Bachelor Thesis -Exploiting the Graph Structure of the Event Data contained in SAP



Alessandro Berti

Software Engineer


+49 241 80 21912



Bachelor Thesis - Exploiting the Graph Structure of the Event Data contained in SAP


SAP ERP is a popular enterprise resource planning system used by worldwide organizations, supporting essential processes such as the procure-to-pay and the order-to-cash. Process mining techniques have been used to analyze and optimize these processes.

However, the extraction of a qualitative event log from mainstream SAP processes (such as the order-to-cash, the accounts-payable, and the accounts-receivable) is particularly challenging, as the events related to different objects are involved (for example, the deliveries, the invoices, and the payments) and the definition of a case notion is required. Object-centric event logs mitigate these problems, but the extraction of a good-quality object-centric event log is still a considerable challenge.

Another viewpoint from which data contained in SAP systems could be analyzed is graph-based. For example, if we focus on an order-to-cash process, an order can be related to different invoices. Similarly, an invoice can be paid in separate payments. This is reflected in the relational structure of SAP as a document flow between various documents. For example, the document flow of sales orders is written in the table VBFA, the document flow of deliveries is written in the table VTFA, and the document flow of the accounting is written in the tables BKPF/BSEG.

The main goals of the thesis are:

  • The translation of mainstream document flows in SAP (e.g., VBFA for the order-to-cash, BKPF/BSEG for accounts-payable/accounts-receivable) in a labeled property graph. The labeled property graph can be implemented in a graph database (e.g., Neo4J).
  • The identification of querying paradigms that lead to interesting insights, including:
    • The verification of logical-temporal rules such as the Four-Eyes-Principle.
    • Identifying the variants of the process and the rework.
    • Verification of aggregate properties (2-way and 3-way matching).
    • Conformance checking (e.g., missing documents in the flow).
  • This can be done using graph-related techniques (connected components, forward/backward propagation).

The student should be familiar with the Python programming language and have some knowledge of process mining. In particular, hands-on experience with process mining (practical courses, pro-seminar, BPI course, IDS course, APM course) and data extraction is preferred. Experiences in using/dealing with ERP systems (specifically, SAP) will be highly valued.


Supervisor: Wil van der Aalst

Alessandro Berti

For more Information:
Send an e-mail to . Make sure to include detailed information about your background and scores for completed courses.

Chair of Process and Data Science
Ahornstr. 55 (Eingang Mies-van-der-Rohe-Str.), Erweiterungsbau E2
52074 Aachen
Phone: +49 241 80 21 901