Bachelor Thesis - Extracting Event Data from ERP Systems
The IEEE STANDARD 1849-2016, IEEE Standard for eXtensible Event Stream (XES) for Achieving Interoperability in Event Logs and Event Streams, is a technical standard developed by the IEEE Standards Association. It standardizes "a language to transport, store, and exchange (possibly large volumes of) event data (e.g., for process mining)" 
Roughly speaking, process mining aims to discover, monitor and improve processes by extracting knowledge from event logs representing actual process executions in a given setting. Process mining depends on the availability of accurate and unambiguous event logs, according to established standards. The purpose of this standard is to provide a generally acknowledged (W3C) XML format for the interchange of event data between information systems in many applications domains on the one hand and analysis tools for such data on the other hand. As such, this standard aims to fix the syntax and the semantics of the event data which, for example, is being transferred from the site generating this data to the site analyzing this data. As a result of this standard, if the event data is transferred using the syntax as described by this standard, its semantics will be well understood and clear at both sites.
However, in practice most of the log files created out of standard ERP systems, such as Oracle and SAP, do not conform with the XES standard. As such, the semantics of their attributes, as well as their use within an analysis or KPI framework is unclear. The goal of this thesis is to create an automated translator from “plain” log files into the XES format. Specifically, focusing on standard even log models for the procurement and sales process, the translation will automatically create the XES log file. This will require defining and using extensions to XES to appropriately capture the log attributes. The translation tool can be programmed in, e.g., Java or Python, at student’s discretion. Note that the tool must be scalable to cope with real-life log files. Therefore experience in handling and processing large sets, XML and process mining are essential.
This thesis will be advised in cooperation with Rafael Accorsi at PricewaterhouseCoopers Switzerland.
Candidates need to be aware of process mining, have an interest in real-world information systems like SAP and Oracle, and have good implementation skills (Java or Python). Experience in handling and processing large sets, XML and process mining are essential.
 Giovanni Acampora; Autilia Vitiello; Bruno Di Stefano; Wil van der Aalst; Christian Gunther; Eric Verbeek: IEEE 1849: The XES Standard: The Second IEEE Standard Sponsored by IEEE Computational Intelligence Society [Society Briefs]. IEEE Computational Intelligence Magazine. 12(2): 4-8 (2017)
Prof.dr.ir. Wil van der Aalst
Rafael Accorsi (PricewaterhouseCoopers Switzerland)
For more Information
Send an e-mail to Mahsa Bafrani for the application. Make sure to include all detailed information about your background and all scores for completed courses, and write “Bachelor Thesis: Extracting Event Data from ERP Systems" in the email's subject.