Master Thesis – Object-Centric Process Mining Using Celonis
Most process mining approaches assume that each event is characterized by three mandatory attributes (case identifier, activity name, timestamp) and any number of optional attributes. This is the event notion supported by XES (https://xes-standard.org/) and is natural when loading an event log as a single database table, CSV file, or XLS file. However, reality is often more complex. One viewpoint is that events indeed refer to a single case identifier, but that events and cases interact. This viewpoint is adopted by the Celonis EMS and is referred to as the multi-event-log concept. Another viewpoint is to assume that an event can refer to any number of objects rather than one case. A "place order" event may refer to one order, five items, and one customer. A "deliver package" event may refer to multiple items from multiple orders of the same customer. The OCEL standard (http://ocel-standard.org/) provides a general standard to interchange object-centric event data with multiple object types. OCEL is closer to the actual events as they occurred in reality and the Celonis multi-event-log concept is closer to the way information is stored in Oracle, Salesforce, SAP, etc. Obviously, multi-event-logs and OCEL are related. However, the multi-event-log concept allows for links with different semantics and there are currently multiple multi-event-log miners (interleaved, non-interleaved, match, and manual). This makes conversions non-trivial and for sure automated translations will be incomplete. The whole area can be described as Object-Centric Process Mining (OCPM). OCPM is getting more and more attention both in the research community and in commercial systems and real-life applications of process mining.
The goal of this master thesis is to conceptualize, formalize, and compare the different concepts for OCPM. Moreover, automated conversions between the different viewpoints will be designed and implemented on top of the Celonis EMS and existing Python libraries.
Concretely the assignment consists of the following parts:
- Create a taxonomy of Object-Centric Process Mining (OCPM) approaches and formalize the different concepts.
- Comparing the different approaches, i.e., the multi-event-log miners (e.g., interleaved, non-interleaved, match, and manual), OCEL-based approaches, and alternatives.
- Designing translation schemes between the different approaches. This includes OCEL to various types of multi-event-logs in Celonis and vice versa. Note that Celonis uses complete data models rather than a flat event log.
- Implementing these approaches in Celonis. Here PyCelonis, a Python package that allows you to connect to Celonis from Python, is of help. Some initial bridges have already been implemented to ensure feasibility.
- Conducting experiments using event data from different systems. What information is lost? What information is replicated? How scalable are things?
- Providing generic insights and findings. This may range from ideas for new process mining techniques to recommendations.
- Data science, software engineering, and conceptualization/formalization skills to systematically define and compare the different OCPM approaches, implement a prototype solution, and evaluate the approach.
- Programming / scripting languages
- SQL and Celonis PQL knowledge is a plus
- Wil M. P. van der Aalst, Alessandro Berti: Discovering Object-Centric Petri Nets. Fundam. Informaticae 175(1-4): 1-40 (2020)
- Wil M. P. van der Aalst: Concurrency and Objects Matter! Disentangling the Fabric of Real Operational Processes to Create Digital Twins. ICTAC 2021: 3-17
- Alessandro Berti, Wil M. P. van der Aalst: Extracting Multiple Viewpoint Models from Relational Databases. CoRR abs/2001.02562 (2020)
- T. Vogelgesang, J. Kaufmann, D. Becher, R. Seilbeck, J. Geyer-Klingeberg, M. Klenk. A Query Language for Process Mining Celonis PQL. Process Querying Methods. Springer, 2021.
prof.dr.ir. Wil van der Aalst (formal thesis supervisor)
Alessandro Berti, MSc. (RWTH).
Sabeth Steiner, MSc., Vice President Product Management of Celonis Studio at Celonis.
Dr. Nils Gerken, Team Lead Process Mining Consulting at Celonis.
Robert Seilbeck, MSc., Head of Backend Development at Celonis.
This a joint thesis project with Celonis. Do not approach people in Celonis directly, without first discussing this with Dr. Seran Uysal (email@example.com‐aachen.de) and/or prof. Wil van der Aalst (firstname.lastname@example.org). It is NOT allowed to organize your own thesis project in a company! We only do joint thesis projects with companies we already work with (e.g., Celonis). The project needs to be defined in the context of this collaboration to ensure good supervision and avoid later confusion.
Since external assignments are particularly challenging, we will only allow good students to do this. Concurrently meeting the expectations of Celonis and ensuring a good academic level will be demanding. Hence, we only consider students with an average of 2.0 or better for such assignments.
If you are interested, make sure to include detailed information about your background (including a detailed CV), scores for completed courses, and your motivation to do this project. After an initial screening from the RWTH side, we will connect you to the person responsible on the Celonis side.
Celonis believes that every company can unlock its full execution capacity using process mining. The company has grown 5,000% in 4 years and 300% in the past year. Celonis now counts as a Decacorn, having raised $1 billion in our most recent funding round in June 2021, valuing the company at more than $11 billion. Powered by its market-leading process mining core, the Celonis Execution Management System provides a set of applications and developer studio and platform capabilities for business executives and users to eliminate billions in corporate inefficiencies. Celonis has thousands of customers, including ABB, AstraZeneca, Bosch, Coca-Cola, Citibank, Danaher Corporation, Dell, GSK, John Deere, L’Oréal, Siemens, Uber, Vodafone and Whirlpool. Celonis is headquartered in Munich, Germany and New York City, USA and has 15 offices worldwide.