Master Thesis – Automated Machine Learning for Process Mining Using Celonis



Alessandro Berti

Software Engineer


+49 241 80 21949




Process mining techniques can be used to generate machine-learning problems. Typically, process discovery and conformance checking are used to align the event data with a discovered and hand-made process model. There are many generic questions that can be answered using standard machine-learning techniques using, for example, neural networks and regression. Examples are:

  • Explaining or predicting a specific bottleneck on the process.
  • Explaining or predicting a specific deviation in the process.
  • Explaining or predicting a routing decision in the process.
  • Explaining or predicting the overall flow time of a process.
  • Explaining or predicting the flow time between two activities.

The Celonis software supports all process mining tasks (process discovery, conformance checking, performance analysis, automated actions, etc.), including a gateway to Python-based machine learning libraries. Celonis Machine Learning is integrated through Jupyter Notebooks and the latest Python packages. PyCelonis is a python package to connect to Celonis from Python. Although this provides great flexibility and expressiveness, users of Celonis Machine Learning need to code and configure parts even when the problems are pretty generic. This assignment consists of the following parts:

  • Create a taxonomy of machine-learning questions for process mining(critically reflecting on the questions that are easy but have no business value).
  • Extracting so-called situation feature tables from Celonis using PyCelonis and PQL. A situation feature table consists of situations (e.g., a decision, a deviation, an activity instance, a case) and for each situation there are standard features.
  • Automatically selecting and applying a suitable machine-learning technique (regression, decision trees, random forest, neural networks) to a given situation feature table.
  • Visualizing the results in a process-oriented manner, e.g., link to activities in a BPMN model. This can be achieved on the Jupyter Notebook / Python side or by returning results back to Celonis.
  • Evaluating the approach using several data sets available for demonstration purposes (and optionally new real-life projects).


  • Data science, machine learning, software engineering, and conceptualization/formalization skills to systematically define the taxonomy of machine-learning questions for process mining, implement a prototype solution, and evaluate the approach.
  • Programming / scripting languages
    • Python
    • Jupyter Notebook
    • SQL and Celonis PQL knowledge is a plus


Supervisor Wil van der Aalst (formal thesis supervisor)

Daily advisor(s)

Dr. Nikou Guennemann, Team Lead Machine Learning & Principal ML Engineer at Celonis.


This a joint thesis project with Celonis. Do not approach people in Celonis directly, without first discussing this with Dr. Seran Uysal () and/or prof. Wil van der Aalst (). It is NOT allowed to organize your own thesis project in a company! We only do joint thesis projects with companies we already work with (e.g., Celonis). The project needs to be defined in the context of this collaboration to ensure good supervision and avoid later confusion.

Since external assignments are particularly challenging, we will only allow good students to do this. Concurrently meeting the expectations of Celonis and ensuring a good academic level will be demanding. Hence, we only consider students with an average of 2.0 or better for such assignments.

If you are interested, make sure to include detailed information about your background (including a detailed CV), scores for completed courses, and your motivation to do this project. After an initial screening from the RWTH side, we will connect you to the person responsible on the Celonis side.

About Celonis

Celonis believes that every company can unlock its full execution capacity using process mining. The company has grown 5,000% in 4 years and 300% in the past year. Celonis now counts as a Decacorn, having raised $1 billion in our most recent funding round in June 2021, valuing the company at more than $11 billion. Powered by its market-leading process mining core, the Celonis Execution Management System provides a set of applications and developer studio and platform capabilities for business executives and users to eliminate billions in corporate inefficiencies. Celonis has thousands of customers, including ABB, AstraZeneca, Bosch, Coca-Cola, Citibank, Danaher Corporation, Dell, GSK, John Deere, L’Oréal, Siemens, Uber, Vodafone and Whirlpool. Celonis is headquartered in Munich, Germany and New York City, USA and has 15 offices worldwide.

  Celonis Logo