Master Thesis - Protecting Event Data and Process Mining Results Using Generalization Techniques



Majid Rafiei

Wissenschaftlicher Mitarbeiter


+49 241 80 21948




Recent years have seen tremendous advances in process mining which are reflected by the growing number of commercial process mining tools available today. There are over 25 commercial products supporting process mining (Celonis, Disco, Minit, myInvenio, ProcessGold, QPR, etc.). All support process discovery and can be used to improve compliance and performance issues. All the process mining activities are performed on the basis of even data, which may include direct and indirect sensitive information. Three main activities in process mining are process discovery, conformance checking, and process enhancement.

Responsible Data Science (RDS) is a set of technical algorithms and social laws for ensuring Fairness, Accuracy, Confidentiality, and Transparency (FACT) during the whole pipeline of Data Science. In this project, the focus is on confidentiality in process mining (how to answer questions without revealing secrets?). The goal is to use generalization/abstraction techniques to protect event data and the results. We assume that there are two parties: business owner and process miner. Process mining algorithms are supposed to be run on the process miner side as service (PMaaS). The process miner side should never have access to the entire event data, and should not be able to reconstruct the entire event data based on the generalized event data or partial detailed event data passed to apply process mining algorithms. For example, for discovering process models, first activities are generalized, then the event data with the generalized activities are sent to the process miner side where the desired discovery algorithm is applied and the result is sent back to the owner. The users have only access to the generalized result, and any request to see the details should pass an authorization phase and comply with privacy requirements.



Good programming skills, specifically Python programming language and web development requirements (Javascript, HTML, CSS, etc.). Knowledge of basic computer science concepts and interest in process mining and responsible data science.

Supervisor Wil van der Aalst


Majid Rafiei

For more information

Contact Make sure to include detailed information about your background and scores for related courses.