Methods to deal with Discrimination in Process Mining Data
Kontakt
Methods to deal with Discrimination in Process Mining Data
Student: Lukas Hörsch
Title: Methods to deal with Discrimination in Process Mining Data
Supervisor: Mahnaz Qafari
1st Examiner: Prof. Dr. Wil M.P. van der Aalst
2nd Examiner: Prof. Dr. Markus Strohmaier
Summary
Process Mining is a tool that aims at improving business processes by applying data mining techniques on top of event data gathered by an organization’s information systems. This data is organized in the form of event logs, which allow for an analysis of the process on a case level. To enhance a process, the results of this analysis may be used to identify the root causes of performance or compliance problems. This can lead to the conclusion that a single person or a specific group is blamed for these problems. However, data mining techniques produce results that are based on correlation. This does not imply causation and thus, decisions based on these results may be unfair. Therefore, it has to be made sure that the conclusion is not based on discrimination or stating obvious facts that cannot be changed. In this thesis, the problem of detecting discrimination in event data is put forward as a major challenge to ensuring fairness in Process Mining. To detect discrimination in an event log, two steps have to be taken: extracting relevant information from the event log and measuring the discrimination on the extracted data. These steps are realized in a Python application that serves as a framework for different Data Mining methods that can detect discrimination. Two such methods are implemented in the application. An evaluation of these methods finds that one of them can be used to detect discrimination in event data. Also, some problems are identified that limit the expressiveness of the implemented approaches, which also might apply to similar methods implemented in the future.