Implementation of automated feature selection algorithm
Contact
Implementation of automated feature selection algorithm
Student: Georgina Hermes
Title: Implementation of automated feature selection algorithm
Supervisor: Mahnaz Qafari
1st Examiner: Prof. Wil M.P. van der Aalst
2nd Examiner: Prof. Dr. Martin Grohe
Summary
One of the main purposes of process mining is to improve processes which are derived from event logs. Improving processes is called process enhancement. For that, it is necessary to know the current state of the process (process discovery), its strengths and weaknesses (process analysis), and how to improve it (process enhancement). Knowing the causal structure of different features of the process can help to better understand the behavior of the process, but also to anticipate the effect of doing an intervention on the process. However, there are many features in any process and causal discovery is based on statistical independence tests which can be time consuming. In this work, we do a preprocessing step before the causal structure discovery, aiming to improve the performance and outcome of the causal analysis. For that we implement an existing feature selection algorithm into the ProM framework to prune for meaningful features. With the generated log consisting only of the selected features we search for causal graphs with Tetrad, a causal model discovery tool. We then compare the causal graph of the original log with the causal graph of the selected feature log. The results show that in terms of complexity, calculation time, readability, and causal relations the selected feature log surpasses the original log.