Process Discovery using Python (Master’s)
Process Discovery using Python - Master's
Course Details
Language: The language of the course is English; therefore, all meetings and the written reports will be in English.
Introduction
Process Mining is a growing branch of Data Science that focuses on analyzing event data recorded in Information Systems, focusing on the process perspective. Investments in Process Mining from public and private companies are steadily increasing, and are expected to more than double in the next five years. Hence a good knowledge of Process Mining is an important skill for Data Scientists.
Process discovery is the initial and one of the most challenging process mining tasks. Based on an event log, a process model is constructed thus capturing the behavior seen in the log. This Software lab course is designed to enable students to get their hands on the discovery process. The course assignments include the implementation of the algorithms either to discover or enable in discovering the process. Process Discovery involves the core discovery algorithms and their visualizations.
This Software Lab course includes tutorials on Software development Life Cycle, common practices in the industry, and also related to Process Discovery. It is expected that the students will follow the Software engineering principles during the course term for achieving the milestones of each assignment. The course will use Python as the core language for implementation.
Important Dates
Registration - through central registration process /Supra in January 2022
Kick-off Meeting: TBD
Location: ZOOM (details will be provided later)
Introductory Sessions
All the above topics will be introduced in brief. Participation is mandatory throughout the course. In the introductory sessions, topics will be assigned to the students and the deadline for submitting the report and implementation will be discussed. Groups will be formed to work on the assignments
Student work structure:
Students will be required to understand and implement the assignment requirements in Python and provide proper visualizations. A proper SDLC lifecycle will be followed during this phase to track the development. The details of the methodology will be communicated in the introductory session. A written report on the implementation, its advantages, and issues should be produced individually by the students.
Grading
The grading will take into account the written report and the Python model implemented.
Prerequisites
- Software Engineering knowledge(Design, development, and testing)
- Prior programming experience. Not necessarily Java or Python.
- Coursera "Process Mining: Data Science in Action" course
- Business Process Intelligence Course
- Interest to learn and code in Python
Recommended
- Introduction to Data Science Course
- Advanced Process Mining Course
- Seminar - Selected Topics in Process Mining
- Pro-Seminar - Data Preprocessing
Registration
The registration is carried out by the central registration process.
In order to increase your chance of being elected for this lab, please state your qualifications, experiences, and overall grades in your enrolled study as detailed as possible. Please specify clearly why you are a suitable candidate for this lab.
You will be informed about the first meeting in the weeks after the registration closes. All communications will be through the Moodle portal.
Resources:
- PM4Py: installation tutorial
- PM4Py: documentation
- Python Tutorial The Python Foundation
- Interactive Tutorial covering the basics of Python
- Introduction to Git
- Introduction to Sprint Planning / SCRUM
- Introduction to Unit Testing
- (Advanced) Deploying Flask application on Dockers