Process Discovery Using Python

Kontakt

Madhavi Shankar

Name

Madhavi Shankara Narayana

Softwareentwicklerin

Telefon

work
+49 241 80 21949

E-Mail

E-Mail
 

Organisational Information

  • Room: 6329, seminar room, Chair of Process and Data Science, Ahornstr. 55, 52074 Aachen
  • Lecturer: Prof. Dr. Wil van der Aalst
  • Coordinating teaching assistant: Madhavi Shankara Narayana, M.Sc.
  • Registration: During central registration process in July using this link
  • Language: The language of the course is English; therefore, all sessions, the written reports, oral questions will be in English.

Important Dates

The mandatory meetings are the following:

  • Thursday 8 November 2018, 14:30-16:00 CET
  • Monday 19 November 2018, 14:30-16:00 CET
  • Monday 26 November 2018, 14:30-16:00 CET
  • Monday 10 December 2018, 14:30-16:00 CET
  • Thursday 20 December 2018, 14:30-16:00 CET
  • Thursday 17 January 2019, 14:30-16:00 CET
  • Monday 21 January 2019, 14:30-16:00 CET

Attendance to the Meetings is mandatory; it is acceptable to skip a maximum of 1 of the 7 (excluding kick-off) meeting (medical certification is NOT required). This one skip is also expected to be communicated.

The milestones for the project are the following:

  • Monday 29 October 2018, 23:59:59 CET / Documentation for project initiation
  • Monday 12 November 2018, 23:59:59 CET / Documentation for requirements analysis
  • Thursday 22 November 2018, 23:59:59 CET / Design and POC of web services and client
  • Wednesday 05 December 2018, 23:59:59 CET / End of Sprint 1
  • Wednesday 19 December 2018, 23:59:59 CET / End of Sprint 2
  • Tuesday 15 January 2019, 23:59:59 CET / End of Sprint 3
  • Wednesday 23 January 2019, 23:59:59 CET / Testing, assessment and documentation
  • UP TO FRIDAY 01 FEBRUARY 2019 / Final oral examination

It is possible to withdraw from the course until Thursday 08 November 2018, 23:59:59.

The withdrawal has to be communicated to the following e-mail address:

Contact hours available by appointment:

Proposed contact hours are the following. It is highly advised to ask an appointment to book the teaching assistant, preferably in the proposed dates and hours.

Modification to the contact hours will be communicated to students through e-mail and promptly reported on this web-site.

Preferred meeting slots - Mondays Preferred meeting slots - Wednesdays Preferred meeting slots - Thursdays
Monday 22 October 2018 14:15-15:45 Wednesday 24 October 2018 14:15-15:45
Wednesday 07 November 2018 14:15-15:45 Thursday 08 November 2018 14:15-15:45
Wednesday 14 November 2018 14:15-15:45 Thursday 15 November 2018 14:15-15:45
Wednesday 21 November 2018 14:15-15:45 Thursday 22 November 2018 14:15-15:45
Monday 26 November 2018 14:15-15:45 Wednesday 28 November 2018 14:15-15:45 Thursday 29 November 2018 14:15-15:45
Monday 03 December 2018 14:15-15:45 Wednesday 05 December 2018 14:15-15:45 Thursday 06 December 2018 14:15-15:45
Monday 10 December 2018 14:15-15:45 Thursday 13 December 2018 14:15-15:45
Wednesday 19 December 2018 14:15-15:45 Thursday 20 December 2018 14:15-15:45
Monday 07 January 2019 14:15-15:45 Thursday 10 January 2019 14:15-15:45
Monday 14 January 2019 14:15-15:45 Wednesday 16 January 2019 14:15-15:45 Thursday 17 January 2019 14:15-15:45
Monday 21 January 2019 14:15-15:45 Wednesday 23 January 2019 14:15-15:45 Thursday 24 January 2019 14:15-15:45

Final oral examination is set as personal appointment. The oral examination should be taken before Friday 01 February 2019.

Seminar Details

Introduction

Process Mining is a growing branch of Data Science that focuses on analyzing event data recorded in Information Systems, focusing on the process perspective.

Investments in Process Mining from public and private companies are steadily increasing, and are expected to more than double in the next five years.

Hence a good knowledge of Process Mining is an important skill for Data Scientists.

Process discovery is the initial and one of the most challenging process mining tasks. Based on an event log, a process model is constructed thus capturing the behaviour seen in the log.

This Software Lab course includes tutorials, guest lectures and hands-on for the existing Process Discovery algorithms using ProM Framework. The course includes implementation of the Discovery algorithms using Python and following the software engineering principles.

Introductory Sessions

All the above topics will be introduced in brief. Participation is mandatory throughout the course. In the introductory sessions, topics will be assigned to the students and deadline for submitting the report and implementation will be discussed.

First Part

Students are expected to understand the algorithms and current implementations done in Java. Know the basics of Process Mining and how tools like ProM work. Some of the algorithms discussed would be as follows:

  1. Naive α- algorithm
  2. Heuristic Miner
  3. Inductive Miner

In addition other approaches and variants of the algorithms will be discussed.

Second Part

Students will be required (in groups of at most 4 people) to implement the discovery algorithms in Python and provide proper visualizations. A proper SDLC lifecycle will be followed during this phase to track the development. The details of the methodology will be communicated in the introductory session.

A written report on the implementation, its advantages and issues should be produced individually by the students.

Grading

The grading will take into account the written report and the Python model implemented and will consist of some oral questions on the report (around 15-30 mins per student).

Prerequisites

  • Software Engineering knowledge(Design, development and testing)
  • Prior programming experience. Not necessarily Java or Python.
  • Interest to learn and code in Python

Optionals

  • Coursera "Process Mining: Data Science in Action" course
  • BPI Course

Resources

Registration

The registration is carried out by the central registration process in July 2018.

In order to increase your chance for being elected for this lab, please state your qualifications, experiences, and overall grades in your enrolled study as detailed as possible. Please give clearly why you are definitely a suitable candidate for this lab.

You will be informed about the first meeting in the weeks after the registration closes.