Process Conformance Checking in Python (Master’s, WS 2018/2019)

Kontakt

Alessandro Berti

Name

Alessandro Berti

Softwareentwickler

Telefon

work
+49 241 80 21949

E-Mail

E-Mail
 

Organisational Information

  • Room: 6329, seminar room, Chair of Process and Data Science, Ahornstr. 55, 52074 Aachen
  • Lecturer: Prof. Dr. Wil van der Aalst
  • Coordinating teaching assistant: Alessandro Berti
  • Registration: During central registration process in July using this link.
  • Language: The language of the course is English; therefore, all sessions, the written reports, oral questions will be in English.

Important Dates

The mandatory meetings are the following:

  • Friday 9 November 2018, 15:30-17:00 CET
  • Friday 16 November 2018, 15:30-17:00 CET
  • Friday 23 November 2018, 15:30-17:00 CET
  • Friday 30 November 2018, 15:30-17:00 CET
  • Friday 07 December 2018, 15:30-17:00 CET
  • Friday 11 January 2019, 15:30-17:00 CET
  • Friday 18 January 2019, 15:30-17:00 CET

It is possible to skip one meeting. Medical certification is not required.

The meetings are organized in the PADS Seminar Room, room 6329. Any change to the meetings date will be communicated to the students by mail and promptly reported on this web site.

The milestones for the project are the following:

  • Monday 22 October 2018, 23:59:59 CET: Documentation for project initiation (4/100 final grade formula)
  • Monday 12 November 2018, 23:59:59 CET: Documentation for requirements analysis (4/100 final grade formula)
  • Thursday 22 November 2018, 23:59:59 CET: Design and P.o.C. of web services and client (4/100 final grade formula)
  • Monday 03 December 2018, 23:59:59 CET: Sprint 1 and documentation (4/100 final grade formula)
  • Friday 21 December 2018, 23:59:59 CET: Sprint 2 and documentation (4/100 final grade formula)
  • Friday 11 January 2019, 23:59:59 CET: Sprint 3 and documentation (4/100 final grade formula)
  • Friday 18 January 2019, 23:59:59 CET: Testing, assessment and documentation (4/100 final grade formula)

It is possible to retire from the course until Thursday 01 November, 23:59:59.

The retirement has to be announced to the following e-mail address:

Please make sure that you receive a reply to your retirement e-mail.

Kick-off Meeting:

10.10.2018, 15:30 Uhr – 17:00 Uhr

Raum 6329, Seminarraum Lehrstuhl Process and Data Science, Ahornstraße 55, 52074 Aachen

Proposed contact hours are the following. It is highly advised to ask an appointment to the teaching assistant, preferably in the proposed dates and hours.

Modification to the contact hours will be communicated to students through e-mail and promptly reported on this web-site.

  • Monday 15 October 2018 09:15-10:45
  • Thursday 18 October 2018 14:30-16:00
  • Wednesday 24 October 2018 09:15-10:45
  • Wednesday 07 November 2018 09:15-10:45
  • Thursday 08 November 2018 09:15-10:45
  • Wednesday 14 November 2018 09:15-10:45
  • Thursday 15 November 2018 09:15-10:45
  • Wednesday 21 November 2018 09:15-10:45
  • Thursday 22 November 2018 09:15-10:45
  • Monday 26 November 2018 09:15-10:45
  • Wednesday 28 November 2018 09:15-10:45
  • Thursday 29 November 2018 09:15-10:45
  • Monday 03 December 2018 09:15-10:45
  • Wednesday 05 December 2018 09:15-10:45
  • Thursday 06 December 2018 09:15-10:45
  • Monday 10 December 2018 09:15-10:45
  • Thursday 13 December 2018 09:15-10:45
  • Wednesday 19 December 2018 09:15-10:45
  • Thursday 20 December 2018 09:15-10:45
  • Monday 07 January 2019 09:15-10:45
  • Thursday 10 January 2019 09:15-10:45
  • Monday 14 January 2019 09:15-10:45
  • Wednesday 16 January 2019 09:15-10:45
  • Thursday 17 January 2019 09:15-10:45
  • Monday 21 January 2019 09:15-10:45
  • Wednesday 23 January 2019 09:15-10:45
  • Thursday 24 January 2019 09:15-10:45

Final oral examination is set as personal appointment. The oral examination should be taken before Friday 01 February 2019.

Seminar Details

Introduction

Process Mining is a growing branch of Data Science that analyses event data recorded in Information Systems and focuses on the process perspective. Investments in Process Mining from public and private companies are steadily increasing, and expected to more than double in the next five years. Hence a good knowledge of Process Mining is an important skill for Data Scientists.

Conformance Checking is a part of Process Mining discipline and consists in techniques to compare the process model and the real behavior recorded in an Information Systems to find commonalities and discrepancies. These may signal the need of better control of the process, or that the model needs to be improved to capture reality better.

Common implementations of Conformance Checking on Information Systems are event listeners that trigger some kind of alert when deviations occur, or post-mortem analysis to detect fraudulent behavior, for example violations of the Four Eyes Principle.

This Software Lab course includes tutorials, guest lectures and hands-on for the existing Conformance Checking algorithms using ProM Framework. The course includes implementation of Conformance Checking using Python and following the software engineering principles.

Introductory Sessions

All the above topics will be introduced briefly. Participation is mandatory throughout the course. In the introductory sessions, topics will be assigned to the students and deadlines for submitting the report and implementation will be discussed.

First part

The first part of this course wants to provide students an overview of current implementations of Conformance Checking approaches in the ProM framework:

  • Token-based replay
  • Alignment based replay
  • ETConformance

In addition other approaches and variants of the algorithms will be discussed. Students at the end of first part should know how to use these approaches, understand the current implementations and their limitations.

Second part

In the second part of the course, students will be required (in groups of at most 4 people) to implement a new Conformance Checking plug-in in a new framework (based on the Python language). The plug-in will implement one of the available approaches (token, alignment), and will have to display problems directly on the process model (using for example colors or interactions).

A written report on the implementation, its advantages and issues should be produced individually by the students. A proper SDLC lifecycle will be followed during this phase to track the development. The details of the methodology will be communicated in the introductory session. At the end of the course, the students should have gained a valuable know-how of Process Mining applications, with a focus on Conformance Checking, and be able to implement the techniques on top of existing Information Systems.

Grading

The grading will consist in some individual oral questions on the written report and on the Python plug-in implemented (around 15-30 minutes per student).

Prerequisites

  • Software Engineering knowledge (Design, development and testing)
  • Prior programming experience. Not necessarily Java or Python.
  • Interest to learn and code in Python

Optionals:

  • Coursera "Process Mining: Data Science in Action" course
  • BPI Course

Resources

Python Tutorial The Python Foundation
Interactive Tutorial covering the basics of Python

Registration

The registration is carried out by the central registration process in July 2018.

In order to increase your chance for being elected for this lab, please state your qualifications, experiences, and overall grades in your enrolled study as detailed as possible. Please give clearly why you are definitely a suitable candidate for this lab.

You will be informed about the first meeting in the weeks after the registration closes.