Overview

Event data in business systems records who did what and when. Process mining reads these logs to reconstruct the flow, compare it to the intended model, and quantify delay, rework, and variants. It complements mapping methods like BPMN by showing the real path, not only the designed path.

Event logs & formats

Minimum fields

Case ID (process instance, e.g., order, ticket, claim)
Activity (event class)
Timestamp (with time zone or UTC)

Helpful attributes

Resource/role, lifecycle transition (start/complete), cost, channel, product, region

Standards

XES (IEEE 1849-2016) — classic log format for discovery and conformance
OCEL — supports many-to-many objects (cases with orders, items, invoices)

Primary links

XES (IEEE 1849-2016): ieeexplore.ieee.org
OCEL: ocel-standard.org
Process mining intro: processmining.org

Three use cases

Discovery

Build a model from data. See the common path and the long tail of variants.

Conformance

Compare data to the target model. Quantify fitness and highlight violations and missing steps.

Enhancement

Annotate the model with times, queues, rework, resources, and cost to find delay and waste.

Preparing data

Extraction patterns

Identify the case (order, ticket, claim). Join tables that hold status changes.
Build events from lifecycle changes (created, assigned, completed).
Keep UTC timestamps; store time zone and daylight rules if needed.

Data hygiene

Deduplicate events; sort by timestamp; handle ties and same-second events.
Normalize activity names; document filters and exclusions.

Multi-object processes

Use OCEL when a case spans objects (order ↔ items ↔ invoice). Avoid flattening that loses relations.

Algorithms & tools

Discovery (examples)

Inductive Miner — robust to noise; produces sound models
Heuristics Miner — frequency-based; handles noise with thresholds
Alpha Miner — classic; good for teaching, less robust in practice

Conformance

Token-based replay (fast)
Alignments (optimal matching of log to model)

Ecosystem

ProM (research plugins): promtools.org
PM4Py (Python): pm4py.fit.fraunhofer.de
Apromore (open-source core): apromore.org

Quality, drift & privacy

Data quality

Completeness, accuracy, validity, timeliness (CAVT)
Missing timestamps, inconsistent IDs, activity naming drift

Concept drift

Processes change over time. Split logs into windows; compare models; detect shifts in variants and cycle time.

Privacy & ethics

Mask personal data; pseudonymize IDs; restrict resource views when needed. Keep access logs and retention rules.

Fitness, precision & other metrics

Model quality

Fitness — how much of the log the model can replay
Precision — how much behavior the model allows that the log does not show
Simplicity — model complexity (prefer smaller, sound models)
Generalization — avoids overfitting to the sample log

Operational metrics

Lead time and wait time by path/variant
Rework rate; return loops
Handoffs by role; social network density

Typical applications

Finance

Order-to-Cash, Procure-to-Pay: maverick buying, price variance, three-way match issues.

IT service

Incident-to-Resolution: ping-pong handoffs, SLA breaches, backlog aging patterns.

Healthcare / public

Referral-to-Treatment or Permitting: queue hotspots, rework loops, missing documents.

90-day starter

Days 0–30: Data

Pick one flow. Extract case ID, activity, timestamp, resource.
Clean names; deduplicate; store UTC; document filters.

Days 31–60: Discovery & conformance

Run Inductive/Heuristics Miner; list top variants.
Check conformance; log violations with counts and impact.

Days 61–90: Action

Target one bottleneck or loop; implement a small change.
Re-measure lead time and rework; publish the deltas.

References

Process mining portal — processmining.org
van der Aalst, “Process Mining: Data Science in Action” — Cambridge Univ. Press
IEEE XES standard (1849-2016) — ieeexplore
OCEL multi-object logs — ocel-standard.org
ProM tools — promtools.org
PM4Py (Python) — pm4py.fit.fraunhofer.de
Apromore — apromore.org

Turn event data into a clear path for change.

If you want a log spec (XES/OCEL) and a discovery checklist, ask for a copy.

Contact us

Process Mining Basics

Overview

Event logs & formats

Minimum fields

Helpful attributes

Standards

Primary links

Three use cases

Discovery

Conformance

Enhancement

Preparing data

Extraction patterns

Data hygiene

Multi-object processes

Algorithms & tools

Discovery (examples)

Conformance

Ecosystem

Quality, drift & privacy

Data quality

Concept drift

Privacy & ethics

Fitness, precision & other metrics

Model quality

Operational metrics

Typical applications

Finance

IT service

Healthcare / public

90-day starter

Days 0–30: Data

Days 31–60: Discovery & conformance

Days 61–90: Action

References

Turn event data into a clear path for change.