Overview
KPIs are the small set of measures that show if a process delivers its outcome. They must be defined, reproducible, and tied to the step where value or risk occurs. Use one source of truth for daily and month-end views. Avoid parallel spreadsheets.
Principles
Outcome first
State the customer, the promise, and the test for success. Pick 3–5 KPIs per flow.
Defined & testable
Write the operational definition. Name the event timestamps, the unit of analysis, and the sampling rule.
Actionable
Assign owners and thresholds. When a KPI breaks, action follows within the review window.
Types of KPIs
Flow (speed)
- Lead time, cycle time, throughput, WIP
- On-time %; queue time; touch time
Quality
- First-pass yield, defect rate, rework rate
- Right-first-time; escape rate
Cost & service
- Cost per unit/case; unit margin
- SLA attainment; backlog size & aging
Also track control health (KCIs): late approvals, failed reconciliations, access review exceptions, control test pass rate.
Operational definitions & formulas
Core formulas
- Lead time = end_ts − start_ts (request → delivery)
- Cycle time = active work time per unit
- Throughput = completed units / period
- WIP = units in process now
- First-pass yield = good units / total units
- Defect rate = defects / opportunities (or per unit)
- Cost per unit = total process cost / output count
- SLA attainment = met commitments / total commitments
Little’s Law (queueing)
WIP = Throughput × Lead time. If WIP grows and throughput is flat, lead time rises. Use this to detect hidden queues.
Reference: MIT/Stanford explanations; Lean VSM practice. MIT OCW · Lean Institute
Measurement system design
Data model
- Define the unit (order, ticket, batch) and the lifecycle events (created, started, paused, completed).
- Specify sources and system clocks; store UTC and local when needed.
- Keep IDs stable; track rework links and parent/child items.
Operational definition
- Exact filter (inclusions/exclusions); numerator/denominator
- Time base (calendar vs. business hours; holiday rules)
- Sampling: full population preferred; else risk-based sample
Baselines, targets & segmentation
Baselines
- Use last 90–180 days for a stable baseline where seasonality is low.
- Show distribution (median, p90/p95) not only the mean.
Targets
- Set targets above common cause variation (use SPC limits).
- Define acceptable error bands; avoid whiplash from outliers.
Segmentation & normalization
- Segment by product, channel, region, customer tier.
- Normalize by size/complexity (per item, per 100 orders, per $1k).
- Watch for Simpson’s paradox when mixing segments.
Visuals & review cadence
Daily
Flow board: WIP, starts, completes, blockers. Show yesterday and 7-day trend.
Weekly
Trend board: lead time (median/p90), FPY, backlog aging, SLA hits/misses, actions closed.
Monthly
SPC/control view, conformance checks, cost per unit, and benefits tracking vs. plan.
Statistical control (SPC)
Why SPC
Distinguish common-cause noise from special-cause signals. Act only when there is a signal.
Charts
- p-chart for proportions (defect rate, FPY)
- X-bar/R (or XmR) for continuous measures (cycle time)
- u-chart for defects per unit
Pitfalls & anti-patterns
Averages that lie
Averaging ratios or mixing segments hides reality. Show distribution and per-segment views.
Vanity metrics
Counts without denominators; dashboards with color but no decisions. Tie each KPI to an owner and a threshold.
Utilization obsession
Maxed utilization increases WIP and lead time (Little’s Law). Protect flow first.
90-day starter
Days 0–30
- Pick one flow. Define 3–5 KPIs with operational definitions.
- Baseline last 90 days; add median and p90.
Days 31–60
- Install daily/weekly boards and owners.
- Add SPC to one KPI; stop reacting to noise.
Days 61–90
- Segment and normalize; resolve one chronic bottleneck.
- Publish the scale plan and retire vanity metrics.
References
- ISO 9001 documented information / process approach — iso.org
- NIST e-Handbook of Statistical Methods — nist.gov
- MIT OCW: Little’s Law — ocw.mit.edu
- Lean Enterprise Institute: Value-stream mapping — lean.org
- ASQ: Control charts (SPC) — asq.org
Design the critical few KPIs. Make them testable. Review them on a clock.
If you want a KPI dictionary template and SPC starter, ask for a copy.