Overview

Before selecting workflow, integration, RPA, or AI, verify that inputs and outputs are defined, sources are trustworthy, and evidence lives where work occurs. Use one logic path for daily operations and month-end reporting—no parallel spreadsheets.

Readiness criteria

Clarity

Operational definitions for inputs/outputs
Stable names and units (ISO 8601 dates; RFC 3339 timestamps)
Versioned schemas; change policy

Trust

Authoritative sources named; owners assigned
Data quality checks (completeness, accuracy, validity, timeliness)
Lineage to system of record

Control

Access/RBAC; SoD for changes
Evidence stored at the producing step
Logging and retention rules

Data contracts & schemas

Define contracts

JSON Schema for payloads (json-schema.org)
OpenAPI for REST (openapis.org)
Enumerations, ranges, requiredness, defaults

Metadata

Business glossary + technical dictionary
Metadata registration (ISO/IEC 11179 concept) — iso.org

Good practice

Reject invalid payloads at the edge (fail fast)
Version schemas; deprecate with dates
Automated contract tests in CI

Master & reference data

What to stabilize

Customers, products, locations, vendors
Reference lists: currencies, units, tax codes
Identifiers and merge/survivorship rules

IDs & keys

Global IDs where possible (e.g., GS1 keys) — gs1.org
Natural vs surrogate keys documented

Governance

Data owners/stewards; change SLA
Golden records; match/merge strategies
Audit of master data changes

Event data & logs

Minimum fields

Case ID (process instance)
Activity name
Timestamp (UTC or with zone)
Actor/resource (optional but useful)

Formats

XES (IEEE 1849) — ieeexplore
CloudEvents for system events — cloudevents.io

Uses

Process mining (discovery/conformance)
SLI/SLO and queue health
Exception root-cause analysis

IDs, time & idempotency

Identity

Stable primary keys; correlation IDs across systems
Idempotency keys for writes/retries

Time

ISO 8601 / RFC 3339 timestamps; store UTC
Record start/complete/paused states

Error taxonomy

Retryable vs non-retryable
Business vs technical error classes
Dead-letter rules and alerts

APIs & integration

Contracts & auth

OpenAPI/GraphQL; input validation
OAuth 2.0 / OIDC for identity — RFC 6749 · openid.net

Events & queues

AMQP/Kafka for at-least-once delivery — OASIS · kafka.apache.org
Idempotent consumers; replay with retention

RPA fallback

Use UI automation only when APIs are absent and screens are stable. Prefer API contracts for durability.

Data quality (CAVT) & profiling

Dimensions

Completeness, Accuracy, Validity, Timeliness (CAVT)
Consistency and Uniqueness as supporting checks

Profiling

Nulls, ranges, patterns, referential integrity
Drift detection on key distributions

Standards

ISO 8000 (data quality concepts) — iso.org
NIST statistical methods — nist.gov

Catalog & lineage

Catalog

Business glossary + technical metadata
Schemas, owners, retention, quality rules

Lineage

End-to-end data flow visibility
OpenLineage compatible where possible — openlineage.io

Why it matters

Faster impact analysis, cleaner audits, fewer surprises in change windows.

Privacy & security

Access & retention

Least privilege; role-based access; periodic reviews
Retention by policy and law (e.g., GDPR) — EUR-Lex

Protection

Encrypt in transit/at rest; mask PII where possible
Log reads/writes; immutable audit trails

Frameworks

ISO/IEC 27001 — iso.org
NIST SP 800-53 — nist.gov

Testing & monitoring

Pre-prod

Contract tests (schemas, enums, ranges)
Golden datasets and replay tests

Prod

SLIs/SLOs for freshness, completeness, error rate
Dead-letter queues, retries, alerting

Change windows

Version bumps coordinated; rollback plans
Deprecations with sunset dates and dashboards

90-day starter

Days 0–30

Pick one flow; define input/output contracts (JSON Schema)
Name owners; catalog fields and sources
Baseline CAVT checks; fix blocking issues

Days 31–60

Publish API contracts (OpenAPI) with auth
Stand up lineage + dashboards; alert on breaks
Add idempotency keys and error taxonomy

Days 61–90

Pilot the automation; track lead time, FPY, exception rates
Harden retention, access reviews, and rollback
Publish deltas; plan scale-out

References

JSON Schema — json-schema.org
OpenAPI Initiative — openapis.org
ISO/IEC 11179 (metadata registries, concept) — iso.org
GS1 Identification Keys — gs1.org
IEEE XES (process event logs) — ieeexplore
CloudEvents — cloudevents.io
ISO 8000 (data quality concepts) — iso.org
NIST e-Handbook (statistics) — nist.gov
OpenLineage — openlineage.io
ISO/IEC 27001 — iso.org
NIST SP 800-53 — nist.gov
GDPR (EU) — EUR-Lex

Prove the data. Then automate.

If you want a data-readiness scorecard (contracts, quality, lineage, privacy), ask for a copy.

Contact us

Data Readiness for Automation