Disclaimer: This article is an engineering-focused interpretation of regulatory requirements. The EU AI Act is subject to ongoing interpretation by the EU AI Office and national authorities. Regulatory requirements may evolve through implementing acts, guidelines, and enforcement decisions. Readers should consult qualified legal counsel for compliance advice specific to their organization and use cases.
Article 13 of the EU AI Act is titled "Transparency and provision of information to deployers." In regulatory language, it requires that high-risk AI systems be designed and developed to ensure their operation is sufficiently transparent to enable deployers to interpret the system's output and use it appropriately. In engineering language, it requires that you can explain what your AI system did, why it did it, and how a human can oversee and correct it -- not as a narrative, but as retrievable, versioned technical artifacts.
This article translates Article 13's requirements into engineering specifications. It maps each sub-requirement to the technical artifact that satisfies it, identifies the three compliance gaps most commonly found in production AI systems, and provides a readiness checklist that engineering and compliance teams can use to assess their current state. For background on Article 13's relationship to the broader EU AI Act transparency framework, see The EU AI Act's Article 13 Problem: What "Transparency" Actually Requires from Your AI System.
High-Risk Categories: Which Systems Are in Scope
Article 13 applies to high-risk AI systems as defined in the Act's Annex III. Before translating requirements into engineering work, teams must determine whether their system qualifies. The Act defines high-risk through a use-case classification, not a technology classification: the same model architecture may be high-risk in one application and minimal-risk in another.
The Annex III categories most relevant to enterprise AI systems are:
- Employment and workforce management: AI systems used to make or materially influence decisions about recruitment, task allocation, performance monitoring, or termination. This includes resume screening tools, candidate ranking systems, workforce scheduling algorithms, and performance evaluation systems.
- Access to essential private services: AI systems used in credit scoring, insurance risk assessment, eligibility determination for financial products or services. This category captures fintech, insurtech, and lending technology products.
- Education and vocational training: AI systems that determine access to educational institutions, evaluate learning outcomes, or influence student assessment. This includes automated grading systems and admissions decision tools.
- Critical infrastructure: AI systems used in the management and operation of energy, water, transport, and digital infrastructure. This includes predictive maintenance systems, load balancing algorithms, and network management tools with autonomous decision authority.
Stanford HAI's analysis emphasizes that the classification is determined by the system's application context, not its technical architecture. A large language model used to draft marketing copy is minimal-risk. The same model used to evaluate job applicants' written responses as part of a hiring pipeline is high-risk. The obligation follows the use case.
Article 13 Sub-Requirements Mapped to Technical Artifacts
Article 13 contains several sub-requirements, each of which maps to a specific type of technical artifact. The following mapping translates regulatory language into engineering deliverables.
Sub-Requirement 1: Instructions for Use (Article 13(1))
Regulatory text (paraphrased): High-risk AI systems shall be designed and developed in such a manner that their operation is sufficiently transparent to enable deployers to interpret the system's output and use it appropriately. An appropriate type and degree of transparency shall be ensured, with a view to achieving compliance with the relevant obligations of the provider and deployer.
Technical artifact: System Card. A version-controlled document describing the system's purpose, capabilities, limitations, intended use conditions, and output interpretation guidance. Unlike a product datasheet, a system card is a living engineering artifact that is updated when the system's behavior changes materially.
# System card structure (version-controlled)
system_card:
version: "3.2.0"
last_updated: "2026-05-10"
system_name: "CreditEval Decision Engine"
intended_purpose: >
Automated pre-screening of consumer credit applications
to produce a risk category (low/medium/high) for human
underwriter review. Not intended as a final credit decision.
intended_users:
- "Licensed credit underwriters with access to full applicant file"
known_limitations:
- "Performance degrades on applicants with fewer than 3 years of credit history"
- "Not validated for commercial/business credit applications"
- "Risk category should not be used as sole basis for credit decision"
output_interpretation:
risk_low: "Applicant profile matches patterns associated with < 2% default rate"
risk_medium: "Applicant profile matches patterns associated with 2-8% default rate"
risk_high: "Applicant profile matches patterns associated with > 8% default rate"
human_oversight_requirement: >
All risk categories must be reviewed by a licensed underwriter
before any credit decision is communicated to the applicant.
change_history:
- version: "3.2.0"
date: "2026-05-10"
change: "Updated performance metrics after Q1 2026 model revalidation"
- version: "3.1.0"
date: "2026-02-15"
change: "Added thin-file limitation disclosure"Sub-Requirement 2: Performance Characteristics (Article 13(3)(b))
Regulatory text (paraphrased): Instructions for use shall include the levels of accuracy, robustness, and cybersecurity against which the high-risk AI system has been tested and validated, and any known or foreseeable circumstances that may impact expected performance.
Technical artifact: Performance Metrics Documentation. A current record of the system's validated accuracy, precision, recall, and robustness metrics, disaggregated by relevant subgroups where the system makes decisions affecting individuals. This documentation must be updated when the model is retrained, the input distribution changes materially, or validation testing reveals new performance characteristics.
The NIST AI Risk Management Framework provides complementary guidance under its Map and Measure functions, emphasizing that performance documentation should include not just aggregate metrics but performance under adversarial conditions, out-of-distribution inputs, and data quality degradation scenarios.
Sub-Requirement 3: Input Specifications and Limitations (Article 13(3)(c))
Regulatory text (paraphrased): Instructions for use shall include the specifications for the input data, including any requirements regarding the characteristics, types, and quality of input data.
Technical artifact: Input Schema and Data Quality Requirements. A formal specification of what data the system expects, what format it must be in, what happens when data is missing or malformed, and under what data quality conditions the system's outputs should not be relied upon. For AI systems that accept structured inputs, this is an annotated schema. For systems that accept unstructured inputs (text, images), this includes format requirements, length limits, language coverage, and known failure modes for specific input types.
Sub-Requirement 4: Human Oversight Measures (Article 13(3)(d))
Regulatory text (paraphrased): Instructions for use shall include the human oversight measures, including the technical measures put in place to facilitate the interpretation of the outputs of the AI system by the deployers.
Technical artifact: Decision Traces with Override Capability. This is not a document -- it is an architectural capability. The system must produce decision records that are interpretable by a human reviewer, and the system must support human override of automated decisions. A system that produces a risk score without any explanation of the factors that contributed to it does not satisfy this requirement, regardless of what the instructions-for-use document says. The technical artifact is the combination of structured decision traces (showing inputs, rules evaluated, and outcomes) and an override mechanism (allowing a human reviewer to accept, reject, or modify the automated decision with the override recorded in the audit trail).
Sub-Requirement 5: Decision Logic Documentation (Article 13 + Articles 9, 11)
Regulatory text (paraphrased): Read in conjunction with Article 9 (risk management) and Article 11 (technical documentation), Article 13 requires that the logic underlying automated decisions be documented in a form that supports post-hoc review and reconstruction.
Technical artifact: Versioned Policy Rules and Audit Logs. The decision logic must be externalized from the model and stored as named, versioned artifacts. Each version is an immutable snapshot. Decision records reference the specific rule version that was active when the decision was made. This combination -- versioned rules plus version-pinned decision records -- enables the post-hoc reconstruction that the Act requires: for any specific past decision, the exact inputs, the exact rules, and the exact outcome can be retrieved.
The Deployer vs. Provider Split
The EU AI Act distinguishes between two roles with different obligations: the provider (the organization that develops or places the AI system on the market) and the deployer (the organization that uses the AI system under its authority). Many SaaS companies occupy both roles simultaneously -- they are providers of their product and deployers of third-party AI models embedded within it.
Article 13's primary obligations fall on the provider. The provider must design the system to be transparent, produce the instructions for use, document performance characteristics, and implement human oversight mechanisms. The deployer's obligations (under Article 26) include using the system in accordance with the instructions for use, implementing the human oversight measures the provider specifies, monitoring the system's operation, and reporting serious incidents.
The practical complexity arises in three scenarios:
SaaS vendor embedding third-party models. A fintech company that embeds an OpenAI or Anthropic model in a credit pre-screening tool is simultaneously a deployer (of the foundation model) and a provider (of the credit pre-screening system). As a provider, the fintech company has full Article 13 obligations for the credit pre-screening system. The foundation model provider's documentation (model card, safety assessment) supports but does not replace the fintech company's own transparency obligations.
Platform providers enabling deployer customization. A platform that allows customers to configure AI-powered workflows (rules, thresholds, model parameters) must provide transparency documentation for the platform's capabilities, but the deployer is responsible for documenting the specific configuration they have applied. The line between "platform behavior" and "configuration behavior" must be clear in the system card.
Multi-vendor AI chains. A system that chains multiple AI services (one vendor for data enrichment, another for risk scoring, a third for decision recommendation) creates a shared transparency obligation. Each provider must document their component. The integrator -- typically the deployer who assembled the chain -- is responsible for documenting the chain's combined behavior, including how component outputs interact and how errors or limitations in one component propagate through the chain.
Gartner's compliance framework recommends that organizations map the provider-deployer split explicitly for every AI system in their portfolio, identifying which obligations they own as provider, which they own as deployer, and which are shared or ambiguous. This mapping exercise frequently reveals gaps -- obligations that no party has clearly accepted responsibility for.
Three Most Common Compliance Gaps
Based on the requirements above, three compliance gaps appear most frequently in production AI systems.
Gap 1: Decision Logic Embedded in Prompts or Application Code
The most common gap. The rules governing automated decisions are embedded in LLM prompts, if-else chains in application code, or configuration files that are overwritten on each deployment. There is no version history. There is no immutable snapshot of the logic as it existed when a specific past decision was made. When asked "what rule governed this decision three months ago?", the team cannot answer with certainty because the rule may have been modified since then without a versioned record.
Remediation: Externalize decision logic into named, versioned rule artifacts stored in an immutable version store. Each decision record references the specific rule version that produced it. Rule changes create new versions rather than modifying existing ones.
Gap 2: No Structured Decision Traces
The system makes decisions but does not capture structured records of what inputs were evaluated, what rules were applied, and what the outcome was. The only records are application logs -- unstructured text entries that capture operational events but not decision rationale. Reconstructing a specific past decision from application logs is unreliable, time-consuming, and may be impossible if logs have been rotated or if the relevant events are spread across multiple services.
Remediation: Implement structured decision traces that capture, at decision time, the typed inputs evaluated, the rule version applied, the evaluation outcome, and a unique trace identifier. Store these traces in an append-only store with retention policies aligned to the Act's requirements (the Act does not specify a retention period for Article 13 artifacts, but Article 12 on logging requires that logs be kept for a period appropriate to the intended purpose of the high-risk AI system, and at least six months unless more specific requirements apply).
Gap 3: System Documentation Is Static
The system has a documentation artifact -- perhaps a model card, a technical specification, or a compliance document -- but it was written at deployment time and has not been updated since. The system has changed: the model has been retrained, new features have been added, thresholds have been adjusted, new data sources have been integrated. The documentation describes the system as it was, not the system as it is.
Remediation: Tie documentation updates to the system change process. When a model is retrained, the performance metrics documentation is updated. When a rule is changed, the decision logic documentation is updated. When a new data source is integrated, the input specification is updated. The simplest enforcement mechanism is a deployment gate: the system cannot be deployed if the documentation version does not match the system version.
Article 13 Readiness Checklist
The following checklist maps Article 13 sub-requirements to engineering deliverables. Each item is binary: the artifact either exists in a current, maintained form, or it does not.
| Requirement | Technical Artifact | Key Properties | Status |
|---|---|---|---|
| System purpose and limitations (Art. 13(1)) | System Card | Version-controlled, updated on material changes, includes limitations and output interpretation | Ready / Not Ready |
| Performance characteristics (Art. 13(3)(b)) | Performance Metrics Documentation | Current accuracy/robustness metrics, disaggregated by relevant subgroups, updated on retraining | Ready / Not Ready |
| Input specifications (Art. 13(3)(c)) | Input Schema and Data Quality Requirements | Formal input specification, missing data behavior, quality thresholds, degradation conditions | Ready / Not Ready |
| Human oversight measures (Art. 13(3)(d)) | Decision Traces + Override Mechanism | Interpretable decision records, human override capability, override audit trail | Ready / Not Ready |
| Decision logic documentation (Art. 13 + Art. 9, 11) | Versioned Policy Rules + Audit Logs | Externalized rules, immutable version history, decision records pinned to rule versions | Ready / Not Ready |
| Risk management (Art. 9) | Living Risk Management Log | Ongoing risk register, updated at material changes, residual risk assessment after mitigations | Ready / Not Ready |
| Provider-deployer obligation mapping | Responsibility Matrix | Clear assignment of each obligation to provider, deployer, or shared, for every AI component | Ready / Not Ready |
Teams with three or more "Not Ready" items should expect significant engineering effort before the August 2026 compliance deadline. The decision logic documentation and structured decision traces (rows 4 and 5) typically require the most engineering work because they involve architectural changes, not just documentation.
From Checklist to Implementation
The readiness checklist identifies gaps. The implementation sequence determines the order in which to close them. For most teams, the recommended sequence is:
- Risk classification assessment: Determine whether your system is high-risk under Annex III. If it is not, Article 13's obligations do not apply (though lighter transparency obligations under Article 50 may still apply). If classification is ambiguous, consult legal counsel and document your classification rationale.
- Decision trace infrastructure: Implement structured decision traces before addressing documentation. Traces provide the raw material for every other artifact: they show what decisions the system makes, what inputs it evaluates, and what rules it applies. Without traces, the remaining documentation is aspirational rather than evidence-based.
- Rule externalization and versioning: Extract decision logic from prompts and application code into versioned rule artifacts. This is typically the highest-effort work item and the one with the longest lead time.
- Documentation artifacts: With traces and versioned rules in place, the system card, performance documentation, and input specification can be written from evidence rather than from memory.
- Human oversight mechanism: Implement the override capability and link it to the decision trace store so overrides are recorded as part of the audit trail.
- Responsibility matrix: Map provider-deployer obligations for every AI component and ensure each obligation has a clear owner.
The Engineering Case for Transparency Infrastructure
Article 13 compliance is a regulatory obligation, but the engineering infrastructure it requires -- system cards, decision traces, versioned rules, performance documentation -- is also the infrastructure that makes AI systems operationally reliable. Teams that build transparency infrastructure for compliance consistently report three secondary benefits:
Faster incident response. When something goes wrong, structured decision traces allow the team to reconstruct the specific decision in minutes rather than hours. The exact inputs, the exact rule version, and the exact outcome are immediately retrievable. There is no log archaeology.
Safer rule changes. Versioned rules with immutable snapshots enable staged rollout (draft, shadow, canary, active) and instant rollback. Rule changes become routine operations rather than high-risk production events.
Clearer organizational accountability. The responsibility matrix and system card make explicit who owns what. When a decision is questioned, the system card identifies the rule owner, the decision trace identifies the specific rule that fired, and the version history shows when and by whom the rule was last changed.
The compliance deadline is the catalyst, but the value extends well beyond compliance. Organizations that treat Article 13 as an engineering project rather than a documentation exercise will find themselves with AI systems that are not just compliant, but genuinely more operable.
This article is an engineering interpretation of regulatory requirements and does not constitute legal advice. Organizations subject to the EU AI Act should consult qualified legal counsel to assess their specific compliance obligations. Regulatory requirements are subject to interpretation through EU AI Office guidance, national implementing measures, and enforcement decisions that may evolve over time.
