AI and Automation in ATM
Adoption of artificial intelligence and higher automation in ATM — EASA AI Levels L1/L2/L3, trustworthy AI building blocks, learning assurance, and human-AI teaming
AI and Automation in ATM
Definition
AI and Automation in ATM covers the adoption of artificial intelligence and machine learning techniques in air traffic management systems, from algorithmic decision-support tools through to higher levels of autonomous ATM operation. The practical scope runs from currently deployed tools — Short-Term Conflict Alert (STCA), arrival managers (AMAN), and traffic demand prediction — to emerging ML-based separation assurance and eventually to limited autonomous operation in constrained environments.
The ICAO Global ATM Operational Concept (Doc 9854) establishes automation as a foundational design element: automated multi-radar tracking, flight plan correlation, and decision-support tools appear in the mandatory interoperability requirements. Doc 9854 section 3.4 specifies that surface-movement decision-support systems will be an integral part of the total ATM automation environment. The ICAO Human Factors Training Manual (Doc 9683) frames automation as a tool whose purpose is to aid the human operator, who retains the responsibility for management and direction of the overall system.
The AI/ML layer goes beyond classical rule-based automation. The European Union Aviation Safety Agency (EASA) has developed the primary civil aviation regulatory framework for machine learning, anchored in the EASA AI Roadmap 2.0 (2023) and the Concept Paper on Guidance for Level 1 and Level 2 Machine Learning Applications (Issue 2, 2024).
Regulatory Basis
ICAO foundations
Doc 9854 (Global ATM Operational Concept) establishes the design principle that ATM systems must be human-centred and that automation serves to augment the controller rather than replace human judgment. The interoperability requirement for "automation and human/machine interface" is explicit: a minimum level of interoperability must be defined to ensure smooth traffic flow. Automated functions listed as normative include multi-radar tracking, correlation of radar track and flight plan, and automated sector-to-sector coordination.
Doc 9683 (Human Factors Training Manual), Chapter 5.3 "Automation in Air Traffic Control", establishes that future ATC architectures will use automated conflict detection and resolution tools for routine separation, with controller intervention for exceptions. It recognises "cooperative human-machine architecture" as the design goal, where automation continuously conveys its difficulty level to the supervisor. The principle that automation must be considered a tool or resource — capable of learning and acting independently on a task, but directed by the human — is stated explicitly.
PANS-ATM (Doc 4444), §4.13.3, requires ATC automation systems to present data in accordance with Human Factors principles and in a timely manner. §8.1.3 states that ATS surveillance systems should integrate with other automated systems to reduce controller workload and coordination voice load. §15.7.2 codifies STCA procedures as the most widely implemented automation safety net, defining its objective as assisting the controller in preventing collision by timely alerting of potential separation minima infringement.
EASA AI framework
EASA AI Roadmap 2.0 (2023) defines the three-level AI classification for civil aviation and sets out the phased development of regulatory material through to Level 3 advanced automation.
EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications" (Issue 2, 2024) is the principal regulatory document for ML in civil aviation. It defines:
Level 1 (Assistance): ML outputs are information to the human, who retains the decision. The human can detect and override incorrect output.
Level 2A (Human-AI Cooperation): ML output directly triggers an action, but the human can monitor and override in adequate time before the action has effect.
Level 2B (Human-AI Teaming): ML output directly triggers an action and the human cannot effectively override in real time. The human monitors outcomes rather than individual actions.
Level 3 (Autonomous AI): ML model operates without human intervention in the loop. Corresponds to advanced automation in the ICAO sense.
The five trustworthy-AI building blocks required for any ML application are: learning assurance (W-shaped process), explainability and human-AI interface, safety risk mitigation, data governance, and ethics/governance.
EU AI Act
Regulation (EU) 2024/1689 (the EU AI Act), applicable from August 2024 with graduated entry into force, classifies aviation safety systems as high-risk AI. High-risk AI must satisfy requirements for data governance, transparency, accuracy, robustness, and human oversight before market placement. ATM ML applications approved under EASA rules must additionally satisfy the EU AI Act for European operations.
FAA
The FAA Roadmap for Artificial Intelligence Safety Assurance (2024) extends equivalent trustworthiness concepts to US aviation. FAA Order 8040.4B on safety risk management applies to any ATM system change, including AI-enabled systems.
Operational Meaning
AI and automation in ATM follows a maturity progression from classical algorithmic tools to learned behaviour:
Currently deployed (L1 / classical decision support): STCA and MTCD (Medium-Term Conflict Detection) flag separation conflicts for controller action. AMAN computes arrival sequences and assigns metering times. These are rule-based algorithms but establish the human-machine teaming pattern against which ML applications are assessed.
Emerging ML applications (L1 and early L2A): ML-based traffic demand prediction, AMAN sequence optimisation using learned traffic patterns, taxiway routing recommendations, and runway configuration selection support. The controller retains decision authority; ML outputs are advisory.
Medium-term (L2B): ML-assisted separation provision in low-density oceanic sectors, automated demand-capacity balancing that proposes network restrictions for human endorsement, and ML-optimised AMAN in high-complexity terminal environments.
Long-term (L3): Highly automated ATM for specific constrained task types such as pre-departure de-confliction in fully trajectory-based environments and automated ground movement in defined surface areas, with a human supervisor monitoring exception states.
Framework Structure
EASA AI Level Classification
The four levels map onto current and planned ATM capabilities:
Level 1 includes all current advisory tools (STCA, MTCD, AMAN, demand prediction displays) where the human always decides.
Level 2A covers systems where automated action follows an ML recommendation unless the human overrides within a defined time window, for example automated AMAN sequencing or sector configuration selection.
Level 2B applies where the human monitors system outcomes without overriding individual actions, for example in oceanic automated separation provision or automated de-confliction of pre-departure flows.
Level 3 applies to fully automated operation in bounded environments, requiring the highest level of safety evidence and regulatory scrutiny.
Trustworthy-AI building blocks
EASA defines five building blocks for any ML application:
Learning assurance (W-shaped process): the structured lifecycle from requirements through data management, training, verification, and system integration, with bidirectional feedback loops at each stage.
Explainability and human-AI interface: model outputs must be interpretable in the operational context; interface design must prevent automation bias and automation over-rejection.
Safety risk mitigation: threat analysis of ML-specific failure modes, including distributional shift, dataset bias, adversarial inputs, and concept drift over operational lifetime.
Data governance: data quality, representativeness, provenance, labelling integrity, and production monitoring for drift.
Ethics and governance: fairness, accountability, and compliance with the EU AI Act for European deployments.
EUROCONTROL and SESAR 3 JU
EUROCONTROL's FLY AI report and the SESAR 3 JU / Digital European Sky programme are the primary implementation vehicles in Europe. SESAR 3 projects demonstrate ML applications in areas including AMAN, demand- capacity balancing, and automated sector configuration, generating the operational performance data needed for EASA approval cases.
External Sources
- https://www.easa.europa.eu/en/document-library/general-publications/easa-artificial-intelligence-roadmap-20 - EASA AI Roadmap 2.0 (2023)
- https://www.easa.europa.eu/en/document-library/general-publications/concept-paper-guidance-level-1-2-machine-learning-applications - EASA Concept Paper Issue 2 (2024)
- https://www.eurocontrol.int/publication/fly-ai-report - EUROCONTROL FLY AI Report
- https://www.faa.gov/aircraft/air_cert/design_approvals/aiassurance - FAA AI Safety Assurance Roadmap (2024)
- https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689 - EU AI Act, Regulation (EU) 2024/1689
- https://www.sesarju.eu/projects - SESAR 3 JU project catalogue
References
Doc 9854 (Global ATM Operational Concept), Chapter 1, §1.6.2 — greater access to decision-support information for airspace users as a key ATM concept objective.
Doc 9854, Appendix F, interoperability requirements, item n — "automation and human/machine interface: a minimum level of interoperability should be defined to ensure the smooth flow of traffic".
Doc 9854, Appendix F, interoperability requirements, item o — automated functions including multi-radar tracking, flight plan correlation, flight progress strip distribution, and automated sector coordination.
Doc 9854, Chapter 3, §3.4 — surface-movement decision-support systems as an integral part of the total ATM automation environment.
Doc 9683 (Human Factors Training Manual), Chapter 3, §3.3.4 — future ATC architectures using automated conflict detection/resolution tools; controller as exception manager.
Doc 9683, Chapter 3, §3.3.8 — definition of automation as a tool or resource with capacity to learn; human retains management and direction responsibility.
Doc 9683, Chapter 3, §3.3.9 — "cooperative human-machine architecture" as the design goal for advanced ATC automation.
Doc 9683, Chapter 5, §5.3 — dedicated section on "Automation in Air Traffic Control": advisory vs. autonomous roles, workload effects, team function changes.
Doc 4444 (PANS-ATM), Chapter 4, §4.13.3 — ATC automation systems must present data in accordance with Human Factors principles.
Doc 4444, Chapter 8, §8.1.3 — ATS surveillance systems integration with other automated systems to reduce controller workload and verbal coordination.
Doc 4444, Chapter 15, §15.7.2 — Short-Term Conflict Alert (STCA) procedures as a codified ATM safety-net automation function (authoritative source — not in local library for EASA supplements; PANS-ATM clause verified in local library).
EASA AI Roadmap 2.0 (2023) — three-level AI classification and phased regulatory development programme (authoritative source — not in local library).
EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications", Issue 2 (2024) — principal regulatory framework for ML in civil aviation: Level definitions, W-shaped learning assurance, trustworthy-AI building blocks (authoritative source — not in local library).
Regulation (EU) 2024/1689 (EU AI Act) — high-risk AI classification for aviation safety systems; human oversight and conformity assessment requirements (authoritative source — not in local library).
Related topics
Detailed working notes on the adoption of artificial intelligence and
higher automation in air traffic management. This folder expands the
summary in topics/ai_atm.md into per-aspect files so each can be read
independently.
Files in this folder
overview.md— what AI/ML in ATM is, where it sits in the ICAO/EASA framework, and why the concept matters for global ATM modernisation.components.md— the trustworthy-AI building blocks: learning assurance, explainability, safety risk mitigation, data governance, and ethics/governance.blocks.md— the EASA AI Level classification (L1/L2A/L2B/L3) as the maturity stages; mermaid flow L1 to L3; certification implications.threads.md— the functional axes that together constitute AI/ATM: learning assurance, human-AI teaming, data governance, V&V, operational use-cases, and oversight.modules.md— anatomy of approving one ML application via the W-shaped learning assurance lifecycle; worked example using ML-based AMAN / traffic-prediction.enablers.md— CNS infrastructure, data pipelines, training/licensing, certification frameworks, and institutional arrangements required for AI in ATM.performance_objectives.md— KPA-keyed performance matrix and KPIs for AI-enabled ATM operations.timeline.md— historical evolution: EASA AI Roadmap 1.0 (2020), Concept Paper issues, Roadmap 2.0 (2023), EU AI Act (2024), first approvals and SESAR milestones.references.md— consolidated ICAO and authoritative external references for all content in this folder.
Reading order
Start with overview.md for context, then blocks.md for the EASA AI
Level framework. components.md details the trustworthy-AI building
blocks. threads.md maps the functional axes. modules.md works through
a concrete approval example. Use enablers.md for deployment planning,
performance_objectives.md for KPA/KPI mapping, and timeline.md for
date context. references.md is the citation master list.
Source basis
Content is grounded in:
- ICAO Doc 9854 (Global ATM Operational Concept), 1st edition (2005).
- ICAO Doc 9683 (Human Factors Training Manual).
- ICAO Doc 4444 (PANS-ATM), Chapters 4, 8, and 15.
- EASA AI Roadmap 2.0 (2023).
- EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications", Issue 2 (2024).
- EUROCONTROL FLY AI Report.
- FAA Roadmap for AI Safety Assurance (2024).
- Regulation (EU) 2024/1689 (EU AI Act).
What it is
AI and Automation in ATM refers to the use of artificial intelligence, machine learning, and higher levels of process automation in air traffic management — from current decision-support tools that advise controllers to emerging systems where ML models act autonomously in constrained ATM environments.
The scope is broad:
- Classical rule-based automation: Short-Term Conflict Alert (STCA), Medium-Term Conflict Detection (MTCD), Arrival Manager (AMAN), Departure Manager (DMAN), A-SMGCS routing, ATFM demand-capacity balancing. These are algorithmic tools, not ML, but they establish the human-machine teaming patterns against which ML applications are evaluated.
- Level 1 ML applications: ML models whose outputs are advisory — the controller uses them as information but always retains the decision. Traffic demand prediction, AMAN sequence optimisation using learned models, and weather impact assessment are typical examples.
- Level 2A/2B ML applications: ML outputs that trigger automated actions, either with human override (2A) or with human monitoring of outcome rather than individual action (2B). Automated oceanic separation provision and automated sector configuration selection are examples currently being researched and demonstrated.
- Level 3 advanced automation: fully autonomous operation in bounded environments. Regulatory frameworks are still under development by EASA.
Where it sits in the ICAO/EASA framework
The ICAO layer is set by Doc 9854 (Global ATM Operational Concept, 2005), which establishes that automation serves to augment the controller and that all ATM systems must be human-centred. Doc 9683 (Human Factors Training Manual) Chapter 5.3 dedicated to ATC automation defines the human-as-supervisor, machine-as-tool model and identifies the double bind of management-by-exception.
PANS-ATM (Doc 4444) §15.7.2 codifies STCA as the most widely deployed safety-net automation and defines the controller's obligation to assess and act on every alert. This establishes the normative precedent for AI output in the safety chain: the controller is responsible for acting on system output, and must be capable of independent assessment.
The EASA layer, which is the primary regulatory framework for ML in civil aviation, is provided by:
- EASA AI Roadmap 2.0 (2023): defines the three-level classification, the W-shaped learning assurance lifecycle, and the phased regulatory programme.
- EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications" (Issue 2, 2024): the principal pre-regulatory guidance document. States and ANSPs developing ML-based ATM tools use this to structure their safety cases.
Above EASA, the EU AI Act (Regulation (EU) 2024/1689, 2024) classifies aviation safety systems as high-risk AI and mandates conformity assessment before market placement for European operations.
The FAA Roadmap for Artificial Intelligence Safety Assurance (2024) is the parallel US framework, closely aligned with EASA concepts.
Why it matters for global ATM modernisation
AI and ML are entering ATM at a moment when three pressures converge:
- Traffic growth: ICAO projects global traffic volume to double by the mid-2030s. Existing controller staffing models and rule-based tools cannot absorb that growth without structural automation support.
- Complexity: Full trajectory-based operations (TBO Block 2 from 2025, Block 3 from 2031) depend on automation for trajectory negotiation, conformance monitoring, and network demand-capacity balancing.
- Data richness: SWIM, ADS-B, ADS-C, and FF-ICE together create the data infrastructure on which ML models can be trained and operated.
At the same time, aviation is safety-critical, certified, and operated by licensed personnel. AI adoption is gated by regulatory approval, not by algorithmic performance alone. The EASA AI Level framework is the principal gate.
Human oversight: the non-negotiable principle
Across all current regulatory frameworks — ICAO, EASA, FAA — the human operator retains ultimate responsibility for safety-critical outcomes. Doc 9683 §3.3.8 states this explicitly: "Automation should be considered to be a tool or resource... the human operator retains the responsibility for management and direction of the overall system." EASA Level 3 guidance (in preparation) will address the narrow conditions under which that principle is relaxed, but even then bounded human supervisory oversight will be required.
References
- Doc 9854 (Global ATM Operational Concept), Chapter 1, §1.6.2 — decision-support access as a key ATM concept objective.
- Doc 9854, Appendix F, items n and o — normative automation and human/machine interface interoperability requirements.
- Doc 9683 (Human Factors Training Manual), Chapter 3, §3.3.4–§3.3.8 — automation in advanced aviation systems: human-machine roles, management by exception, cooperative architecture.
- Doc 9683, Chapter 5, §5.3 — Automation in Air Traffic Control.
- Doc 4444 (PANS-ATM), Chapter 15, §15.7.2 — STCA as a codified safety-net automation function.
- EASA AI Roadmap 2.0 (2023) — three-level classification and regulatory programme (authoritative source — not in local library).
- EASA Concept Paper Issue 2 (2024) — Level definitions and trustworthy-AI building blocks (authoritative source — not in local library).
- Regulation (EU) 2024/1689 (EU AI Act, 2024) — high-risk AI classification and conformity assessment (authoritative source — not in local library).
- FAA Roadmap for Artificial Intelligence Safety Assurance (2024) — US parallel framework (authoritative source — not in local library).
The trustworthy-AI building blocks
EASA identifies five building blocks that must be addressed in any civil aviation ML application approval. These are the foundational components of trustworthy AI. Each block requires evidence in the safety case.
The five blocks and their relationships form the core of any ML approval:
- Learning assurance
- Explainability and human-AI interface
- Safety risk mitigation
- Data governance
- Ethics and governance
1. Learning assurance
Learning assurance is the process by which the ML development lifecycle is structured to provide confidence that the ML model performs as intended across the operational envelope. It is analogous to development assurance for classical software, but adapted to the non-deterministic and data-driven nature of ML.
The W-shaped learning assurance process (defined in the EASA Concept Paper) has two symmetric V-processes joined at the centre:
Left V: requirements and design
- ML requirements (what the model must achieve, under what conditions).
- Data requirements (what training data is needed, what is out of scope).
- Model architecture selection.
Centre: training and validation
- Dataset preparation, cleaning, and annotation.
- Model training and hyperparameter selection.
- Internal validation against held-out data.
Right V: integration and deployment assurance
- Operational performance evaluation.
- System integration testing (model within the ATM system context).
- Deployment monitoring and change management.
Bidirectional feedback at each stage means that failures found in the right V drive design and data corrections in the left V — the W shape.
Specific ML failure modes that learning assurance must address:
- Distributional shift: model performance degrades when operational traffic patterns differ from the training distribution (e.g., new aircraft types, new routes, post-pandemic traffic patterns).
- Overfitting: model learns artefacts of the training dataset rather than generalising to operational conditions.
- Concept drift: the underlying relationship between inputs and outputs changes over time (e.g., new separation standards, airspace redesign).
- Shortcut learning: model exploits spurious correlations in training data that do not hold in all operational conditions.
2. Explainability and human-AI interface
Explainability covers two distinct requirements:
Technical explainability: the ability to understand why the model produces a given output — important for certification evidence, debugging, and incident investigation.
Operational explainability: the ability of the controller or ATFM manager to understand the model output well enough to apply appropriate reliance — neither accepting every recommendation blindly (automation bias) nor rejecting outputs reflexively (automation over-rejection).
The human-AI interface must be designed so that:
- The controller can detect incorrect or unusual model outputs.
- The confidence or uncertainty of the model output is conveyed in a form that supports appropriate trust calibration.
- The controller's ability to maintain manual skills is preserved (mitigation of skill degradation under high automation).
Doc 9683 §5.3 identifies automation bias and complacency as longstanding risks in ATC automation. ML systems amplify these risks because the model basis for an output is typically less transparent than a rule-based algorithm.
3. Safety risk mitigation
ML applications introduce failure modes that differ from classical ATM software:
- Adversarial inputs: deliberate manipulation of input data to cause incorrect model output. Relevant for ML models consuming surveillance data or SWIM feeds.
- Sensor/data failure mode propagation: ML models that aggregate many input streams may degrade gracefully or catastrophically depending on how failure modes propagate through the model.
- Tail events: rare but safety-critical scenarios (for example, closely spaced simultaneous operations in degraded surveillance) may be underrepresented in the training set, causing the model to perform poorly precisely where performance matters most.
The safety risk mitigation building block requires:
- Formal hazard identification specific to ML failure modes.
- Mitigation measures (input monitoring, output monitoring, performance bounds, fallback to classical automation).
- Safety objectives allocated to the ML component.
- Operational suitability demonstrations.
4. Data governance
Data is to an ML model what code is to classical software. Data governance covers:
Data quality: accuracy, completeness, timeliness, and consistency of training data. For ATM, this means radar track data, flight plan data, SWIM feeds, weather data, and operational annotations (labels).
Data representativeness: the training dataset must cover the full operational envelope, including edge cases and rare scenarios. Gaps in coverage are safety hazards.
Data provenance and integrity: chain of custody for training data, version control, and protection against contamination.
Production monitoring: continuous comparison of operational input distributions against training distributions, with alerting when drift reaches defined thresholds.
Annotation quality: for supervised learning, the quality of human labels or ground-truth references determines the ceiling of model performance. Label audits are a required activity.
5. Ethics and governance
The ethics and governance building block covers:
Fairness: ensuring that the ML model does not systematically disadvantage specific user groups, aircraft types, or regions due to biased training data or model design.
Accountability: clear assignment of responsibility for model outputs across the certification hierarchy (developer, integrator, ANSP, regulator).
Human rights and privacy: ML models consuming surveillance data or flight data must comply with applicable privacy and data protection regulations.
EU AI Act compliance: for European operations, all five building blocks must be documented in the technical documentation required under Article 11 of Regulation (EU) 2024/1689 for high-risk AI systems.
Relationship between building blocks
The five building blocks are not independent. They interact:
- Learning assurance defines the lifecycle; data governance controls the quality of the data flowing through that lifecycle.
- Explainability informs safety risk mitigation by identifying which failure modes are detectable by the human operator.
- Ethics/governance sets the regulatory envelope within which learning assurance and safety risk mitigation must operate.
A trustworthiness analysis bringing together all five blocks is the evidence base for the EASA-approval safety case.
References
- EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications", Issue 2 (2024) — five trustworthy-AI building blocks, W-shaped learning assurance process, explainability requirements (authoritative source — not in local library).
- EASA AI Roadmap 2.0 (2023) — building block programme structure and phased development (authoritative source — not in local library).
- Doc 9683 (Human Factors Training Manual), Chapter 5, §5.3 — automation bias, complacency, and skill degradation in ATC automation contexts.
- Doc 9683, Chapter 3, §3.3.8 — automation as a tool; human oversight principle.
- Doc 4444 (PANS-ATM), Chapter 4, §4.13.3 — Human Factors principles for ATC automation data presentation.
- Regulation (EU) 2024/1689 (EU AI Act), Article 9–15 — risk management, data governance, technical documentation, transparency, and human oversight requirements for high-risk AI (authoritative source — not in local library).
What a Level is
In the EASA AI framework, a Level is a classification of the degree of human involvement in the ML-to-action chain. It is not a maturity ranking in the sense that L3 is always better than L1; it is a characterisation of the relationship between the ML output and the safety-critical action, which determines the stringency of the required certification evidence.
The Level determines:
- The required depth of the safety case.
- The required characteristics of the human-AI interface.
- The required robustness of learning assurance.
- The applicable oversight regime under EASA and the EU AI Act.
The four Levels
| Level | Label | Human role | Action mechanism | Example ATM application |
|---|---|---|---|---|
| L1 | Assistance | Decides and acts | ML output is information only | AMAN sequence prediction, demand forecast, conflict risk display |
| L2A | Human-AI Cooperation | Can override in time | Automated action, human override within defined window | Automated AMAN sequencing with override capability |
| L2B | Human-AI Teaming | Monitors outcomes | Automated action, human cannot override in real time | Oceanic automated separation provision with supervisor monitoring |
| L3 | Autonomous AI | Supervisory only | System operates without human in loop | Automated ground movement de-confliction in bounded surface areas |
Level 1 — Assistance (currently deployed)
Level 1 is the current state of AI/ML deployment in ATM. The ML model provides advisory output that the controller sees as information. The controller exercises independent judgment in deciding whether to act on the output and what action to take.
The human is expected to be able to:
- Detect incorrect or unusual ML outputs.
- Understand the basis for the output sufficiently to apply or reject it.
- Act correctly even if the ML output is wrong.
Level 1 applications require the lowest level of certification evidence among ML applications, but must still satisfy learning assurance, data governance, and interface design requirements.
Representative Level 1 applications in ATM:
- ML-based traffic demand prediction: predicting sector load 30-90 minutes ahead from historical patterns and current flight plans. Controller uses as planning information; no automated action.
- AMAN sequence optimisation: ML model computes an optimised arrival sequence; AMAN displays it; controller approves or modifies.
- Conflict risk estimation: ML model assigns a risk score to trajectory pairs; displayed as colour-coded overlay; controller investigates.
- Runway configuration recommendation: ML model suggests optimal runway configuration based on predicted demand and wind; ATC management approves.
Level 2A — Human-AI Cooperation
Level 2A introduces automated action: the ML output directly triggers a system state change (a recommendation is executed automatically). However, the human can monitor the action before it has operational effect and override it within a defined time window.
Certification requirements are significantly higher than Level 1:
- The override mechanism must be reliable and the time window must be sufficient for the human to detect, assess, and act.
- The interface must convey pending automated actions with sufficient clarity for timely human assessment.
- The model's performance bounds must be established so that the human can assess the risk of accepting without override.
Representative Level 2A applications:
- Automated AMAN sequencing: ML-optimised sequence is automatically applied to the AMAN tool; controller can modify within a defined window before the sequence becomes operationally binding.
- Automated sector configuration selection: ML recommends and applies a sector grouping based on predicted load; ATFM manager can override within the planning horizon.
- Automated speed advisory uplink: ML-computed optimal speed is automatically transmitted to aircraft via datalink; pilot can decline.
Level 2B — Human-AI Teaming
Level 2B applies where automated action is taken and the human cannot effectively override each action in real time. The human monitors outcomes and maintains supervisory situational awareness rather than approving individual outputs.
Certification requirements are very high:
- The model must demonstrate high reliability in the operational envelope.
- Comprehensive monitoring must detect model degradation.
- Fallback to lower automation level must be fast and reliable.
- Human supervisory responsibilities must be explicitly specified.
Representative Level 2B applications (largely in research/demonstration phase as of 2026):
- Automated oceanic separation provision: in low-density oceanic track systems, the ATM automation maintains separation based on ADS-C conformance monitoring; the human supervisor monitors but does not approve each separation action.
- Automated demand-capacity balancing: ML-driven network restriction proposals are automatically applied with the human reviewing aggregate outcomes at defined intervals rather than individual decisions.
Level 3 — Autonomous AI
Level 3 applies to systems operating without human intervention in the control loop. The human role is supervisory: monitoring system state and intervening only for exceptions or degraded modes.
Regulatory frameworks for Level 3 are under development by EASA as of 2026. The EASA AI Roadmap 2.0 flags Level 3 guidance as a longer-term activity. No civil aviation ATM applications have been certificated at Level 3 as of this review date.
Candidate Level 3 applications (concept level only):
- Automated surface movement management in bounded, segregated surface areas.
- Automated de-confliction of pre-departure flows in fully trajectory-based environments.
- Remote Virtual Tower operation with AI-based situational awareness and alerting in low-traffic environments.
Certification implications by Level
The EASA framework implies a staircase of evidence:
Level 1: Safety case focused on interface design, output monitoring, and human factors. Learning assurance documentation. Data governance plan. Explainability evidence.
Level 2A: All Level 1 requirements plus: override mechanism design and testing, time-window adequacy evidence, automated action monitoring, and higher-stringency learning assurance (DAL-equivalent assignment).
Level 2B: All Level 2A requirements plus: comprehensive model reliability demonstration, fallback system design and testing, supervisory task analysis, and conformity with EU AI Act for European operations.
Level 3: All Level 2B requirements plus: full autonomous operation safety case, operational suitability demonstration in the bounded environment, and EASA-specific Level 3 guidance (pending as of 2026).
Relationship to the ASBU Block progression
ASBU Block 2 (from 2025) introduces initial TBO elements and enhanced network management automation. The AMAN, demand-capacity balancing, and trajectory conformance monitoring functions in Block 2 will increasingly rely on ML components, predominantly at Level 1 and Level 2A.
ASBU Block 3 (from 2031) envisions full automation of trajectory negotiation and network management functions, which maps to Level 2B and early Level 3 in the EASA taxonomy.
References
- EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications", Issue 2 (2024) — Level 1, 2A, 2B definitions; certification implications; interface requirements (authoritative source — not in local library).
- EASA AI Roadmap 2.0 (2023) — Level classification, Level 3 as future work, phased regulatory programme (authoritative source — not in local library).
- Doc 9683 (Human Factors Training Manual), Chapter 5, §5.3.15–§5.3.18 — advisory vs. automated roles; delegation of problem-solving to the machine; human-machine cooperation patterns in ATC.
- Regulation (EU) 2024/1689 (EU AI Act, 2024) — high-risk AI requirements applicable at all levels (authoritative source — not in local library).
The functional axes of AI/ML in ATM
Six functional axes cut across all EASA AI Levels and all ATM use-cases. Treating each axis independently prevents gaps in the safety case and helps ANSPs sequence their capability development coherently.
The six threads and their dependencies:
- Thread 1: Learning assurance
- Thread 2: Human-AI teaming
- Thread 3: Data governance
- Thread 4: Verification and validation (V&V)
- Thread 5: Operational use-cases
- Thread 6: Oversight and governance
Thread 1: Learning assurance
Learning assurance is the structured process that provides confidence in ML model behaviour throughout the lifecycle.
Key axis activities
Requirements definition: specifying what the model must achieve, under what input conditions, and what operational envelope it covers. Requirements must be testable and traceable to operational objectives.
Data acquisition and management: sourcing operational data (radar tracks, flight plans, SWIM feeds, weather data), cleaning, annotating, and versioning. The training dataset is the primary artefact of accountability in an ML system.
Model development and training: architecture selection, training, hyperparameter tuning, and internal validation. For ATM, model interpretability (or certified proxies for it) is usually required to support Level 1/2A explainability requirements.
Operational performance evaluation: testing the trained model against held-out operational data, including edge cases, rare events, and degraded input scenarios. Performance bounds must be established at this stage.
Integration assurance: integrating the ML component into the ATM system and verifying that the end-to-end system behaviour meets safety requirements.
Deployment monitoring: continuous tracking of model performance in live operation against the baseline established during evaluation. Trigger conditions for re-training or de-activation must be defined.
Thread dependencies
Learning assurance depends on data governance (Thread 3) for the quality of its inputs and on V&V (Thread 4) for the verification methods. It feeds human-AI teaming (Thread 2) by defining the performance bounds that the interface must convey.
Thread 2: Human-AI teaming
Human-AI teaming covers the design and assurance of the relationship between the controller (or ATM manager) and the ML system.
Key axis activities
Role allocation: defining which tasks remain with the human, which are delegated to the ML system, and which are collaborative. Doc 9683 §5.3.15 establishes the baseline taxonomy: advisory, cooperative, and delegated roles.
Interface design: designing the controller interface so that ML outputs are interpretable, uncertainty is conveyed, and the override mechanism (for Level 2A) or monitoring dashboard (for Level 2B) is effective.
Trust calibration: ensuring that controllers develop appropriate reliance — neither too high (automation bias) nor too low (over- rejection). Training and experience design are the primary tools.
Skill maintenance: ML systems that reduce routine workload may erode manual skills needed for degraded-mode operation. Skill maintenance programmes must be specified alongside the AI deployment plan.
Supervisory task design: for Level 2B, the human role is supervisory. Supervisory tasks (what to monitor, what constitutes an exception, what is the intervention protocol) must be explicitly designed and trained.
Thread dependencies
Human-AI teaming depends on learning assurance (Thread 1) for the performance bounds it must convey, and on V&V (Thread 4) for validation of the human-system interaction. It feeds the operational use-cases thread (Thread 5) by constraining which use-cases are feasible at which EASA Level.
Thread 3: Data governance
Data governance manages the quality, integrity, and lifecycle of the data assets on which ML models depend.
Key axis activities
Data sourcing: identifying and gaining access to operational data from radar systems, SWIM services, ADS-B/ADS-C receivers, flight plan systems, and weather services. Data sharing agreements between ANSPs and data providers must be in place.
Data quality management: defining and enforcing data quality standards (accuracy, completeness, timeliness, consistency). For ATM, radar track accuracy and ADSB message completeness have known geographic variation that must be addressed.
Data pipeline design: engineering the pipeline from raw data to model-ready features, with version control at each stage.
Annotation management: for supervised learning, the ground-truth labelling process (who defines correct outcomes, under what protocol) must be documented and audited.
Production data monitoring: comparing the distribution of live operational inputs against the training distribution. Drift detection triggers model review.
Privacy and access control: flight data is commercially sensitive and may be subject to data protection regulations. Data governance must address access controls and data retention.
Thread dependencies
Data governance is the input dependency for both learning assurance (Thread 1) and V&V (Thread 4). It also feeds ethics/governance (Thread 6) by establishing the transparency of the data on which model behaviour is based.
Thread 4: Verification and validation (V&V)
V&V provides the evidence that the ML system does what its requirements say, and does not do what it should not do.
Key axis activities
Unit testing: component-level testing of the ML model and its surrounding software.
Integration testing: testing the ML component within the ATM system context, including fail-safe mechanisms and fallback behaviour.
Operational testing: testing in realistic operational scenarios, including simulation and live trials with safety observers. Fast-time simulation is essential for covering rare and high-risk scenarios at scale.
Safety net verification: verifying that monitoring and alerting functions (override mechanisms for Level 2A, fallback triggers for Level 2B) work as designed.
Adversarial testing: deliberately probing the model with inputs designed to expose boundary behaviour, distributional shift vulnerability, and adversarial susceptibility.
Independent verification: for higher EASA Levels, third-party independent verification of the learning assurance evidence package may be required by the national CAA.
Thread dependencies
V&V depends on requirements from learning assurance (Thread 1) and data quality from data governance (Thread 3). It feeds the operational use-cases thread (Thread 5) by establishing which use-cases have sufficient safety evidence.
Thread 5: Operational use-cases
The operational use-cases thread maps ML capabilities to ATM functions and sequences their deployment.
Current deployed use-cases (Level 1)
Traffic demand prediction: ML models trained on historical traffic patterns predict sector load and delay propagation, supporting ATFM planning decisions.
AMAN sequence optimisation: ML models compute optimised arrival sequences for busy terminal areas, reducing holding and fuel burn.
Conflict risk scoring: ML models assign conflict risk scores to aircraft pairs based on trajectory data and historical encounter patterns, supporting MTCD alert prioritisation.
Taxiway routing: ML models suggest optimal taxi routes based on predicted surface congestion, supporting A-SMGCS functions.
Near-term use-cases (Level 2A, research/demonstration)
Automated AMAN sequencing with controller override: ML-optimised sequence applied automatically, controller can modify within the planning horizon.
Automated sector configuration selection: ML-driven reconfiguration proposal applied unless manager overrides.
Conflict resolution advisory: ML generates conflict resolution manoeuvres and uplinks the preferred option to the crew unless the controller modifies.
Medium-term use-cases (Level 2B, research phase)
Automated oceanic separation provision in low-density track systems, with human supervisory monitoring.
Automated demand-capacity balancing at network level.
Long-term use-cases (Level 3, concept phase)
Automated surface movement management in bounded segregated areas.
Automated pre-departure de-confliction in fully TBO environments.
Thread dependencies
Operational use-cases depends on human-AI teaming (Thread 2) to determine feasible Level assignments and on V&V (Thread 4) to determine safety evidence adequacy. It feeds oversight and governance (Thread 6) by defining what oversight is needed for each use-case.
Thread 6: Oversight and governance
Oversight and governance provides the institutional and regulatory framework within which AI/ML deployment in ATM occurs.
Key axis activities
Regulatory engagement: engaging with EASA (for European operations), the national CAA, and ICAO (for global harmonisation) during the development and approval process. EASA's Concept Paper provides a consultation pathway even before formal rules are in place.
Safety management integration: integrating AI/ML change management into the ANSP Safety Management System (SMS). Every ML application is a significant system change and triggers the SMS change management process.
Post-deployment surveillance: monitoring operational performance against the safety case predictions. Mandatory reporting of ML-related incidents to the CAA and to EASA as required.
EU AI Act compliance programme: for European operations, establishing the processes required by Regulation (EU) 2024/1689 — including the technical documentation, conformity assessment, and post-market monitoring plan.
International harmonisation: participating in ICAO working groups (for example, the ICAO GANP and ASBU threads on automation), the EUROCONTROL FLY AI programme, and SESAR 3 JU projects to align standards and share safety evidence across the community.
Thread dependencies
Oversight and governance wraps all other threads and provides the accountability structure. It depends on the completeness of the learning assurance, V&V, and data governance evidence produced by Threads 1, 4, and 3.
References
- EASA Concept Paper Issue 2 (2024) — thread definitions for learning assurance, explainability, safety risk mitigation, data governance, and ethics/governance (authoritative source — not in local library).
- Doc 9683 (Human Factors Training Manual), Chapter 5, §5.3.15–§5.3.24 — automation role allocation, advisory vs. delegated task design, team function changes, skill maintenance.
- Doc 4444 (PANS-ATM), Chapter 8, §8.1.3 — ATS surveillance system integration to reduce workload; operational requirement underpinning the use-case thread.
- EUROCONTROL FLY AI Report — operational use-case taxonomy and implementation maturity assessment across European ANSPs (authoritative source — not in local library).
- Regulation (EU) 2024/1689 (EU AI Act, 2024) — governance and oversight requirements for high-risk AI (authoritative source — not in local library).
What a module is
A module is a worked example of approving and deploying a single ML application in ATM. It shows how the EASA AI Level classification, the five trustworthy-AI building blocks, and the W-shaped learning assurance lifecycle combine into a concrete approval artefact.
This file presents one detailed module: an ML-based Arrival Manager (AMAN) enhancement with traffic prediction, assessed as EASA Level 1 advancing toward Level 2A. A shorter sketch follows for a second use-case: ML-based traffic demand prediction for ATFM planning.
Module 1: ML-enhanced AMAN sequence optimisation
Operational objective
An Arrival Manager (AMAN) computes a landing sequence and assigns metering times (Controlled Time Over / Controlled Time of Arrival, CTO/CTA) to inbound flights at a busy terminal area. Classical AMAN uses rule-based heuristics constrained by separation minima, wake turbulence categories, and ATFM slots.
The ML enhancement replaces or supplements the heuristics with a trained model that learns optimal sequencing patterns from historical traffic data, including:
- The actual runway throughput delivered by different sequence decisions under different traffic mix conditions.
- The relationship between sequence decisions, fuel burn, and final approach speed targets.
- The sensitivity of on-time performance to AMAN horizon length and traffic complexity.
The resulting system predicts an optimised sequence up to 60 minutes ahead and updates it dynamically as new flight information arrives via SWIM/FIXM.
EASA Level assessment
Initial deployment: Level 1 (Assistance). The ML-generated sequence is displayed to the AMAN controller as a recommendation. The controller reviews it, modifies as needed, and manually applies it to the operational AMAN tool. The ML output is information; the controller decides.
Target upgrade: Level 2A (Human-AI Cooperation). The ML-generated sequence is automatically applied to the AMAN tool. The controller can modify within a defined review window (for example, 15 minutes before the first flight enters the metering fix). After the window closes, the sequence is operationally binding unless the controller explicitly overrides.
W-shaped learning assurance lifecycle
Requirements stage (left V top):
- ML requirements: predict a landing sequence that maximises runway throughput while meeting separation and wake minima and minimising ATFM slot violations. Bounded by the terminal area's operational envelope (traffic mix, runway configurations, wind bands).
- Data requirements: minimum 3 years of historical radar track data, flight plan data, ATFM slot data, and weather METAR/TAF covering the full seasonal and traffic-mix range. Edge cases (severe weather, runway closures, major events) must be represented.
Data preparation stage (left V middle):
- Radar track data cleaned and annotated with sequence decisions and outcomes (throughput achieved, delay incurred).
- Dataset split: 70% training, 15% validation, 15% test. Test set held out from all model development decisions.
- Labelling: "optimal" sequence label derived from post-hoc optimisation of historical traffic, reviewed by experienced AMAN controllers.
Training stage (centre):
- Architecture: gradient-boosted decision tree or attention-based sequence model, depending on interpretability requirements.
- Hyperparameter tuning on validation set.
- Internal validation: accuracy, mean deviation from optimum sequence, separation violation rate on validation set.
Operational evaluation stage (right V middle):
- Performance bounds established: the model achieves X% of optimal throughput under Y% of conditions; degrades to no worse than Z% under adversarial inputs defined in the test protocol.
- Edge case testing: severe weather scenarios, equipment degradation (no ADS-B, Mode S only), high traffic complexity.
Integration and system testing stage (right V top):
- ML component integrated into AMAN tool in a shadow mode (running in parallel with classical AMAN without operational effect).
- 6-month shadow trial: comparing ML recommendations with controller- applied sequences and outcomes.
- Human factors validation: usability testing with controllers; trust calibration assessment; override rate monitoring.
Deployment:
- Live Level 1 deployment: controller uses ML recommendation as advisory; performance monitored continuously.
- Transition to Level 2A after: evidence from live Level 1 operation that override rate is within bounds, that automation bias is absent, and that the model maintains performance across seasonal variation.
Trustworthy-AI evidence summary
Learning assurance: W-shaped lifecycle documented, all artefacts under version control, performance bounds formally stated.
Explainability: Feature importance scores provided for each sequence recommendation; interface displays top-3 factors driving the recommendation (e.g., "mixed B777/A320 wake pair avoidance", "ATFM slot constraint at 14:32").
Safety risk mitigation: Distributional shift monitoring active; fallback to classical AMAN triggered automatically if model confidence drops below threshold or if live performance deviates by more than 2 standard deviations from shadow trial baseline.
Data governance: Data sourcing agreements with ANSP radar and flight plan systems; monthly data quality report; annotation protocol reviewed annually by operational experts.
Ethics/governance: Conformity assessment documentation prepared under EU AI Act Article 11; technical file lodged with national CAA; post-market monitoring plan in place.
Module 2: ML-based traffic demand prediction for ATFM
Operational objective
ATFM planners regulate traffic flow by issuing slots and restrictions. Classical demand prediction uses filed flight plans and historical punctuality factors. An ML model trained on radar track data, historical slot compliance, weather impact patterns, and airline schedule data can provide sector load estimates with tighter confidence bounds and earlier prediction horizons.
EASA Level assessment
Level 1 (Assistance): The ML demand prediction is displayed on the ATFM planning tool alongside the classical prediction. The ATFM manager compares both outputs and applies professional judgment. No automated action.
W-shaped lifecycle (abbreviated)
Requirements: predict sector entry counts and ATFM delay at 30, 60, 90, and 120-minute horizons, with confidence intervals. Cover all seasonal and traffic-mix patterns including major events.
Data: 5 years of ATFM sector data, filed flight plans, radar track actuals, METAR/TAF, and historical slot compliance rates.
Training: ensemble model combining gradient-boosted regressor for baseline demand with recurrent component for delay propagation.
Evaluation: mean absolute error (MAE) at each horizon; comparison with classical prediction on held-out test year.
Integration: parallel output on ATFM planning tool; no automated action.
Deployment monitoring: MAE tracked weekly; seasonal recalibration annually; major schedule change (IATA season) triggers re-evaluation.
Performance outcome (indicative)
Typical improvement over classical prediction: 10-20% reduction in MAE at the 90-minute horizon, primarily driven by improved propagation of early morning delay patterns. The benefit is highest in congested terminal areas where delay propagates non-linearly.
Key design principles for any ATM ML module
From the worked examples, five principles emerge:
-
Shadow operation before live deployment. Every ATM ML application should run in shadow mode (computing but not acting) long enough to establish performance bounds and human trust before any automated action is introduced.
-
Fallback is not optional. Every Level 2A or 2B system must have a reliable, tested, fast fallback to a lower Level or to the classical algorithm. The fallback must work under the conditions where the ML model is most likely to degrade.
-
Override rate is a safety metric. For Level 2A, the rate at which controllers override the automated action is a leading indicator of model degradation and of automation bias. Both extremes (0% override and 100% override) indicate a problem.
-
Seasonal recalibration is a maintenance task. ATM traffic patterns have strong seasonal, weekly, and event-driven variation. An ML model deployed without recalibration degrades predictably.
-
Interface design is co-equal with model design. A high-performing ML model presented through a poor interface will be distrusted or blindly accepted — both are safety hazards.
References
- EASA Concept Paper Issue 2 (2024) — W-shaped learning assurance lifecycle; Level 1 and Level 2A interface requirements; shadow trial expectations (authoritative source — not in local library).
- Doc 9683 (Human Factors Training Manual), Chapter 5, §5.3.16 — risk of excessive trust in automation leading to complacency; reliability requirements for advisory systems.
- Doc 4444 (PANS-ATM), Chapter 8, §8.1.3 — ATS surveillance system integration and automation to reduce workload; normative baseline for AMAN automation.
- EUROCONTROL FLY AI Report — AMAN ML use-case taxonomy, shadow trial methodology, and performance benchmarks from European demonstrations (authoritative source — not in local library).
What enables AI/ML deployment in ATM
AI and ML applications in ATM are not standalone technologies. They depend on a set of enabling conditions — infrastructure, data, training, regulatory, and institutional — that must be in place before any ML application can be safely deployed. Many of these enablers are ASBU modules already being implemented under the GANP Block 0 and Block 1 frameworks.
1. Data infrastructure enablers
SWIM and digital information exchange
ML models for ATM consume large volumes of operational data: radar tracks, flight plan updates, ADS-B/ADS-C position reports, METAR/TAF, NOTAMs, and network management data. SWIM (ASBU thread, Block 0/1) provides the technical infrastructure for distributing this data in near-real-time with defined quality of service.
Without SWIM in place, ML models must consume data from point-to-point interfaces with varying quality and latency. This increases data governance complexity and degrades model performance.
Enabling requirements:
- SWIM Yellow Profile (ATN IPS) or Blue Profile operational in the region.
- Flight information exchange using FIXM/IWXXM/AIXM data standards.
- ATFM data (demand and capacity, slot allocations) available in machine-readable form.
ADS-B and ADS-C coverage
ML models for en-route traffic prediction and conflict risk assessment depend on position data quality. ADS-B (ASBU ASUR-B0/B1) provides 4D position at 1-second update rates across most airspace. ADS-C (ASBU COMS-B0) provides intent data (EPP) in oceanic and remote areas.
Without adequate surveillance coverage, the training data has spatial gaps that degrade ML model performance in exactly those areas where conventional surveillance is weakest.
Enabling requirements:
- ADS-B mandate in effect for relevant airspace (ICAO Annexes 11/10 provisions implemented).
- ADS-C contracts in place for oceanic applications.
- Surveillance data fusion performed by the ATM system before feeding the ML model.
FF-ICE and trajectory data
For ML applications targeting trajectory prediction and trajectory conformance monitoring, the availability of full 4D trajectory data via FF-ICE (ASBU FICE-B1/B2) substantially improves model inputs. The ML model can learn from the relationship between the desired trajectory and the executed trajectory, supporting conformance monitoring and conflict risk estimation.
2. CNS infrastructure enablers
Controller-pilot datalink (CPDLC/ATN B2)
Level 2A ML applications that automatically uplink clearances (for example, speed adjustments from an ML-optimised AMAN) require a reliable, authenticated datalink channel. CPDLC ATN B2 (ASBU COMI-B1) is the required substrate. Without it, automated uplinks cannot be implemented and the Level 2A architecture falls back to Level 1 (controller must voice the clearance).
Cybersecurity
AI/ML applications introduce new attack surfaces: adversarial inputs designed to manipulate model output, data poisoning attacks on training pipelines, and model extraction. Cybersecurity protections must be applied at both the data pipeline layer and the model inference layer.
Doc 9683 identifies data integrity as a prerequisite for reliable automation. EASA's trustworthy-AI building blocks include safety risk mitigation against adversarial inputs. ICAO Annex 17 (Security) and the relevant EU cybersecurity regulations apply to ATM AI systems that are classified as critical infrastructure.
3. Training and human performance enablers
Initial and recurrent training for ML-augmented operations
Deploying an ML advisory tool (Level 1) changes the controller's task. Controllers must be trained to:
- Understand the basis and limitations of ML outputs.
- Apply appropriate levels of trust (neither automation bias nor over-rejection).
- Detect anomalous model behaviour and report it.
- Maintain manual skills for degraded-mode operation.
ICAO Annex 1 and PANS-Training (Doc 9868) require competency-based training that must be updated when the operational environment changes. Introduction of ML tools is a significant operational change that triggers the training system change process.
For Level 2A systems, additional training is required on the override mechanism, the time window, and the criteria for override.
For Level 2B systems, supervisory task training replaces operational task training for the automated function. Simulators that replicate the ML-augmented environment are required for recurrent assessment.
Cognitive skills maintenance
The automation complacency risk (Doc 9683 §5.3.16) is particularly acute in ML environments where the automation is perceived as more competent than classical algorithms. Skill maintenance programmes must include periodic manual operations exercises (in simulators) that keep controller skills viable for system failure scenarios.
4. Regulatory and certification enablers
National CAA preparedness
National CAAs must develop or adopt the regulatory framework for ML application approval before their supervised ANSPs can deploy ML tools. In Europe, this means applying the EASA AI Roadmap 2.0 and Concept Paper framework. Non-EASA states must develop equivalent national frameworks or adopt ICAO guidance material when it is produced.
ICAO is developing guidance on AI in ATM as part of the GANP review cycle, but formal ICAO SARPs on ML application certification are not yet in place as of 2026.
Safety Management System (SMS) integration
Every AI/ML deployment is a significant system change and must be managed through the ANSP SMS under ICAO Annex 19 requirements. The SMS change management process must be adapted to:
- Handle the iterative, data-driven nature of ML development (not a conventional software waterfall process).
- Address the ongoing nature of ML maintenance (re-training, recalibration) as distinct from one-time software upgrades.
- Incorporate post-deployment ML performance monitoring into the safety assurance process.
EU AI Act compliance infrastructure
For European operations, ANSPs and ATM system developers must establish EU AI Act compliance processes: technical documentation, conformity assessment, notified body engagement for certain applications, CE marking, and post-market monitoring. This is a novel regulatory burden on top of EASA approval.
5. Institutional enablers
EUROCONTROL and SESAR 3 JU programme support
The EUROCONTROL FLY AI programme provides European ANSPs with:
- A maturity assessment framework for ML readiness.
- A community of practice for sharing safety evidence.
- Technical support for data governance and pipeline design.
SESAR 3 JU projects generate the operational demonstrations that provide the safety evidence base for EASA Concept Paper compliance. ANSPs not participating in SESAR projects can leverage published results to reduce their own safety case burden.
ICAO global harmonisation
ICAO is developing guidance on AI/ML in aviation through the GANP review cycle and technical panels. Global harmonisation of AI certification standards is needed to prevent fragmentation where EASA, FAA, CAAC, and other authorities develop incompatible approaches that complicate cross-border ATM system deployment.
Enabler dependency summary
| Enabler | Required for | ASBU link |
|---|---|---|
| SWIM operational | All ML data pipelines | SWIM-B0/B1 |
| ADS-B mandate | Traffic prediction, conflict risk | ASUR-B0/B1 |
| FF-ICE R1 | Trajectory-based ML inputs | FICE-B1 |
| CPDLC ATN B2 | Level 2A automated uplinks | COMI-B1 |
| Cybersecurity controls | Adversarial input mitigation | COMI-B2 / national |
| EASA Concept Paper framework | All ML approvals in Europe | Non-ASBU (EASA) |
| SMS change management updated | Any ML deployment | ICAO Annex 19 |
| Controller ML training | Level 1 and above | PANS-Training |
| Supervisory task training | Level 2B | PANS-Training |
References
- Doc 9683 (Human Factors Training Manual), Chapter 3, §3.3.19–§3.3.20 — increased automation is inevitable; issue is how, when, and where; human-centred design principles.
- Doc 9683, Chapter 5, §5.3.16 — complacency risk when automation is perceived as reliable; design and training mitigation.
- Doc 4444 (PANS-ATM), Chapter 4, §4.13.3 — Human Factors principles for ATC automation; ongoing human-centred design requirement.
- Doc 4444, Chapter 13, §13.2.2 — ADS-C ground systems integration with other automated systems; workload reduction objective.
- PANS-Training (Doc 9868), competency framework — workload management and automation level selection as trained competencies.
- EASA Concept Paper Issue 2 (2024) — enabler requirements for each AI Level including interface, training, and monitoring (authoritative source — not in local library).
- EUROCONTROL FLY AI Report — ANSP maturity assessment, data governance guidance, and SESAR evidence pathway (authoritative source — not in local library).
The performance lens
AI and ML applications in ATM are justified by performance benefits across the KPAs defined in ICAO Doc 9854 and Doc 9883. Every ML deployment must quantify its expected contribution to performance objectives, and those contributions must be evidenced post-deployment.
The chain is:
KPA --(measured by)--> KPI <--(targeted by)-- Performance Objective
--(achieved by)--> ML application
Performance Objectives for AI/ML in ATM
The following performance objectives represent the primary expected benefits of AI and ML adoption in ATM, mapped to the EASA AI Levels at which each becomes achievable.
PO 1 — Reduce controller workload under high traffic complexity
Reduce the cognitive load on controllers during peak traffic by providing ML-assisted conflict detection, sequencing, and coordination tools. Measured by task completion time, self-reported workload (NASA-TLX), error rate in simulated complexity scenarios. Achievable at: Level 1 (advisory) and Level 2A (automated relief).
PO 2 — Improve runway throughput at complex terminal areas
Increase the number of movements per hour at high-density airports by optimising arrival and departure sequencing through ML-enhanced AMAN/DMAN. Measured by actual vs. declared capacity utilisation. Achievable at: Level 1 and Level 2A.
PO 3 — Increase accuracy and earliness of traffic demand prediction
Improve ATFM planning by providing demand forecasts with tighter confidence intervals at longer horizons. Measured by prediction MAE (mean absolute error) at defined horizons vs. classical baseline. Achievable at: Level 1.
PO 4 — Reduce ATFM delay per flight
Reduce the average delay attributed to ATM network management by improving demand-capacity balancing through ML-enhanced prediction and automated restriction proposals. Achievable at: Level 1 (prediction accuracy) and Level 2A (automated restriction proposals).
PO 5 — Reduce separation infringement events
Reduce the rate of loss-of-separation events and STCA alerts through improved ML-based conflict detection with earlier warning and better prioritisation of actionable alerts. Achievable at: Level 1 (enhanced conflict risk display).
PO 6 — Maintain human performance under higher automation
Prevent degradation of manual control skills and situation awareness as automation levels increase. Measured by performance on manual operation exercises in simulation (pre/post deployment comparison). Achieved by: training programme design and skill-maintenance exercises linked to each Level transition.
KPA contribution by AI Level
The following matrix scores each KPA by the benefit horizon across the four EASA AI Levels for ATM (1 = some benefit, 2 = clear benefit, 3 = primary driver):
| KPA | Level 1 | Level 2A | Level 2B | Level 3 |
|---|---|---|---|---|
| Safety | 2 | 3 | 3 | 3 |
| Capacity | 1 | 3 | 3 | 3 |
| Flight efficiency | 1 | 2 | 3 | 3 |
| Predictability | 2 | 2 | 3 | 3 |
| Environment | 1 | 2 | 2 | 3 |
| Human performance | 2 | 2 | 2 | 2 |
| Cost-effectiveness | 1 | 2 | 3 | 3 |
Note: Human performance scores 2 at all Levels because the benefit depends on training system quality, not just automation level. Poorly managed Level 3 deployment can decrease human performance.
Key Performance Indicators
Safety KPIs
- STCA alert rate per flight hour (lower = fewer potential conflicts).
- STCA false-positive rate (lower = higher trust in alerts).
- Separation infringement events per million flight hours.
- ML model false-negative rate on held-out safety-critical scenarios.
Capacity KPIs
- Runway movements per hour (actual vs. declared capacity).
- ATFM sector regulation rate.
- ATFM delay minutes per flight (ML-assisted ATFM vs. classical baseline).
Flight efficiency KPIs
- AMAN sequence efficiency: actual vs. theoretical minimum sequence delay.
- Fuel burn per arrival movement (ML-optimised AMAN vs. classical).
- Track-mile efficiency: actual vs. direct route.
Predictability KPIs
- Demand forecast MAE at 30, 60, 90-minute horizons (ML vs. classical).
- CTA adherence rate: flights meeting ML-assigned CTA within tolerance.
- Variance in actual vs. planned landing time.
Human performance KPIs
- Controller self-reported workload (NASA-TLX) in ML-assisted vs. classical environment.
- Override rate for Level 2A automated actions (target within bounds: neither 0% nor consistently high; both indicate problems).
- Manual operations exercise score (periodic simulation assessment).
- Time-to-detect anomalous model output in controller training scenarios.
Environmental KPIs
- Fuel per arrival movement at ML-optimised terminal area.
- CDO conformance rate under ML-assisted AMAN sequencing.
Cost-effectiveness KPIs
- ATFM delay cost reduction (delay minutes averted multiplied by published cost-per-minute).
- Controller productivity: movements managed per controller hour.
How performance is measured and reported
Baseline measurement: performance data should be collected for a defined pre-deployment baseline period (minimum 6 months) using the same measurement methods that will be used post-deployment.
Shadow trial measurement: performance is measured during the shadow trial (ML running in parallel without operational effect) to establish the expected benefit before any operational change.
Post-deployment monitoring: all performance KPIs are tracked continuously post-deployment. Deviation from shadow trial expectations triggers model review.
Regional reporting: EUROCONTROL Performance Review Body for European operations; FAA Aviation System Performance Metrics (ASPM) for US operations; ICAO ASBU implementation monitoring for global reporting.
The quantitative evidence gathered through these KPIs also forms part of the EASA post-market monitoring plan required under the Concept Paper and the EU AI Act.
References
- Doc 9854 (Global ATM Operational Concept), Chapter 2 — KPAs and the performance-based framework for ATM.
- Doc 9683 (Human Factors Training Manual), Chapter 5, §5.3.21–§5.3.24 — changed performance measurement under automation; team function effects; supervisory assessment challenges.
- EASA Concept Paper Issue 2 (2024) — performance monitoring requirements for Level 1 and Level 2 ML applications; post-deployment surveillance obligations (authoritative source — not in local library).
- EUROCONTROL FLY AI Report — operational KPI benchmarks from European AI/ML demonstrations (authoritative source — not in local library).
Historical evolution
The following table records the key milestones in the development of the AI and automation regulatory framework for civil aviation ATM.
| Year | Event | Significance |
|---|---|---|
| 1993 | ICAO Doc 9683 (Human Factors Training Manual) first published | Established the foundational ICAO framework for automation in ATC; §5.3 dedicated to ATC automation remains the primary ICAO reference |
| 2001 | STCA generalised across European ATM | Short-Term Conflict Alert becomes the first widely deployed safety-net automation function in ATM; establishes the advisory-tool model |
| 2005 | ICAO Doc 9854 (Global ATM Operational Concept) published | Defines automation and decision-support as normative ATM design requirements; establishes human-centred automation principle |
| 2010 | EUROCONTROL introduces MTCD at MUAC | Medium-Term Conflict Detection deployed operationally; advances advisory-tool model to longer horizons |
| 2015 | SESAR programme demonstrates ML-based AMAN at Paris CDG | First operational-context demonstration of ML-assisted arrival sequencing in Europe |
| 2020 | EASA AI Roadmap 1.0 published | Identified AI as transformative; announced phased regulatory development programme; introduced preliminary AI Level concept |
| 2021 | EASA Concept Paper Issue 1: Guidance for Level 1 and Level 2 ML applications | First structured regulatory guidance for ML in civil aviation; Level 1/2A/2B definitions issued for consultation |
| 2022 | EUROCONTROL FLY AI report published | First systematic survey of AI/ML deployment across European ANSPs; defined maturity levels; documented use-case taxonomy |
| 2023 | EASA AI Roadmap 2.0 published | Consolidated three-level classification, W-shaped learning assurance, and roadmap to Level 3 guidance; referenced in EASA rulemaking agenda |
| 2024 | EASA Concept Paper Issue 2 published | Updated and expanded guidance for Level 1 and Level 2A/2B; principal pre-regulatory framework for ML in civil aviation |
| 2024 | Regulation (EU) 2024/1689 (EU AI Act) enters into force | Aviation safety systems classified high-risk; conformity assessment and human oversight requirements apply; gradual entry into force through 2027 |
| 2024 | FAA Roadmap for Artificial Intelligence Safety Assurance published | US parallel framework aligned with EASA concepts; defines AI trustworthiness programme for FAA oversight |
| 2025 | ASBU Block 2 availability window opens | TBO and enhanced network management automation in Block 2 create the operational context for Level 2A ML applications in AMAN and ATFM |
| 2026 | First EASA-accepted ML ATM applications expected | Level 1 ML applications (traffic prediction, AMAN sequence advisory) progressing through EASA Concept Paper review process |
| 2026 | ICAO GANP review cycle considering AI guidance material | ICAO working groups developing global harmonisation material for AI in ATM; no formal SARPs yet adopted |
| 2027 | EU AI Act high-risk AI provisions fully applicable | Full conformity assessment obligations for AI-enabled ATM systems in Europe; technical documentation and post-market monitoring required |
| 2029+ | Level 2B applications regulatory framework expected | EASA targeting completion of Level 2B guidance material; first Level 2B demonstrations anticipated |
| 2031 | ASBU Block 3 availability window opens | Full TBO and network-centric automation in Block 3 creates operational context for Level 2B/3 ML applications |
| 2033+ | Level 3 guidance material expected (indicative) | EASA Roadmap 2.0 flags Level 3 as a longer-term programme item; timeline subject to revision |
Key observations from the timeline
Regulatory development lags operational need. AMAN and traffic prediction ML tools are demonstrably beneficial, but regulatory frameworks to approve them as certified components have been in development since 2020 and only reached pre-regulatory status with the 2024 Concept Paper.
The EU AI Act introduces a new compliance layer on top of the EASA aviation-specific framework. For European ATM, the 2024-2027 period requires ANSPs to build EU AI Act compliance processes in parallel with EASA approval processes — a significant institutional burden.
ASBU Block 2 (2025) and Block 3 (2031) create the operational context for AI/ML deployment. The AMAN, TBO, and network automation functions in Block 2 and Block 3 will increasingly rely on ML at Level 1 and Level 2A as the regulatory framework matures.
ICAO is behind EASA on AI regulation. As of 2026, there are no formal ICAO SARPs on ML application certification. ICAO member states outside the EASA perimeter must develop national frameworks or adopt EASA guidance by reference, which creates fragmentation risk for cross- border ATM systems.
References
- EASA AI Roadmap 2.0 (2023) — programme timeline and milestones (authoritative source — not in local library).
- EASA Concept Paper Issue 1 (2021) and Issue 2 (2024) — regulatory development history (authoritative source — not in local library).
- Regulation (EU) 2024/1689 (EU AI Act) — entry-into-force provisions and applicability dates (authoritative source — not in local library).
- FAA Roadmap for Artificial Intelligence Safety Assurance (2024) — US parallel timeline (authoritative source — not in local library).
- EUROCONTROL FLY AI Report (2022) — survey of European deployment status (authoritative source — not in local library).
- Doc 9854 (Global ATM Operational Concept, 2005) — original ICAO automation framework publication date.
- Doc 9683 (Human Factors Training Manual, 1993) — foundational ATC automation framework publication date.
ICAO documents (in local library)
- Doc 9854 (Global ATM Operational Concept), Chapter 1, §1.6.2 — greater access to decision-support information for airspace users as a core ATM concept objective; human-centred automation.
- Doc 9854, Appendix F, interoperability requirements, item n — "automation and human/machine interface: a minimum level of interoperability should be defined to ensure the smooth flow of traffic".
- Doc 9854, Appendix F, interoperability requirements, item o — automated functions including multi-radar tracking, flight plan correlation, flight progress strip distribution, and automated sector coordination as normative requirements.
- Doc 9854, Chapter 3, §3.4 — surface-movement decision-support systems as an integral part of the total ATM automation environment; integration with departure and arrival automation.
- Doc 9683 (Human Factors Training Manual), Chapter 3, §3.3.2–§3.3.4 — role of the human operator in highly automated systems; management-by-exception dilemma for controllers.
- Doc 9683, Chapter 3, §3.3.7–§3.3.8 — human cognitive strengths in uncertain situations; automation defined as a tool or resource; human retains management and direction responsibility.
- Doc 9683, Chapter 3, §3.3.9–§3.3.10 — "cooperative human-machine architecture" as the design goal; determining appropriate levels and modes of interaction for controllers.
- Doc 9683, Chapter 3, §3.3.13 — automation role in establishing negotiation between FMS and ATM ground systems; human decides outcome.
- Doc 9683, Chapter 5, §5.3 — dedicated section "Automation in Air Traffic Control": advisory, cooperative, and delegated roles; workload effects; team function changes under automation.
- Doc 9683, Chapter 5, §5.3.15–§5.3.18 — advisory vs. automated roles; machine as advisor or decision-maker; planning vs. executive function automation.
- Doc 9683, Chapter 5, §5.3.21–§5.3.24 — supervisory function changes; skill assessment challenges; simulator-based competency maintenance.
- Doc 4444 (PANS-ATM), Chapter 4, §4.13.3.2–§4.13.3.5 — Human Factors principles for data presentation in ATC automation systems; timely display requirements.
- Doc 4444, Chapter 8, §8.1.3 — ATS surveillance systems integration with other automated systems to reduce controller workload and verbal coordination.
- Doc 4444, Chapter 15, §15.7.2 — Short-Term Conflict Alert (STCA) procedures: objective, parameters, controller obligations; most widely deployed safety-net automation function in ATM.
- Doc 9868 (PANS-Training), competency framework — workload management and automation level selection as trained competencies for pilots and air traffic controllers.
ICAO documents (not in local library)
- Doc 9750 (GANP), ASBU Threads — Block 2 network management automation and Block 3 advanced automation as the operational context for AI/ML deployment (authoritative source — not in local library; see https://ganpportal.icao.int/).
- ICAO Annex 19 (Safety Management) — SMS change management requirements applicable to AI/ML system deployments as significant operational changes (authoritative source — not in local library).
- ICAO Annex 17 (Security) — cybersecurity requirements applicable to ATM AI systems as critical infrastructure components (authoritative source — not in local library).
EASA documents
- EASA AI Roadmap 2.0 (2023) — three-level AI classification (L1/L2A/L2B/L3), W-shaped learning assurance, trustworthy-AI building blocks, phased regulatory programme (authoritative source — not in local library).
- EASA Concept Paper "Guidance for Level 1 and Level 2 Machine Learning Applications", Issue 2 (2024) — principal pre-regulatory guidance: Level definitions, W-shaped lifecycle, explainability requirements, data governance, safety risk mitigation, ethics/governance (authoritative source — not in local library).
- EASA Concept Paper Issue 1 (2021) — predecessor document; established initial Level 1/2A/2B framework; superseded in detail by Issue 2 but remains reference for development history (authoritative source — not in local library).
EU regulatory documents
- Regulation (EU) 2024/1689 (EU AI Act, 2024) — high-risk AI classification for aviation safety systems; data governance, transparency, accuracy, robustness, human oversight, and conformity assessment requirements; graduated entry into force 2024-2027 (authoritative source — not in local library).
FAA documents
- FAA Roadmap for Artificial Intelligence Safety Assurance (2024) — US framework for AI trustworthiness in civil aviation; aligned with EASA concepts; applicable to FAA-regulated ATM systems (authoritative source — not in local library).
- FAA Order 8040.4B (Safety Risk Management) — applicable to any ATM system change including AI-enabled systems (authoritative source — not in local library).
EUROCONTROL and SESAR documents
- EUROCONTROL FLY AI Report: Demystifying and Accelerating AI in Aviation/ATM (2022) — European ANSP AI maturity survey; use-case taxonomy; implementation maturity framework (authoritative source — not in local library).
- SESAR 3 JU project deliverables (ongoing) — AI and ML demonstration results under the Digital European Sky programme; primary source of operational performance evidence for EASA approval cases (authoritative source — not in local library).
External sources
- https://www.easa.europa.eu/en/document-library/general-publications/easa-artificial-intelligence-roadmap-20 - EASA AI Roadmap 2.0 (2023)
- https://www.easa.europa.eu/en/document-library/general-publications/concept-paper-guidance-level-1-2-machine-learning-applications - EASA Concept Paper Issue 2 (2024)
- https://www.eurocontrol.int/publication/fly-ai-report - EUROCONTROL FLY AI Report (2022)
- https://www.faa.gov/aircraft/air_cert/design_approvals/aiassurance - FAA Roadmap for Artificial Intelligence Safety Assurance (2024)
- https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689 - Regulation (EU) 2024/1689 (EU AI Act)
- https://www.sesarju.eu/projects - SESAR 3 JU project catalogue
- https://ganpportal.icao.int/ - ICAO GANP Portal; ASBU automation threads