The Airline IT Failures Making Headlines Are Not PSS Problems
At $100.76 per minute of aircraft block time [1], airline operations failures are not abstract events. Southwest’s December 2022 meltdown: 16,700 cancellations [2], $800 million in losses [3], ten days to recover.
SWAPA President Casey Murray had warned management six weeks earlier that the airline was “one thunderstorm, one ATC event, one router brownout from a complete meltdown” [4]. The union had been raising the risk since 2016.
Delta’s July 2024 CrowdStrike recovery: five days, $500 million [5], while American restored operations the same day. The difference in both cases was not primarily the weather nor a vendor patch. It was the state of each airline’s bespoke crew scheduling and crew-tracking software.
These failures did not originate in Amadeus, SabreSonic, or any commercial PSS platform. They originated in custom-built operations software: undocumented, running on end-of-life stacks, and structurally unable to absorb either a sudden volume spike or a cascading infrastructure failure.
A third failure mode requires no operational event at all. FAA and EASA rule cycles arrive continuously, and any crew scheduling or compliance system too opaque to modify safely accumulates exposure with every cycle.
The applications in this category sit outside every vendor’s modernization roadmap. This article is about them.
A Map of the Bespoke Operations Layer: The Applications No Vendor Will Modernize for You
PSS platforms (Amadeus Altéa, SabreSonic, Navitaire) handle reservations, inventory, ticketing, and departure control. Commercial platforms with vendor-managed modernization roadmaps and contracts measured in decades. Not the terrain this article covers.
The bespoke operations layer is a distinct, identifiable category. It runs the airline’s actual operations. As airline operations software is the definitional case of tightly coupled systems, every subsystem in this layer shares a structural DNA: built by specialists over decades, on stacks that are now end-of-life, with logic that has never been written down.

Crew Scheduling Engines
The optimization core of airline operations. A crew scheduling engine simultaneously models FAA/EASA/DGCA duty time regulations, union contract logic, pilot qualifications, type ratings, rest requirements, base assignments, and personal preferences across thousands of crew members, in real time, during irregular operations. Built by specialists who have since retired. Southwest’s SkySolver is the most public failure case of this category, but it is not an outlier.
The systems that do this work were built by specialists who have since retired, and the optimization algorithms they wrote were never documented.
The consequences are well understood by anyone who has tried to touch one. A frequently cited account from an aviation engineering forum captures the problem precisely: “Our crew scheduling system was built in the early 2000s.
The optimization algorithm is undocumented. The developer who wrote it retired. We can’t modify it, we can’t replace it because it’s too critical.” Southwest’s SkySolver is the most public failure case of this category. It is not an outlier.
Flight Planning and Dispatch Release Systems
Weight and balance calculations, fuel burn validation, and dispatch release logic. Many touch safety-certified code paths require FAA/EASA recertification if modified, a constraint that has frozen modernization programs for years.
The American Airlines Flight Operating System illustrates the facade trap clearly: modern alerting features and updated UIs have been built on top of core dispatch and flight planning logic dating to the 1970s.
The interface has modernized. The underlying calculation layer has not. New capability is only as reliable as the data layer beneath it.
Compliance Monitoring Applications
ETOPS compliance tracking, MEL management, and airworthiness compliance systems. Safety-adjacent, audit-subject, and regulatory-deadline-driven. These are the quiet version of the Southwest problem: no mass cancellations, but a hard regulatory deadline, zero documentation, and often no application-tier failover.
An audit finding on an ETOPS compliance system can ground route approvals. An MEL system that cannot absorb a regulatory change puts every maintenance release at risk.
OCC and IROPS Decision Support Tools
Where legacy architecture fails most visibly. During a disruption, recovery requires simultaneous aircraft recovery, crew recovery, passenger recovery, and maintenance coordination across systems built at different times by different teams that do not communicate in real time.
Peer-reviewed research published in November 2025 concluded that no existing framework has successfully achieved a comprehensive solution for IROPS management at scale, describing current approaches as highly fragmented. The architectural description maps precisely to what most airlines are running.
MRO Workflow and Maintenance Tracking Systems
Airworthiness records, maintenance scheduling, and component tracking. Often, the most neglected subsystem in any modernization conversation, and the least visible until a regulatory audit surfaces gaps or an airworthiness record cannot be traced.
ACARS Interface Handlers and Legacy Data Integrations
Custom middleware connecting aircraft communication systems to operations platforms: ACARS message parsing, EDIFACT transaction handling, OOOI event processing. Undocumented integration logic accumulated across system generations. The interfaces are rarely tested independently; they are discovered when something downstream breaks.
The shared structural DNA across all of these is custom-built over decades by engineers who have since retired, on end-of-life stacks (VB6, legacy .NET, legacy Java), with zero documentation and zero test coverage, tightly coupled through undocumented interfaces to adjacent systems.
Emirates chairman Sir Tim Clark described the industry’s technology infrastructure as “Jurassic” as far back as 2014. That characterization has only become more accurate.
The reason these systems have not been modernized is not the budget. It is opacity. Every engagement that has attempted to touch them without first achieving system-level comprehension has produced variants of the failure cases in Section 1.
Why Airline Software Is the Hardest Modernization Problem in Any Industry
Uptime Requirements That Have No Equivalent
At $100.76 per minute of delay [1], technical debt in airline systems has a direct per-minute cost that most industries can only approximate. OCC systems, crew scheduling engines, and dispatch release systems run continuously.
No maintenance window accommodates a multi-day migration. Any modernization approach that requires a cutover period, rather than parallel-run incremental extraction, is disqualified by operational reality.
Safety-Certified Code Paths
Dispatch release logic, weight and balance calculations, and fuel burn validation in many airline systems carry FAA or EASA safety certification. Modifying a certified code path requires a recertification process that can take months.
A modernization program that cannot distinguish certified from non-certified code paths before transformation begins will either avoid these modules entirely, leaving the highest-risk code untouched, or trigger a recertification cycle that was never scoped into the program.
Undocumented Optimization Algorithms
Crew scheduling engines encode optimization logic that reflects years of operational learning, union contract negotiation outcomes, and regulatory interpretation. None of it is written down. The logic lives in the code, and in some cases, in the memory of engineers who are no longer with the airline.
Transformation without comprehension replaces the system with something that compiles and passes basic tests but does not reproduce the operational behavior the airline depends on. The failure is not visible at UAT. It surfaces during the first IROPS event after go-live.
Regulatory Change Absorption
FAA and EASA rule changes arrive on a continuous cycle. Each update requires systems to absorb new constraint logic. An undocumented, tightly coupled crew scheduling engine cannot be modified safely without first understanding what it currently models.
The exposure is not a single missed compliance deadline. It is a structural inability to respond to any future regulatory cycle.
Cybersecurity Exposure Through Architectural Fragmentation
Legacy monolithic architectures represent a concentrated attack surface. A single application encoding crew scheduling, compliance tracking, and IROPS logic in a tightly coupled codebase propagates failure broadly when compromised.
The September 2025 ransomware attack on Collins Aerospace’s MUSE system took check-in systems offline simultaneously across Heathrow, Brussels, Berlin, and Dublin. The propagation was architectural, not incidental. Modular, independently deployable systems fail in smaller perimeters.
Constraints vs. Consequences
| Constraint | Violation Scenario | Operational Consequence |
| Uptime requirement | Cutover-based migration requiring downtime window | OCC or crew scheduling unavailable during disruption recovery |
| Safety-certified code paths | Transformation without certification pathway planning | Recertification cycle not scoped; FAA/EASA finding possible |
| Undocumented optimization logic | Transformation without system comprehension | Silent behavioral deviation; surfaces during first IROPS event post-go-live |
| Regulatory change absorption | System too opaque to modify for new FDTL rules | Compounding exposure across every future regulatory cycle |
| Cybersecurity surface | Monolithic legacy architecture with no module boundaries | Single attack point propagates failure across multiple systems or operators |
Before any modernization program can safely sequence its work, it needs a complete map of system dependencies, safety-critical code paths, and regulatory risk areas. That is precisely what the Legacyleap $0 Modernization Assessment produces, at no cost, before any commitment. Request yours today.
How Legacyleap Handles the Applications Airlines Cannot Afford to Get Wrong
The constraints in Section 3 are not solved by faster code generation. They are solved by a platform that operationalizes comprehension before transformation begins.
That approach has a name: comprehension-first modernization. The principle is that no transformation decision, be it sequencing, architecture, module extraction, or certification pathway planning, can be made correctly without first achieving complete system understanding.
For airline bespoke operations software, where the cost of a silent behavioral deviation surfaces during an IROPS event rather than a test run, this is not a methodology preference. It is the only sequence that manages the risk correctly.
Legacyleap is built around five specialized agents operating in sequence: each stage produces the inputs the next one requires.
For this operational software, the first two agents carry the most weight.
Assessment Agent
Maps full system dependencies, module boundaries, call graphs, and regulatory risk areas across the entire codebase.
For a crew scheduling engine or ETOPS compliance monitor, this means identifying which code paths touch safety-certified logic, which modules carry regulatory constraint encoding, and which interface boundaries are undocumented. Without this output, modernization sequencing is based on assumptions.
Documentation Agent
Reconstructs the functional and technical documentation that never existed: business logic, regulatory constraint encoding, workflow rules, module boundaries, interface contracts. For airline compliance software, this is the prerequisite for every transformation decision that follows. It produces what retiring engineers took with them, codified.
Recommendation Agent
Suggests modernization paths and target architecture options based on complexity, regulatory dependency, and feasibility. For aviation software, this means identifying which modules to extract first, which require certification pathway planning before transformation begins, and which can be replaced with commercial alternatives rather than rebuilt.
Modernization Agent
Executes governed, diff-based code transformations that are always human-reviewable before acceptance. The agent cannot merge, deploy, or execute code directly. Every transformation is a reviewable diff. Engineering control is preserved at every step.
QA Agent
Validates functional parity through automated unit, API, functional, and regression test suites. Parity validation is evidence-based: outputs are validated across structured test scenarios before anything advances, not calendar-based. For safety-adjacent systems, this is the difference between a deployment that can be defended to a regulator and one that cannot.
Legacyleap Agent Architecture Mapped to Aviation-Specific Constraints
| Agent | Aviation-Specific Function | Constraint It Resolves |
| Assessment Agent | Maps safety-certified code paths, regulatory logic areas, and module boundary risks across full codebase | Sequencing problem: cannot modernize safely without knowing what must be certified before transformation |
| Documentation Agent | Reconstructs FDTL constraint encoding, union contract logic, regulatory rule implementations, and undocumented interface contracts | Opacity problem: the logic exists only in the code; extraction without documentation produces silent behavioral risk |
| Recommendation Agent | Identifies which modules to extract first, which require certification pathway planning, and which can be replaced vs. rebuilt | Architecture problem: not all modernization paths are equivalent; wrong sequencing triggers certification cycles |
| Modernization Agent | Executes diff-based transformations with human review at every step; cannot auto-merge or deploy | Control problem: safety-adjacent systems require human sign-off on every code change before advancement |
| QA Agent | Generates and runs structured test scenarios against original behavior; evidence-based parity, not calendar-based | Validation problem: behavioral parity cannot be asserted without test coverage that did not previously exist |
The ETOPS Monitor Engagement
This methodology has been executed against real aviation compliance software under a real regulatory deadline.
A leading North American airline had an ETOPS compliance monitoring application that needed to be modernized by December 2025. Compliance tracking for ETOPS-certified routes is continuous, audit-subject, and safety-adjacent.
The application’s profile:
- VB6 desktop application,
- 11,000 lines of code across 29 files,
- Four heterogeneous databases (Informix, Oracle, SQL Server, Microsoft Access),
- 2,588 COM component references,
- 195 WIN32 API calls,
- Zero documentation, zero test coverage,
- A recent outage with no application-tier failover, and
- An audit flag that made the December 2025 deadline non-negotiable.
This is the quiet version of the Southwest problem. No mass cancellations, but a compliance system with no failover, no documentation, and a hard regulatory deadline. Every airline has applications that look exactly like this.
Delivery was structured across three phases:
- One week of assessment and discovery,
- Two weeks of automated code conversion at 60-65% automation, and
- Four weeks of manual refinement, testing, and documentation.
Total:
- 8 weeks to UAT readiness versus a 14-to-16-week manual baseline with a comparable team size.
- React frontend and .NET Core/LTS backend, designed for cloud-readiness and application-tier HA/failover. The single-point-of-failure risk that had caused the prior outage was resolved at the architecture level.
On documentation: the engagement delivered complete functional and technical documentation covering the application’s logic, data flows, and integration dependencies across all four databases. Assets that did not exist before the engagement and are now the baseline for every future modification.
On test coverage: the engagement delivered automated unit and functional test suites built from scratch. Before the engagement, no structured test plans or automated tests existed. Every functional path through the application had coverage, providing the validation baseline required for UAT and for any future change.
The December 2025 deadline was met. The audit flag was resolved. ETOPS route approvals were not put at risk. 50% reduction in time and cost versus the manual baseline.

The $0 Modernization Assessment delivers a dependency and module map, regulatory code path visibility, risk indicators, and a structured modernization readiness view, at no cost, before any commitment.
Every Airline Has a Portfolio of ETOPS Monitors
The ETOPS Monitor is one application. Every airline has dozens carrying the same profile: undocumented, end-of-life stacks, zero test coverage, safety-adjacent or compliance-adjacent, left alone because nobody was confident they understood them.
The urgency drivers are compounding. FDTL regulatory updates will continue to arrive, and each one creates a new exposure window for any crew scheduling or dispatch system that cannot be safely modified. Legacy monolithic architectures remain the attack surface of record. At $100.76 per minute, the cost of the next operational failure is not a projection. It is a rate.
The choice is whether to build the system understanding and modernization blueprint before the deadline, whether it is regulatory or operational, or wait for a Southwest or Delta event to force the decision.
Incremental modernization is the only viable approach for airline-scale systems, and the sequencing logic that applies at portfolio scale starts with a complete map of what you have.
Request a $0 Modernization Assessment. The place to start is a complete map of your portfolio: dependencies, risk areas, safety-critical code paths, and a modernization blueprint. At no cost, before any commitment.
Book a Demo. See how Legacyleap operates against bespoke operations software, including systems with no documentation and no test coverage.
FAQs
PSS modernization addresses commercial platform currencies such as Amadeus, SabreSonic, and Navitaire. Operations software modernization addresses the bespoke layer that those platforms never touch: crew scheduling engines, compliance monitors, dispatch systems, and OCC tooling. An airline can complete a full PSS migration and remain entirely exposed to the failure modes Southwest and Delta documented, because the systems that failed in both cases had no connection to PSS. For most airlines, the bespoke operations layer carries the higher business risk and has no vendor managing it.
The failure model for bespoke operations software is rarely a single outage. It accumulates as a constraint: a system that cannot absorb an FDTL rule change without months of manual effort, a crew scheduling engine with an unknown threshold under disruption load. The business case is built on exposure quantification, not past incidents. A modernization readiness assessment maps regulatory change risk, single points of failure, and end-of-life stack exposure across the portfolio, producing a risk-weighted view that is defensible to a board without requiring a Southwest-scale event to justify it.
Yes, and for most airlines, this is the correct initial sequence. The optimization algorithm carries the highest operational risk. Replacing it requires full comprehension of existing behavior and parity validation across irregular operations edge cases. The safer path is to first modernize the infrastructure layer, integration surface, and interface while preserving the algorithm as a contained, documented module. That produces a deployable, testable system before any decision about algorithm replacement is made, and keeps the highest-risk logic change as a deliberate second phase.
A legacy compliance application under audit presents outputs with no traceable logic. The system produces a compliance record, but the rules it applies and the conditions under which it flags violations cannot be independently verified. A modernized application with complete functional documentation presents the inverse: compliance logic documented at the rule level, data flows mapped, and a test suite providing an auditable record of validated behavior. The difference determines whether an examiner accepts the application on faith or can verify it independently.
Three variables govern sequencing: regulatory deadline proximity, operational consequence of failure, and integration complexity. Applications with active audit flags have non-negotiable positions. Beyond those, the common failure mode is sequencing by apparent simplicity. Applications that look small in line count often carry deep integration dependencies, and that is where sequencing assumptions break down. A portfolio-level dependency map is the precondition for correct sequencing. Without it, programs consume most of their budget on approachable applications and arrive at the highest-risk systems last.
References
[1] Airlines for America.Cost of Aircraft Block Time. $100.76 per minute figure for aircraft block time cost.
[2] U.S. Department of Transportation.Report on Southwest Airlines Holiday Meltdown (2023). Documents SkySolver failure, flight cancellation totals, and recovery timeline.
[3] Southwest Airlines.Fourth Quarter and Full Year 2022 Earnings Release. CEO Bob Jordan: “more than 16,700 flight cancellations, pre-tax negative impact of approximately $800 million.” $1.3B 2023 technology budget commitment confirmed on earnings call.
[4] Southwest Airlines Pilots Association (SWAPA).Written Testimony of Captain Casey Murray, SWAPA President — U.S. Senate Commerce Committee (February 2023). Documents the November 2022 warning and SWAPA’s multi-year history of flagging structural risk in Southwest’s crew scheduling infrastructure.
[5] Travel Weekly.Why the CrowdStrike crash hit Delta harder (August 2024). Confirms $500M cost, 5-day recovery vs. American’s same-day and United’s 3-day recovery, crew-tracking system as primary recovery bottleneck, and Windows infrastructure dependency.








