Business Metric Impact / 01|6-min read / DORA-cited

Tech debt impact on engineering velocity: the DORA gap

DORA's elite-to-low gap on deploy frequency is 38x. The lead-time gap is 106x. These are not aspirational targets, they are the measured spread between top-quartile and bottom-quartile teams. The spread is almost entirely a tech debt story.

The 90-Second Answer

Low DORA performers ship 38x less often, take 106x longer to get a change into production, recover 6,570x more slowly from incidents, and fail 5x more often when they do ship (DORA State of DevOps 2024). The aggregated gap is a strategic competitive disadvantage that no single hire or process change closes. Tech debt is the dominant explanatory variable behind the spread.

The Four Keys

What DORA actually measures

DORA, the DevOps Research and Assessment programme (now part of Google Cloud), has been measuring software delivery performance annually since 2014. The four keys are the framework that has emerged as the industry-standard productivity instrument. Each measures a different dimension of delivery, and the spread between performance bands has widened year on year in recent reports.

MetricEliteHighMediumLow
Deploy frequencyOn demand (multiple per day)Daily to weeklyWeekly to monthlyFewer than monthly
Lead time for changesLess than 1 hour1 day to 1 week1 week to 1 month1 to 6 months
Mean time to recoveryLess than 1 hourLess than 1 day1 day to 1 week1 week to 6 months
Change failure rate0-15%16-30%16-30%16-30%+

Bands derived from DORA State of DevOps 2024. Annual updates revise the ranges slightly.

Why Debt Is the Bottleneck

The mechanism, in concrete terms

The metrics gap is not an artefact of measurement or process maturity alone. It tracks closely to accumulated tech debt because each of the four keys has a debt-sensitive failure mode. Deploy frequency falls when the cost of deployment is high because the deployment system is brittle. Lead time grows when changes require coordination across multiple poorly-decoupled systems. MTTR explodes when incident diagnosis requires reverse-engineering systems whose documentation has drifted. Change-failure rate spikes when test coverage cannot keep pace with the changing surface area of the codebase.

Each failure mode has a debt-shaped cause. A team can paper over individual failures with workarounds, on-call rotations, and heroic last-minute fixes, but the four metrics taken together expose the underlying debt because they all degrade simultaneously when the codebase deteriorates. This is why DORA scores have become a useful proxy for tech debt severity even though DORA itself does not explicitly measure debt.

The strongest predictor of which band a team falls into is not headcount, not language choice, not framework, and not any specific tool. It is the cumulative engineering investment in delivery-system hygiene over the previous three to five years. Teams that invested early sit in the high or elite bands and continue improving; teams that deferred investment sit in the medium or low bands and find each metric harder to move year by year as the debt compounds.

The Strategic Translation

What 38x means in product terms

A 38x deploy-frequency gap sounds abstract until it is translated into product capacity. An elite team that deploys 38 times for every one deploy from a low-performer team can run 38 product experiments in the same period. They can ship 38 incremental fixes for the same user-research finding. They can respond 38 times to a competitor's release while the low-performer is still preparing their single response. The aggregated strategic capacity gap is enormous.

For a CEO evaluating competitive position, the DORA bands are a useful diagnostic. If the company's most relevant competitors are clustered in higher DORA bands, the velocity gap is itself a strategic position the CEO is fighting against, regardless of headcount or budget parity. The fix is not headcount; the fix is the structural tech-debt remediation that DORA scores are a proxy for. Hiring into a low-performer team often produces no measurable velocity gain because the bottleneck is not capacity.

For a CFO evaluating engineering spend, the DORA scores are a useful efficiency proxy. Two teams with identical opex lines can be operating at very different levels of effective capacity depending on their DORA band. The CFO who knows the team is in the low band has a defensible reason to fund a debt-reduction initiative rather than adding headcount; the dollar return on the debt initiative is typically larger and faster than the dollar return on hiring.

The Improvement Path

Moving up the bands

DORA's research consistently identifies an improvement path that does not require expensive structural change. Most teams can move from low to medium performance with three operational changes: trunk-based development with feature flags (replacing long-lived branches), automated test infrastructure that runs on every PR, and on-call ownership by the team that wrote the code (replacing centralised ops teams). These three changes typically take 6-12 months and require no major code rewrites.

The harder step is medium to high. Moving from monthly-to-weekly deploys into daily-or-better deploys requires structural changes to the codebase itself: service decomposition where the monolith blocks small-batch deployment, dependency-injection patterns where tight coupling prevents safe partial deployments, and observability investment so issues are caught in production within minutes rather than days. This is where the substantive debt work appears.

The transition from high to elite is genuinely hard and most teams never make it. Elite performance requires a culture of continuous deployment, mature progressive-delivery practices, and a codebase architected for high-frequency small-batch change. Few engineering organisations reach this band without sustained multi-year investment, and the band's defining characteristic is that debt accumulation is actively prevented through ongoing engineering hygiene rather than periodically remediated.

Cross-Reference

Velocity sits inside the business-impact stack

Velocity is the most visible business-metric impact. The companion pages cover the less-visible impacts: onboarding time-to-productive, cost of revenue (COGS), gross margin, and burn rate. The pitch pages translate velocity into stakeholder-specific framing: CFO, CEO, board.

For the engineering-practitioner version of the velocity story, including the specific tooling and patterns that move the four keys, see the sister site technicaldebtcost.com. The cicdcost site (a portfolio sibling) covers the CI/CD-specific cost arithmetic that intersects with the lead-time metric.

Field Notes

Frequently asked questions

What is the DORA velocity gap between elite and low performers?+

DORA's 2024 State of DevOps report shows roughly a 38x deploy frequency gap, 106x lead time, 6,570x recovery time, and 5x change-failure rate between elite and low performers. The aggregated gap is much larger than any individual metric and almost entirely traceable to accumulated tech debt.

How is velocity measured for tech-debt purposes?+

The DORA four keys (deploy frequency, lead time, MTTR, change-failure rate) are the industry standard. Most engineering organisations measure these from their CI/CD and incident systems with weeks of work rather than the months of custom instrumentation often assumed.

Does moving from low to medium performer require a rewrite?+

Usually no. DORA's research consistently shows that low performers can reach medium performance with operational changes (trunk-based development, automated testing, on-call ownership) before any structural codebase work. The biggest velocity gains often come before any code is touched.

Why is velocity a CEO concern and not just a CTO concern?+

Velocity translates directly into roadmap risk and competitive responsiveness. A 38x deploy frequency gap means the elite-performer competitor can run 38 product experiments for every one the low-performer runs. The compounding cost of that gap accrues to the CEO's strategic position, not the CTO's operational metrics.

How do we improve DORA scores in a debt-heavy codebase?+

Start with the lead time metric since it has the most operational levers (smaller PRs, automated tests, deployment automation). MTTR follows once observability and runbooks are mature. Deploy frequency and change-failure rate are the hardest to move and typically require structural debt work.

What is the dollar cost of being a low DORA performer?+

Hard to express as a single number but the McKinsey 25-42% drag figure correlates strongly with DORA performance bands. Low performers tend to cluster at the high end of the drag range (38-42%); elite performers at the low end (5-15%). The dollar cost is the drag percentage applied to the engineering opex line.

Adjacent Reading