4. Supply chain and dependencies

4. Supply chain and dependencies
Photo by Omar Flores / Unsplash

Sixteen hours after the Shock, recovery has ground to a halt. At first, progress was real. A few internal services responded again. Some systems that had gone dark now answer health checks. The IT team thought the worst was behind them. Then recovery stalled.

The nightly database snapshots had completed just in time. On paper, nothing is lost. In practice, those databases were managed services that were never operated by the team.  Faced with raw data files, no one knows how to recreate (let alone operate) a database cluster from scratch.

Anything that requires a build is blocked. Most of the source code has been re-assembled from the local backups of some of the old-school engineers - but the thousands of dependencies it requires are inaccessible.

Worse, the services that are technically back remain unreachable. The company no longer controls its DNS. Domains cannot be updated, traffic cannot be rerouted, certificates cannot be reissued. Some systems may be alive - but they are invisible. The company is completely locked outside of the world. 

Failure mode

Modern businesses are not standalone systems. They are assemblies of services, stitched together through APIs, SDKs, identity layers, and managed platforms. This architecture enables speed and scale, but it also creates hidden, compounding dependencies that become dangerous under widespread disruption.

Most SaaS failures do not start inside the company:

  • a single identity provider gating access to all operational tools,
  • DNS or CDN outages severing access to otherwise healthy systems,
  • online CI/CD platforms becoming unavailable,
  • container images storage systems are unreachable (ex: Docker registry, Kubernetes registry…),
  • build dependencies can’t be accessed anymore (ex: open-source repositories, Pypi, npm…),
  • critical APIs become unavailable (ex: LLMs, OCR services…),
  • logging and monitoring systems failing when they are most needed,
  • payment, messaging, or notification providers failing.

Objective

Preparedness in this area has three core objectives:

  • Visibility: understand which vendors are critical to operate and recover. Make hidden dependencies visible.
  • Control: retain the ability to act even when key vendors fail.
  • Recoverability: avoid dependency patterns that make recovery impossible under stress.

The goal is not eliminating vendors. It is preventing any single vendor failure from becoming existential.

Solutions

Map critical vendor dependencies

SaaS tools have become deeply integrated in development workflows: developers code in the cloud, run their version control online, etc… Because these services work reliably most of the time, teams fail to model what happens when they are unavailable. 

Preparedness starts with an explicit dependency map that extends beyond infrastructure. Whereas ISO 27001 and SOC 2 assess vendor dependency to preserve service continuity in the event of an isolated provider failure, preparedness aims to define a recovery path toward a Minimal Survivable Service under conditions of systemic provider unavailability.

Create a simple table: list the vendors you currently use. Define whether they are required for your Minimal Survivable Service. Identify a replacement or a workaround. Assess your required steps towards readiness - whether you start now or on the spot.

Component

Provider

Required for MSS ?

Replacement ?

Workaround ?

Readiness ?

Compute

Amazon EC2

Yes

Hetzner, OVH


Storage

Amazon DynamoDB

Yes

Self-hosted MongoDB

Important rework of production code.

LLM API

ChatGPT API

Yes

Mistral

Isolate calls behind LangChain

SMS delivery for 2FA

Twilio

No

Disable 2FA temporarily


CDN

Cloudflare

No

Change DNS

Self-host JS dependencies


DNS

Amazon Route 53

Yes

OVH

Separate platform domain from corporate domain

Source code versioning

GitHub

Yes

Self-hosted Git

Need to setup

Build dependencies

npm

Yes

Self-hosted NPM

Need to setup

CI pipeline

GitHub actions

Yes

TBD

Need to explore

Ticketing

Atlassian

No

Email


Payment








Make sure you go beyond the obvious Infrastructure and SaaS tools. Review identity, monitoring, billing, third-party APIs, build dependencies, support tools, analytics, etc… 

We have started to list sovereign alternatives to the main SaaS tools here.

Deliverables:
- Map critical SaaS and infrastructure dependencies.
- Decision on each critical component required for the MSS: is the lock-in acceptable ?

Mandatory XKCD here:

XKCD #2347 – Dependency
Comic by xkcd.com © Randall Munroe.

Reducing dependency risk without over-engineering

To take preparedness further, you can assign owners to each, and size the effort required. It will be up to you to decide whether you engage in it right away for preparedness purposes, or whether you will execute the work just in case of an event. 

Digital preparedness largely aligns with pragmatic engineering and infrastructure best practices: 

  • Centralize all third-party integrations in well-named modules or services.
  • Introduce thin abstraction layers : if replacing a dependency requires editing more than one module, you’re too tightly coupled.
  • Ensure external API calls fail gracefully.
  • Prefer boring, well-understood tech for critical paths.
  • Avoid lock-in into specific cloud services.  
  • Use infrastructure-as-code. 

Do not push teams into multi-cloud fantasy architectures or massive rewrites - but limit the number of vendors in critical paths towards the Minimal Survivable Service. 

Deliverables:
- Double-check how your development guidelines handle external dependency management.

Beware of managed services

For components that are required to operate the Minimum Survivable Service, managed services (ex: cloud-hosted databases, message queues…) should be treated with caution. Convenience must be weighed against exitability: can the data be exported quickly, can the service be re-created elsewhere, can encryption keys be accessed, can operational control be regained without the vendor? 

Dependency risk is not only technical. It is also human. As platforms abstract more complexity away, teams gradually lose the skills required to operate systems directly. 

For components that are part of the Minimum Survivable Service, companies must consciously preserve hands-on expertise. This does not mean rejecting managed services, but ensuring that the team retains the ability to operate equivalent systems independently if required. Documented runbooks, periodic self-hosted exercises, and shared operational ownership help keep this knowledge alive.

Deliverables:
- Review all managed services, verify whether critical competencies are missing

Beware of custom workflows

Modern SaaS platforms encourage deep customization: bespoke workflows, automations, triggers, and scripts embedded directly inside the vendor’s environment. While powerful in normal conditions, these custom workflows become a hard lock-in vector. The more logic you embed inside a SaaS tool, the more you turn that vendor into an execution dependency.

Unlike data, these workflows are rarely portable. They are tightly coupled to the vendor’s execution engine, event model, and proprietary logic. When access to the platform is lost, there is often no clean export, no reproducible definition, and no equivalent runtime elsewhere. Even when an alternative tool exists, reproducing the behavior requires time - precisely what is missing during a crisis.

For systems that are part of the Minimum Survivable Service, prefer out-of-the-box capabilities that will likely be available in alternative, sovereign solutions. If implementing a custom workflow is absolutely necessary, make sure to implement it in code you control, version, and can redeploy independently. 

Deliverables:
- Review whether you have critical custom workflows implemented within a vendor software

Keeping access to support

In a large-scale outage, support channels may be overwhelmed, response times may degrade, and escalation paths may be unavailable. 

If corporate email is down, vendor support teams lose their main verification channel. Requests to restore access, unlock accounts, or change billing details are no longer coming from a recognized domain or mailbox. Even legitimate founders or executives may be treated as unverified third parties.

If password managers or identity platforms are unavailable, the problem compounds. Account identifiers, customer IDs, tenant names, contract references, and support case numbers are often stored only inside these tools. Without them, you may know which service you use but not how the vendor identifies you. Recovery stalls not because access is denied, but because you cannot even reference the correct account.

Deliverables:
- Offline access to proof of ownership documents
- Offline access to vendor account identifiers, references, passwords

For the most critical vendors, you can pre-agree escalation paths that do not require mailbox access or pre-registered secondary contact channels (ex: from another domain). 

Conclusion

Most companies discover they depend on 20-40 external services. Digital preparedness means treating vendor dependencies as first-class risks, mapping them honestly, and designing the business to survive if those dependencies break.