RefoundRefound
Alle Artikel
migrationarchitecturerefactoring

The Strangler Fig Pattern: How to Modernize a Legacy System Without Stopping the Business

Niclas Kusenbach

In 2004, Martin Fowler named a migration strategy after a tree. The strangler fig grows around a host tree, slowly replacing it — the original eventually dies and falls away, and the fig stands on its own. The metaphor is exact: you wrap new functionality around the old system, gradually shift traffic to the new paths, and retire legacy code only once its replacement is proven in production.

It is the only migration approach that consistently works for systems that cannot go offline.

Why big-bang rewrites fail

The appeal of a full rewrite is obvious. Start clean. No legacy constraints. Modern stack, test coverage, CI from day one. The reality is that most large rewrites take two to three times longer than estimated, go over budget, and frequently get cancelled before they ship — leaving the team with two half-working systems instead of one.

The problems are structural, not a matter of team quality:

  • You are building blind. The old system encodes years of business logic, edge cases, and regulatory requirements that were never formally documented. You only discover them when you miss them.
  • The target moves. While you rewrite, the business keeps running on the old system. By the time you are ready to cut over, the systems have diverged.
  • The cutover is a single point of failure. No matter how much you test, the switch from old to new is a high-risk moment. If something goes wrong, you roll back and lose months of work.

How the strangler fig works

The pattern has three phases, repeated until the old system is gone:

1. Identify a seam

A seam is a natural boundary in the existing system — a discrete piece of functionality that can be extracted without changing anything else. Good seams are: a single API endpoint, a background job, a reporting module, a specific user-facing workflow.

The first seam you pick should be low-risk and high-value. Low-risk because you are still learning the pattern. High-value because it proves the approach to stakeholders.

2. Build in parallel

Write the replacement for that seam in the new stack. Run both the old and new implementations simultaneously. Do not delete old code yet.

At the boundary — usually an API gateway, a feature flag, or a routing layer — you control which implementation handles each request. Initially, 100% goes to the old path. You can shift traffic gradually: 1%, 10%, 50%, 100%.

         ┌──────────────────────────────┐
         │         API Gateway          │
         └──────────────┬───────────────┘
                        │
            ┌───────────┴───────────┐
            │                       │
     [old system]             [new service]
     (90% of traffic)         (10% of traffic)

3. Validate, then retire

Once the new implementation is handling 100% of traffic and has been stable in production for an agreed period, the old code is deleted. Not archived. Deleted.

This is important. Code that is "kept just in case" becomes code that is depended on, then code that is maintained, then legacy code that you have to migrate again in five years.

Practical concerns

Database migrations. The hardest part of any migration is the data layer. The strangler fig does not make this easy — it makes it manageable. While running in parallel, both systems write to the same database, or you run a synchronisation layer between the old and new schemas. Column-by-column migrations, using tools like pgroll or Flyway, let you evolve the schema incrementally without downtime.

Feature flags. Infrastructure like LaunchDarkly, Unleash, or even a simple database table of flags gives you fine-grained control over traffic routing. They also give you an instant kill switch if the new path behaves unexpectedly.

Observability. You need metrics, logs, and traces for both the old and new paths during the transition. Without this, you cannot make confident decisions about when to shift traffic. Instrument both sides before you flip a single percentage point.

Team knowledge. The team who built the old system holds context that is not in the code. Their involvement in the migration is not optional — it is the difference between a migration that ships and one that discovers a critical edge case six months in.

When not to use the strangler fig

The pattern works best when the system has identifiable boundaries — an API surface, discrete modules, separable workflows. If the legacy codebase is a monolithic tangle where every function reaches into every other, you will need to do some refactoring work first to create the seams. That work is worth doing; it is not wasted.

The strangler fig is also not appropriate when the business genuinely cannot tolerate two implementations running simultaneously — for example, if the cost of maintaining the routing layer exceeds the cost of a planned outage. For most production systems that concern is theoretical. Most businesses have more tolerance for a careful dual-track migration than they do for the risk of a failed cutover.

What a migration actually looks like

A mid-market logistics company comes to us with a PHP monolith: 180,000 lines, no tests, last updated in 2019. They process 3,000 shipments a day and cannot take downtime.

I start with the tracking API — a well-defined surface that accounts for 40% of inbound traffic. Over six weeks: new service built in Go, running behind the same API gateway, starting at 0% traffic. I add observability, run load tests, fix three bugs that only surface under production traffic patterns. At week seven, it is handling 100% of tracking requests. The PHP tracking code is deleted.

I repeat for the shipment creation workflow, then the billing module, then the admin interface. Eighteen months later, the PHP monolith is retired. It was live and processing shipments every day of that eighteen months. No planned downtime. No failed cutover.

That is the strangler fig in practice.


If you are running a system that your business depends on but cannot safely change, an audit is where we start — mapping the seams that make a migration like this possible.