Article

Mar 26, 2026

Every Company Already Has a Digital Twin. It’s Just Locked in Backups

Most companies think of a “digital twin” as something futuristic. Something you build with simulations, sensors, or expensive infrastructure. Something that belongs in manufacturing, not in a law firm, a hedge fund, or a SaaS company. But if you zoom in on how companies actually operate day to day, something more interesting shows up: You are already generating a complete, time-indexed record of how your business works. It’s just not being used...

The System You Already Have (But Don’t Use)

Take a typical quant fund or data-driven company.

Your stack probably looks like this:

  • kdb+ or Snowflake for structured data and queries

  • S3 or internal data lake for research datasets

  • MLflow / Weights & Biases for experiments

  • Python + Pandas / NumPy / PyTorch for modeling

  • Slack, email, internal tools for decision-making context

  • Rubrik, Veeam, or Cohesity for backups

Each of these systems captures a different piece of reality:

  • What data you saw

  • What models you trained

  • What decisions you made

  • What outcomes followed

But they are all fragmented.

And critically, your backup system is the only place where everything actually comes together over time.

It contains the full history:

  • Old model versions that were discarded

  • Past datasets before they were cleaned or filtered

  • Internal discussions before trades or product decisions

  • The exact state of your systems at specific points in time

Backups are not just storage.

They are the closest thing you have to a complete historical replay of your company.

Why This Matters Now

Modern AI systems are not limited by modeling techniques anymore.

They are limited by training data that actually reflects reality.

And most companies solve this the wrong way.

They go external:

  • Buying datasets from Snowflake Marketplace

  • Paying for alternative data feeds

  • Scraping public data

  • Building synthetic datasets

At the same time, they are sitting on 10–20 years of their own operational history, stored for compliance, never used for learning.

So you get a strange situation:

  • Your AI models are trained on generic data

  • Your real institutional knowledge is locked away

You are effectively ignoring the only dataset that actually captures how your business behaves.

What a “Digital Twin” Actually Means in Practice

A digital twin is not a 3D simulation.

It is the ability to answer one simple question:

“What would have happened if we made a different decision?”

To do that, you need three things:

  1. State — what the world looked like at a moment in time

  2. Actions — what decisions were made

  3. Outcomes — what happened next

Your backups already contain all three.

The problem is that they are not accessible in a way that AI systems can use.

How This Works With Duplicati

Duplicati turns backups into something you can actually operate on.

Instead of treating them as encrypted archives for disaster recovery, it:

  • Extracts historical data into Parquet / Delta Lake on S3-compatible storage

  • Indexes and versions data across time

  • Preserves permissions and audit trails

  • Connects directly into your existing ML workflows

So instead of this:

“We need to rebuild a dataset from scratch to test this idea”

You get this:

“Load the system state from March 2021 and rerun the strategy”

A Concrete Example (Quant Workflow)

Imagine a quant researcher working on a new strategy.

Today, their workflow looks like this:

  1. Pull data from kdb+ or Snowflake

  2. Clean and reconstruct datasets manually

  3. Backtest using Python

  4. Compare results against current strategies

What’s missing is everything that was tried before.

  • Why was a similar strategy abandoned?

  • What data conditions existed at that time?

  • What did the model actually see?

With Duplicati, the workflow changes:

  1. Query historical system states directly from backup-derived datasets

  2. Load:

    • Previous model versions

    • Historical feature sets

    • Market conditions at that exact time

  3. Replay the decision environment

  4. Simulate alternative strategies against real past conditions

Now you are not just backtesting on data.

You are replaying your firm’s actual history.

That is a fundamentally different capability.

Beyond Quant: This Applies Everywhere

This is not specific to trading.

The same pattern exists in every company.

SaaS

You already have:

  • Product logs

  • Support tickets

  • Feature rollouts

  • User behavior over time

A digital twin lets you simulate:

  • “What if we shipped this feature earlier?”

  • “What if we handled support differently?”

Healthcare

You already have:

  • Patient records

  • Treatment decisions

  • Outcomes

A digital twin lets you analyze:

  • “What treatment path led to better outcomes under similar conditions?”

Enterprise Ops

You already have:

  • Internal communications

  • Planning documents

  • Execution timelines

A digital twin lets you understand:

  • “Why did this decision work in one case but fail in another?”

Why Backup Is the Only Place This Can Happen

No operational system gives you this.

  • Snowflake gives you structured snapshots

  • Databricks gives you pipelines

  • BI tools give you dashboards

But none of them give you complete, time-indexed history across systems.

Backup does.

It is the only system that:

  • Captures everything

  • Preserves it over time

  • Stores it immutably

That is why the shift matters.

The Shift

Backup used to be about recovery.

Now it becomes:

  • A training data layer

  • A simulation engine

  • A decision replay system

Over time, this evolves into something much bigger:

A continuously updating digital twin of your company.

Not a model you build once.

But a system that reflects every decision, every dataset, and every outcome as they happen.

The Real Insight

The most valuable AI dataset a company owns is not external.

It is its own history.

The problem is not collection.

It is activation.

Duplicati exists to bridge that gap.

Not by replacing your data stack, but by unlocking the one system that already sees everything:

Your backups.

Get started for free

Pick your own backend and store encrypted backups of your files anywhere online or offline. For MacOS, Windows and Linux.

Pick your own backend and store encrypted backups of your files anywhere online or offline. For MacOS, Windows and Linux.

  • Example image