Article
Mar 26, 2026
Every Company Already Has a Digital Twin. It’s Just Locked in Backups
Most companies think of a “digital twin” as something futuristic. Something you build with simulations, sensors, or expensive infrastructure. Something that belongs in manufacturing, not in a law firm, a hedge fund, or a SaaS company. But if you zoom in on how companies actually operate day to day, something more interesting shows up: You are already generating a complete, time-indexed record of how your business works. It’s just not being used...
The System You Already Have (But Don’t Use)
Take a typical quant fund or data-driven company.
Your stack probably looks like this:
kdb+ or Snowflake for structured data and queries
S3 or internal data lake for research datasets
MLflow / Weights & Biases for experiments
Python + Pandas / NumPy / PyTorch for modeling
Slack, email, internal tools for decision-making context
Rubrik, Veeam, or Cohesity for backups
Each of these systems captures a different piece of reality:
What data you saw
What models you trained
What decisions you made
What outcomes followed
But they are all fragmented.
And critically, your backup system is the only place where everything actually comes together over time.
It contains the full history:
Old model versions that were discarded
Past datasets before they were cleaned or filtered
Internal discussions before trades or product decisions
The exact state of your systems at specific points in time
Backups are not just storage.
They are the closest thing you have to a complete historical replay of your company.
Why This Matters Now
Modern AI systems are not limited by modeling techniques anymore.
They are limited by training data that actually reflects reality.
And most companies solve this the wrong way.
They go external:
Buying datasets from Snowflake Marketplace
Paying for alternative data feeds
Scraping public data
Building synthetic datasets
At the same time, they are sitting on 10–20 years of their own operational history, stored for compliance, never used for learning.
So you get a strange situation:
Your AI models are trained on generic data
Your real institutional knowledge is locked away
You are effectively ignoring the only dataset that actually captures how your business behaves.
What a “Digital Twin” Actually Means in Practice
A digital twin is not a 3D simulation.
It is the ability to answer one simple question:
“What would have happened if we made a different decision?”
To do that, you need three things:
State — what the world looked like at a moment in time
Actions — what decisions were made
Outcomes — what happened next
Your backups already contain all three.
The problem is that they are not accessible in a way that AI systems can use.
How This Works With Duplicati
Duplicati turns backups into something you can actually operate on.
Instead of treating them as encrypted archives for disaster recovery, it:
Extracts historical data into Parquet / Delta Lake on S3-compatible storage
Indexes and versions data across time
Preserves permissions and audit trails
Connects directly into your existing ML workflows
So instead of this:
“We need to rebuild a dataset from scratch to test this idea”
You get this:
“Load the system state from March 2021 and rerun the strategy”
A Concrete Example (Quant Workflow)
Imagine a quant researcher working on a new strategy.
Today, their workflow looks like this:
Pull data from kdb+ or Snowflake
Clean and reconstruct datasets manually
Backtest using Python
Compare results against current strategies
What’s missing is everything that was tried before.
Why was a similar strategy abandoned?
What data conditions existed at that time?
What did the model actually see?
With Duplicati, the workflow changes:
Query historical system states directly from backup-derived datasets
Load:
Previous model versions
Historical feature sets
Market conditions at that exact time
Replay the decision environment
Simulate alternative strategies against real past conditions
Now you are not just backtesting on data.
You are replaying your firm’s actual history.
That is a fundamentally different capability.
Beyond Quant: This Applies Everywhere
This is not specific to trading.
The same pattern exists in every company.
SaaS
You already have:
Product logs
Support tickets
Feature rollouts
User behavior over time
A digital twin lets you simulate:
“What if we shipped this feature earlier?”
“What if we handled support differently?”
Healthcare
You already have:
Patient records
Treatment decisions
Outcomes
A digital twin lets you analyze:
“What treatment path led to better outcomes under similar conditions?”
Enterprise Ops
You already have:
Internal communications
Planning documents
Execution timelines
A digital twin lets you understand:
“Why did this decision work in one case but fail in another?”
Why Backup Is the Only Place This Can Happen
No operational system gives you this.
Snowflake gives you structured snapshots
Databricks gives you pipelines
BI tools give you dashboards
But none of them give you complete, time-indexed history across systems.
Backup does.
It is the only system that:
Captures everything
Preserves it over time
Stores it immutably
That is why the shift matters.
The Shift
Backup used to be about recovery.
Now it becomes:
A training data layer
A simulation engine
A decision replay system
Over time, this evolves into something much bigger:
A continuously updating digital twin of your company.
Not a model you build once.
But a system that reflects every decision, every dataset, and every outcome as they happen.
The Real Insight
The most valuable AI dataset a company owns is not external.
It is its own history.
The problem is not collection.
It is activation.
Duplicati exists to bridge that gap.
Not by replacing your data stack, but by unlocking the one system that already sees everything:
Your backups.



