Skip to content
AI & Machine Learning 4 min read

Why Venture Capital Needs T0 Data

The fundamental flaw in how VCs evaluate startups — and why capturing data at the moment of origin changes everything.

Mariana Canet

Mariana Canet

Head of Data

November 24, 2025

#T0 Data #Machine Learning #Data Science #VC Analytics
Why Venture Capital Needs T0 Data

After months of research and model development, our data science team has reached a critical insight that’s reshaping how we think about startup prediction: the industry’s fundamental approach to data is broken.

The Endogenous Bias Problem

When we started building predictive models for venture capital, we did what everyone does. We gathered data from Crunchbase, PitchBook, and CB Insights. We trained models on funding rounds, exits, and failures.

The results were… mediocre.

Not because our models were bad. Not because we lacked data. But because the data itself was fundamentally flawed.

The problem: endogenous bias.

When you train algorithms on performance data that isn’t captured at origin, you’re not predicting outcomes — you’re pattern-matching on signals that are already contaminated by the outcomes you’re trying to predict.

Let me explain with an example:

Imagine you’re trying to predict which startups will succeed. You train your model on data from Series B companies. The data looks clean. The patterns seem clear.

But here’s the catch: Series B companies have already been selected. They’ve already passed through multiple filters:

  1. They convinced an angel investor
  2. They survived to raise a seed round
  3. They impressed Series A investors
  4. They grew enough to merit a Series B

Your model isn’t learning to identify future winners. It’s learning to recognize companies that have already won their early battles. The selection bias is baked into every data point.

Same Data, Same Results

Here’s the uncomfortable truth about VC analytics: everyone uses the same data.

  • Crunchbase: 80% of the industry
  • PitchBook: The “premium” alternative
  • CB Insights: For those who want charts

The data sources are commoditized. The features are commoditized. The insights are commoditized.

When everyone trains on the same data, everyone reaches the same conclusions. There’s no edge. No alpha. Just expensive confirmation of what everyone already knows.

The T0 Solution

We asked a different question: What if we captured data before the selection bias occurs?

This led us to T0 — the moment of origin. Specifically, the original pitch deck a founder creates before they’ve talked to anyone.

Think about what a first pitch deck contains:

  • Raw founder psychology
  • Unfiltered market assumptions
  • Genuine (not coached) financial thinking
  • Authentic team dynamics
  • Real competitive positioning

This data has never been systematically captured. No one has a database of original pitch decks with outcome tracking.

Until now.

Building the T0 Database

With PULSE, our AI-powered pitch deck analysis engine, we’re building something unprecedented: a database of startup DNA captured at the moment of creation.

Here’s what we’re extracting:

Quantitative Signals

  • Financial projection patterns
  • Market sizing approaches
  • Growth rate assumptions
  • Burn rate expectations
  • Valuation anchors

Qualitative Signals

  • Narrative structure
  • Problem articulation clarity
  • Solution presentation confidence
  • Team positioning
  • Competitive framing

Meta Signals

  • Deck design sophistication
  • Information organization
  • Emphasis patterns
  • What’s included vs. omitted
  • Storytelling coherence

The Model Advantage

With T0 data, our models learn from uncontaminated signals. We’re not pattern-matching on success survivors. We’re identifying the raw characteristics that correlate with future outcomes.

Early results are promising:

MetricTraditional ModelsT0 Models
Prediction Accuracy62%79%
False Positive Rate34%18%
Signal-to-NoiseLowHigh

(Based on internal validation against 600+ startups with known outcomes)

The Compounding Advantage

Here’s what makes this approach particularly powerful: it compounds.

Every pitch deck we analyze adds to our training data. Every outcome we track validates or refines our models. Every cycle makes our predictions sharper.

Traditional data providers are stuck. Their data is historical, static, and shared with everyone. Our data is proprietary, growing, and captured at the only moment that matters.

What This Means for VCs

If you’re an investor, T0 data changes your workflow:

  1. Earlier conviction: Identify promising startups before the crowd
  2. Better filtering: Reduce false positives with uncontaminated signals
  3. Deeper diligence: Understand founder psychology from day one
  4. Competitive edge: Access insights no one else has

The venture capital industry has been flying blind, using data that’s already been filtered by the outcomes we’re trying to predict.

T0 data is how we finally learn to see.


Our next post will explore WHISPER — how we transform T0 signals into actionable prediction scores. Subscribe to be notified.

Share this article

Mariana Canet

Written by

Mariana Canet

Head of Data

Part of the Xylence team building the predictive intelligence layer for global capital.

GET EARLY ACCESS

Be Part of This Quiet Revolution.

Join the VCs and founders who are ready to hear what the future whispers.

Trusted by VCs managing $10B+ AUM. Your data is secure and never shared.