From 175,000 Competitors
to 30 Portfolio Companies
Turning the world’s most exclusive scientific talent pool into a concentrated, high-conviction seed portfolio
Current Pipeline
Early-stage alumni-founded companies scoring in the top half (score ≥ 50), founded 2021+. Anonymized for LP review.
Portfolio Outcome Projections
Using calibrated per-company probabilities from the ML scoring model, applied to the current pipeline as a hypothetical portfolio.
Unicorn Probability Distribution
| Scenario | Probability |
|---|
Methodology: Each company’s P($1B+) is treated as an independent Bernoulli trial. Aggregate probabilities computed via Poisson approximation with λ = Σ p1B.
| Scenario | Gross MOIC | Net TVPI | Unicorns | $100M+ Exits | Gross Proceeds |
|---|---|---|---|---|---|
| P25 (Conservative) | - | - | - | - | - |
| P50 (Median) | - | - | - | - | - |
| P75 (Upside) | - | - | - | - | - |
Explore projected fund returns across 10,000 scenarios
Play with assumptions, run full Monte Carlo simulations, and see probability distributions.
Or subscribe to updates:
Each simulated portfolio draws company outcomes from a pool of 127 real alumni-founded companies (funded, VC-investable, founded ≤ 2017). Two sampling modes are available: SFF Scored uses the ML model’s predictive power to weight draws — Tier 1 samples with 2.2× weight on $100M+ outcomes (AUC 0.742 at funding, walk-forward CV); Random draws uniformly from the pool, representing what any seed investor would get without the scoring edge.
| Outcome Tier | Count | % of Pool | Example Values | Source |
|---|---|---|---|---|
| $1B+ (Unicorn) | 21 | 16.5% | $1B – $500B | Database |
| $100M – $999M | 16 | 12.6% | $100M – $700M | Database |
| $1M – $99M | 10 | 7.9% | $5M – $75M | Database |
| No known valuation | 80 | 63.0% | 45 active (no public valuation) · 19 acquired (value unknown) · 8 closed · 7 unknown · 1 IPO micro-cap. | |
| Tier | Scoring Quartile | $100M+ Rate | $1B+ Rate | Source |
|---|---|---|---|---|
| Tier 1 (Gold) | Q1 (n=32) | 62.5% | 34.4% | Database |
| Tier 2 (Teal) | Q2+Q3 (n=62) | 25.9% | 14.5% | Database |
| Pool Baseline (Random mode) | All (n=127) | 29.1% | 16.5% | Database |
Backtest tier rates assume perfect classification — every company the model ranks as Tier 1 truly belongs there. In practice, no model is perfect. Scoring Fidelity controls what fraction of companies are drawn from their assigned tier’s outcome distribution vs. the overall base-rate distribution.
At 100% fidelity, every company draws from its scored tier (full backtest rates). At 80% (default, calibrated to walk-forward AUC = 0.792 at first funding), each company has a 20% chance of drawing from the base pool instead — modeling the probability that the scoring model mis-ranked it. At 50%, tier assignment is essentially random and returns converge to an unscored portfolio.
| Fidelity | Interpretation | Effective T1 $100M+ Rate |
|---|---|---|
| 100% | Perfect classifier (full backtest) | 62.5% |
| 80% (default) | Calibrated to AUC at first funding | 55.8% |
| 50% | Random classification (no signal) | 45.7% |
| Parameter | Value | Notes | Source |
|---|---|---|---|
| Entry stage | Seed | Post-money: $5M–$25M (adjustable, default $13M) | Adjustable |
| Per-round dilution | 20% | Each round after entry (adjustable, 15%–30%) | Industry |
| Avg rounds: unicorns | 4.7 | From actual company data | Database n=20 |
| Follow-on (T1) | Super pro rata | Offsets 2 dilution rounds | Industry |
| Follow-on (T2) | Pro rata | Offsets 1 dilution round | Industry |
| Parameter | Value | Notes | Source |
|---|---|---|---|
| Management fee | 2% | × fund size × 10yr life (modeled as 20% upfront) | Industry |
| Carry | 20% | Of profits above 1× return of capital | Industry |
Data & Methodology
Scoring System, Evaluation Framework & Results
Executive Summary
Science-fair competitions (ISEF, STS) select for a rare combination of technical depth, independent research ability, and competitive drive — traits that compound in venture-backed founders. Science Fair Fund uses a proprietary ML scoring system to rank these alumni at each investment stage, enabling three portfolio construction decisions: prioritizing sourcing, sizing initial checks, and concentrating follow-on capital.
The scoring system’s edge is early identification: at founding — before any institutional capital — the top tier produces $100M+ outcomes at 2.56× the baseline rate. Signal strengthens through funding and Series A as information accumulates, enabling disciplined capital allocation at every stage.
- Primary metric: $500M+ outcomes (stable sample, directly relevant to fund returns). $100M+ and $1B+ shown for context.
- Two evaluation frameworks: Selection Advantage (tier separation within each stage) and Model Validation (walk-forward CV + time-gated backtests).
- At Series A: Tier 1 concentrates 50% $500M+ hit rate (1.80× lift).
Data Universe
Reference cohort = VC-investable alumni-linked companies founded 2006–2020, with outcomes observed through early 2026. Every company has at least five years of maturity.
| Metric | Full Cohort | Funded ≥$500K | Series A+ |
|---|---|---|---|
| Alumni-linked companies | 240 | 148 | 105 |
| $100M+ outcomes | 38 (15.8%) | 37 (25.0%) | 37 (35.2%) |
| $500M+ outcomes | 25 (10.4%) | 25 (16.9%) | 25 (23.8%) |
| $1B+ outcomes | 21 (8.8%) | 21 (14.2%) | 21 (20.0%) |
172 of 240 companies (71.7%) are US/Canada-headquartered, consistent with the fund’s North American deployment focus.
Scoring Methodology
The fund operates two complementary evaluation systems, each answering a different question about alumni founders.
Ranking Model (V4 Ensemble). Each company receives a stage-specific score based on its highest-scoring founder. The score is a within-cohort rank — not a calibrated probability — designed to sort companies relative to peers at each decision point. It answers: who should we prioritize?
Outcome Prediction (Walk-Forward CV). Separately, a walk-forward cross-validated model tests whether the ranking generalizes forward in time — trained on earlier cohorts, evaluated on later ones — using only information available at each stage. It answers: does the model actually predict who wins?
Three Decision Points
At Founding. Competition results, education trajectory, and sector signals available before institutional funding. Focuses sourcing on the highest-potential alumni.
At First Funding. Founding signals augmented by round size, investor quality, and co-founder composition. Primary signal for initial check sizing.
At Series A. Funding signals plus milestone progression — time to Series A, round scaling, traction markers. Drives follow-on concentration.
Feature Importance by Stage
| Signal Category | At Founding | At Funding | At Series A |
|---|---|---|---|
| Competition/Awards | #2 | #3 | #5 |
| Skills | #4 | #4 | #4 |
| Education | #5 | #5 | #3 |
| Capital/Stage | — | #2 | #2 |
| Sector/Market | #1 | — | #6 |
Model Validation
The predictive test: does the model generalize forward in time? Walk-forward cross-validation trains on earlier founding-year cohorts, evaluates on later ones, and uses only information available at each stage.
| Decision Point | Walk-Forward AUC | Fund Application |
|---|---|---|
| At Founding | 0.666 | Sourcing priority within alumni universe |
| At First Funding | 0.742 | Initial check sizing ($100K–$250K range) |
| At Series A | 0.843 | Follow-on concentration decisions |
An AUC of 0.843 at Series A means the model correctly ranks a random $100M+ outcome above a random non-outcome 84% of the time.
Precision/Recall at Tier 1 Threshold
| Stage | Selection % | Precision ($500M+) | Recall ($500M+) |
|---|---|---|---|
| At Founding | 25% | 34.2% | 61.9% |
| At First Funding | 26% | 37.5% | 50.0% |
| At Series A | 25% | 50.0% | 45.5% |
Selection Advantage
The ranking model’s cross-sectional test: within each stage cohort, does the score concentrate future winners into a smaller, actionable subset?
Across all three decision points, the scoring system concentrates $500M+ outcomes toward the top of the ranked list.
At Founding
n=151 • Baseline $500M+: 13.9% • Tier 1: 34.2% (2.46×)
| At Founding (n=151) | n | $100M+ | Lift | $500M+ | Lift | $1B+ | Lift |
|---|---|---|---|---|---|---|---|
| Tier 1 (p75+) | 38 | 52.6% | 2.56× | 34.2% | 2.46× | 28.9% | 2.57× |
| Tier 2 (p25–74) | 76 | 14.5% | 0.71× | 10.6% | 0.76× | 7.9% | 0.70× |
| Tier 3 (p0–24) | 37 | 0.0% | 0.00× | 0.0% | 0.00× | 0.0% | 0.00× |
| Baseline | 151 | 20.5% | 1.00× | 13.9% | 1.00× | 11.2% | 1.00× |
At First Funding
n=125 • Baseline $500M+: 19.2% • Tier 1: 37.5% (1.95×)
| At First Funding (n=125) | n | $100M+ | Lift | $500M+ | Lift | $1B+ | Lift |
|---|---|---|---|---|---|---|---|
| Tier 1 (p75+) | 32 | 62.5% | 2.17× | 37.5% | 1.95× | 34.4% | 2.15× |
| Tier 2 (p25–74) | 62 | 25.9% | 0.90× | 19.4% | 1.01× | 14.5% | 0.91× |
| Tier 3 (p0–24) | 31 | 0.0% | 0.00× | 0.0% | 0.00× | 0.0% | 0.00× |
| Baseline | 125 | 28.8% | 1.00× | 19.2% | 1.00× | 16.0% | 1.00× |
At Series A
n=79 • Baseline $500M+: 27.8% • Tier 1: 50.0% (1.80×)
| At Series A (n=79) | n | $100M+ | Lift | $500M+ | Lift | $1B+ | Lift |
|---|---|---|---|---|---|---|---|
| Tier 1 (p75+) | 20 | 90.0% | 2.15× | 50.0% | 1.80× | 45.0% | 1.98× |
| Tier 2 (p25–74) | 40 | 37.5% | 0.90× | 30.0% | 1.08× | 22.5% | 0.99× |
| Tier 3 (p0–24) | 19 | 0.0% | 0.00× | 0.0% | 0.00× | 0.0% | 0.00× |
| Baseline | 79 | 41.8% | 1.00× | 27.8% | 1.00× | 22.8% | 1.00× |
Score Tier Lift
$500M+ outcome rate by score quartile — higher bars = stronger signal.
Precision / Recall Tradeoff
As you widen the selection pool (x-axis), precision drops but recall rises.
Feature Importance
Which signal categories drive the model at each stage.
Interpretation Notes
- Stage cohort sizes differ because each decision point uses its own eligible universe (151 at founding → 125 funded → 79 Series A+).
- Early signal is observable before institutional capital. The founding score shows Tier 1 $100M+ rates at 2.56× the baseline rate.
- Signal strengthens with information. Tier 1 $500M+ lift ranges from 1.80–2.46× across stages.
- The score is a rank, not P($500M+). The V4 score is a percentile rank. $500M+ probabilities are separately calibrated.
Definitions
| $100M+ | Exit value or last known valuation ≥$100M. |
| $500M+ | Exit value or last known valuation ≥$500M. |
| $1B+ (Unicorn) | Exit value or last known valuation ≥$1B. |
| Funded | Institutional capital raised ≥$500K. |
| Series A+ | Venture stage at Series A or later, funded ≥$500K. |
| Walk-Forward CV | Trains on earlier cohorts, tests on later ones. No future data leaks into training. |
| AUC | Area Under ROC Curve. Probability a positive is ranked above a negative. 0.5 = random; 1.0 = perfect. |
| Precision | Fraction of selected companies that are actual $500M+ outcomes. |
| Recall | Fraction of all $500M+ outcomes captured by the selected set. |
| Lift | Outcome rate in a tier divided by the baseline rate for the full stage cohort. |
Alumni Founded Companies
Top 50 alumni-founded companies with $100M+ outcomes
Science Fair Fund
Every generation, a few thousand teenagers do something extraordinary — they choose a hard, unsolved problem and spend a year trying to push the boundary of what’s known. Science fairs don’t just test knowledge. They select for the rarest combination in venture: deep technical ability fused with the obsession to ship.
These kids grow up to build the defining companies of their era — and yet no one underwrites that pattern. The entire venture industry watches Stanford GSB and YC Demo Day while the highest-signal cohort in technology goes uninvested at the moment of founding.
We exist to fix that. Science Fair Fund is the first venture fund built entirely around this community — identifying alumni founders before institutional capital arrives, scoring them with proprietary data no one else has, and concentrating capital behind the ones the model says will break out. This isn’t a thesis about sectors or stages. It’s a thesis about people — that the teenagers who chose to do something hard when no one was watching become the founders who do it again when everything is on the line.
The Opportunity
Fund Details
How Scoring Drives Portfolio Construction
The ML model scores every alumni-founded company at each stage of its lifecycle. At founding, competition results, education trajectory, and sector signals identify the highest-potential founders before institutional capital enters.
At first funding, the model augments biographical signals with round size, investor quality, and co-founder composition to size initial checks.
Tier assignment directly determines capital allocation: Tier 1 companies receive 2x the check size and get super pro rata follow-on at Series A. This scoring-driven concentration is the mechanism that converts the alumni network’s structural advantage into portfolio returns.
The Structural Moat
| Edge | What It Means |
|---|---|
| Tribal Access | Science fair is formative identity (age 14–18). Alumni told Society for Science they’d “save space” for affiliated investors. This isn’t networking — it’s kinship. Our relatively smaller check sizes ($100–200K) make it easy for founders to make room on the cap table. |
| Information Asymmetry | Proprietary database of 49,183+ alumni. ML model (55 features) identifies founders before launch. LinkedIn triggers catch “stealth mode” → first-ticket SAFEs before Sequoia/YC sees the deck. |
| Proven Alpha | Among funded alumni, Tier 1 (top 40% by score): 18% unicorn rate, 37% $100M+ outcomes — 2× baseline. Cohort baseline: 14.19% unicorn rate, 21% $100M+ rate (vs ~5% / ~15% industry). |
| Insider Credentials | ISEF 1st Place (’04) + multi-year judge. The only comprehensive dataset on this cohort, built since 2006. |
Returns Profile
Monte Carlo validated, 10K simulations. Historical outcome rates applied to scored pipeline with position-level modeling.
Downside protection: Even if ML fails completely, baseline cohort delivers 14.19% unicorn rate, 15× gross MOIC. Loss ratio: 9% (vs. 30–40% typical seed).
The Manager: Anthony Atlas
- ISEF 1st Place Winner (2004) — I’m one of them
- Multi-year Grand Awards Judge — deep community relationships
- Raised >$75M in venture capital as an operator
- Supported 3 deep-tech companies through ~10× valuation step-ups (seed → Series B)
Why This Works Now
Capital is commoditized. Geographic moats have dissolved. The edge is now signal extraction + proprietary access.
Science Fair Fund isn’t competing on capital or brand. We’re competing on identity and information asymmetry — a structural arbitrage generalist funds cannot replicate.
Fund II expansion: The thesis is portable. Fund II expands to Math/Physics Olympiads (the Collison brothers won Irish Young Scientist before building Stripe).
How We Compare
| Metric | SFF Alumni | VC Industry | Multiple |
|---|---|---|---|
| Unicorn rate | 14.19% | 1–2% | 3.1x |
| Series A graduation | 67% | ~50% | 1.3x |
| Loss ratio | ~9% | 30–40% | 0.25x |
| Scoring AUC | 0.84 | N/A | — |
Stay connected
Facts & Methodology
1. The Thesis
Science fair alumni are the most under-recognized founder talent pool in venture capital. Every year, thousands of teenagers compete at ISEF (International Science and Engineering Fair) and STS (Science Talent Search). These competitions select for a rare combination: deep technical ability, independent research capacity, and the obsession to ship results under pressure — exactly the traits that predict startup success.
Despite producing 14.19% unicorn founders (3.1x the venture industry baseline of ~5%), this cohort receives almost no structured institutional capital at the moment of founding. Science Fair Fund exists to close that gap — identifying alumni founders before institutional capital arrives and concentrating capital behind the highest-conviction opportunities.
The fund’s structural edge is threefold: (1) proprietary database of 49,183+ alumni that no other investor maintains, (2) an ML scoring model trained on 20 years of alumni outcomes, and (3) tribal access — the GP is an ISEF 1st Place Winner (’04) with deep community relationships.
2. Database Methodology
The SFF Alumni Intelligence Engine contains 49,183+ verified alumni profiles spanning ISEF and STS competitions from 1950 to present. Data sources include official competition records, Society for Science archives, LinkedIn enrichment, and public company databases.
Each alumni record includes: competition history (year, placement, project), educational trajectory (undergraduate and graduate institutions), career path (current role, company, location), and founder status (linked to specific companies with founding dates and roles).
LinkedIn enrichment covers approximately 17% of the database. For funded founders, coverage is substantially higher. The database identifies 178+ companies with $500K+ in funding and tracks 248B+ in aggregate enterprise value across exits and current valuations.
Data quality is maintained through automated health checks, deduplication pipelines, and materialized views that are refreshed after every import. All LP-facing metrics are derived from a standardized definitions file to ensure consistency across reports.
3. Scoring Model
The SFF scoring model is a gradient-boosted classifier trained on the full alumni-founded company universe. The model predicts $100M+ outcomes using 55 features across five categories: competition results, education, company fundamentals, founder biography, and sector signals.
Best AUC: 0.84 at Series A stage. At founding (before any funding data is available), the model still achieves meaningful separation between tiers. Walk-forward cross-validation is used to prevent data leakage — the model is always trained on companies founded before the test period.
Tier 1 alumni (top quartile by score) show 18% unicorn rate and 37% $100M+ outcome rate — approximately 2x the baseline cohort rates. This lift is consistent across all three scoring stages (at founding, at first funding, at Series A).
The model is retrained quarterly as new outcome data becomes available. Feature importance shifts predictably across stages: competition results dominate at founding, while funding round characteristics become more important at later stages.
4. Backtest Methodology
The Monte Carlo simulation uses a 127-company real outcome pool: 47 companies with known positive outcomes (ranging from $5M to $500B) and 80 companies with zero exit value. This is not synthetic data — every company in the pool is a real alumni-founded venture with verified outcomes.
Simulation methodology: bootstrap resampling from the outcome pool with replacement. Each simulation run constructs a full portfolio of 30 companies (10 Tier 1 at $200K, 20 Tier 2 at $100K), applies position-level dilution modeling, and calculates net fund returns including management fees and carry.
Conservative assumptions: (1) 80% fidelity — the model captures only 80% of the historical signal, (2) 40% liquidity discount on unrealized positions, (3) exit caps at $10B to limit tail-driven results, (4) no credit for follow-on allocation alpha.
The interactive simulator on the Fund Returns tab allows investors to adjust all assumptions and see the impact on return distributions in real time.
5. Fund Structure
Fund I: $15M target. Stage: pre-seed and seed. Portfolio: 30 companies deployed over 2 years. Check sizes are tier-based: $200K for Tier 1 (top quartile by ML score, 10 companies) and $100K for Tier 2 (second quartile, 20 companies). Total first-check capital: $4M.
Follow-on reserve: $8M allocated for Series A pro rata rights. Tier 1 companies receive super pro rata follow-on; Tier 2 companies receive standard pro rata. This reserve strategy concentrates capital in winners identified by both the scoring model and market validation.
Management fee: 2% annually on committed capital during the investment period. Carried interest: 20% above a preferred return. Fund term: 10 years with two 1-year extensions.
6. Comparable Benchmarks
| Metric | SFF Alumni Cohort | VC Industry Baseline |
|---|---|---|
| Unicorn rate (funded companies) | 14.19% | 1–2% |
| Series A graduation rate | 67% | ~50% |
| Loss ratio (total write-off) | ~9% | 30–40% |
| $100M+ outcome rate | 21% | ~15% |
| ML scoring AUC (Series A) | 0.84 | N/A |
Sources: SFF Alumni Intelligence Engine (alumni cohort); PitchBook, Cambridge Associates (industry baselines). Cohort data covers alumni-founded companies with $500K+ funding, founded 2006–2020.
7. Important Disclosures
For Accredited Investors Only. This material is provided for informational purposes only and does not constitute an offer to sell or a solicitation of an offer to buy any securities. Any such offer would be made only pursuant to a definitive offering memorandum and subscription agreement.
Past performance is not indicative of future results. The backtest results presented are based on historical data and Monte Carlo simulation. Actual fund returns may differ materially from simulated results. All investments involve risk, including the possible loss of principal.
The scoring model’s predictive accuracy (AUC 0.84) is measured on historical data and may not persist in future periods. Model performance may degrade as market conditions change.
Science Fair Fund is not registered as an investment adviser under the Investment Advisers Act of 1940. This communication is not investment advice. Prospective investors should consult their own legal, tax, and financial advisors before making any investment decision.
Want deeper analysis?
Welcome!
Two quick questions to personalize your experience.