





Pick ten past opportunities across outcomes: wins, losses, and unknowns. Blind the brand names and re-score using your draft rubric. Where do raters diverge, and why? Which anchors confuse? Did weights overvalue charisma or undervalue distribution? Capture disagreements, rewrite definitions, and repeat until differences shrink meaningfully. This practice reveals whether your rubric explains success retroactively without overfitting, and whether it offers predictive signal before outcomes are obvious to everyone else.
Quantify agreement using simple statistics like Cohen’s kappa or intraclass correlation, even informally. High dispersion means your anchors or training need work. Encourage independent scoring before discussion to prevent anchoring on senior voices. After debate, record final scores and rationales, keeping both versions to study convergence. Reliability is not about forced consensus; it is about consistent meanings that survive pressure, excitement, and the occasional charismatic founder who dazzles unprepared rooms.
Design guardrails that counteract known biases. Use structured questions to equalize interviews, blind non-essential pedigree signals during early screens, and timebox partner commentary to ensure quieter voices surface. Rotate devil’s advocate roles and capture pre-mortems for high-scoring but risky bets. When possible, seek direct customer evidence rather than introductions from status-heavy networks. Small process tweaks compound into fairer, sharper judgments, protecting both your portfolio and the trust of founders you hope to back.






Schedule honest post-mortems on missed winners and painful write-offs. Was the rubric missing a criterion, or were anchors misleading? Did weights mirror outdated beliefs? Capture what would have flipped the decision and test that adjustment against other deals. Publish a short internal note so learning spreads. Invite readers and peers to challenge your conclusions, turning embarrassment into institutional wisdom rather than folklore that vanishes when memory or personnel change.
Pre-seed hardware needs different anchors than growth-stage fintech. Create modular variants that inherit shared principles but adapt evidence expectations, timelines, and risk thresholds. Keep the language consistent across variants so partners can compare apples to informed apples. Track performance by variant to avoid cargo-culting one-size-fits-all scoring. As sectors evolve, sunset criteria that no longer predict outcomes, and introduce new ones carefully, backed by data, not fashion or FOMO-driven enthusiasm.
Treat founders as partners, not puzzles. Share what you evaluate, why it matters, and how to provide the most helpful evidence. Protect sensitive data, avoid performative meetings, and close loops quickly. Ethical clarity earns trust, surfaces better information, and attracts referrals from people you pass on today. Invite subscribers to hold you accountable: publish anonymized aggregate stats on pass reasons and calibration outcomes. Reputation is your compound interest; invest deliberately each interaction.