Your ICP document isn’t wrong. It’s static. And markets aren’t.
Most revenue teams have something that looks like an ICP. A slide deck or Notion page listing the characteristics of their ideal buyer: industry, company size, geography, maybe a tech stack or two. It was probably assembled from a combination of gut feel, the last few closed-won deals, and a workshop where someone drew a circle on a whiteboard and called it “our sweet spot.”
That document is not useless. But it’s not a scoring system, and scoring is what you actually need.
The Static Block Problem
The typical ICP description fits somewhere between 20,000 and 80,000 companies in most TAMs. “B2B SaaS, 100–2000 employees, North America, uses Salesforce” is not a selection criterion — it’s a market segment. It tells you where the opportunity exists, not which specific companies are in a buying moment right now.
The companies that match your ICP description on Monday aren’t uniformly buyable. Some just raised a Series B and are rebuilding their stack. Some just had their VP of Revenue leave and are frozen. Some have been evaluating your category for eight months and are one reference call away from signing. The static ICP document treats all of them identically.
The math doesn’t.
Three Numbers That Actually Matter
Traditional ICP work focuses on who your buyers are. The math focuses on how they buy. Three metrics, properly measured, tell you more about your real ICP than any firmographic list.
Conversion Velocity. Not average velocity across your whole pipeline — that number is a blended average of your best deals and your worst. Cluster velocity: the time from first meaningful signal to closed-won, segmented by firmographic attribute combinations. When you slice it this way, you often find that a specific cluster — say, fintech companies with 200–500 employees who use Stripe — closes in half the time of your overall average. That cluster is your actual Tier 1 ICP, whether you’ve named it that or not.
ACV Efficiency. The ratio of annual contract value to total sales effort invested (touches, hours, stakeholders managed). This is the metric that separates a genuinely good ICP segment from a vanity one. A segment might have high ACVs but terrible efficiency — they require six-month enterprise cycles, legal review, and five stakeholders to get a signature. Compare that to a segment with 60% of the ACV but a fraction of the effort. The second segment may have better unit economics. Tier 1 ICP companies should close faster and with less friction, not just at higher prices.
Retention Fractals. The repeating pattern in your best long-term accounts. Churned customers often look identical to retained customers at the point of sale — same size, same industry, same job titles in the room. What differs is subtle: a fractally recurring signal in your pipeline data that, when present at deal stage, predicts three-year retention with disproportionate accuracy. This might be the presence of a RevOps leader in the initial call. It might be whether the champion had budget authority versus committee approval. Find it in your historical data. It’s usually there, and it’s almost never in your ICP document.
Why Scores Without Attribution Are Not Decisions
Once you’ve built a scoring model — whether rule-based, ML-derived, or something in between — you face a problem that most teams don’t see coming: a rep who sees a lead score of 84 and doesn’t know what drove it cannot act on it.
Does the 84 mean the company is a perfect industry fit with weak intent signals? Or a moderate industry fit with very strong behavioral signals? The same number can describe fundamentally different situations requiring completely different outreach strategies. Without attribution, the score is decoration.
This is where Shapley values, borrowed from cooperative game theory and now standard in explainable AI, become useful. The core idea is elegant: every feature’s contribution to a prediction should equal its average marginal contribution across all possible orderings of features. For a linear model, this has a closed-form solution:
SHAP_i = w_i × (x_i − E[x_i])
Where w_i is the feature’s weight, x_i is the encoded feature value for this specific company, and E[x_i] is the mean feature value across your entire dataset. The sum of all SHAP values equals the company’s deviation from the baseline score — exact decomposition, no residual.
In practice, this means a rep sees something like:
Meridian Analytics — ICP Score: 89 Industry match ✓ +31 · Size in range ✓ +13 · Revenue in range ✓ +14 · Target market ✓ +12 · Tech stack match ✓ +11 · Baseline: 62
That’s not a score. That’s a briefing. The rep knows why this account is high-priority, which objections to anticipate, and what questions to lead with. Explainability isn’t a nice-to-have. It’s the difference between a scoring system that changes behavior and one that gets ignored after the first QBR.
The Living ICP
A model that isn’t recalibrated on outcomes will drift. This is guaranteed, not a risk — market conditions shift, your product evolves, your go-to-market motion changes. A static scoring model trained on your 2023 closed-won data will become increasingly miscalibrated against your 2026 pipeline.
The Living ICP is a closed loop: every quarter, your most recent closed-won and closed-lost deals are fed back into the model as ground truth. Feature weights are updated. Tier thresholds are adjusted. The companies your reps thought were Tier 1 but didn’t close become as informative as the ones that did.
This isn’t a continuous retraining system that requires a data science team. A quarterly calibration review — compare your model’s predicted scores against actual outcomes for the quarter’s closed deals, identify where it was systematically wrong, adjust weights accordingly — takes a few hours and produces outsized improvement.
The signal that’s most often miscalibrated in models we’ve seen: tech stack. Companies list what they use on their website three years after they’ve migrated away from it. A rep who calls assuming a Salesforce shop and finds a HubSpot-only environment loses trust immediately. Weight tech stack data relative to how fresh your enrichment is.
GTM Architecture: Strategy, Execution, Evolution
There’s a pattern that separates companies that build effective ICP systems from those that build elaborate ones that nobody uses.
Strategy layer. Define ICP criteria as weights, not lists. The difference is that weighted criteria force you to express trade-offs explicitly: “industry match matters three times as much as geography for us” is a statement of commercial strategy, not a filter. It also makes your scoring model auditable — you can point to each weight and explain why.
Execution layer. Apply the model at the point of prospecting and at pipeline review. A score assigned at import is stale by the time it reaches a rep. Scoring should update when new data arrives — a company that just hired a RevOps VP should have their score recalculate immediately, not at the next overnight batch.
Evolution layer. Close the loop. The closed-won data is the most valuable data you have, and most companies use it only for celebration. Every deal that closed contains information about which attributes actually predicted success versus which ones you thought predicted success. Every deal that didn’t close despite a high ICP score is even more informative — it tells you something is broken in your model, your product, or your go-to-market.
The companies that get this right stop talking about ICP as a concept. They start talking about pipeline quality scores, feature attribution, and calibration accuracy. That’s when you know the model is working — not because the scores got higher, but because the conversation changed.
We built a free tool that runs this scoring model against your company list in your browser — no data leaves your machine. Upload a CSV, map your columns, define your ICP criteria, and get a SHAP-scored, fully attributed report in under 60 seconds.