Lead Scoring Is Broken. Here's What Actually Works.

Lead scoring was invented in an era when the alternative was no scoring at all. A spreadsheet, a gut feeling, and a handoff meeting on Friday afternoons. Against that baseline, assigning points for job title, company size, and whitepaper downloads felt like a meaningful step forward.

It wasn’t. We just didn’t have anything to compare it to.

Thirty years later, we’re still building AI on top of the same conceptual framework: demographic fit plus behavioral engagement equals a number that tells you who to call. We’ve added machine learning to it, which makes it feel more sophisticated. It isn’t. We’ve just built a more complicated version of a broken system.

What Traditional Scoring Actually Measures

Take a typical lead scoring model. It assigns points for:

Job title match to ICP: +20
Company size in target range: +15
Email open: +5
Webinar attendance: +10
Pricing page visit: +25
Demo request: +50

A lead who visits the pricing page twice, opens three emails, and has the right job title becomes an MQL. They get handed to sales with a score of 80.

Here’s the problem: this measures our behavior toward the lead as much as it measures the lead’s intent. We emailed them, they opened it. We hosted a webinar, they showed up. We built a pricing page, they clicked it. We have no idea if they’re actually in the market to buy or if they were doing competitive research, killing time, or confused about what we sell.

The score tells you what happened in your systems. It doesn’t tell you what’s happening in their business.

The Intent Gap

What actually predicts whether a lead will buy is something the traditional model completely ignores: what’s happening on their side.

A company that just raised a Series B is in buying mode. A company that just lost their VP of Sales is making urgent decisions about their sales stack. A company where three people from the same org all visited your site in the same week without being emailed has a buying committee forming. A contact who went from never engaging to scheduling a call is a different signal than a contact who’s been passively opening your nurture emails for eight months.

None of this shows up in a traditional lead score because it requires reading signals that aren’t generated by your own marketing activity.

A Better Framework

What works, based on what we’ve seen across GTM implementations, has three components.

Fit scoring (static, not time-decayed). This is the traditional firmographic stuff: segment, size, industry, tech stack compatibility. It should be binary: either a lead is in your addressable market or they’re not. Don’t give it too much weight; fit is a prerequisite, not a signal.

Intent signals (behavioral, but theirs, not yours). This means third-party intent data, which companies are actively researching your category, combined with high-value behavioral signals that indicate genuine consideration. A pricing page visit counts for something. An email open counts for almost nothing. A comparison article between you and a competitor means they’re in an active evaluation. Weight these signals appropriately and apply time decay aggressively. A pricing page visit from six months ago is meaningless.

Pipeline pattern matching (proprietary, trained on your data). This is the part nobody talks about and it’s where the real lift comes from. Every company’s closed-won deals have a fingerprint: a characteristic sequence of touches, stakeholder patterns, and timeline behaviors that precede a close. If you have enough historical data (typically two to three years of pipeline), you can train a model that recognizes that fingerprint in early-stage leads.

The output of this model isn’t a score. It’s a probability: specifically, the probability that a lead will reach a certain stage within a given timeframe, based on how similar leads have behaved historically. It’s calibrated to your business, not to aggregate industry behavior.

Throwing Out the MQL

The MQL threshold is an arbitrary line drawn on a conceptually flawed score. It gets renegotiated at every QBR, used as a weapon in sales-marketing alignment debates, and produces different results every time someone adjusts the point values.

A better handoff mechanism: pass leads to sales when a combination of fit, intent, and pattern match exceeds a threshold you’ve calibrated against actual conversion rates. Not when they’ve accumulated enough points to cross a line you drew in 2019.

The operational difference matters. Sales reps should be able to look at a lead and understand why it was flagged: what signals triggered it, what similar leads have done, what the current probability is. Not just a number. Explainability drives adoption, and without adoption none of this matters.

The Data You Need

If you want to build this properly, you need three things:

A clean historical pipeline dataset. All your closed-won and closed-lost deals for the past two to three years, with accurate timestamps on stage progression. Most CRMs have this data; it’s usually messy. Clean it.
Intent data integration. Bombora, G2, or category-equivalent intent providers. Not mandatory, but high-value. Even basic website behavior analytics, properly instrumented, give you something to work with.
Commitment to calibration. A model that isn’t periodically recalibrated against actual outcomes will drift. You need a process for comparing predicted probabilities to actual conversion rates and adjusting. Quarterly is fine. Never is not.

The companies that get this right stop talking about MQLs within six months. They start talking about pipeline quality, conversion probability, and velocity. That’s when you know the model is working, not because the score got higher, but because the conversation changed.