---
title: "The ICP Scoring Prompt: A Drop-In Template for AI Sales Agents"
description: "Score prospects against your ICP automatically. This is the prompt our team uses with Claude and GPT, plus the rubric structure that makes the scores reliable."
url: https://timkilroy.com/blog/icp-scoring-prompt-ai-sales-agents
date: 2026-05-12
updated: 2026-05-11T17:41:47Z
category: "Sales"
author: Tim Kilroy
---

# The ICP Scoring Prompt: A Drop-In Template for AI Sales Agents

_Score prospects against your ICP automatically. This is the prompt our team uses with Claude and GPT, plus the rubric structure that makes the scores reliable._


Most agencies are running ICP scoring through an LLM right now and here is what’s happening:

- The prompts look fine. 
- The output looks confident. 
- The potential pipeline looks healthy. 
- **CLOSE RATE SUCKS**

If you are asking your LLM of choice to create an ICP for you and it has no rubric, no tier thresholds, & no calibration, you aren't really scoring prospects in creating a target list. What you are doing is creating a model where everybody in a vertical sort of fits. This means your team is going to fill up your pipeline with prospects that don't actually fit who you can serve best.

## Why a Generic AI-Created ICP Scoring Prompt Fails

Ask any LLM to "score this prospect against our ICP" without giving it a rubric and you get something like this every time: 

1. A BS paragraph telling you surface level information about the target and **how smart you are & how lucky they are that you are going to reach out**, 
2. A score in the **good to great range** (without any distinction between good and great)
3. 3 reasons why they fit that use words like “drift”, “clarity”, “compound”, and other tell-tale signs that there's no true thinking behind the score.


The output reads with authority. The reasoning - uh - well, maybe less authoritative.

There are 3 big reasons that AI-powered ICP scoring prompts produce scores that are somewhere between meaningless and useless without some serious hand-holding:

- **No rubric means no consistency:** Two prospects that look similar get different scores. The same prospect scored on Tuesday and Friday gets different scores. The model is averaging vibes, not measuring against a standard.
- **No tier thresholds means no decision rule.** If a "7" means pursue and an "8" also means pursue, and a "6" sometimes means pursue you don't have a scoring system. You've got somebody who's giving you vague guidance, not helping you focus your team's attention on the best possible fits.
- **No calibration means no accuracy:** A basic LLM prompt with information about the prospect and one line about you doesn't have any way to receive feedback. No matter how smart you are, your first version of your rubric will be wrong - *it'll be overweighted in some areas and underweighted in others. *You actually have to create a feedback mechanism that calibrates your actual sales results to the opportunity. When you build that, you actually get a model that you can trust.

> **Tim's Take:** *An LLM with a generic prompt will give you good scores for average to bad prospects and also give you good scores for the prospects who should be demanding most of your attention. you need to build a system that gets better over time that you can refine, constrain, and expand as your business evolves*

## The Structure of a Reliable Scoring Prompt

Every ICP scoring prompt worth its salt has four parts - if you skip any, the output creates scores that aren't worth paying attention to:

1. **Role:** The model needs to know what kind of analyst it's playing. *"You are a revenue operations leader at a B2B agency"* is more useful than the default "You are a helpful assistant." The role anchors tone, gives some directional perspective, and helps the model have a little bit of skepticism.
2. **Rubric:** A list of weighted criteria with explicit scoring guidance for each. Industry fit, size, growth trajectory, buying readiness, and delivery fit are the five that matter for most agencies. Each criteria needs a 0 to 20 (or 0 to 10 or whatever) scale and a description of what each level looks like.
3. **Tier thresholds:** What does a 90-100 mean? What does a 60-74 mean? Tier thresholds turn a number into a decision. Without them, every prospect ends up in the gray zone and the team keeps coming back to ask "should we pursue this one?"
4. **Output format: **Structured output - always. Have it pit out JSON if the scoring is going to be read by some other sort of AI, otherwise, markdown for people. You have to provide your LLM with a really great example of what you want. Otherwise, it's going to freestyle and ramble.
5. **Know about you: **In in order to generate quality output, your LLM has got to have some sort of information about you, the size of your team, great clients you've had in the past, things that you do well, things that you do poorly & what kind of impact you’ve had on clients.

That's the shape. Below is the prompt.

## The ICP scoring prompt (drop-in)

This is the prompt we use. It works with Claude and ChatGPT. we haven't tested it on Grok or any open source models, but I'm gonna guess that it works pretty well there, too…

Replace the bracketed sections with your own ICP definition. 

## Variations: Persona-Level Scoring & Account-Level Scoring

The prompt above is account-level scoring. You're scoring the company. If you sell into named buyer roles, you also need persona-level scoring, which is a separate pass.

For persona scoring, swap the rubric for:

1. **Role fit (0-25):** is this the actual buyer, an influencer, a blocker, or an end user?
2. **Influence (0-25):** can this person move budget or sign? Or do they need to escalate?
3. **Pain alignment (0-25):** does this person feel the pain your offer solves, or are they two layers removed from it?
4. **Engagement signals (0-25):** have they engaged with your content, your category, or competitors? Or are they completely cold?

Tier thresholds for persona scoring run the same shape: A (90-100), B (75-89), C (60-74), D (40-59), F (0-39).

A single account can have an A-tier persona and a C-tier account score. That happens when the right person works at the wrong company. Document both scores and pursue the persona if you have a long enough sales cycle and a smart enough nurture motion.

## How to Calibrate the Prompt Against Your Actual Pipeline

The prompt will be wrong on the first run (probably the 2nd & 3rd, too - but just put in the work.) Calibration is the work that makes the scores trustworthy.

Run the prompt against your last 10-30 closed-won deals in the same ICP (or as many as you have).  If the average score for your won deals is 65, your tier thresholds are off by about 20 points and you need to recalibrate down. If most won deals are scoring in the 75-89 B-tier and almost none are scoring in the 90-100 A-tier, your rubric is too harsh.

Then run it against your last 10-30 closed-lost/gone dark deals in the same ICP. you might find that there is an attribute or something different about the ones you won and the ones you lost. That way you can actually improve your model right off the bat.

This is not going to take you very long, and it yields dividends forever.

## When the Prompt is the Limit and When the Rubric is the Limit

**Prompt failure** looks like wildly different scores for the same prospect on different runs, or missing fields in the JSON output, or the model refusing to score because of "insufficient information." Prompt failures are fixable with better instruction, better examples, and constraining the output format.

**Rubric failure** looks like the prompt running smoothly, producing consistent scores, and the scores still not filtering out bad fits. The numbers are reliable. They're just not measuring the right things. Rubric failures require revising the criteria, often by interviewing your top sales rep and asking what they actually look for in a prospect that doesn't show up on a website.

Most agencies blame the prompt when the rubric is the problem. The prompt is the easy fix. The rubric requires honest internal work.

## What to Do Next

If you're running ICP scoring through an LLM today, take this prompt, plug in your ICP definition, and run it against your last 30 deals. Compare to your historical close rate by tier. You'll either confirm your scoring system is calibrated or discover where it's wrong.

If you're not running ICP scoring through an LLM yet, this is the place to start. The setup is easy and the payoff is a sales team that stops chasing the wrong-fit accounts.

---
Canonical URL: https://timkilroy.com/blog/icp-scoring-prompt-ai-sales-agents