Building a Multi-Agent GTM Engine with Claude

Outbound sales is a grind. Find companies that match your ICP. Find the right contacts. Research them on LinkedIn. Write personalized emails. Load them into your campaign tool. Repeat.

It takes hours per day to do well. Most people don't do it well, so they blast generic emails and wonder why response rates are terrible.

I built cheerful-gtm to automate the entire pipeline. It's a multi-agent system where each agent specializes in one part of the workflow: finding leads, enriching them with research, and writing personalized email sequences. The result: high-quality outbound at scale, running on autopilot.

Here's how it works.


The Architecture

The pipeline has three agents connected in sequence:

Lead Finder Agent → Enrichment Agent → Copywriter Agent → Instantly → Slack Digest

Each agent is a Claude-powered loop with access to specific tools. They hand off structured data to the next agent in the chain.

Agent 1: Lead Finder

Job: Find companies matching our ICP, identify contacts, deduplicate against existing data.

Tools:

  • apollo_search: Search Apollo.io for companies by criteria
  • apollo_get_contacts: Get contacts at a specific company
  • clarify_check: Check if a contact exists in our CRM
  • instantly_check: Check if a contact is in an active campaign

Flow:

  1. Search Apollo for companies matching criteria (industry, size, location, technologies)
  2. For each company, find relevant contacts (title, seniority)
  3. Check each contact against Clarify CRM - skip if already a lead
  4. Check against Instantly - skip if already in a campaign
  5. Return list of net-new leads

The agent decides how many searches to run based on results. If the first search returns low-quality matches, it might refine the query. If it finds enough leads quickly, it stops early.

Agent 2: Enrichment

Job: Research each lead to score them and find personalization angles.

Tools:

  • linkedin_get_profile: Fetch LinkedIn profile via Apify
  • linkedin_get_posts: Fetch recent LinkedIn posts
  • clarify_create: Create a record in our CRM

Flow:

  1. For each lead, fetch their LinkedIn profile
  2. Get their recent posts (last 30 days)
  3. Score based on activity:
    • HIGH: Recent posts relevant to what we do
    • MEDIUM: Any recent activity
    • LOW: No posts or minimal presence
  4. Extract personalization angles from posts
  5. Create a CRM record with the enriched data

The score determines how much effort the copywriter spends on personalization.

Agent 3: Copywriter

Job: Generate a 3-email sequence for each lead, personalized based on their score.

Tools: None (pure generation)

Flow:

  1. Receive enriched lead with score and personalization angles
  2. Generate email 1 (day 0): Initial outreach
  3. Generate email 2 (day 4): Follow-up with different angle
  4. Generate email 3 (day 8): Final touch, softer ask

For HIGH-score leads, emails reference specific posts and demonstrate genuine familiarity. For LOW-score leads, personalization is lighter - company and role based, not individual.


The Agentic Loop

Each agent follows the same pattern:

class BaseAgent:
    def __init__(self, client: Anthropic):
        self.client = client
        self.model = "claude-opus-4-5-20251101"

    async def run(self, input_data: dict) -> dict:
        messages = [{"role": "user", "content": self._format_prompt(input_data)}]

        while True:
            response = self.client.messages.create(
                model=self.model,
                max_tokens=4096,
                tools=self._get_tool_definitions(),
                messages=messages
            )

            # Check if we're done
            if response.stop_reason != "tool_use":
                return self._parse_final_response(response)

            # Handle tool calls
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = await self._handle_tool_call(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })

            # Continue the conversation
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

The key insight: Claude decides when it's done. The loop continues until stop_reason isn't tool_use. This means:

  • The Lead Finder might call Apollo 5 times to refine a search
  • The Enricher might skip LinkedIn for leads with obviously fake profiles
  • The Copywriter generates all three emails in one shot

No hardcoded step counts. The agent adapts to the situation.


Tool Definitions

Tools are defined as JSON schemas that Claude understands:

def _get_tool_definitions(self) -> list:
    return [
        {
            "name": "apollo_search",
            "description": "Search Apollo.io for companies matching criteria",
            "input_schema": {
                "type": "object",
                "properties": {
                    "industry": {
                        "type": "string",
                        "description": "Industry vertical (e.g., 'SaaS', 'E-commerce')"
                    },
                    "employee_count_min": {
                        "type": "integer",
                        "description": "Minimum employee count"
                    },
                    "employee_count_max": {
                        "type": "integer",
                        "description": "Maximum employee count"
                    },
                    "technologies": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "Technologies the company uses"
                    }
                },
                "required": ["industry"]
            }
        },
        # ... more tools
    ]

Tool handlers execute the actual API calls:

async def _handle_tool_call(self, name: str, input: dict) -> dict:
    if name == "apollo_search":
        return await self.apollo_client.search_companies(**input)
    elif name == "apollo_get_contacts":
        return await self.apollo_client.get_contacts(**input)
    elif name == "clarify_check":
        return await self.clarify_client.check_exists(input["email"])
    # ... etc

Personalization by Score

The copywriter's behavior changes based on lead score:

HIGH Score (relevant recent posts):

Subject: Your post on AI automation resonated

Hi Sarah,

I saw your post last week about automating customer support workflows -
especially the point about maintaining quality while scaling. That's exactly
the problem we're solving at Cheerful.

We help e-commerce brands automate email responses while keeping the human
touch. Based on what you shared, I think you'd find our approach interesting.

Would you be open to a 15-minute call to see if there's a fit?

LOW Score (no recent activity):

Subject: Quick question about [Company] support

Hi Sarah,

I noticed [Company] is growing quickly in the e-commerce space.
Curious how you're handling customer support volume as you scale?

We help brands like yours automate email responses without sacrificing quality.

Worth a quick chat?

The HIGH version references specific content and demonstrates genuine research. The LOW version is still personalized (company, role, industry) but doesn't pretend to know things we don't.

This honesty matters. Fake personalization ("I loved your recent post!" when there is no post) destroys trust.


Pipeline Orchestration

The pipeline coordinator runs the agents in sequence:

class Pipeline:
    def __init__(self):
        self.lead_finder = LeadFinderAgent()
        self.enrichment = EnrichmentAgent()
        self.copywriter = CopywriterAgent()
        self.instantly = InstantlyClient()
        self.slack = SlackClient()

    async def run(self, icp_criteria: dict) -> PipelineResult:
        # Phase 1: Find leads
        leads = await self.lead_finder.run(icp_criteria)
        logger.info(f"Found {len(leads)} leads")

        # Phase 2: Enrich leads
        enriched_leads = []
        for lead in leads:
            enriched = await self.enrichment.run(lead)
            enriched_leads.append(enriched)

        # Phase 3: Generate sequences
        sequences = []
        for lead in enriched_leads:
            sequence = await self.copywriter.run(lead)
            sequences.append((lead, sequence))

        # Phase 4: Push to Instantly
        for lead, sequence in sequences:
            await self.instantly.add_lead(
                email=lead.email,
                campaign_id=os.environ["INSTANTLY_CAMPAIGN_ID"],
                variables={
                    "first_name": lead.first_name,
                    "company": lead.company,
                    "email_1": sequence.emails[0],
                    "email_2": sequence.emails[1],
                    "email_3": sequence.emails[2],
                }
            )

        # Phase 5: Slack digest
        await self.slack.post_digest(leads, enriched_leads, sequences)

        return PipelineResult(
            leads_found=len(leads),
            leads_enriched=len(enriched_leads),
            sequences_generated=len(sequences)
        )

Scheduling

The pipeline runs Mon-Thu at 9am ET:

def main():
    scheduler = APScheduler()

    @scheduler.scheduled_job('cron', day_of_week='mon-thu', hour=9, timezone='US/Eastern')
    async def run_pipeline():
        pipeline = Pipeline()
        result = await pipeline.run(ICP_CRITERIA)
        logger.info(f"Pipeline complete: {result}")

    scheduler.start()

Or run immediately with --now:

python -m cheerful_gtm.main --now

Results and Iteration

After running for a few weeks:

  • Lead quality improved: The enrichment scoring accurately identifies engaged prospects
  • Response rates up: Personalized sequences based on actual LinkedIn activity perform better than generic templates
  • Time saved: What used to take 2-3 hours daily now runs automatically
  • Better data: Every lead goes into Clarify CRM with enrichment data, even if they don't respond

The Slack digest keeps me informed without requiring daily attention:

📊 Cheerful GTM Pipeline - January 4, 2025

Leads Found: 47
  - New: 32
  - Skipped (CRM): 8
  - Skipped (active campaign): 7

Enrichment:
  - HIGH score: 8 (25%)
  - MEDIUM score: 14 (44%)
  - LOW score: 10 (31%)

Sequences Generated: 32
Campaign Updated: cheerful-q1-outbound

[View in Instantly] [View in Clarify]

Lessons Learned

1. Let agents decide loop iterations

Hardcoding "search Apollo exactly 3 times" is fragile. Letting Claude decide based on results is more robust. Sometimes one search is enough. Sometimes five isn't.

2. Score-based personalization prevents fakeness

The temptation is to always maximize personalization. But fake personalization is worse than none. Scoring leads and adjusting depth accordingly keeps emails honest.

3. Dedupe early and often

Checking against CRM and existing campaigns before enrichment saves API calls and prevents embarrassing duplicate outreach.

4. The Slack digest builds trust

Seeing daily stats helps me trust the system. If something looks off (0 leads found, unusual score distribution), I can investigate before emails go out.

5. Multi-agent > monolithic

Breaking the pipeline into specialized agents made each one easier to build, test, and improve. The Lead Finder doesn't need to know about email writing. The Copywriter doesn't need to know about Apollo's API.


Try It Yourself

The core pattern:

  1. Define your ICP as structured criteria
  2. Build specialized agents for each pipeline stage
  3. Use tool-calling to give agents capabilities
  4. Let Claude manage loops - check stop_reason
  5. Connect outputs to inputs between agents
  6. Add observability (logs, Slack digests)

The specific tools (Apollo, LinkedIn, Instantly) can be swapped for whatever you use. The agent architecture stays the same.


Built with Python, Claude (Opus 4.5), Apollo.io, Apify, Clarify CRM, and Instantly. Runs on a schedule via APScheduler.