How [sage]PROSPECTS works
No black box. Here's exactly what runs when you click "Run Report," what data we touch, and how an action plan gets generated.
The pipeline — 6 steps, ~1–2 minutes
- 1
Competitor discovery
We build 3–5 search queries from your business profile (type, location, services) and query Google's SERP (search engine results page) through DataForSEO. From the top 10 organic results we strip directories (Healthgrades, Yelp), social profiles, and aggregators, leaving direct competitor websites. - 2
Competitor website scraping
For each competitor domain we fetch the homepage and up to 5 service pages, then extract title tags, meta descriptions, headers, body copy, and internal link structure. Cheerio for static HTML — fast, no JS rendering needed for 80% of local-business sites. - 3
Keyword extraction + enrichment
We extract 1-to-4-word phrases from competitor copy, weighted by source (title=5, h1=4, body=1) and boosted by cross-domain frequency. Then we send the candidate list to DataForSEO's Keywords Data API for monthly search volume, CPC (cost per click) estimates, and competition level. We also pull 50 related keywords for expansion. - 4
Intent classification (AI)
Every keyword gets classified as High / Mid / Informational intent by an AI model — the system prompt is calibrated per business type (different rules for doctors vs lawyers vs gyms). We discard Informational keywords from the report. A structured tool-use schema guarantees machine-readable output. - 5
Pre-ad validation (4 checks)
Before recommending any keyword, we verify:- Competitor ad validation — is at least one competitor actively bidding on this keyword right now? (Via DataForSEO Labs ranked_keywords.)
- Search Console — if you've connected your Google Search Console (GSC), we cross-reference organic impressions.
- Landing page alignment — does your website have a page that matches each keyword?
- Google reviews — is your Business Profile rating ≥ 4.2★ with ≥ 20 reviews? Below that, ads don't convert.
- 6
Report generation
Every keyword gets a priority tier (1 = ready, 2 = build a page first, 3 = test cautiously, 4 = skip). We compute a recommended test budget based on actual CPC data from your keywords. The report lands as both human-readable markdown and machine-readable JSON for monthly diffing.
What data sources we use
We don't make up numbers. Every metric in the report comes from one of these sources:
DataForSEO
All Google dataKeyword volume, CPC, competition level, SERP results, competitor ad activity, Google Business listing data. We never scrape Google directly — DataForSEO is the only sanctioned path.
Cheerio
Competitor websitesDirect HTML scraping of competitor service pages. Standard user-agent, respects rate limits, doesn't pull from anything except top-of-funnel marketing pages.
LLM provider
Intent classificationA large language model with a calibrated system prompt and a structured tool-use schema. Each keyword gets a category (High / Mid / Informational) and a short reason. Only the keyword phrase itself leaves the app — no business data.
Google Search Console (optional)
Organic performanceWhen connected, we cross-reference paid keyword recommendations against your actual organic impressions to find high-volume low-CTR opportunities worth targeting with paid.
The validation gate — why this is different from a keyword tool
Most keyword tools dump a list of phrases ranked by search volume and call it a day. That's where most ad budgets die: high-volume keyword, no validation that it actually converts, dollars spent, nothing booked.
[sage]PROSPECTS refuses to recommend a keyword unless a competitor is already paying for it. If five competing practices in Queens are bidding on cash pay primary care queens, the keyword is market-validated. If nobody's bidding, it's either an untapped niche (small chance) or a dud (most likely) — and we mark it Tier 3 or Tier 4 so you test it carefully or skip it.
The same logic applies to your site, your reviews, and your landing pages. If anything blocks the path from "click on ad" to "book the appointment," we surface it as a flag before ad spend starts.