Skip to content
New Venue Data
Engineering

Building a real-time prospecting pipeline on license data

A practical architecture for turning raw Florida license filings into scored, routed, CRM-ready leads — using webhooks, enrichment, and a dead-simple scoring model your reps will actually trust.

Priya Nair· Staff EngineerMay 12, 202611 min read

There is a meaningful gap between "we have access to license data" and "our reps get a scored lead in Slack ninety seconds after a relevant business files." This post walks through the pipeline we recommend to customers building the second thing. None of it is exotic; the value is in the sequencing and in resisting the urge to over-engineer.

Stage one: ingest via webhooks, not polling

The first instinct is usually to schedule a nightly job that pulls the day's filings. It works, but it bakes in up to twenty-four hours of latency on a signal whose entire value is freshness. Register a webhook instead. Filter at the source — by county, license type, and event type — so your endpoint only ever wakes up for filings you actually care about.

  • Subscribe to new_filing and ownership_transfer events; skip renewals unless you sell retention or compliance services.
  • Scope the subscription to your territories at registration time so you are not paying to process Panhandle filings your Miami team will never touch.
  • Return a 200 immediately and push the payload onto a queue. Never do enrichment or scoring inside the webhook handler — a slow handler causes retries and duplicate processing.

Stage two: deduplicate before you do anything else

Public records are messy. The same business can surface as a pending filing, then an approval, then a status change, generating three events for one real-world lead. Deduplicate on a stable key — the license record ID, falling back to a normalized business-name-plus-address hash. Collapsing these early prevents your reps from getting pinged three times about the same restaurant, which is the fastest way to lose their trust in the system.

Stage three: enrich, but only with what changes the decision

Enrichment is where pipelines go to die. The temptation is to bolt on every data source you can find. Resist it. Add only the fields that change whether a rep should act and how fast. In practice that is a short list: a website or phone where available, a NAICS or category code to confirm the segment, and the surrounding-cluster context that tells you whether this is an isolated opening or part of a wave.

Enrichment is where pipelines go to die. Add only the fields that change whether a rep acts, and how fast. Everything else is latency you are paying for with no return.

Stage four: a scoring model your reps will trust

You do not need machine learning here, and using it early will actively hurt you because reps cannot reason about why a black box ranked one lead above another. Start with a transparent additive score everyone can read off a single page:

  • License type weight: full-liquor SRX or 4COP scores higher for a beverage distributor than a beer-and-wine cafe.
  • Recency: a filing from today outscores one from three weeks ago, because the buying window is open widest right now.
  • Territory fit: in-territory beats adjacent beats out-of-region.
  • Cluster bonus: a filing inside an active opening cluster gets a lift, since corridors in turnover convert better.

The point of a legible model is not accuracy in the abstract; it is adoption. A rep who understands why a lead scored 84 will work it. A rep handed a mysterious "AI score" of 0.71 will quietly ignore the whole feed within two weeks.

Stage five: route to where the work already happens

The final stage is delivery, and the rule is simple: meet reps where they already work. Push high-scoring leads into your CRM as records and into the channel your team lives in — a Slack message, an email digest, a task in the sales tool — with the score and the two or three reasons it scored that way. Do not build a new dashboard nobody logs into. The pipeline's job ends the moment a human sees the right lead in the place they were already looking.

What good looks like

When this is wired correctly, the elapsed time from a business filing its license to a named rep seeing a scored, contextualized lead is measured in minutes, not days. That latency is the whole game. Every hour you shave off is an hour of head start on the next vendor who is still waiting for the restaurant to physically open before they notice it exists.

Start monitoring Florida in minutes.

No contracts. Cancel any time. County plan from $149/month.