Public records are inconsistent: names are spelled differently, addresses are formatted a dozen ways, and the same business can appear under several entities. Our job is to clean that up so you do not have to.
Normalization
Before matching, we standardize names and addresses — casing, punctuation, suffixes like LLC and Inc, and address components — so that comparisons are apples to apples.
Matching
We then link records that describe the same business using a combination of license number, normalized name, and geocoded address. The result is one stable id that follows a business across renewals and status changes.
Deduplication
- Exact duplicate filings collapse into a single record.
- Repeat events on the same business become distinct, ordered events under one identity.
- When a source corrects a filing, we update the record and keep the original sourceUrl for auditing.
Was this article helpful?
Still stuck? Our team is happy to help.