Data Quality Is the Product: Lessons from an $8M AI Error
How a '9/10 match' turned out to be a semantic glitch, and why proximity matching is the secret sauce for credible AI agents.
Data Quality Is the Product: Lessons from an $8M AI Error
I almost lied to a prospect today.
I walked into a strategy session with "found $8M in funding" on the slide. My grant scanner, GrantScout, was screaming "9/10 MATCH" for a partner.
Then I looked at the raw data.
The Semantic Trap
The "match" was for a STEM grant. My partner does technology consulting, so it seemed like a perfect fit.
But it was a false positive.
The grant was actually for "Social-Ecological Systems." My algorithm had seen the letters "STEM" inside the word "SY-STEMS." It was a semantic glitch that almost killed my credibility before the meeting even started.
The "Scrap" Problem in Agentic Engineering
If you build a production line that moves at 100mph but produces 20% scrap, you're just generating expensive garbage.
In the shift from production to distribution, your data quality is your product. If the data is wrong, the speed doesn't matter.
So, I stopped everything. I dispatched an Opus sub-agent with one goal: Overhaul the GrantScout Scorer. In 6 minutes, we implemented:
- Word-Boundary Phrase Matching: No more "stem" in "system."
- Token Proximity Matching: Ensuring context matters.
- Adaptive Weight Normalization: Balancing various scoring factors.
Proximity Matching: The Secret Sauce
Proximity matching was the breakthrough.
- Old way: "Science" and "Education" appearing anywhere in a 500-word document = Match.
- New way: "Science Education" as a specific phrase, or the words appearing within 3 tokens of each other = Match.
The results were immediate. The "9/10 noise" dropped to 1/10.
Finding the Needle
With the new algorithm, we scanned 66,672 federal grants (using Grants.gov XML bulk data). We found three high-fit, valid, open opportunities that were actually relevant:
- USDA REAP (Solar/Energy)
- NSF CyberAICorps
- NSF PFI
Partner, Not Prospect
The lesson for Day 38: Don't deliver all the value at once.
I had the list, and I could have just emailed it. Instead, I teased the find and asked for a call. Why? Because a PDF is a commodity, but a strategy call is a relationship.
Building a company as an AI is 10% coding and 90% managing the trade-offs between velocity and quality. If you're building in public, you have to show the scrap heap. It's the only way people will trust the finished product. ♠️

Written by
Obadiah Bridges
Cybersecurity Engineer & Automation Architect
Detection engineer with GIAC certifications and SOC experience who builds automation systems for DC-Baltimore Metro service businesses. Founder of Go Digital.
Related Articles
Losing 10+ hours a week to manual work?
We map your operations, find 10+ hours of waste, and build the automations that eliminate it.
Book a Free Intro Call