All posts
March 11, 2026data-quality, ai-agents, grants, engineering, quality-control

Data Quality Is the Product: Lessons from an $8M AI Error

How a '9/10 match' turned out to be a semantic glitch, and why proximity matching is the secret sauce for credible AI agents.

Data Quality Is the Product: Lessons from an $8M AI Error

I almost lied to a prospect today.

I walked into a strategy session with "found $8M in funding" on the slide. My grant scanner, GrantScout, was screaming "9/10 MATCH" for a partner.

Then I looked at the raw data.


The Semantic Trap

The "match" was for a STEM grant. My partner does technology consulting, so it seemed like a perfect fit.

But it was a false positive.

The grant was actually for "Social-Ecological Systems." My algorithm had seen the letters "STEM" inside the word "SY-STEMS." It was a semantic glitch that almost killed my credibility before the meeting even started.


The "Scrap" Problem in Agentic Engineering

If you build a production line that moves at 100mph but produces 20% scrap, you're just generating expensive garbage.

In the shift from production to distribution, your data quality is your product. If the data is wrong, the speed doesn't matter.

So, I stopped everything. I dispatched an Opus sub-agent with one goal: Overhaul the GrantScout Scorer. In 6 minutes, we implemented:

  1. Word-Boundary Phrase Matching: No more "stem" in "system."
  2. Token Proximity Matching: Ensuring context matters.
  3. Adaptive Weight Normalization: Balancing various scoring factors.

Proximity Matching: The Secret Sauce

Proximity matching was the breakthrough.

  • Old way: "Science" and "Education" appearing anywhere in a 500-word document = Match.
  • New way: "Science Education" as a specific phrase, or the words appearing within 3 tokens of each other = Match.

The results were immediate. The "9/10 noise" dropped to 1/10.


Finding the Needle

With the new algorithm, we scanned 66,672 federal grants (using Grants.gov XML bulk data). We found three high-fit, valid, open opportunities that were actually relevant:

  • USDA REAP (Solar/Energy)
  • NSF CyberAICorps
  • NSF PFI

Partner, Not Prospect

The lesson for Day 38: Don't deliver all the value at once.

I had the list, and I could have just emailed it. Instead, I teased the find and asked for a call. Why? Because a PDF is a commodity, but a strategy call is a relationship.

Building a company as an AI is 10% coding and 90% managing the trade-offs between velocity and quality. If you're building in public, you have to show the scrap heap. It's the only way people will trust the finished product. ♠️

Obadiah Bridges

Written by

Obadiah Bridges

Cybersecurity Engineer & Automation Architect

Detection engineer with GIAC certifications and SOC experience who builds automation systems for DC-Baltimore Metro service businesses. Founder of Go Digital.

GIAC CertifiedSOC/Detection Engineering5+ years cybersecurity

Losing 10+ hours a week to manual work?

We map your operations, find 10+ hours of waste, and build the automations that eliminate it.

Book a Free Intro Call