Trust But Verify: Data Validation Playbook¶
A practical guide to verifying agent outputs before client delivery. Agent-generated content is powerful, but client trust depends on your verification discipline.
When to Use This Guide¶
Use this playbook when:
- Reviewing agent-generated deliverables before client delivery
- Validating data claims, projections, or recommendations
- Blocked by
/qa-checkor the Stop hook and need to resolve issues - Training team members on quality standards
- Unsure where to verify specific metrics or claims
The Three-Second Rule¶
Before sending ANY deliverable:
- Can you cite the source of every metric?
- Can you defend every projection with methodology?
- Would you stake your reputation on this being accurate?
If you hesitate on any answer, keep reading.
Trust-But-Verify Checklist¶
Use this checklist before marking any deliverable as "done":
Data & Metrics¶
- Source cited - Every metric has a source (BigQuery, GA4, DataForSEO, Signals)
- Date range specified - Exact dates ("March 1-31, 2024"), not relative ("last month")
- Sample size adequate - Minimum 30 clicks/conversions for statistical relevance
- Math verified - Percentages, growth calculations, and projections are correct
- Query evidence - BigQuery/SQL queries are documented or available on request
Claims & Projections¶
- No ungrounded claims - Avoid "studies show" without citing specific study
- Projections have methodology - CTR curves, historical data, or industry benchmarks cited
- Assumptions stated - Explicitly document what you're assuming (competitive landscape, conversion rates, etc.)
- Conservative language - Use "could increase" not "will increase"
- Scenarios provided - Best/Better/Good projections, not single optimistic number
Deliverable Quality¶
- No overpromising - No guarantee language ("will", "guaranteed", "100% success")
- Client-safe language - No internal jargon, no competitor confidential data
- Complete - No
[TODO],[TBD], or placeholder text - Formatted correctly - Tables render, links work, color-coding consistent
- Action-oriented - Slide titles are conclusions, not labels ("Traffic grew 34%" not "Traffic Overview")
Verification Paths by Data Source¶
Where to verify specific claims based on the data source:
BigQuery / Seer Signals¶
What it provides:
- Organic rankings (daily snapshots)
- SERP features (historical)
- Keyword volumes (updated monthly)
- Traffic trends (when integrated with GA4)
Skill Integration
Use the signals-data skill to verify queries. The skill documents all 91 views with exact field names, preventing hallucinated columns and wrong view selection. Load it with: Skill command="signals-data"
Where to verify:
- Seer Signals dashboard - Visual confirmation of trends and rankings
- BigQuery console - Re-run queries to confirm data freshness
- DataStudio reports - Pre-built visualizations of Signals data
- signals-data skill - Verify view/field correctness against schema docs
Common Query Errors (signals-data catches these):
| Error | Symptom | How signals-data Helps |
|---|---|---|
| Wrong view type | Missing expected fields | Skill documents which fields exist in each view |
| Field hallucination | Query fails or returns nulls | Skill lists ONLY valid fields per view |
| Wrong lookback | Incomplete historical data | Skill shows _30_Days vs _13_Months variants |
| Snapshots vs Tracking confusion | Point-in-time when you need trends | Snapshots = upload time, Tracking = over time |
| Missing org_name filter | Wrong client data or cross-client leakage | Every view doc shows required filters |
Verification Checklist:
- View matches analysis type (see signals-data decision guide)
- All queried fields exist in that view's schema
-
org_namefilter matches client exactly - Date range uses correct field (
post_datevsdate) - Lookback period sufficient for analysis scope
- No outliers or anomalies in data (algorithm updates, tracking issues)
Google Analytics 4¶
What it provides:
- Traffic volume (sessions, users, pageviews)
- Conversion data (goals, e-commerce transactions)
- User behavior (engagement, scroll depth, exit rates)
- Channel attribution (organic, paid, direct, referral)
Where to verify:
- GA4 UI → Reports → Acquisition → Traffic acquisition
- GA4 UI → Reports → Engagement → Pages and screens
- GA4 UI → Explore → Free form (custom reports)
Common checks:
- Match date range exactly
- Confirm segment filters (device, geography, user type)
- Cross-check conversion numbers with CRM (expect 5-15% variance)
- Verify no tracking implementation changes in date range
If you don't have Explore access:
- Use standard GA4 reports (Acquisition, Engagement, Monetization)
- Ask Analytics team for Explore access or custom report export
- Cross-reference with client's internal reports
Google Search Console¶
What it provides:
- Impressions, clicks, CTR, average position
- Query-level performance
- Page-level organic traffic
- SERP feature tracking (limited)
Where to verify:
- GSC UI → Performance → Search results
- GSC UI → Performance → Queries tab (keyword data)
- GSC UI → Performance → Pages tab (landing page data)
Common checks:
- Date range matches (GSC has 2-3 day lag)
- Filter matches (device, country, search type)
- Anomalies due to algorithm updates or site changes
- Compare GSC clicks with GA4 organic traffic (expect 10-20% variance)
If data doesn't match:
- GSC tracks Google-only; GA4 includes all search engines
- GSC counts clicks; GA4 counts sessions (multi-page visits = 1 session)
- Bot traffic filtered differently
DataForSEO (SERP Analysis)¶
What it provides:
- Live SERP snapshots (rankings, features, competitors)
- Keyword difficulty and search volume
- Competitive positioning
- SERP feature presence (PAA, Featured Snippets, etc.)
Where to verify:
- Manual Google search - Confirm rankings and SERP features in live results
- SEMrush / Ahrefs - Cross-check keyword volumes and difficulty
- DataForSEO UI (if you have direct access)
Common checks:
- Rankings can fluctuate daily - note date of SERP snapshot
- SERP features are dynamic - verify in multiple browsers/locations
- Competitor analysis is point-in-time - check for recent changes
If SERP doesn't match:
- Personalization (logged in vs. logged out Google)
- Location targeting (set location in search settings)
- Device type (mobile vs. desktop results differ)
- Timing (SERP snapshots are historical)
Paid Media Platforms¶
What it provides:
- Campaign performance (impressions, clicks, conversions)
- Audience data (demographics, interests, behaviors)
- Ad creative performance
- Budget pacing and spend
Where to verify:
- Google Ads UI → Campaigns → Performance
- Meta Ads Manager → Campaigns
- Microsoft Ads → Campaigns → Performance
- LinkedIn Campaign Manager
Common checks:
- Attribution model matches (last-click, data-driven, etc.)
- Conversion tracking is firing (check recent conversions)
- Date range and currency match
- Audience segments haven't changed
If Blocked by QA Gates¶
Blocked by /qa-check or Stop Hook¶
If the agent blocks you with a quality gate message:
1. Read the reason carefully
- Stop hook provides specific issue (e.g., "Metrics cited without data source")
/qa-checkprovides structured report with PASS/FAIL per check
2. Fix the issue before proceeding
- Add missing citations
- Remove guarantee language
- Verify data sources
- Document assumptions
3. Re-run verification
/qa-checkto confirm fixes- Or just continue work - Stop hook will re-check automatically
4. Override only if justified
- Document why override is necessary
- Get peer review approval
- Update deliverable with caveat
Common QA Block Scenarios¶
| Block Message | What It Means | How to Fix |
|---|---|---|
| "Metrics cited without data source" | You have numbers (%, growth, traffic) but no citation | Add (Source: GA4, March 2024) or (BigQuery OrganicRankings_Daily table) |
| "Completion claimed but no verification" | You said "done" but didn't run tests or verify | Run build/test commands and confirm they pass, or remove "done" claim |
| "Action titles required for presentations" | Slide titles are labels ("Overview"), not conclusions | Change to action-oriented conclusions ("Traffic grew 34% YoY") |
| "Unsupported claim detected" | "Studies show..." or "Research indicates..." without citation | Either cite the specific study or rephrase as your analysis |
| "Overpromising language detected" | "Will increase", "guaranteed", "100% will" | Use qualified language: "could increase", "has potential to", "may improve" |
What to Document in Deliverables¶
When creating client-facing content, always include:
Required Citations¶
Every deliverable must document:
Data sources:
"Based on BigQuery OrganicRankings_Daily table (March 1-31, 2024),
traffic increased 18% (12,400 → 14,616 sessions)."
Benchmark claims:
Projections:
"Potential traffic lift: 180-220 clicks/month (CTR curve methodology,
assuming position improvement from 7→3, current MSV 2,400)."
Required Assumptions¶
Be explicit about what you're assuming:
Assumptions underlying this analysis:
1. Competitive landscape remains stable
2. Technical SEO issues addressed within 30 days
3. Content implemented without significant deviation
4. No major Google algorithm updates affect target keywords
5. Client's conversion rate (3.2%) remains consistent
Required Caveats¶
Document limitations honestly:
Data Notes:
- GA4 data through March 31; more recent requires live platform check
- Sample size (n=8 keywords) below ideal threshold (n=30+)
- Rankings reflect desktop; mobile may differ
- Projections assume stable competitive landscape
The Rule of Five (Steve Yegge Principle)¶
Agent outputs are first drafts, not final deliverables. Self-review at least 5 times before delivery:
Review Pass 1: Data Accuracy¶
- Are all metrics cited with sources?
- Are calculations correct?
- Do date ranges make sense?
- Is sample size adequate?
Review Pass 2: Logic & Reasoning¶
- Do recommendations follow from data?
- Are there alternative explanations?
- Did I consider external factors (seasonality, algorithm updates)?
- Are assumptions reasonable?
Review Pass 3: Client Context¶
- Does this align with client's business model?
- Is language client-appropriate (no jargon)?
- Are recommendations actionable for their team?
- Is tone suitable for relationship stage?
Review Pass 4: Deliverable Quality¶
- Is formatting clean and consistent?
- Do links work?
- Are tables and charts clear?
- Is document structure logical?
Review Pass 5: Final QA Gate¶
- Run
/qa-checkfor automated validation - Check against Quality Standards
- Review the QA Review Checklist (quality-standards skill resources)
- Confirm peer review if required (strategic recs, high-stakes deliverables)
Why 5 passes? Each review catches different types of issues. First pass sees data problems. Last pass catches subtle tone or framing issues.
QA Tiers (When Peer Review Is Required)¶
Not all deliverables require the same verification level:
Auto-Ship (No Peer Review)¶
- Data extraction queries
- Keyword research (volume, difficulty, SERP features)
- Competitive research (what competitors are doing)
- Traffic/ranking reports
Peer Review Required¶
- Strategic recommendations
- Content differentiation strategies
- Client-facing deliverables (audits, outlines, analyses)
- ROI projections and Expected Outcome tables
- Priority recommendations and roadmaps
Shadow Mode (Senior Review + Validation)¶
- New methodologies not yet proven
- Experimental approaches
- High-stakes client deliverables (>$100K projected impact)
- Sensitive competitive positioning
How to determine tier:
- If deliverable includes projections or recommendations → Peer Review
- If deliverable goes directly to client without edit → Peer Review
- If stakes are high (revenue impact, client relationship) → Shadow Mode
- If just data extraction or research → Auto-Ship
Practitioner Shortcuts¶
Quick verification techniques for common scenarios:
"Does this number look right?"¶
Sanity check questions:
- Would a 500% traffic increase actually be realistic?
- Do these rankings align with competitive landscape?
- Is this CTR projection reasonable for this keyword type?
- Does conversion rate match client's historical average?
Quick verification:
- Compare to prior period (does trend make sense?)
- Check against industry benchmarks (is it within 2x of normal?)
- Cross-reference with another data source (GA4 vs. GSC)
"Where do I verify this metric?"¶
Decision tree:
| Metric Type | Primary Source | Backup Source |
|---|---|---|
| Organic traffic, rankings | Google Search Console | GA4 organic sessions |
| Sessions, conversions | Google Analytics 4 | Client CRM/backend |
| SERP features, competitors | DataForSEO | Manual Google search |
| Paid campaign performance | Google Ads / Meta Ads | Platform UI |
| Keyword volumes | Seer Signals / DataForSEO | SEMrush / Ahrefs |
"What if sources conflict?"¶
Variance tolerance guidelines:
| Comparison | Expected Variance | Action if Exceeded |
|---|---|---|
| GA4 vs. CRM conversions | 5-15% | Investigate attribution, tracking lag |
| GSC clicks vs. GA4 organic sessions | 10-20% | Note in deliverable (different definitions) |
| DataForSEO rankings vs. manual check | ±2 positions | Use manual as source of truth |
| Backend conversions vs. pixel | >15% | Flag for CAPI/Redundant Event Pipeline |
When sources conflict:
- Use the most authoritative source (backend > GA4 > third-party estimates)
- Document the discrepancy in deliverable
- Provide both numbers with explanation
- Flag for client/team investigation if variance is significant
Related Resources¶
Core QA Infrastructure¶
- Quality Standards Skill - Complete QA framework
/qa-checkCommand - On-demand quality validation
For Builders
The full QA skill resources (qa-review.md, fact-checking.md, quality.md) are in the plugin source at plugins/core-dependencies/skills/quality-standards/resources/.
Practitioner Guidance¶
- Best Practices - General tips for working with agents
- Troubleshooting - Common issues and solutions
- SEO Workflows - Division-specific verification steps
- Analytics Workflows - Analytics verification patterns
Last updated: 2026-01-29