Skip to content

Trust But Verify: Data Validation Playbook

A practical guide to verifying agent outputs before client delivery. Agent-generated content is powerful, but client trust depends on your verification discipline.


When to Use This Guide

Use this playbook when:

  • Reviewing agent-generated deliverables before client delivery
  • Validating data claims, projections, or recommendations
  • Blocked by /qa-check or the Stop hook and need to resolve issues
  • Training team members on quality standards
  • Unsure where to verify specific metrics or claims

The Three-Second Rule

Before sending ANY deliverable:

  1. Can you cite the source of every metric?
  2. Can you defend every projection with methodology?
  3. Would you stake your reputation on this being accurate?

If you hesitate on any answer, keep reading.


Trust-But-Verify Checklist

Use this checklist before marking any deliverable as "done":

Data & Metrics

  • Source cited - Every metric has a source (BigQuery, GA4, DataForSEO, Signals)
  • Date range specified - Exact dates ("March 1-31, 2024"), not relative ("last month")
  • Sample size adequate - Minimum 30 clicks/conversions for statistical relevance
  • Math verified - Percentages, growth calculations, and projections are correct
  • Query evidence - BigQuery/SQL queries are documented or available on request

Claims & Projections

  • No ungrounded claims - Avoid "studies show" without citing specific study
  • Projections have methodology - CTR curves, historical data, or industry benchmarks cited
  • Assumptions stated - Explicitly document what you're assuming (competitive landscape, conversion rates, etc.)
  • Conservative language - Use "could increase" not "will increase"
  • Scenarios provided - Best/Better/Good projections, not single optimistic number

Deliverable Quality

  • No overpromising - No guarantee language ("will", "guaranteed", "100% success")
  • Client-safe language - No internal jargon, no competitor confidential data
  • Complete - No [TODO], [TBD], or placeholder text
  • Formatted correctly - Tables render, links work, color-coding consistent
  • Action-oriented - Slide titles are conclusions, not labels ("Traffic grew 34%" not "Traffic Overview")

Verification Paths by Data Source

Where to verify specific claims based on the data source:

BigQuery / Seer Signals

What it provides:

  • Organic rankings (daily snapshots)
  • SERP features (historical)
  • Keyword volumes (updated monthly)
  • Traffic trends (when integrated with GA4)

Skill Integration

Use the signals-data skill to verify queries. The skill documents all 91 views with exact field names, preventing hallucinated columns and wrong view selection. Load it with: Skill command="signals-data"

Where to verify:

  • Seer Signals dashboard - Visual confirmation of trends and rankings
  • BigQuery console - Re-run queries to confirm data freshness
  • DataStudio reports - Pre-built visualizations of Signals data
  • signals-data skill - Verify view/field correctness against schema docs

Common Query Errors (signals-data catches these):

Error Symptom How signals-data Helps
Wrong view type Missing expected fields Skill documents which fields exist in each view
Field hallucination Query fails or returns nulls Skill lists ONLY valid fields per view
Wrong lookback Incomplete historical data Skill shows _30_Days vs _13_Months variants
Snapshots vs Tracking confusion Point-in-time when you need trends Snapshots = upload time, Tracking = over time
Missing org_name filter Wrong client data or cross-client leakage Every view doc shows required filters

Verification Checklist:

  • View matches analysis type (see signals-data decision guide)
  • All queried fields exist in that view's schema
  • org_name filter matches client exactly
  • Date range uses correct field (post_date vs date)
  • Lookback period sufficient for analysis scope
  • No outliers or anomalies in data (algorithm updates, tracking issues)

Google Analytics 4

What it provides:

  • Traffic volume (sessions, users, pageviews)
  • Conversion data (goals, e-commerce transactions)
  • User behavior (engagement, scroll depth, exit rates)
  • Channel attribution (organic, paid, direct, referral)

Where to verify:

  • GA4 UI → Reports → Acquisition → Traffic acquisition
  • GA4 UI → Reports → Engagement → Pages and screens
  • GA4 UI → Explore → Free form (custom reports)

Common checks:

  • Match date range exactly
  • Confirm segment filters (device, geography, user type)
  • Cross-check conversion numbers with CRM (expect 5-15% variance)
  • Verify no tracking implementation changes in date range

If you don't have Explore access:

  • Use standard GA4 reports (Acquisition, Engagement, Monetization)
  • Ask Analytics team for Explore access or custom report export
  • Cross-reference with client's internal reports

Google Search Console

What it provides:

  • Impressions, clicks, CTR, average position
  • Query-level performance
  • Page-level organic traffic
  • SERP feature tracking (limited)

Where to verify:

  • GSC UI → Performance → Search results
  • GSC UI → Performance → Queries tab (keyword data)
  • GSC UI → Performance → Pages tab (landing page data)

Common checks:

  • Date range matches (GSC has 2-3 day lag)
  • Filter matches (device, country, search type)
  • Anomalies due to algorithm updates or site changes
  • Compare GSC clicks with GA4 organic traffic (expect 10-20% variance)

If data doesn't match:

  • GSC tracks Google-only; GA4 includes all search engines
  • GSC counts clicks; GA4 counts sessions (multi-page visits = 1 session)
  • Bot traffic filtered differently

DataForSEO (SERP Analysis)

What it provides:

  • Live SERP snapshots (rankings, features, competitors)
  • Keyword difficulty and search volume
  • Competitive positioning
  • SERP feature presence (PAA, Featured Snippets, etc.)

Where to verify:

  • Manual Google search - Confirm rankings and SERP features in live results
  • SEMrush / Ahrefs - Cross-check keyword volumes and difficulty
  • DataForSEO UI (if you have direct access)

Common checks:

  • Rankings can fluctuate daily - note date of SERP snapshot
  • SERP features are dynamic - verify in multiple browsers/locations
  • Competitor analysis is point-in-time - check for recent changes

If SERP doesn't match:

  • Personalization (logged in vs. logged out Google)
  • Location targeting (set location in search settings)
  • Device type (mobile vs. desktop results differ)
  • Timing (SERP snapshots are historical)

What it provides:

  • Campaign performance (impressions, clicks, conversions)
  • Audience data (demographics, interests, behaviors)
  • Ad creative performance
  • Budget pacing and spend

Where to verify:

  • Google Ads UI → Campaigns → Performance
  • Meta Ads Manager → Campaigns
  • Microsoft Ads → Campaigns → Performance
  • LinkedIn Campaign Manager

Common checks:

  • Attribution model matches (last-click, data-driven, etc.)
  • Conversion tracking is firing (check recent conversions)
  • Date range and currency match
  • Audience segments haven't changed

If Blocked by QA Gates

Blocked by /qa-check or Stop Hook

If the agent blocks you with a quality gate message:

1. Read the reason carefully

  • Stop hook provides specific issue (e.g., "Metrics cited without data source")
  • /qa-check provides structured report with PASS/FAIL per check

2. Fix the issue before proceeding

  • Add missing citations
  • Remove guarantee language
  • Verify data sources
  • Document assumptions

3. Re-run verification

  • /qa-check to confirm fixes
  • Or just continue work - Stop hook will re-check automatically

4. Override only if justified

  • Document why override is necessary
  • Get peer review approval
  • Update deliverable with caveat

Common QA Block Scenarios

Block Message What It Means How to Fix
"Metrics cited without data source" You have numbers (%, growth, traffic) but no citation Add (Source: GA4, March 2024) or (BigQuery OrganicRankings_Daily table)
"Completion claimed but no verification" You said "done" but didn't run tests or verify Run build/test commands and confirm they pass, or remove "done" claim
"Action titles required for presentations" Slide titles are labels ("Overview"), not conclusions Change to action-oriented conclusions ("Traffic grew 34% YoY")
"Unsupported claim detected" "Studies show..." or "Research indicates..." without citation Either cite the specific study or rephrase as your analysis
"Overpromising language detected" "Will increase", "guaranteed", "100% will" Use qualified language: "could increase", "has potential to", "may improve"

What to Document in Deliverables

When creating client-facing content, always include:

Required Citations

Every deliverable must document:

Data sources:

"Based on BigQuery OrganicRankings_Daily table (March 1-31, 2024), 
traffic increased 18% (12,400 → 14,616 sessions)."

Benchmark claims:

"Position 1 CTR averages 27.6% (Advanced Web Ranking, 2024 CTR Study)."

Projections:

"Potential traffic lift: 180-220 clicks/month (CTR curve methodology, 
assuming position improvement from 7→3, current MSV 2,400)."

Required Assumptions

Be explicit about what you're assuming:

Assumptions underlying this analysis:
1. Competitive landscape remains stable
2. Technical SEO issues addressed within 30 days
3. Content implemented without significant deviation
4. No major Google algorithm updates affect target keywords
5. Client's conversion rate (3.2%) remains consistent

Required Caveats

Document limitations honestly:

Data Notes:
- GA4 data through March 31; more recent requires live platform check
- Sample size (n=8 keywords) below ideal threshold (n=30+)
- Rankings reflect desktop; mobile may differ
- Projections assume stable competitive landscape

The Rule of Five (Steve Yegge Principle)

Agent outputs are first drafts, not final deliverables. Self-review at least 5 times before delivery:

Review Pass 1: Data Accuracy

  • Are all metrics cited with sources?
  • Are calculations correct?
  • Do date ranges make sense?
  • Is sample size adequate?

Review Pass 2: Logic & Reasoning

  • Do recommendations follow from data?
  • Are there alternative explanations?
  • Did I consider external factors (seasonality, algorithm updates)?
  • Are assumptions reasonable?

Review Pass 3: Client Context

  • Does this align with client's business model?
  • Is language client-appropriate (no jargon)?
  • Are recommendations actionable for their team?
  • Is tone suitable for relationship stage?

Review Pass 4: Deliverable Quality

  • Is formatting clean and consistent?
  • Do links work?
  • Are tables and charts clear?
  • Is document structure logical?

Review Pass 5: Final QA Gate

  • Run /qa-check for automated validation
  • Check against Quality Standards
  • Review the QA Review Checklist (quality-standards skill resources)
  • Confirm peer review if required (strategic recs, high-stakes deliverables)

Why 5 passes? Each review catches different types of issues. First pass sees data problems. Last pass catches subtle tone or framing issues.


QA Tiers (When Peer Review Is Required)

Not all deliverables require the same verification level:

Auto-Ship (No Peer Review)

  • Data extraction queries
  • Keyword research (volume, difficulty, SERP features)
  • Competitive research (what competitors are doing)
  • Traffic/ranking reports

Peer Review Required

  • Strategic recommendations
  • Content differentiation strategies
  • Client-facing deliverables (audits, outlines, analyses)
  • ROI projections and Expected Outcome tables
  • Priority recommendations and roadmaps

Shadow Mode (Senior Review + Validation)

  • New methodologies not yet proven
  • Experimental approaches
  • High-stakes client deliverables (>$100K projected impact)
  • Sensitive competitive positioning

How to determine tier:

  • If deliverable includes projections or recommendations → Peer Review
  • If deliverable goes directly to client without edit → Peer Review
  • If stakes are high (revenue impact, client relationship) → Shadow Mode
  • If just data extraction or research → Auto-Ship

Practitioner Shortcuts

Quick verification techniques for common scenarios:

"Does this number look right?"

Sanity check questions:

  • Would a 500% traffic increase actually be realistic?
  • Do these rankings align with competitive landscape?
  • Is this CTR projection reasonable for this keyword type?
  • Does conversion rate match client's historical average?

Quick verification:

  • Compare to prior period (does trend make sense?)
  • Check against industry benchmarks (is it within 2x of normal?)
  • Cross-reference with another data source (GA4 vs. GSC)

"Where do I verify this metric?"

Decision tree:

Metric Type Primary Source Backup Source
Organic traffic, rankings Google Search Console GA4 organic sessions
Sessions, conversions Google Analytics 4 Client CRM/backend
SERP features, competitors DataForSEO Manual Google search
Paid campaign performance Google Ads / Meta Ads Platform UI
Keyword volumes Seer Signals / DataForSEO SEMrush / Ahrefs

"What if sources conflict?"

Variance tolerance guidelines:

Comparison Expected Variance Action if Exceeded
GA4 vs. CRM conversions 5-15% Investigate attribution, tracking lag
GSC clicks vs. GA4 organic sessions 10-20% Note in deliverable (different definitions)
DataForSEO rankings vs. manual check ±2 positions Use manual as source of truth
Backend conversions vs. pixel >15% Flag for CAPI/Redundant Event Pipeline

When sources conflict:

  1. Use the most authoritative source (backend > GA4 > third-party estimates)
  2. Document the discrepancy in deliverable
  3. Provide both numbers with explanation
  4. Flag for client/team investigation if variance is significant

Core QA Infrastructure

For Builders

The full QA skill resources (qa-review.md, fact-checking.md, quality.md) are in the plugin source at plugins/core-dependencies/skills/quality-standards/resources/.

Practitioner Guidance


Last updated: 2026-01-29