Trust But Verify: Data Validation Playbook¶

A practical guide to verifying agent outputs before client delivery. Agent-generated content is powerful, but client trust depends on your verification discipline.

When to Use This Guide¶

Use this playbook when:

Reviewing agent-generated deliverables before client delivery
Validating data claims, projections, or recommendations
Blocked by /qa-check or the Stop hook and need to resolve issues
Training team members on quality standards
Unsure where to verify specific metrics or claims

The Three-Second Rule¶

Before sending ANY deliverable:

Can you cite the source of every metric?
Can you defend every projection with methodology?
Would you stake your reputation on this being accurate?

If you hesitate on any answer, keep reading.

Trust-But-Verify Checklist¶

Use this checklist before marking any deliverable as "done":

Data & Metrics¶

Source cited - Every metric has a source (BigQuery, GA4, DataForSEO, Signals)
Date range specified - Exact dates ("March 1-31, 2024"), not relative ("last month")
Sample size adequate - Minimum 30 clicks/conversions for statistical relevance
Math verified - Percentages, growth calculations, and projections are correct
Query evidence - BigQuery/SQL queries are documented or available on request

Claims & Projections¶

No ungrounded claims - Avoid "studies show" without citing specific study
Projections have methodology - CTR curves, historical data, or industry benchmarks cited
Assumptions stated - Explicitly document what you're assuming (competitive landscape, conversion rates, etc.)
Conservative language - Use "could increase" not "will increase"
Scenarios provided - Best/Better/Good projections, not single optimistic number

Deliverable Quality¶

No overpromising - No guarantee language ("will", "guaranteed", "100% success")
Client-safe language - No internal jargon, no competitor confidential data
Complete - No [TODO], [TBD], or placeholder text
Formatted correctly - Tables render, links work, color-coding consistent
Action-oriented - Slide titles are conclusions, not labels ("Traffic grew 34%" not "Traffic Overview")

Verification Paths by Data Source¶

Where to verify specific claims based on the data source:

BigQuery / Seer Signals¶

What it provides:

Organic rankings (daily snapshots)
SERP features (historical)
Keyword volumes (updated monthly)
Traffic trends (when integrated with GA4)

Skill Integration

Use the signals-data skill to verify queries. The skill documents all 91 views with exact field names, preventing hallucinated columns and wrong view selection. Load it with: Skill command="signals-data"

Where to verify:

Seer Signals dashboard - Visual confirmation of trends and rankings
BigQuery console - Re-run queries to confirm data freshness
DataStudio reports - Pre-built visualizations of Signals data
signals-data skill - Verify view/field correctness against schema docs

Common Query Errors (signals-data catches these):

Error	Symptom	How signals-data Helps
Wrong view type	Missing expected fields	Skill documents which fields exist in each view
Field hallucination	Query fails or returns nulls	Skill lists ONLY valid fields per view
Wrong lookback	Incomplete historical data	Skill shows `_30_Days` vs `_13_Months` variants
Snapshots vs Tracking confusion	Point-in-time when you need trends	Snapshots = upload time, Tracking = over time
Missing org_name filter	Wrong client data or cross-client leakage	Every view doc shows required filters

Verification Checklist:

View matches analysis type (see signals-data decision guide)
All queried fields exist in that view's schema
org_name filter matches client exactly
Date range uses correct field (post_date vs date)
Lookback period sufficient for analysis scope
No outliers or anomalies in data (algorithm updates, tracking issues)

Google Analytics 4¶

What it provides:

Traffic volume (sessions, users, pageviews)
Conversion data (goals, e-commerce transactions)
User behavior (engagement, scroll depth, exit rates)
Channel attribution (organic, paid, direct, referral)

Where to verify:

GA4 UI → Reports → Acquisition → Traffic acquisition
GA4 UI → Reports → Engagement → Pages and screens
GA4 UI → Explore → Free form (custom reports)

Common checks:

Match date range exactly
Confirm segment filters (device, geography, user type)
Cross-check conversion numbers with CRM (expect 5-15% variance)
Verify no tracking implementation changes in date range

If you don't have Explore access:

Use standard GA4 reports (Acquisition, Engagement, Monetization)
Ask Analytics team for Explore access or custom report export
Cross-reference with client's internal reports

Google Search Console¶

What it provides:

Impressions, clicks, CTR, average position
Query-level performance
Page-level organic traffic
SERP feature tracking (limited)

Where to verify:

GSC UI → Performance → Search results
GSC UI → Performance → Queries tab (keyword data)
GSC UI → Performance → Pages tab (landing page data)

Common checks:

Date range matches (GSC has 2-3 day lag)
Filter matches (device, country, search type)
Anomalies due to algorithm updates or site changes
Compare GSC clicks with GA4 organic traffic (expect 10-20% variance)

If data doesn't match:

GSC tracks Google-only; GA4 includes all search engines
GSC counts clicks; GA4 counts sessions (multi-page visits = 1 session)
Bot traffic filtered differently

DataForSEO (SERP Analysis)¶

What it provides:

Live SERP snapshots (rankings, features, competitors)
Keyword difficulty and search volume
Competitive positioning
SERP feature presence (PAA, Featured Snippets, etc.)

Where to verify:

Manual Google search - Confirm rankings and SERP features in live results
SEMrush / Ahrefs - Cross-check keyword volumes and difficulty
DataForSEO UI (if you have direct access)

Common checks:

Rankings can fluctuate daily - note date of SERP snapshot
SERP features are dynamic - verify in multiple browsers/locations
Competitor analysis is point-in-time - check for recent changes

If SERP doesn't match:

Personalization (logged in vs. logged out Google)
Location targeting (set location in search settings)
Device type (mobile vs. desktop results differ)
Timing (SERP snapshots are historical)

Paid Media Platforms¶

What it provides:

Campaign performance (impressions, clicks, conversions)
Audience data (demographics, interests, behaviors)
Ad creative performance
Budget pacing and spend

Where to verify:

Google Ads UI → Campaigns → Performance
Meta Ads Manager → Campaigns
Microsoft Ads → Campaigns → Performance
LinkedIn Campaign Manager

Common checks:

Attribution model matches (last-click, data-driven, etc.)
Conversion tracking is firing (check recent conversions)
Date range and currency match
Audience segments haven't changed

If Blocked by QA Gates¶

Blocked by `/qa-check` or Stop Hook¶

If the agent blocks you with a quality gate message:

1. Read the reason carefully

Stop hook provides specific issue (e.g., "Metrics cited without data source")
/qa-check provides structured report with PASS/FAIL per check

2. Fix the issue before proceeding

Add missing citations
Remove guarantee language
Verify data sources
Document assumptions

3. Re-run verification

/qa-check to confirm fixes
Or just continue work - Stop hook will re-check automatically

4. Override only if justified

Document why override is necessary
Get peer review approval
Update deliverable with caveat

Common QA Block Scenarios¶

Block Message	What It Means	How to Fix
"Metrics cited without data source"	You have numbers (%, growth, traffic) but no citation	Add `(Source: GA4, March 2024)` or `(BigQuery OrganicRankings_Daily table)`
"Completion claimed but no verification"	You said "done" but didn't run tests or verify	Run build/test commands and confirm they pass, or remove "done" claim
"Action titles required for presentations"	Slide titles are labels ("Overview"), not conclusions	Change to action-oriented conclusions ("Traffic grew 34% YoY")
"Unsupported claim detected"	"Studies show..." or "Research indicates..." without citation	Either cite the specific study or rephrase as your analysis
"Overpromising language detected"	"Will increase", "guaranteed", "100% will"	Use qualified language: "could increase", "has potential to", "may improve"

What to Document in Deliverables¶

When creating client-facing content, always include:

Required Citations¶

Every deliverable must document:

Data sources:

"Based on BigQuery OrganicRankings_Daily table (March 1-31, 2024), 
traffic increased 18% (12,400 → 14,616 sessions)."

Benchmark claims:

"Position 1 CTR averages 27.6% (Advanced Web Ranking, 2024 CTR Study)."

Projections:

"Potential traffic lift: 180-220 clicks/month (CTR curve methodology, 
assuming position improvement from 7→3, current MSV 2,400)."

Required Assumptions¶

Be explicit about what you're assuming:

Assumptions underlying this analysis:
1. Competitive landscape remains stable
2. Technical SEO issues addressed within 30 days
3. Content implemented without significant deviation
4. No major Google algorithm updates affect target keywords
5. Client's conversion rate (3.2%) remains consistent

Required Caveats¶

Document limitations honestly:

Data Notes:
- GA4 data through March 31; more recent requires live platform check
- Sample size (n=8 keywords) below ideal threshold (n=30+)
- Rankings reflect desktop; mobile may differ
- Projections assume stable competitive landscape

The Rule of Five (Steve Yegge Principle)¶

Agent outputs are first drafts, not final deliverables. Self-review at least 5 times before delivery:

Review Pass 1: Data Accuracy¶

Are all metrics cited with sources?
Are calculations correct?
Do date ranges make sense?
Is sample size adequate?

Review Pass 2: Logic & Reasoning¶

Do recommendations follow from data?
Are there alternative explanations?
Did I consider external factors (seasonality, algorithm updates)?
Are assumptions reasonable?

Review Pass 3: Client Context¶

Does this align with client's business model?
Is language client-appropriate (no jargon)?
Are recommendations actionable for their team?
Is tone suitable for relationship stage?

Review Pass 4: Deliverable Quality¶

Is formatting clean and consistent?
Do links work?
Are tables and charts clear?
Is document structure logical?

Review Pass 5: Final QA Gate¶

Run /qa-check for automated validation
Check against Quality Standards
Review the QA Review Checklist (quality-standards skill resources)
Confirm peer review if required (strategic recs, high-stakes deliverables)

Why 5 passes? Each review catches different types of issues. First pass sees data problems. Last pass catches subtle tone or framing issues.

QA Tiers (When Peer Review Is Required)¶

Not all deliverables require the same verification level:

Auto-Ship (No Peer Review)¶

Data extraction queries
Keyword research (volume, difficulty, SERP features)
Competitive research (what competitors are doing)
Traffic/ranking reports

Peer Review Required¶

Strategic recommendations
Content differentiation strategies
Client-facing deliverables (audits, outlines, analyses)
ROI projections and Expected Outcome tables
Priority recommendations and roadmaps

Shadow Mode (Senior Review + Validation)¶

New methodologies not yet proven
Experimental approaches
High-stakes client deliverables (>$100K projected impact)
Sensitive competitive positioning

How to determine tier:

If deliverable includes projections or recommendations → Peer Review
If deliverable goes directly to client without edit → Peer Review
If stakes are high (revenue impact, client relationship) → Shadow Mode
If just data extraction or research → Auto-Ship

Practitioner Shortcuts¶

Quick verification techniques for common scenarios:

"Does this number look right?"¶

Sanity check questions:

Would a 500% traffic increase actually be realistic?
Do these rankings align with competitive landscape?
Is this CTR projection reasonable for this keyword type?
Does conversion rate match client's historical average?

Quick verification:

Compare to prior period (does trend make sense?)
Check against industry benchmarks (is it within 2x of normal?)
Cross-reference with another data source (GA4 vs. GSC)

"Where do I verify this metric?"¶

Decision tree:

Metric Type	Primary Source	Backup Source
Organic traffic, rankings	Google Search Console	GA4 organic sessions
Sessions, conversions	Google Analytics 4	Client CRM/backend
SERP features, competitors	DataForSEO	Manual Google search
Paid campaign performance	Google Ads / Meta Ads	Platform UI
Keyword volumes	Seer Signals / DataForSEO	SEMrush / Ahrefs

"What if sources conflict?"¶

Variance tolerance guidelines:

Comparison	Expected Variance	Action if Exceeded
GA4 vs. CRM conversions	5-15%	Investigate attribution, tracking lag
GSC clicks vs. GA4 organic sessions	10-20%	Note in deliverable (different definitions)
DataForSEO rankings vs. manual check	±2 positions	Use manual as source of truth
Backend conversions vs. pixel	>15%	Flag for CAPI/Redundant Event Pipeline

When sources conflict:

Use the most authoritative source (backend > GA4 > third-party estimates)
Document the discrepancy in deliverable
Provide both numbers with explanation
Flag for client/team investigation if variance is significant

Core QA Infrastructure¶

Quality Standards Skill - Complete QA framework
/qa-check Command - On-demand quality validation

For Builders

The full QA skill resources (qa-review.md, fact-checking.md, quality.md) are in the plugin source at plugins/core-dependencies/skills/quality-standards/resources/.

Practitioner Guidance¶

Best Practices - General tips for working with agents
Troubleshooting - Common issues and solutions
SEO Workflows - Division-specific verification steps
Analytics Workflows - Analytics verification patterns

Last updated: 2026-01-29

Trust But Verify: Data Validation Playbook¶

When to Use This Guide¶

The Three-Second Rule¶

Trust-But-Verify Checklist¶

Data & Metrics¶

Claims & Projections¶

Deliverable Quality¶

Verification Paths by Data Source¶

BigQuery / Seer Signals¶

Google Analytics 4¶

Google Search Console¶

DataForSEO (SERP Analysis)¶

Paid Media Platforms¶

If Blocked by QA Gates¶

Blocked by /qa-check or Stop Hook¶

Common QA Block Scenarios¶

What to Document in Deliverables¶

Required Citations¶

Required Assumptions¶

Required Caveats¶

The Rule of Five (Steve Yegge Principle)¶

Review Pass 1: Data Accuracy¶

Review Pass 2: Logic & Reasoning¶

Review Pass 3: Client Context¶

Review Pass 4: Deliverable Quality¶

Review Pass 5: Final QA Gate¶

QA Tiers (When Peer Review Is Required)¶

Auto-Ship (No Peer Review)¶

Peer Review Required¶

Shadow Mode (Senior Review + Validation)¶

Practitioner Shortcuts¶

"Does this number look right?"¶

"Where do I verify this metric?"¶

"What if sources conflict?"¶

Related Resources¶

Core QA Infrastructure¶

Practitioner Guidance¶

Blocked by `/qa-check` or Stop Hook¶