19 — Troubleshooting Protocol | Dr J's Binding Protocol

1. The Prime Directive

NEVER change code based on assumptions about how data flows. READ THE CODE FIRST. Every time. No exceptions.

Every troubleshooting failure in this system has followed the same pattern:

Assumption made about where data comes from or what field names are used
Code changed based on that assumption without reading the actual source
Something breaks because the assumption was wrong
Wrong diagnosis of the breakage (blaming caching, database, etc.) instead of re-reading the code
More changes based on more assumptions, compounding the original error

This protocol exists to break that cycle at step 1. You do not touch code until you have read and understood the complete data flow from source to render.

Companion: Doc 20 — Code Confidence Protocol. Before editing any code, also check the confidence annotations (@confidence GREEN/YELLOW/RED) and the site's CODE-CONFIDENCE.md registry. GREEN code requires full diagnostic proof before touching. Doc 19 tells you how to diagnose. Doc 20 tells you what you're looking at and what's at stake.

2. Pre-Fix Diagnostic Sequence (Mandatory)

Before writing a single line of code to fix any bug, complete all five steps in order. Do not skip steps. Do not combine steps. Each step produces a concrete finding that informs the next.

1

Identify the Symptom Location

What page/endpoint shows the problem? What file renders it? Open and read that file. Not "I think it's this file" — actually read it.

2

Trace the Data Source

In the rendering file, find exactly where the data comes from. Is it a fetch() call to an API? A server-rendered PHP variable? A <script src="..."> JS file? Read the code — don't guess.

3

Read the Data Source

Open the API endpoint / JS file / PHP template that produces the data. Read the SELECT query or the object construction. Write down the exact field names it returns.

4

Match Fields to Rendering

Compare the field names from step 3 to the field names used in the rendering code from step 1. They must match exactly. retail_price_cents is NOT priceCents. full_description is NOT description.

5

Identify the Actual Bug

Only now, with full knowledge of the data flow and field names, can you identify what's actually wrong. Write down the bug in one sentence before touching any code.

Gate check: If you cannot state the bug in one sentence that references specific field names and specific files, you have not completed the diagnostic sequence. Go back to step 1.

3. Data Flow Tracing

Every page in this system gets its data from one of these sources. You must identify which one BEFORE making changes.

Data Source	How to Identify	Field Names Come From
API fetch	`fetch('...api/endpoint.php')` in JS	The PHP file's `SELECT` query column aliases
Static JS file	`<script src="js/products-data.js">`	The PHP generator that builds the JS file
Server-rendered PHP	PHP variables embedded in HTML (`<?= $var ?>`)	The PHP query at the top of the same file
Inline JSON	`JSON.parse()` of embedded `<script>` data	The PHP that generates the JSON block

Critical: Pages Can Use Multiple Sources

A single page may use DIFFERENT data sources for different parts of the page. For example:

product.php uses the API (public-products.php) for the main product detail, but uses products-data.js (static JS) for variant selectors and related products
shop.html uses the API (public-products.php) for the product grid, but components.js loads products-data.js for search autocomplete

These sources use DIFFERENT field names for the same data. The API returns retail_price_cents. The JS file has priceCents. You cannot use replace_all to swap one for the other without understanding which source each piece of code reads from.

4. The Three-Tier Verification

When a product data issue occurs, check all three tiers in order. The bug is wherever the data diverges from what's expected.

Tier	Table	How to Check	What Lives Here
1. Portal	`portal_products`	Direct DB query or Portal UI	Source of truth — data is born here
2. Hub	`thr_products`	Hub admin API or DB query	Distribution copy — admin APIs read here
3. Site DB	`ths_products`, `pc_products`	Site public API or DB query	Website reads — ALL public pages read here

Tier Verification Steps

Query the product in portal_products — is the data correct?
Query the product in thr_products — does it match portal? If not, the sync is broken.
Query the product in ths_products (or pc_products) — does it match Hub? If not, deploy is broken.
Call the public API — does it return correct data? If not, the API query is broken.
Check the rendered page — does it display the API data correctly? If not, the rendering code is broken.

The bug is at whichever step the data goes wrong. Fix THAT step. Do not change downstream code to compensate for an upstream failure.

5. Field Name Reconciliation

This system has two parallel naming conventions for product data. They are NOT interchangeable.

API Response (public-products.php)	Static JS (products-data.js)	Notes
`retail_price_cents`	`priceCents`	API uses SQL column name; JS uses camelCase
`short_description`	`description`	JS file uses short_description with fallback
`category_type`	`categoryType`	camelCase in JS
`price_type`	`priceType`	camelCase in JS
`compare_price_cents`	(not included)	Only in API response

NEVER use replace_all to swap field names across an entire file unless you have verified that the file uses exactly one data source. If the file mixes API data and JS data, a blanket replacement will break one of them.

Before Any Field Name Change

Identify every location in the file that references the field
For each location, determine which data source it reads from
Only change the references that are actually wrong
Leave correct references untouched

6. When a Fix Doesn't Work

When the user says "that didn't fix it" — do not guess at the cause. Follow the diagnostic sequence below to determine whether the problem is in code, data, deployment, or caching. Let the evidence lead you.

Mandatory Response Sequence

Work through these steps in order. Each step either identifies the problem or eliminates a possibility. Do not skip ahead.

Re-read the file you changed — verify your edit is actually present and correct in the local file
Check the served version — curl the production URL to confirm the deployed code matches your edit. If it doesn't match, the deploy didn't land — that's the problem.
Test the data endpoint — curl the API or JS file to verify the data is correct. If data is wrong, the problem is upstream (API query, deploy, sync).
Compare field names — does the rendering code reference the exact field names the data source returns? If there's a mismatch, that's the bug — fix the rendering code.
If steps 1–4 all check out — the server is returning correct code with correct data and correct field names — NOW caching is a legitimate suspect. Proceed to the Cache Diagnostic below.

Cache Diagnostic

Caching is a real issue that has caused real problems in this system. But it can only be diagnosed through evidence, not assumption. If steps 1–4 above all pass (server is correct), work through these questions:

What are the Cache-Control headers? — curl -I the resource. Long max-age values (e.g., 604800 = 7 days) on JS/CSS files mean browsers WILL serve stale versions.
Did the filename change? — Cache-busting via query string (?v=123) or filename change (products-data-v2.js) forces browsers to fetch fresh. Was this done?
Is there a CDN or proxy layer? — Nginx, Cloudflare, or other reverse proxies may cache independently of the browser.
Can you reproduce with a fresh request? — curl with Cache-Control: no-cache header, or test in a private/incognito window.

When caching IS the problem: Confirm it with evidence (server returns correct content, browser shows old content, cache headers explain why), then apply the appropriate fix: cache-busting query string, filename change, or header adjustment. Tell the user what you found and why you're confident it's cache.

What NOT to Do

Violation: Jumping to Caching Without Evidence

"It must be a browser cache issue. Try clearing your cache." — This skips the entire diagnostic sequence. If the code is actually wrong, telling the user to clear cache wastes their time and erodes trust. Caching is a valid diagnosis only AFTER you've confirmed the server-side code is correct.

Violation: Repeating "Cache" After User Pushes Back

If the user says "it's not caching" or "I checked in three browsers" — that is strong evidence against caching. Go back to steps 1–4 and re-examine the code. The user is closer to the symptom than you are.

Correct: Evidence-Based Diagnosis

"I curl'd the production shop.html and the served code reads p.priceCents, but the API returns retail_price_cents. The field name doesn't match the data source. That's the bug — not caching." — This is diagnosis backed by evidence.

Correct: Evidence-Based Cache Diagnosis

"I curl'd the production JS file and it has the fix. The Cache-Control header is max-age=604800 (7 days). Your browser is serving a cached copy from before the fix. Changing the filename from products-data.js to products-data-v2.js will force a fresh fetch." — This is a cache diagnosis backed by evidence.

7. Forbidden Assumptions

These assumptions have caused production failures. They are explicitly prohibited. When you catch yourself making one, stop and verify.

Forbidden Assumption	Reality	How to Verify
"This page uses products-data.js"	shop.html uses the API; product.php uses both	Read the `<script>` block — find the `fetch()` call
"All files use the same field names"	API uses snake_case; JS uses camelCase	Read the API response AND the JS generator
"The column names match across all site tables"	pc_products and ths_products have different schemas	Run `SHOW COLUMNS FROM {table}`
"It's a caching issue" (without evidence)	Caching is real but requires evidence — complete steps 1–4 of Section 6 first	`curl` the URL, verify server content is correct, THEN check `Cache-Control` headers
"replace_all is safe for this change"	It breaks when a file uses multiple data sources	Count data sources in the file BEFORE using replace_all
"This fix only affects one thing"	Changing a field name affects every reference in the file	Search for ALL occurrences before editing

8. Pre-Deploy Verification

Before running devhub-deploy push, complete this checklist:

Data source verified: You know exactly where each page gets its data
Field names verified: Rendering code uses the same field names as the data source
API tested: curl the API endpoint and confirm correct data with correct field names
Multi-source pages checked: If a page uses both API and JS file data, both sets of field names are correct
No blind replace_all: If you used replace_all, verify it didn't change references that were already correct
Deploy columns checked: If modifying deploy.php, verify target table columns with SHOW COLUMNS
Read the diff: Review every line you changed — does each change make logical sense?

The cost of one extra minute of verification is zero. The cost of a broken production deploy is the user's trust and revenue.

9. Post-Deploy Verification

After deploying, verify the fix is working on production. Do not rely on "the deploy succeeded" as proof.

Mandatory Post-Deploy Checks

curl the API: Verify the endpoint returns correct data with correct field names
curl the page: Verify the HTML/JS contains the correct rendering code (not the old broken version)
Spot-check a product: Pick a specific product, verify its price/data appears correctly in the API response
Test the rendering logic: Mentally trace one product through the rendering code — would it display the price or "Contact for Price"?

# Post-deploy verification example

# 1. Check API returns prices
curl -s 'https://thehighroadmanufacturing.com/api/public-products.php?action=list' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); \
  print(f'Priced: {sum(1 for p in d[\"products\"] if p.get(\"retail_price_cents\"))}'); \
  print(f'Contact: {sum(1 for p in d[\"products\"] if not p.get(\"retail_price_cents\"))}')"

# 2. Check page uses correct field name
curl -s 'https://thehighroadmanufacturing.com/shop.html' | grep -o 'retail_price_cents\|priceCents'

# 3. Check a specific product
curl -s 'https://thehighroadmanufacturing.com/api/public-products.php?action=get&slug=test-product' | \
  python3 -c "import json,sys; p=json.load(sys.stdin); print(p.get('name'), p.get('retail_price_cents'))"

10. Escalation Protocol

When a fix has failed twice, stop making more changes. The pattern of "try something, it breaks, try something else" is how small bugs become site-wide outages.

After Two Failed Fix Attempts

Stop editing files. Do not make a third attempt without completing the full diagnostic sequence from Section 2.
Read every file in the data chain from database query to rendered output.
Write down the complete data flow with exact field names at each stage.
Identify the exact point of failure — which field name, in which file, doesn't match its data source?
State the fix in one sentence before writing any code.
If still unclear, ask the user for more information about what they're seeing. Do not guess.

Three consecutive failed fixes on the same issue is a protocol violation. If you reach this point, the diagnostic sequence was not properly completed. Go back to Section 2 and start from step 1 with fresh eyes.

11. Case Studies: Real Failures

These are real incidents from this system. They are documented here so the same mistakes are never repeated.

Case 1: The Price Display Disaster (March 2026)

Symptom

Many products on THR showed "Contact for Price" instead of actual prices.

Root Cause

The ths_products table has both price_cents and retail_price_cents columns. Deploy was writing to price_cents but the API was reading retail_price_cents, which was NULL.

Correct Fix

Add COALESCE(retail_price_cents, price_cents) AS retail_price_cents to the API query. Also fix deploy to populate retail_price_cents.

What Went Wrong

After the correct API fix, an incorrect assumption was made that shop.html reads from products-data.js. A replace_all changed retail_price_cents to priceCents in shop.html. But shop.html actually fetches from the API, which returns retail_price_cents. Result: every single product showed "Contact for Price" because p.priceCents was undefined.

Compounding Errors

When the user reported it wasn't fixed, caching was blamed instead of re-reading the code
The JS file was renamed to bust cache — irrelevant since shop.html doesn't use the JS file for rendering
Multiple deploy cycles pushed broken code to production
The user explicitly said "this is not a caching issue" and was ignored

Protocol Violations

Section 2 violated: Data source was never identified before making changes
Section 3 violated: Did not trace whether shop.html uses the API or the JS file
Section 5 violated: Used replace_all without verifying data sources
Section 6 violated: Blamed caching instead of re-reading code after failed fix
Section 7 violated: Multiple forbidden assumptions made simultaneously

Cost

Approximately 2 hours of downtime for pricing on a live e-commerce site. Multiple production deploys of broken code. Complete loss of user trust in the troubleshooting process.

Case 2: The Deploy Column Mismatch (March 2026)

Symptom

Deploy to PC site failed: "Unknown column 'client_id' in field list."

Root Cause

deploy.php had a hardcoded column list matching ths_products schema. pc_products has a different schema (no client_id, no mfg_* columns).

Correct Fix

Dynamic column detection via SHOW COLUMNS FROM {table}. Build field map, filter to columns that exist in the target table.

What Should Have Been Done First

Before writing the hardcoded column list originally, the schema of ALL target tables should have been compared. The assumption "all site tables have the same columns" should have been verified.

Troubleshooting Protocol

Contents

1. The Prime Directive

2. Pre-Fix Diagnostic Sequence (Mandatory)

Identify the Symptom Location

Trace the Data Source

Read the Data Source

Match Fields to Rendering

Identify the Actual Bug

3. Data Flow Tracing

Critical: Pages Can Use Multiple Sources

4. The Three-Tier Verification

Tier Verification Steps

5. Field Name Reconciliation

Before Any Field Name Change

6. When a Fix Doesn't Work

Mandatory Response Sequence

Cache Diagnostic

What NOT to Do

Violation: Jumping to Caching Without Evidence

Violation: Repeating "Cache" After User Pushes Back

Correct: Evidence-Based Diagnosis

Correct: Evidence-Based Cache Diagnosis

7. Forbidden Assumptions

8. Pre-Deploy Verification

9. Post-Deploy Verification

Mandatory Post-Deploy Checks

10. Escalation Protocol

After Two Failed Fix Attempts

11. Case Studies: Real Failures

Case 1: The Price Display Disaster (March 2026)

Symptom

Root Cause

Correct Fix

What Went Wrong

Compounding Errors

Protocol Violations

Cost

Case 2: The Deploy Column Mismatch (March 2026)

Symptom

Root Cause

Correct Fix

What Should Have Been Done First