Skip to content
MCS · Design

How we publish.

Two non-negotiable rules. Five chained stages. Every artifact is a public PR.

Section 1 · 2 non-negotiable rules

The two rules.

  • Rule #1

    Specs come from a hand-curated YAML catalog.

    The LLM cannot emit numbers. Every numeric token in MDX prose must trace back to content/_specs/<slug>.yaml or the build fails.

    CI evidence ↗
    • validator·scripts/validate-specs.ts
    • doctrine·content/_specs/_README.md
    • schema·content/_specs/_SCHEMA.md
    • CI step·.github/workflows/content-gate.yml · validate:specs
  • Rule #2

    The LLM is forbidden from voice-bearing sections.

    Voice-bearing headings — Verdict, Intro, Recommendation — may only appear under documents whose frontmatter author is in the human allowlist. The LLM does not write voice; the validator catches anything that slips past review.

    CI evidence ↗
    • validator·scripts/validate-authorship.ts
    • CI step·.github/workflows/content-gate.yml · validate:authorship

Section 2 · 5 stages

The pipeline, stage by stage.

Every stage runs on a Vercel cron. Each one writes only its declared row type and commits to a single boundary it never crosses.

CatchClusterStripQuantifyGatePublic PRhourly09:00 UTC daily10:00 UTC daily11:00 UTC daily11:00 UTC dailyon green

Every exit is a public PR · CI gates · auto-merge on green

  • Stage 1

    Catch

    hourly

    Reads
    HN Algolia, Phoronix RSS, NVIDIA Developer Blog RSS, r/LocalLLaMA, arXiv cs.AR
    Writes
    mcs_signal_candidate (deduped by source + source_id)
    Never
    persists Reddit body text — only title, score, permalink, url, created_utc (ToS perimeter).
  • Stage 2

    Cluster

    09:00 UTC daily

    Reads
    unprocessed mcs_signal_candidate rows in a 48-hour window
    Writes
    mcs_signal_cluster + mcs_signal_cluster_member (3-shingle Jaccard ≥ 0.45)
    Never
    uses an LLM or embeddings — pure JS, lead-only centroid, deterministic.
  • Stage 3

    Strip

    10:00 UTC daily

    Reads
    mature clusters (closed_at NOT NULL OR member_count ≥ 3)
    Writes
    topic_label heuristically + mcs_signal_claim rows (kind: spec | benchmark | event | price)
    Never
    writes voice — no Verdict, Intro, or Recommendation prose. Subject vocab is allowlisted.
  • Stage 4

    Quantify

    11:00 UTC daily (paired with Gate)

    Reads
    unverified mcs_signal_claim rows + the spec catalog
    Writes
    verdict on each claim: confirmed | conflict | novel | weak
    Never
    writes copy — verdicts are advisory inputs to Gate, never published prose.
  • Stage 5

    Gate

    11:00 UTC daily (paired with Quantify)

    Reads
    ready clusters meeting threshold (≥1 conflict OR ≥3 novels OR ≥3 cluster members)
    Writes
    a single GitHub PR per topic — changelog stub, conflict YAML, novel-spec stub
    Never
    merges. CI runs the gates; on green the PR auto-merges, on red it sits.

Section 3 · Topology

One picture.

  Phoronix ─┐
   NVIDIA ──┤
  r/LocalL ─┤
        HN ─┼──→ Catch ──→ Cluster ──→ Strip ──→ Quantify ──→ Gate ──→ PUBLIC PR
     arXiv ─┤      │          │          │           │          │         │
TechCrunch ─┤      │          │          │           │          │         ▼
  Verge AI ─┘      │          │          │           │          │     CI · gates
                   │          │          │           │          │         │
                   │          │          │           │          │     auto-merge
                   └──────────┴──────────┴───────────┴──────────┘         │
                          read-only admin views (/feed, /clusters)        ▼
                                                                       /signal
                                                                       /changelog

Five sources fan in. Five stages chain. The only way out is a public PR; CI auto-merges on green.

Section 4 · The boundary

What the LLM is and isn't allowed to do.

  • Allowed

    Recommend rigs from the catalog (The Specifier).

    Forbidden

    Emit numbers anywhere in MDX prose.

    Enforced by

    scripts/validate-specs.ts (Rule #1)

  • Allowed

    Summarize an article in a PR description.

    Forbidden

    Write a verdict, intro, or recommendation paragraph.

    Enforced by

    scripts/validate-authorship.ts (Rule #2)

  • Allowed

    Render specs from YAML the human committed.

    Forbidden

    Persist Reddit body text into mcs_signal_candidate.raw_payload.

    Enforced by

    lib/catch/sources/reddit-localllama.ts perimeter (ToS, operational)