Building a Claude Code skill to fact-check scientific claims

Disclaimer

This skill relies on an LLM (Claude) to orchestrate searches and synthesize results. Despite cross-referencing multiple sources (PubMed, Semantic Scholar, web) and built-in verification steps, an LLM can still produce factual errors, shortcuts, or omissions. The generated reports are not a substitute for professional medical advice or an expert-led literature review. Use them as a starting point, not as absolute truth.

Health claims are everywhere and nobody checks them

"Spirulina is a superfood." "Coffee causes cancer." "Intermittent fasting cures everything."

We all have that friend who drops these at dinner. And honestly, fact-checking any of them is real work. You need PubMed, meta-analyses, regulatory positions from the FDA or EFSA, checking whether studies are industry-funded... That's 2 hours per claim. At least.

So I automated it. Type /science-check spirulina is a superfood in Claude Code, wait about 2 minutes, get a full report. Verdicts, evidence levels, risks, verified sources. The works.

I burned a weekend iterating on this (and an entire evening on an allowed-tools bug I'll get to later). Here's how I built it and what I learned.

What it does

A slash command /science-check that:

Launches 3 research agents in parallel (meta-analyses, risks, critical analysis)
Searches PubMed, Semantic Scholar, Google Scholar via MCP servers
Validates key studies (retractions, funding, sample sizes)
Produces a structured report with per-claim verdicts

Here's the full terminal output from /science-check is spirulina a "superfood"?:

- [x] Claim recovered and translated to English
- [x] ORIENTATION phase: 3 parallel searches launched
- [x] DEEPENING phase: reference sources fetched
- [x] VALIDATION phase: key studies verified
- [x] SELF-CHECK phase: quality checklist validated
- [x] SYNTHESIS phase: report generated

══════════════════════════════════════════════════════════════════════
 SCIENCE CHECK
══════════════════════════════════════════════════════════════════════

Claim: Spirulina is a "superfood"
Date: March 5, 2026
Overall verdict: PARTIALLY CONFIRMED

── Scientific consensus ───────────────────────────────────────────

Spirulina has documented, real benefits — primarily on lipid profiles
and blood pressure — supported by multiple meta-analyses of RCTs.
However, the term "superfood" has NO official scientific definition
— it's a marketing term. Confirmed benefits are modest and targeted,
far from the miracle food image pushed by marketing.

── Per-claim breakdown ────────────────────────────────────────────

Claim                             │ Verdict              │ Evidence level
──────────────────────────────────┼──────────────────────┼──────────────────────────────────
Lowers cholesterol/LDL/TG        │ CONFIRMED            │ Meta-analysis 20 RCTs (n=1076)
Lowers blood pressure             │ CONFIRMED            │ Meta-analysis of RCTs
Helps with weight loss            │ Partially confirmed  │ Meta-analysis 17 RCTs: −1.07 kg
Glycemic control (T2 diabetes)    │ Partially confirmed  │ Meta-analysis 8 studies
Reduces inflammation (CRP, IL-6)  │ Partially confirmed  │ Meta-analyses: modest effect
Rich in protein ("60%")           │ OVERSTATED           │ 3 g serving = 2 g protein
Vitamin B12 source                │ DEBUNKED             │ Inactive pseudo-B12
Liver detox                       │ UNPROVEN             │ No proven mechanism
General immune boost              │ UNPROVEN             │ Zero quality RCTs
Anti-cancer                       │ PREMATURE            │ In vitro only
"Superfood" (term)                │ NOT APPLICABLE       │ Purely a marketing term

── Risks and side effects ─────────────────────────────────────────

Risk                                  │ Severity         │ Population
──────────────────────────────────────┼──────────────────┼───────────────────────────────
Heavy metal contamination             │ Moderate-high    │ All (94% products contaminated)
Microcystins (cyanotoxins)            │ High             │ Chronic use ≥ 4 g/day
Autoimmune diseases                   │ Moderate-high    │ Lupus, MS, vitiligo, RA
Anticoagulant interactions            │ Moderate         │ On warfarin/aspirin
Immunosuppressant interactions        │ Moderate         │ On azathioprine/cyclosporine
Phenylketonuria (PKU)                 │ Very high        │ Absolute contraindication
Allergic reactions                    │ Variable         │ Iodine/seafood allergy
Pseudo-B12 masking deficiency         │ Low              │ Vegans

── Official positions ─────────────────────────────────────────────

ANSES (France) : Safe at moderate doses. CI: PKU, allergies.
                 Heavy metal contamination alert. (2017)
FDA (USA)      : GRAS status. Approved as food coloring.
                 Minimal supplement regulation.
EFSA (EU)      : REJECTED diabetes health claims (2013).
                 No full assessment completed.
WHO            : No specific position.

── Red flags ──────────────────────────────────────────────────────

⚠ "Superfood" = no official scientific definition
⚠ Claims-to-evidence ratio ~10:1 (50+ claims, <5 proven)
⚠ $630M → $1.4B market (likely publication bias)
⚠ 1 retracted study: "Spirulina Unleashed" (MDPI, 2024)
⚠ Weasel language: 36x "may/might/suggest", 0x "proven"
⚠ 94% of samples positive for microcystins

── Sources (14 consulted) ─────────────────────────────────────────

 [1] Spirulina & lipid profile — Meta-analysis 20 RCTs (2023)
 [2] Spirulina & cardiometabolic health — Meta-analysis (2025)
 [3] Spirulina & blood pressure — Meta-analysis RCTs (2021)
 [4] Spirulina & body composition — Meta-analysis 17 RCTs (2025)
 [5] Spirulina & type 2 diabetes — Meta-analysis (2021)
 [6] Spirulina & CRP — GRADE meta-analysis (2025)
 [7] Spirulina & inflammation — Meta-analysis RCTs (2025)
 [8] Examine.com — Evidence-based review (2025)
 [9] ANSES — Regulatory position (2017)
[10] Rubio et al. — Heavy metals (2021)
[11] Autoimmune reactions — Case reports (2025)
[12] Umbrella review — Meta-analyses (2026)
[13] EFSA — Rejected claims (2013)
[14] "Spirulina Unleashed" — Retracted (2024/2025)

══════════════════════════════════════════════════════════════════════

Bottom line: real but modest benefits (cholesterol, blood pressure)
confirmed by 15+ meta-analyses. "Superfood" = pure marketing hype.
Contamination risks are non-trivial.

Not bad for 2 minutes of waiting.

Setting up the MCP servers

The skill uses two MCP servers to hit scientific databases directly. Without them it falls back to WebSearch - still works, just less precise.

PubMed MCP (`mcp-simple-pubmed`)

Direct access to NCBI's Entrez API. Free, just needs an email.

# Test it works
uvx mcp-simple-pubmed --help

Paper Search MCP (`paper-search-mcp`)

Multi-source search: PubMed, arXiv, bioRxiv, medRxiv, Semantic Scholar, Google Scholar. The Swiss army knife for academic search.

# Test it
uvx --from paper-search-mcp python -m paper_search_mcp.server --help

The `~/.claude/mcp.json`

Drop both servers into your global MCP config:

See the full mcp.json

{
  "mcpServers": {
    "pubmed": {
      "command": "uvx",
      "args": ["mcp-simple-pubmed"],
      "env": {
        "PUBMED_EMAIL": "your@email.com"
      }
    },
    "paper-search": {
      "command": "uvx",
      "args": ["--from", "paper-search-mcp", "python", "-m", "paper_search_mcp.server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": ""
      }
    }
  }
}

Few things to know:

PUBMED_EMAIL: NCBI's Entrez API needs an email. No API key, just an email to identify requests. Use yours.
SEMANTIC_SCHOLAR_API_KEY: optional. Works without one but with lower rate limits. Grab a free key at semanticscholar.org/product/api.
Both use uvx (the uv runner). Don't have uv? curl -LsSf https://astral.sh/uv/install.sh | sh.

You need to restart Claude Code after editing mcp.json. MCP servers load at startup, not hot.

4 files, not one big blob

The skill lives in ~/.claude/skills/science-check/ (browse on GitHub):

science-check/
├── SKILL.md              # Main instructions (105 lines)
├── REPORT_TEMPLATE.md    # Report template
├── TRUSTED_SOURCES.md    # Trusted sources by tier
└── EVIDENCE_HIERARCHY.md # Evidence levels grid

My first version? One 250-line file. Claude kept losing track, mixing up workflow phases, forgetting sections of the report. Frustrating.

Here's the thing: Claude Code loads the entire SKILL.md when the skill triggers. Every token competes with conversation history. Moving references to separate files is "progressive disclosure" - Claude only loads them when it actually needs them. That one change fixed most of my issues.

The SKILL.md: where it all happens

Here's the full file. The YAML frontmatter tells Claude when to trigger the skill, and the markdown body defines how to execute it:

name: science-check
description: 'Verifie une affirmation scientifique ou de sante en croisant
  PubMed, Semantic Scholar, et le web...'
user-invocable: true
argument-hint: '[affirmation a verifier]'
allowed-tools:
  - Agent
  - Bash
  - Read
  - WebSearch
  - WebFetch
  - AskUserQuestion
  - Write
  - mcp__pubmed__search_pubmed
  - mcp__pubmed__get_paper_fulltext
  - mcp__paper-search__search_pubmed
  - mcp__paper-search__search_arxiv
  - mcp__paper-search__search_google_scholar
  - mcp__paper-search__search_biorxiv
  - mcp__paper-search__search_medrxiv
  - mcp__paper-search__read_pubmed_paper
  - mcp__paper-search__read_biorxiv_paper
  - mcp__paper-search__read_medrxiv_paper

The full 6-phase workflow and all rules are in the SKILL.md on GitHub.

What I learned about this frontmatter (the hard way)

Make the description "pushy". Claude tends to under-trigger skills. You ask "does magnesium help with sleep?" and Claude just answers instead of using the skill. By explicitly listing "nutrition, dietary supplements, medication, therapy" in the description, you nudge the triggering. Took me 4 or 5 rewrites before it triggered consistently.

allowed-tools needs the full MCP names. This is where I lost an evening. First test: I run /science-check, agents launch, everything looks fine... except they never use PubMed. No error in the logs. Just... silence. Turns out the format is mcp__<server_name>__<tool_name> and I hadn't declared the MCP tools in allowed-tools. Claude simply didn't have permission to use them inside the skill context. No error, no warning. Nothing. Maddening.

Agent in the list. That's what enables 3 parallel searches instead of sequential. ~1 minute instead of ~3. Worth it.

The 6-phase workflow

Phase 1: Translate the claim

Nothing fancy. Scientific databases are in English. "Spirulina is good for health" becomes "spirulina health benefits evidence".

Phase 2: Orientation - 3 parallel agents

Three sub-agents are launched simultaneously, each with a different research angle:

Agent A searches for meta-analyses and systematic reviews (highest evidence level)
Agent B searches for risks and side effects (the counterpart often missing from benefit-oriented searches)
Agent C searches for critical analyses and debunking (confirmation bias reduction)

Each agent has access to WebSearch and the PubMed/Paper Search MCPs, provided they are declared in allowed-tools.

Phase 3: Deep dive

Claude fetches the best sources found, following a reliability ranking defined in TRUSTED_SOURCES.md:

Tier 1: Cochrane Library, PubMed, Examine.com
Tier 2: EFSA, FDA, ANSES, WHO
Tier 3: Harvard Health, Mayo Clinic, McGill OSS, NHS
Tier 4: Retraction Watch, Semantic Scholar (citation counts)

The paper-search MCP can pull citation counts directly from Semantic Scholar, providing a signal on a study's real-world impact.

Phase 4: Cross-validation

For each key study, Claude checks:

Sample size (n=?)
Study type (RCT, observational, animal, in vitro)
Funding source (industry = potential bias)
Replication of results
Retraction status via Retraction Watch

Phase 5: Self-verification

A quality checklist is evaluated before writing the report:

At least 3 independent sources consulted
At least 1 meta-analysis or systematic review found (otherwise flagged in the report)
No conclusion based on a single study
Risks and side effects identified

If any criterion fails, Claude relaunches targeted searches before proceeding to synthesis. Without this phase, I found Claude would sometimes conclude "CONFIRMED" from a single RCT with 30 participants.

Phase 6: Synthesis with `ultrathink`

The word ultrathink in the SKILL.md activates Claude's extended thinking. Synthesis requires weighing contradictory evidence (positive vs negative meta-analyses, divergent opinions between EFSA and FDA, etc.) and producing a weighted overall verdict. The report is generated following the template in REPORT_TEMPLATE.md, using the evidence grid from EVIDENCE_HIERARCHY.md.

The reference files

`EVIDENCE_HIERARCHY.md`

Evidence grid used to assign verdicts:

See the full grid on GitHub.

`TRUSTED_SOURCES.md`

Trusted sources ranked by consultation priority:

See the full ranking on GitHub.

`REPORT_TEMPLATE.md`

The template Claude follows to generate the final report:

See the full template on GitHub.

Try it

Restart Claude Code, then:

/science-check intermittent fasting helps with weight loss

Claude shows a progress checklist, launches 3 background agents (you see notifications as they complete), runs cross-validations, and produces the full report. About 1-2 minutes depending on topic complexity.

What I learned

The description is 80% of the work. I spent more time tweaking those 3 lines of YAML than writing the entire workflow. If Claude doesn't trigger the skill, nothing else matters.

Sub-agents are a game changer. Sequential searches -> 3 parallel agents = 3x faster, better quality because each agent has its dedicated angle. Catch: without Agent in allowed-tools, it silently falls back to sequential. No warning.

Self-verification isn't optional. I almost removed it to save tokens. Bad idea. It's the phase that stops Claude from concluding "CONFIRMED" based on an in vitro study with 12 mice.

The silent MCP debugging trap. When an MCP tool is missing from allowed-tools, there's no error. Claude just... doesn't use it. Burned an evening on this. Check your declared tools.

Limits

This doesn't replace a doctor. The report is only as good as what's available online, and Claude can misread a study. But for a first filter - "is this worth bringing up with my doctor?" - it's become a reflex.

Next up: formal evals with Anthropic's skill-creator framework, a cache to avoid hammering PubMed with duplicate queries, and maybe a web version via the Agent SDK for non-devs.

If you work in a field where you need to verify claims - health, nutrition, but also finance, law, tech - the pattern transfers: parallel multi-angle search, cross-validation, self-verification, structured report. The skill changes, the skeleton stays.

Disclaimer

Health claims are everywhere and nobody checks them

"Spirulina is a superfood." "Coffee causes cancer." "Intermittent fasting cures everything."

So I automated it. Type /science-check spirulina is a superfood in Claude Code, wait about 2 minutes, get a full report. Verdicts, evidence levels, risks, verified sources. The works.

I burned a weekend iterating on this (and an entire evening on an allowed-tools bug I'll get to later). Here's how I built it and what I learned.

What it does

A slash command /science-check that:

Launches 3 research agents in parallel (meta-analyses, risks, critical analysis)
Searches PubMed, Semantic Scholar, Google Scholar via MCP servers
Validates key studies (retractions, funding, sample sizes)
Produces a structured report with per-claim verdicts

Here's the full terminal output from /science-check is spirulina a "superfood"?:

- [x] Claim recovered and translated to English
- [x] ORIENTATION phase: 3 parallel searches launched
- [x] DEEPENING phase: reference sources fetched
- [x] VALIDATION phase: key studies verified
- [x] SELF-CHECK phase: quality checklist validated
- [x] SYNTHESIS phase: report generated

══════════════════════════════════════════════════════════════════════
 SCIENCE CHECK
══════════════════════════════════════════════════════════════════════

Claim: Spirulina is a "superfood"
Date: March 5, 2026
Overall verdict: PARTIALLY CONFIRMED

── Scientific consensus ───────────────────────────────────────────

Spirulina has documented, real benefits — primarily on lipid profiles
and blood pressure — supported by multiple meta-analyses of RCTs.
However, the term "superfood" has NO official scientific definition
— it's a marketing term. Confirmed benefits are modest and targeted,
far from the miracle food image pushed by marketing.

── Per-claim breakdown ────────────────────────────────────────────

Claim                             │ Verdict              │ Evidence level
──────────────────────────────────┼──────────────────────┼──────────────────────────────────
Lowers cholesterol/LDL/TG        │ CONFIRMED            │ Meta-analysis 20 RCTs (n=1076)
Lowers blood pressure             │ CONFIRMED            │ Meta-analysis of RCTs
Helps with weight loss            │ Partially confirmed  │ Meta-analysis 17 RCTs: −1.07 kg
Glycemic control (T2 diabetes)    │ Partially confirmed  │ Meta-analysis 8 studies
Reduces inflammation (CRP, IL-6)  │ Partially confirmed  │ Meta-analyses: modest effect
Rich in protein ("60%")           │ OVERSTATED           │ 3 g serving = 2 g protein
Vitamin B12 source                │ DEBUNKED             │ Inactive pseudo-B12
Liver detox                       │ UNPROVEN             │ No proven mechanism
General immune boost              │ UNPROVEN             │ Zero quality RCTs
Anti-cancer                       │ PREMATURE            │ In vitro only
"Superfood" (term)                │ NOT APPLICABLE       │ Purely a marketing term

── Risks and side effects ─────────────────────────────────────────

Risk                                  │ Severity         │ Population
──────────────────────────────────────┼──────────────────┼───────────────────────────────
Heavy metal contamination             │ Moderate-high    │ All (94% products contaminated)
Microcystins (cyanotoxins)            │ High             │ Chronic use ≥ 4 g/day
Autoimmune diseases                   │ Moderate-high    │ Lupus, MS, vitiligo, RA
Anticoagulant interactions            │ Moderate         │ On warfarin/aspirin
Immunosuppressant interactions        │ Moderate         │ On azathioprine/cyclosporine
Phenylketonuria (PKU)                 │ Very high        │ Absolute contraindication
Allergic reactions                    │ Variable         │ Iodine/seafood allergy
Pseudo-B12 masking deficiency         │ Low              │ Vegans

── Official positions ─────────────────────────────────────────────

ANSES (France) : Safe at moderate doses. CI: PKU, allergies.
                 Heavy metal contamination alert. (2017)
FDA (USA)      : GRAS status. Approved as food coloring.
                 Minimal supplement regulation.
EFSA (EU)      : REJECTED diabetes health claims (2013).
                 No full assessment completed.
WHO            : No specific position.

── Red flags ──────────────────────────────────────────────────────

⚠ "Superfood" = no official scientific definition
⚠ Claims-to-evidence ratio ~10:1 (50+ claims, <5 proven)
⚠ $630M → $1.4B market (likely publication bias)
⚠ 1 retracted study: "Spirulina Unleashed" (MDPI, 2024)
⚠ Weasel language: 36x "may/might/suggest", 0x "proven"
⚠ 94% of samples positive for microcystins

── Sources (14 consulted) ─────────────────────────────────────────

 [1] Spirulina & lipid profile — Meta-analysis 20 RCTs (2023)
 [2] Spirulina & cardiometabolic health — Meta-analysis (2025)
 [3] Spirulina & blood pressure — Meta-analysis RCTs (2021)
 [4] Spirulina & body composition — Meta-analysis 17 RCTs (2025)
 [5] Spirulina & type 2 diabetes — Meta-analysis (2021)
 [6] Spirulina & CRP — GRADE meta-analysis (2025)
 [7] Spirulina & inflammation — Meta-analysis RCTs (2025)
 [8] Examine.com — Evidence-based review (2025)
 [9] ANSES — Regulatory position (2017)
[10] Rubio et al. — Heavy metals (2021)
[11] Autoimmune reactions — Case reports (2025)
[12] Umbrella review — Meta-analyses (2026)
[13] EFSA — Rejected claims (2013)
[14] "Spirulina Unleashed" — Retracted (2024/2025)

══════════════════════════════════════════════════════════════════════

Bottom line: real but modest benefits (cholesterol, blood pressure)
confirmed by 15+ meta-analyses. "Superfood" = pure marketing hype.
Contamination risks are non-trivial.

Not bad for 2 minutes of waiting.

Setting up the MCP servers

The skill uses two MCP servers to hit scientific databases directly. Without them it falls back to WebSearch - still works, just less precise.

PubMed MCP (`mcp-simple-pubmed`)

Direct access to NCBI's Entrez API. Free, just needs an email.

# Test it works
uvx mcp-simple-pubmed --help

Paper Search MCP (`paper-search-mcp`)

Multi-source search: PubMed, arXiv, bioRxiv, medRxiv, Semantic Scholar, Google Scholar. The Swiss army knife for academic search.

# Test it
uvx --from paper-search-mcp python -m paper_search_mcp.server --help

The `~/.claude/mcp.json`

Drop both servers into your global MCP config:

See the full mcp.json

{
  "mcpServers": {
    "pubmed": {
      "command": "uvx",
      "args": ["mcp-simple-pubmed"],
      "env": {
        "PUBMED_EMAIL": "your@email.com"
      }
    },
    "paper-search": {
      "command": "uvx",
      "args": ["--from", "paper-search-mcp", "python", "-m", "paper_search_mcp.server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": ""
      }
    }
  }
}

Few things to know:

PUBMED_EMAIL: NCBI's Entrez API needs an email. No API key, just an email to identify requests. Use yours.
SEMANTIC_SCHOLAR_API_KEY: optional. Works without one but with lower rate limits. Grab a free key at semanticscholar.org/product/api.
Both use uvx (the uv runner). Don't have uv? curl -LsSf https://astral.sh/uv/install.sh | sh.

You need to restart Claude Code after editing mcp.json. MCP servers load at startup, not hot.

4 files, not one big blob

The skill lives in ~/.claude/skills/science-check/ (browse on GitHub):

science-check/
├── SKILL.md              # Main instructions (105 lines)
├── REPORT_TEMPLATE.md    # Report template
├── TRUSTED_SOURCES.md    # Trusted sources by tier
└── EVIDENCE_HIERARCHY.md # Evidence levels grid

My first version? One 250-line file. Claude kept losing track, mixing up workflow phases, forgetting sections of the report. Frustrating.

The SKILL.md: where it all happens

Here's the full file. The YAML frontmatter tells Claude when to trigger the skill, and the markdown body defines how to execute it:

name: science-check
description: 'Verifie une affirmation scientifique ou de sante en croisant
  PubMed, Semantic Scholar, et le web...'
user-invocable: true
argument-hint: '[affirmation a verifier]'
allowed-tools:
  - Agent
  - Bash
  - Read
  - WebSearch
  - WebFetch
  - AskUserQuestion
  - Write
  - mcp__pubmed__search_pubmed
  - mcp__pubmed__get_paper_fulltext
  - mcp__paper-search__search_pubmed
  - mcp__paper-search__search_arxiv
  - mcp__paper-search__search_google_scholar
  - mcp__paper-search__search_biorxiv
  - mcp__paper-search__search_medrxiv
  - mcp__paper-search__read_pubmed_paper
  - mcp__paper-search__read_biorxiv_paper
  - mcp__paper-search__read_medrxiv_paper

The full 6-phase workflow and all rules are in the SKILL.md on GitHub.

What I learned about this frontmatter (the hard way)

Agent in the list. That's what enables 3 parallel searches instead of sequential. ~1 minute instead of ~3. Worth it.

The 6-phase workflow

Phase 1: Translate the claim

Nothing fancy. Scientific databases are in English. "Spirulina is good for health" becomes "spirulina health benefits evidence".

Phase 2: Orientation - 3 parallel agents

Three sub-agents are launched simultaneously, each with a different research angle:

Agent A searches for meta-analyses and systematic reviews (highest evidence level)
Agent B searches for risks and side effects (the counterpart often missing from benefit-oriented searches)
Agent C searches for critical analyses and debunking (confirmation bias reduction)

Each agent has access to WebSearch and the PubMed/Paper Search MCPs, provided they are declared in allowed-tools.

Phase 3: Deep dive

Claude fetches the best sources found, following a reliability ranking defined in TRUSTED_SOURCES.md:

Tier 1: Cochrane Library, PubMed, Examine.com
Tier 2: EFSA, FDA, ANSES, WHO
Tier 3: Harvard Health, Mayo Clinic, McGill OSS, NHS
Tier 4: Retraction Watch, Semantic Scholar (citation counts)

The paper-search MCP can pull citation counts directly from Semantic Scholar, providing a signal on a study's real-world impact.

Phase 4: Cross-validation

For each key study, Claude checks:

Sample size (n=?)
Study type (RCT, observational, animal, in vitro)
Funding source (industry = potential bias)
Replication of results
Retraction status via Retraction Watch

Phase 5: Self-verification

A quality checklist is evaluated before writing the report:

At least 3 independent sources consulted
At least 1 meta-analysis or systematic review found (otherwise flagged in the report)
No conclusion based on a single study
Risks and side effects identified

Phase 6: Synthesis with `ultrathink`

The reference files

`EVIDENCE_HIERARCHY.md`

Evidence grid used to assign verdicts:

See the full grid on GitHub.

`TRUSTED_SOURCES.md`

Trusted sources ranked by consultation priority:

See the full ranking on GitHub.

`REPORT_TEMPLATE.md`

The template Claude follows to generate the final report:

See the full template on GitHub.

Try it

Restart Claude Code, then:

/science-check intermittent fasting helps with weight loss

What I learned

The description is 80% of the work. I spent more time tweaking those 3 lines of YAML than writing the entire workflow. If Claude doesn't trigger the skill, nothing else matters.

Self-verification isn't optional. I almost removed it to save tokens. Bad idea. It's the phase that stops Claude from concluding "CONFIRMED" based on an in vitro study with 12 mice.

The silent MCP debugging trap. When an MCP tool is missing from allowed-tools, there's no error. Claude just... doesn't use it. Burned an evening on this. Check your declared tools.

Limits

Next up: formal evals with Anthropic's skill-creator framework, a cache to avoid hammering PubMed with duplicate queries, and maybe a web version via the Agent SDK for non-devs.

Health claims are everywhere and nobody checks them

What it does

Setting up the MCP servers

PubMed MCP (mcp-simple-pubmed)

Paper Search MCP (paper-search-mcp)

The ~/.claude/mcp.json

4 files, not one big blob

The SKILL.md: where it all happens

What I learned about this frontmatter (the hard way)

The 6-phase workflow

Phase 1: Translate the claim

Phase 2: Orientation - 3 parallel agents

Phase 3: Deep dive

Phase 4: Cross-validation

Phase 5: Self-verification

Phase 6: Synthesis with ultrathink

The reference files

EVIDENCE_HIERARCHY.md

TRUSTED_SOURCES.md

REPORT_TEMPLATE.md

Try it

What I learned

Limits

Health claims are everywhere and nobody checks them

What it does

Setting up the MCP servers

PubMed MCP (mcp-simple-pubmed)

Paper Search MCP (paper-search-mcp)

The ~/.claude/mcp.json

4 files, not one big blob

The SKILL.md: where it all happens

What I learned about this frontmatter (the hard way)

The 6-phase workflow

Phase 1: Translate the claim

Phase 2: Orientation - 3 parallel agents

Phase 3: Deep dive

Phase 4: Cross-validation

Phase 5: Self-verification

Phase 6: Synthesis with ultrathink

The reference files

EVIDENCE_HIERARCHY.md

TRUSTED_SOURCES.md

REPORT_TEMPLATE.md

Try it

What I learned

Limits

PubMed MCP (`mcp-simple-pubmed`)

Paper Search MCP (`paper-search-mcp`)

The `~/.claude/mcp.json`

Phase 6: Synthesis with `ultrathink`

`EVIDENCE_HIERARCHY.md`

`TRUSTED_SOURCES.md`

`REPORT_TEMPLATE.md`

PubMed MCP (`mcp-simple-pubmed`)

Paper Search MCP (`paper-search-mcp`)

The `~/.claude/mcp.json`

Phase 6: Synthesis with `ultrathink`

`EVIDENCE_HIERARCHY.md`

`TRUSTED_SOURCES.md`

`REPORT_TEMPLATE.md`