TL;DR

Skill : webapp-testing (Anthropic official) — Gives Claude Code control of a real Chromium instance via Playwright Python SDK. Drive the browser in natural language: manual login → handoff to Claude → test authenticated flows, screenshots, DOM inspection, console logs. Only official skill capable of real browser testing without HTTP mocks.
Safe installation : Dedicated venv embedded in ~/.claude/skills/webapp-testing/.venv/ — never sudo pip install or misuse pipx. PEP 668 blocks system-wide Python installs on modern distros (good thing). Complete isolation: Chromium 600 MB + Python deps stay inside the skill. Clean uninstall: rm -rf ~/.claude/skills/webapp-testing = zero residue. Survives system Python upgrades.
Installation workflow : Sparse-checkout skill from anthropics/skills → create dedicated venv → pip install playwright → playwright install --with-deps chromium (sudo once for system libs) → patch SKILL.md to point to venv Python → smoke test SSR. 6 steps, ~20 min total.
Use case : Test UI locally while coding. OAuth login, 2FA, authenticated routes, hydration checks, visual regression, console error detection — everything static analysis can't touch. Workflow: you log in manually, Claude takes control, writes Playwright scripts, executes, reports. Zero flaky Selenium.
Trade-off : Chromium only (no Firefox/Safari via this skill). Setup requires sudo once (system libs). Best practice: isolated venv per skill, never global pipx. Works with /goal for unattended automation.

A skill that drives Chromium

Saturday evening, I stumble onto a discussion about a recent Anthropic skill that gives Claude Code control of a real browser. Not a simulation, not an HTML parser: a Chromium instance it drives via Playwright to test your app locally while you code.

It's called webapp-testing. It's official, it's in the anthropics/skills repo, and according to recent blogs it racks up ~117k installs a week. It's one of the most-used official skills right now.

The concept is simple: instead of writing Playwright scripts by hand, you describe what you want to test in natural language, and Claude writes + runs the script in a real browser. You can log in by hand, then hand control back to it. It takes screenshots, reads the DOM, captures console logs, debugs authenticated flows. The kind of thing no static code analysis can do.

I wanted to install it. And here's the first surprise: not a single blog I read gives the right install method. They all propose something that pollutes the system or that'll work for two weeks before breaking.

Here's how I installed it cleanly, with a dedicated venv, no sudo pip install, no misused pipx.

The problem with the usual methods

For Playwright on Python in 2026, the official consensus is clear:

Never pip install system-wide. PEP 668 blocks this by default on Ubuntu, Debian, Fedora and most modern distros. You get hit with an error: externally-managed-environment, and good thing too — nothing gets polluted.
Not pipx. pipx is designed for standalone CLIs. But Playwright is used as an imported library (from playwright.sync_api import ...). pipx isolates things so well that you can't import it from an external script. Classic mistake, plenty of people make it.
Always --with-deps on Linux when installing the browsers. Otherwise Chromium crashes with cryptic errors about missing .so files - audio libs, fonts, rendering. --with-deps runs apt install under the hood (so it asks for sudo), but that's the right route.

The best practice for a classic Playwright project is: one venv per project. But for a global Claude Code skill, venv-per-project doesn't work - Claude invokes the skill from any directory, it has no way to know which venv to activate.

The solution: a dedicated venv inside the skill folder

The pattern that holds up: a venv embedded directly inside the skill folder, and a patch in SKILL.md to tell Claude "use THIS python, not the system one".

Like this:

The skill is self-contained, it ships its own stack.
Zero pollution of the system Python.
If I uninstall the skill (rm -rf ~/.claude/skills/webapp-testing), everything goes away cleanly, including the 600 MB of Chromium.
It survives system python upgrades, distro changes, migrations.

Here's how to set it up.

Installation workflow

Step 1: get the skill

The official anthropics/skills repo contains plenty of other stuff (PDF/DOCX/PPTX document skills, MCP server generator, etc.). We just want webapp-testing. Sparse checkout so we don't clone everything:

cd /tmp
git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/anthropics/skills.git anthropics-skills-tmp
cd anthropics-skills-tmp
git sparse-checkout set skills/webapp-testing
cp -r skills/webapp-testing ~/.claude/skills/
cd .. && rm -rf anthropics-skills-tmp

The ~/.claude/skills/webapp-testing/ folder now contains:

webapp-testing/
├── SKILL.md         # Main instructions
├── LICENSE.txt
├── examples/        # console_logging.py, element_discovery.py, static_html_automation.py
└── scripts/         # with_server.py (multi-server lifecycle)

At this point, Claude Code detects the skill on its next startup. But it doesn't work yet - Playwright isn't installed.

Step 2: dedicated venv inside the skill

python3 -m venv ~/.claude/skills/webapp-testing/.venv
~/.claude/skills/webapp-testing/.venv/bin/pip install --upgrade pip
~/.claude/skills/webapp-testing/.venv/bin/pip install playwright

Everything stays contained in ~/.claude/skills/webapp-testing/.venv/. Zero impact on the system Python. If I type python3 in a terminal, it's still my pristine OS Python.

On my machine that gives Python 3.12.3, pip 26.1, Playwright 1.59.0 (released late April 2026).

Step 3: Chromium + system dependencies

~/.claude/skills/webapp-testing/.venv/bin/playwright install --with-deps chromium

--with-deps is critical on Linux. It runs a sudo apt install (or equivalent) under the hood to lay down the shared libs Chromium needs: libnss3, libatk1.0, libxkbcommon, libgbm, and about twenty others. Without it, the downloaded browser crashes on launch with an error like error while loading shared libraries: libnss3.so.

The download is ~280 MB:

Chrome for Testing 147.0.7727.15 (170 MB) → ~/.cache/ms-playwright/chromium-1217/
Chrome Headless Shell (112 MB) → ~/.cache/ms-playwright/chromium_headless_shell-1217/

Note: the Chromium binary is stored in ~/.cache/ms-playwright/, not in the venv. That's the Playwright default and it's a good thing - if tomorrow you want to use Playwright in another project, you can reuse the same browser cache.

Step 4: patch SKILL.md

This is the step nobody mentions in online guides, and yet it's the one that makes the install actually work.

By default, SKILL.md tells Claude "run the scripts with python3". But python3 is the system Python, which doesn't have Playwright. So Claude calls python3 scripts/with_server.py, it crashes with ModuleNotFoundError: No module named 'playwright', and it spends half the session figuring out why.

I add an IMPORTANT section at the top of SKILL.md pointing to the right interpreter:

**IMPORTANT — Python interpreter to use**:
This skill ships with its own dedicated venv at
`~/.claude/skills/webapp-testing/.venv` with Playwright + Chromium
pre-installed. **Always invoke scripts with this interpreter**, never
the system `python3` (which won't have Playwright):

\`\`\`bash
~/.claude/skills/webapp-testing/.venv/bin/python scripts/with_server.py --help
~/.claude/skills/webapp-testing/.venv/bin/python /tmp/your_automation.py
\`\`\`

When the helper `with_server.py` invokes child commands, also pass this
interpreter explicitly (e.g. `... -- ~/.claude/skills/webapp-testing/.venv/bin/python your_automation.py`).

Claude reads SKILL.md every time the skill triggers. This note lands straight in its context, and it consistently uses the right python. No more ModuleNotFoundError.

Step 5: smoke test

Before declaring victory, verify that Chromium actually launches:

~/.claude/skills/webapp-testing/.venv/bin/python -c "
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    b = p.chromium.launch(headless=True)
    page = b.new_page()
    page.set_content('<h1>hello</h1>')
    print('h1:', page.locator('h1').inner_text())
    b.close()
print('OK')
"

Expected output:

h1: hello
OK

If you see that, the install is complete. If you get a crash on shared libs, it means --with-deps didn't run correctly (often: sudo denied or apt unavailable). Re-run the command with explicit sudo.

Why not the `npx skills add` method

Several blogs suggest:

npx skills add https://github.com/anthropics/skills --skill webapp-testing

Or with the marketplace plugin:

/plugin marketplace add anthropics/skills
/plugin install example-skills@anthropic-agent-skills

It lays down the skill (basically what my sparse git clone does), but it doesn't handle the Playwright install. You end up with a skill detected by Claude Code that crashes on first use because Python can't find Playwright. And then you read Stack Overflow for 30 minutes.

The "git sparse + dedicated venv + patch SKILL.md" method takes 5 minutes and works on the first try.

Bonus: alternative with `uv`

uv (by Astral) has become the go-to for managing Python in 2026. Faster than pip (10x on installs), native venv management, auto-dependency scripts via PEP 723. If you already have it installed:

curl -LsSf https://astral.sh/uv/install.sh | sh   # if not already installed

uv venv ~/.claude/skills/webapp-testing/.venv
uv pip install --python ~/.claude/skills/webapp-testing/.venv/bin/python playwright
~/.claude/skills/webapp-testing/.venv/bin/playwright install --with-deps chromium

The result is rigorously identical, the pip install just goes twice as fast. The SKILL.md patch still needs doing either way.

How to use it now

Once it's set up, you call the skill in natural language. A few examples that work for me:

"Start my Next.js dev server on port 3000 and use webapp-testing to check
that the homepage loads with no console errors."

"Run webapp-testing on localhost:5173, take a screenshot of the login
page, then try to log in with test@example.com/password and check that
we land on the dashboard."

"Using webapp-testing, navigate to the checkout form and list all the
input field selectors - I want to write an E2E test."

Claude automatically detects the skill (it shows up in the internal list at startup), it calls ~/.claude/skills/webapp-testing/.venv/bin/python thanks to the SKILL.md patch, and it writes + runs the Playwright script on the fly.

The pattern the skill recommends

The official SKILL.md gives a clear decision tree:

Static app (pure HTML) → read the HTML file directly to identify selectors, then a simple Playwright script.
Dynamic app, server already running → "recon → action" pattern: page.goto(url), page.wait_for_load_state('networkidle'), screenshot/inspection, selector identification, execution.
Dynamic app, server to be started → use the scripts/with_server.py helper that manages the multi-server lifecycle (frontend + backend in parallel, for example).

The most important rule: always wait_for_load_state('networkidle') before inspecting the DOM on a dynamic app. Otherwise Claude reads a dead DOM mid-render and writes a test that fails 1 time in 3.

What I learned

The official Playwright docs recommend venv but don't cover the "global skill" case. I had to adapt the pattern myself. The docs talk about venv-per-project, which is the right reflex in classic dev. But for a skill invoked from anywhere, the venv embedded in the skill itself is the only clean way.

The SKILL.md patch is 90% of the install's stability. Without that note, Claude falls back on the system python3 and crashes. With it, it consistently uses the right interpreter. An 8-line note changes everything.

The Chromium binaries in ~/.cache/ms-playwright/ are shared. If tomorrow you do another project with Playwright (regardless of install method), it'll reuse the same cache. You download Chromium only once on your machine.

--with-deps is non-negotiable. I tried skipping the sudo once to go faster, I spent 20 minutes debugging an error on libnspr4.so. Don't do it. --with-deps runs a single apt install, it's fast.

This skill replaces 80% of my MCP Playwright needs. I previously had a setup with @playwright/mcp configured as an MCP server. It's more powerful for interactive exploration (persistent browser state, accessibility tree in the response), but heavier to set up and more expensive in tokens. For "check that this flow works" tests, the webapp-testing skill is more direct.

Limits

It's a development aid tool, not an E2E test framework for production. If you're building a real E2E test suite with CI/CD, sharding, retries, allure reports - use pytest-playwright in a dedicated project. The Claude Code skill is for the everyday: "I changed this component, quickly check that nothing's broken on flow X".

And of course, everything Claude sees in the browser (DOM, console output, form data) goes to the Anthropic API. Use it on dev environments with test data, not on production with real customer data.

Otherwise, it's become my reflex whenever I touch frontend. I code, I say "check the login flow", it verifies in 30 seconds by driving Chromium. It's what E2E tests should have been from the start.

Installing the Claude Code webapp-testing Skill Cleanly

A skill that drives Chromium

The problem with the usual methods

The solution: a dedicated venv inside the skill folder

Installation workflow

Step 1: get the skill

Step 2: dedicated venv inside the skill

Step 3: Chromium + system dependencies

Step 4: patch SKILL.md

Step 5: smoke test

Why not the `npx skills add` method

Bonus: alternative with `uv`

How to use it now

The pattern the skill recommends

What I learned

Limits

Claude Code as a back-office: wiring Drive, Gmail and Trello to actually run your company

Claude Fable 5: what the official demos don't tell you

chrome-devtools MCP from WSL: driving (and auto-launching) a Windows Chrome

Installing the Claude Code webapp-testing Skill Cleanly

A skill that drives Chromium

The problem with the usual methods

The solution: a dedicated venv inside the skill folder

Installation workflow

Step 1: get the skill

Step 2: dedicated venv inside the skill

Step 3: Chromium + system dependencies

Step 4: patch SKILL.md

Step 5: smoke test

Why not the npx skills add method

Bonus: alternative with uv

How to use it now

The pattern the skill recommends

What I learned

Limits

Claude Code as a back-office: wiring Drive, Gmail and Trello to actually run your company

Claude Fable 5: what the official demos don't tell you

chrome-devtools MCP from WSL: driving (and auto-launching) a Windows Chrome

Why not the `npx skills add` method

Bonus: alternative with `uv`