What we need from engineering to run full E2E tests on staging
Right now we can already run streamer + operator tests side-by-side. The plumbing exists:
playwright.config.tsdeclares one project per role (public,streamer,operator)- Each role project loads its own
auth/<role>.<env>.jsonstorage state npm run test:stageruns all of them in one process;workers: 1keeps them from racing on shared accounts
What's holding us back from a clean, hands-off, repeatable run on every staging deploy is auth + fixtures. Here's the prioritised ask.
1. Test-only authentication endpoint (the unblocker)
Without this we either rely on storage-state files that expire (manual re-capture every few days), or we drive Kick OAuth + email PIN in CI (brittle and slow). With it, every test starts authenticated in <50 ms.
Contract proposal:
POST /api/test/session
Headers:
X-Test-Secret: <rotating shared secret; CI/local env only>
Body:
{ "as": "streamer" | "operator", "fixture": "e2e-streamer-1" }
Response 200:
Set-Cookie: <same session cookie a real login produces>
{ "userId": "...", "role": "streamer" }
Response 401: bad secret
Response 403: endpoint disabled in this build
Response 404: unknown fixtureSecurity mitigations to ship together (all three, not just one):
- Build-time exclusion - wrap the route in a build flag so the production binary literally doesn't contain it. Env vars alone are too easy to misconfigure.
- Fixture whitelist - endpoint only accepts a predefined list of fixture user IDs. Refuses arbitrary emails; refuses minting an admin/operator role unless the fixture has that role.
- Loud logging + alerts - every call logs
WARN: test session minted for <fixture> from <ip>and alerts on any call from outside CI / the test secret's holder.
The companies that got burned by test backdoors had one of these, not three. We need all three.
2. Fixture accounts on staging
A small set of dedicated test users, never used for real interaction:
| Fixture | Role | State |
|---|---|---|
e2e-streamer-fresh | streamer | Just signed up, KYC not started, no streams, no price-per-stream set |
e2e-streamer-verified | streamer | KYC approved, price-per-stream set, no applications yet |
e2e-streamer-applied | streamer | Has an existing application/offer on e2e-campaign-funded for negotiation tests |
e2e-operator-fresh | operator | No campaigns |
e2e-operator-with-funded | operator | Has one funded campaign (e2e-campaign-funded) that persists across runs - needed for F-04 funded-edit tests, OP-06, OP-14/15/16 |
e2e-operator-with-deal-active | operator | Has a deal in the active state with a streamer fixture - exercises /operator/deals Active tab |
e2e-operator-with-deal-completed | operator | Has a deal in the completed state - exercises /operator/deals Completed tab |
e2e-operator-with-deal-refunded | operator | Has a deal in the refunded state - exercises /operator/deals Refunded tab |
e2e-operator-with-deal-cancelled | operator | Has a deal in the cancelled state - exercises /operator/deals Cancelled tab |
e2e-operator-with-pending-offer | operator | Has a pending offer/application from a streamer fixture - exercises /operator/offers and /operator/inbox populated states |
Stage data should be reset-friendly - i.e. not promoted to prod, no real money, no Kick OAuth that hits real Kick (use a fake-Kick test endpoint or seed the OAuth grants directly).
3. Reset endpoints
For idempotent test runs we need a way to wipe a fixture's mutable state between runs:
POST /api/test/streamer/<id>/reset
→ removes applications, offers, deals, featured videos, language selections; KYC kept
POST /api/test/operator/<id>/reset
→ removes draft campaigns; keeps the persistent funded fixture campaign
POST /api/test/campaign/<id>/reset
→ returns a campaign to a known state (refund + redraft, or reset application list)Gated by the same X-Test-Secret + build-flag setup as #1.
Without this, our tests already do best-effort cleanup (the [E2E] OP-08 <timestamp> naming + scavenger setup), but a hard reset is much more reliable and lets us drop the workers:1 serialisation.
Priority bump (June 2026): the new staging shipped /operator/inbox, /operator/offers, /operator/negotiations, /operator/deals - these surfaces are cross-account by definition (an operator's offer is a streamer's application). Once we start exercising those flows end-to-end, scoped-naming cleanup stops scaling: a test that accepts an offer creates state on the streamer account too, and the streamer fixture can't be scavenged from the operator side. Reset endpoints move from "nice to have" to "needed before this lane goes green in CI."
4. Persistent funded fixture campaign
Funding a campaign requires a real on-chain USDC deposit signed by MetaMask. We can't do that in CI without:
- A test wallet pre-loaded with test USDC on staging's chain
- MetaMask in the Playwright browser (via Synpress or similar)
- Automated signature acceptance
The easiest unblock: engineering creates and funds one campaign manually on the e2e-operator-with-funded fixture, and we treat it as a read-only fixture that persists across runs. Then OP-14/15/16 and the F-04 repro tests start running automatically. No MetaMask in CI needed.
5. Smaller asks that compound
- Feature flags exposed via a debug endpoint (
GET /api/test/flags) so tests can skip when a flag is off rather than fail. - A clean way to assert "this email is the most recent OTP" for operator email-PIN login. Mailosaur/MailSlurp on a dedicated domain would do it.
- Stable launch-date config - F-14 was caused by a hardcoded banner date; if the date comes from
SiteLaunchContextalready, we just need confirmation so we can rely on it. - Webhook simulator for Kick stream events. Without it we can't deterministically test the streamer's "live → ended" state transitions (F-11).
- SPA hard-load routing: a direct
goto('/operator/inbox')(and same for/offers,/negotiations,/deals) lands on/operator/campaignsinstead of the requested route. The pages render fine via client-side sidebar click, so it looks like an SSR / hydration fallback that redirects to the operator's default landing on cold load. We work around it in tests by clicking the sidebar, but it's a real UX bug for anyone bookmarking or sharing these URLs.
What we can do today (no eng dependencies)
Even without all of the above, our existing setup handles a lot:
- Single-role tests using
auth/<role>.<env>.jsonstorage state - we capture once interactively (via the debug Chrome's CDP attach) and Playwright reuses for days. - The
npm run test:stageandnpm run test:prodcommands run all role projects sequentially with their own auth files. - Mutating tests are scoped to
[E2E] *names and self-clean.
The cliff we hit when staging gets new flows (negotiations, deals, multi-role interactions) is that one operator's actions need to be visible to one streamer in the same test run. Doable in Playwright (browser.newContext() × 2 in a single test), but data-dependent and brittle without the fixture/reset endpoints above.
Recommended path forward:
- Ship #1 (test-only auth endpoint) + #2 (fixture accounts) first - unblocks 95% of CI flakiness.
- Add #3 (reset endpoints) in lockstep with #2's expanded fixture set - the new offer/negotiation/deal surfaces make cross-account state unavoidable, and scoped-naming cleanup can't reach into the streamer side from operator tests.
- Treat #4 (funded fixture) as a one-off manual setup; only revisit when on-chain test infra becomes worth it.