What we need from engineering to run full E2E tests on staging

Right now we can already run streamer + operator tests side-by-side. The plumbing exists:

playwright.config.ts declares one project per role (public, streamer, operator)
Each role project loads its own auth/<role>.<env>.json storage state
npm run test:stage runs all of them in one process; workers: 1 keeps them from racing on shared accounts

What's holding us back from a clean, hands-off, repeatable run on every staging deploy is auth + fixtures. Here's the prioritised ask.

1. Test-only authentication endpoint (the unblocker)

Without this we either rely on storage-state files that expire (manual re-capture every few days), or we drive Kick OAuth + email PIN in CI (brittle and slow). With it, every test starts authenticated in <50 ms.

Contract proposal:

POST /api/test/session
Headers:
  X-Test-Secret: <rotating shared secret; CI/local env only>
Body:
  { "as": "streamer" | "operator", "fixture": "e2e-streamer-1" }
Response 200:
  Set-Cookie: <same session cookie a real login produces>
  { "userId": "...", "role": "streamer" }
Response 401: bad secret
Response 403: endpoint disabled in this build
Response 404: unknown fixture

Security mitigations to ship together (all three, not just one):

Build-time exclusion - wrap the route in a build flag so the production binary literally doesn't contain it. Env vars alone are too easy to misconfigure.
Fixture whitelist - endpoint only accepts a predefined list of fixture user IDs. Refuses arbitrary emails; refuses minting an admin/operator role unless the fixture has that role.
Loud logging + alerts - every call logs WARN: test session minted for <fixture> from <ip> and alerts on any call from outside CI / the test secret's holder.

The companies that got burned by test backdoors had one of these, not three. We need all three.

2. Fixture accounts on staging

A small set of dedicated test users, never used for real interaction:

Fixture	Role	State
`e2e-streamer-fresh`	streamer	Just signed up, KYC not started, no streams, no price-per-stream set
`e2e-streamer-verified`	streamer	KYC approved, price-per-stream set, no applications yet
`e2e-streamer-applied`	streamer	Has an existing application/offer on `e2e-campaign-funded` for negotiation tests
`e2e-operator-fresh`	operator	No campaigns
`e2e-operator-with-funded`	operator	Has one funded campaign (`e2e-campaign-funded`) that persists across runs - needed for F-04 funded-edit tests, OP-06, OP-14/15/16
`e2e-operator-with-deal-active`	operator	Has a deal in the `active` state with a streamer fixture - exercises `/operator/deals` Active tab
`e2e-operator-with-deal-completed`	operator	Has a deal in the `completed` state - exercises `/operator/deals` Completed tab
`e2e-operator-with-deal-refunded`	operator	Has a deal in the `refunded` state - exercises `/operator/deals` Refunded tab
`e2e-operator-with-deal-cancelled`	operator	Has a deal in the `cancelled` state - exercises `/operator/deals` Cancelled tab
`e2e-operator-with-pending-offer`	operator	Has a pending offer/application from a streamer fixture - exercises `/operator/offers` and `/operator/inbox` populated states

Stage data should be reset-friendly - i.e. not promoted to prod, no real money, no Kick OAuth that hits real Kick (use a fake-Kick test endpoint or seed the OAuth grants directly).

3. Reset endpoints

For idempotent test runs we need a way to wipe a fixture's mutable state between runs:

POST /api/test/streamer/<id>/reset
  → removes applications, offers, deals, featured videos, language selections; KYC kept
POST /api/test/operator/<id>/reset
  → removes draft campaigns; keeps the persistent funded fixture campaign
POST /api/test/campaign/<id>/reset
  → returns a campaign to a known state (refund + redraft, or reset application list)

Gated by the same X-Test-Secret + build-flag setup as #1.

Without this, our tests already do best-effort cleanup (the [E2E] OP-08 <timestamp> naming + scavenger setup), but a hard reset is much more reliable and lets us drop the workers:1 serialisation.

Priority bump (June 2026): the new staging shipped /operator/inbox, /operator/offers, /operator/negotiations, /operator/deals - these surfaces are cross-account by definition (an operator's offer is a streamer's application). Once we start exercising those flows end-to-end, scoped-naming cleanup stops scaling: a test that accepts an offer creates state on the streamer account too, and the streamer fixture can't be scavenged from the operator side. Reset endpoints move from "nice to have" to "needed before this lane goes green in CI."

4. Persistent funded fixture campaign

Funding a campaign requires a real on-chain USDC deposit signed by MetaMask. We can't do that in CI without:

A test wallet pre-loaded with test USDC on staging's chain
MetaMask in the Playwright browser (via Synpress or similar)
Automated signature acceptance

The easiest unblock: engineering creates and funds one campaign manually on the e2e-operator-with-funded fixture, and we treat it as a read-only fixture that persists across runs. Then OP-14/15/16 and the F-04 repro tests start running automatically. No MetaMask in CI needed.

5. Smaller asks that compound

Feature flags exposed via a debug endpoint (GET /api/test/flags) so tests can skip when a flag is off rather than fail.
A clean way to assert "this email is the most recent OTP" for operator email-PIN login. Mailosaur/MailSlurp on a dedicated domain would do it.
Stable launch-date config - F-14 was caused by a hardcoded banner date; if the date comes from SiteLaunchContext already, we just need confirmation so we can rely on it.
Webhook simulator for Kick stream events. Without it we can't deterministically test the streamer's "live → ended" state transitions (F-11).
SPA hard-load routing: a direct goto('/operator/inbox') (and same for /offers, /negotiations, /deals) lands on /operator/campaigns instead of the requested route. The pages render fine via client-side sidebar click, so it looks like an SSR / hydration fallback that redirects to the operator's default landing on cold load. We work around it in tests by clicking the sidebar, but it's a real UX bug for anyone bookmarking or sharing these URLs.

What we can do today (no eng dependencies)

Even without all of the above, our existing setup handles a lot:

Single-role tests using auth/<role>.<env>.json storage state - we capture once interactively (via the debug Chrome's CDP attach) and Playwright reuses for days.
The npm run test:stage and npm run test:prod commands run all role projects sequentially with their own auth files.
Mutating tests are scoped to [E2E] * names and self-clean.

The cliff we hit when staging gets new flows (negotiations, deals, multi-role interactions) is that one operator's actions need to be visible to one streamer in the same test run. Doable in Playwright (browser.newContext() × 2 in a single test), but data-dependent and brittle without the fixture/reset endpoints above.

Recommended path forward:

Ship #1 (test-only auth endpoint) + #2 (fixture accounts) first - unblocks 95% of CI flakiness.
Add #3 (reset endpoints) in lockstep with #2's expanded fixture set - the new offer/negotiation/deal surfaces make cross-account state unavoidable, and scoped-naming cleanup can't reach into the streamer side from operator tests.
Treat #4 (funded fixture) as a one-off manual setup; only revisit when on-chain test infra becomes worth it.

What we need from engineering to run full E2E tests on staging ​

1. Test-only authentication endpoint (the unblocker) ​

2. Fixture accounts on staging ​

3. Reset endpoints ​

4. Persistent funded fixture campaign ​

5. Smaller asks that compound ​