Autonomous agents must run tests before pushing. Pattern: Layer 1 (unit): make test — pure logic, <1s Layer 2 (integration): make test-sqlite — real DB, real handlers Layer 3 (E2E): npx playwright test — browser tests tinqs-git already has upstream Gitea test infra (better than ariki-game). Gap: agents don't run them. This handoff fixes that. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.1 KiB
Handoff: Testing Strategy from ariki-game
Date: 2026-05-22 From: ariki-game (Godot 4, Claude Opus 4.6) To: tinqs-git (Gitea fork, Go + TypeScript) Why: Autonomous agents were shipping broken code without verification. We built a 3-layer test system. Same principles apply here.
What We Learned (ariki-game)
Multiple Claude and DeepSeek V4 agents built features autonomously (terrain, UI, characters, building system) over several sessions. None ran tests. When we finally built integration tests, they immediately caught:
- 13 unit tests silently broken after a refactor
- 3 compile errors in UI files nobody noticed
- Terrain texture format mismatches
- Missing method stubs blocking reconnection UI
Key insight: Autonomous agents MUST run tests before pushing. The test infrastructure must be trivial to invoke — one command, clear pass/fail output.
Three Test Layers (applies to any repo)
Layer 1: Unit Tests — fast, pure logic, no I/O
- ariki-game:
game.sh test→ 323 xUnit tests, <1s - tinqs-git equivalent:
make test→ Go unit tests - Rule: Extract pure logic into testable functions. Don't mock — isolate.
Layer 2: Integration Tests — real runtime, real dependencies
- ariki-game:
game.sh itest→ loads real Godot scene tree, queries terrain, validates nodes - tinqs-git equivalent:
make test-sqlite→ real DB, real HTTP handlers - Rule: Use the real thing. Don't mock the database, don't mock the API.
- tinqs-git already has this in
tests/integration/— 100+ Go integration tests
Layer 3: E2E Tests — full stack, browser/client
- ariki-game:
game.sh run+ Agent API (/screenshot,/click,/scene) - tinqs-git equivalent:
npx playwright test→tests/e2e/TypeScript tests - Rule: Test what the user sees. Screenshots for visual regression.
What tinqs-git Already Has (better than us)
Your test infra is upstream Gitea — already mature:
# Unit tests (Go)
TAGS="bindata sqlite sqlite_unlock_notify" make test
# Integration tests (real SQLite DB, real HTTP)
TAGS="bindata sqlite sqlite_unlock_notify" make test-sqlite
# E2E tests (Playwright, headless browser)
npx playwright test
# Lint
make lint
This is ahead of ariki-game. We had to build our integration test runner from scratch. You have Gitea's entire test suite.
What to Add for Agent Workflows
1. Agent pre-push hook
Any agent (Claude Code, DeepSeek V4, Cursor) should run before pushing:
make lint && make test && make test-sqlite
2. Tinqs-specific integration tests
Gitea upstream tests cover Gitea features. Add tests for Tinqs customizations:
- Custom Git Studio branding/UI
- S3 LFS backend configuration
- Custom API endpoints (if any)
- Runner/CI workflow execution
3. One-command test runner
make tinqs-test # runs all three layers relevant to Tinqs changes
Pattern: Extracting Testable Logic
When you have logic embedded in HTTP handlers or templates, extract to pure functions:
// Before (untestable — embedded in handler)
func handleAPI(w http.ResponseWriter, r *http.Request) {
result := complexCalculation(r.FormValue("input"))
// ...
}
// After (testable)
func ComplexCalculation(input string) Result { ... } // pure, exported
func handleAPI(w http.ResponseWriter, r *http.Request) {
result := ComplexCalculation(r.FormValue("input"))
// ...
}
// Test
func TestComplexCalculation(t *testing.T) {
got := ComplexCalculation("test")
assert.Equal(t, expected, got)
}
We did this in ariki-game by extracting:
ModeCycle.Next()fromInputManager.CycleMode()GroundPlaneProjection.Project()fromVillageBuilder.RaycastForPlacement()
Summary
| Repo | Unit | Integration | E2E | Agent cmd |
|---|---|---|---|---|
| ariki-game | game.sh test (323, <1s) |
game.sh itest (33, ~30s) |
Agent API | Custom |
| tinqs-git | make test |
make test-sqlite |
npx playwright test |
Upstream Gitea |
tinqs-git is already in a better position. The gap is: agents don't run the tests. Fix that and you're ahead of ariki-game.