a81a450e7e
Merged into tinqs/studio: - cmd/tinqs-cli/ — tinqs-cli (Go binary, from bot/cli) - cmd/tea/ — Gitea CLI tool (from tinqs/cli-tea) - services/bot/ — Bot service (from tinqs-ltd/bot on git.arikigame.com) - services/admin/ — Admin panel (from tinqs/admin) - services/team-tool/ — Team Tool (from tinqs/team-tool) - services/proxy/ — tinqs-proxy (from bot/proxy) - web/landing/ — tinqs.com website (from tinqs/website) - web/docs/ — Platform docs (from tinqs/docs) - web/blog/ — Blog (placeholder) - runner/ — Ephemeral CI runner (from tinqs/runner) All source repos will be deleted after verification.
215 lines
11 KiB
Markdown
215 lines
11 KiB
Markdown
# Planning
|
|
|
|
Notes and roadmap for taco-bot.
|
|
|
|
## Product Vision
|
|
|
|
Taco is an autonomous meeting assistant. It has its own email address, gets invited to meetings like any team member, joins at the scheduled time, listens, takes notes, and answers questions when asked.
|
|
|
|
## Core Requirements
|
|
|
|
### 1. Calendar-driven auto-join
|
|
- Taco has a Google Workspace email (e.g. `taco@tinqs.com`)
|
|
- Team members invite Taco to a Google Calendar meeting like any other attendee
|
|
- **Any meeting Taco is invited to, Taco joins. No exceptions, no opt-in toggle.** Invite = join.
|
|
- Taco reads its calendar, joins the Google Meet at the scheduled time automatically
|
|
- No manual intervention — invite it and it shows up
|
|
|
|
### 2. Live meeting listening
|
|
- Taco joins the meeting and captures live audio/captions throughout
|
|
- Builds a running transcript of the entire meeting
|
|
- Identifies speakers by name where possible
|
|
|
|
### 3. Meeting notes
|
|
- Taco takes structured notes as the meeting progresses
|
|
- Captures key topics, decisions, action items, and questions raised
|
|
- Notes are available after the meeting ends
|
|
|
|
### 4. "Hey Taco" wake word
|
|
- When someone says "Hey Taco", the bot listens for what follows
|
|
- **Question** → Taco answers in the Google Meet chat
|
|
- **Task/action item** → Taco acknowledges and logs it as a follow-up
|
|
- Context-aware: Taco uses the current meeting conversation to understand what's being asked and why
|
|
|
|
### 5. Docs repo search
|
|
- Taco can read the `tinqs-ltd/docs` repo on Git Studio (and eventually other repos)
|
|
- When answering questions, Taco searches relevant docs for grounded answers
|
|
- Combines meeting context + docs knowledge for accurate responses
|
|
- **Migration note:** Docs search currently uses GitHub Code Search API — must be migrated to Gitea API (`git.arikigame.com`)
|
|
|
|
### 6. Chat responses
|
|
- All Taco responses go to the Google Meet chat (not audio)
|
|
- Prefixed with "🌮 Taco:" for visibility
|
|
- Concise, actionable answers
|
|
|
|
## Architecture — AWS
|
|
|
|
Everything runs on AWS. No Vercel. Dockerfile already exists.
|
|
|
|
### AWS Services
|
|
|
|
| Service | Role |
|
|
|---|---|
|
|
| **ECS Fargate** | Runs the bot container — both the API and the meeting joiner. Long-running tasks for meetings, always-on service for the API. |
|
|
| **S3** | Already in use. Store meeting transcripts, notes, action items. Cheap, durable, no expiry. |
|
|
| **EventBridge** | Cron scheduler — polls Google Calendar every 1-2 min, triggers meeting join tasks. |
|
|
| **ALB or API Gateway** | Routes HTTP traffic to the ECS service (API endpoints). |
|
|
| **ECR** | Docker image registry. CI pipeline already pushes here. |
|
|
| **CloudWatch** | Logs, monitoring, alerts. |
|
|
|
|
### Two container modes, one image
|
|
|
|
The same Docker image runs in two modes:
|
|
|
|
1. **API service** (always-on ECS service) — handles `/api/meeting/*` endpoints, Claude Q&A, transcript storage, calendar polling
|
|
2. **Meeting bot** (on-demand ECS task) — launched per meeting, runs headless Chrome, joins Google Meet, captures captions, posts to chat. Task stops when meeting ends.
|
|
|
|
EventBridge cron triggers the API service to check the calendar. When a meeting is upcoming, the API service launches a Fargate task for that meeting.
|
|
|
|
### Meeting join approach
|
|
|
|
#### Option A: Headless browser bot (Puppeteer/Playwright) — start here
|
|
- Launches headless Chrome, navigates to the Meet URL, joins as `taco@tinqs.com`
|
|
- Captures captions via DOM observation (similar to current Chrome extension approach)
|
|
- Posts to Meet chat via DOM injection
|
|
- **Pros:** Proven pattern (current extension already does this), works with Google Meet as-is
|
|
- **Cons:** Brittle (DOM selectors break), resource-heavy (headless Chrome per meeting), Google may block bot-like joins
|
|
|
|
#### Option B: Google Meet REST API + Media API (if available)
|
|
- Google has been rolling out Meet REST API for bots/companions
|
|
- Would allow joining programmatically without a browser
|
|
- **Pros:** Official, stable, no DOM scraping
|
|
- **Cons:** API access may require Google Workspace Enterprise, limited availability, still maturing
|
|
|
|
#### Option C: LiveKit replacement (longer-term)
|
|
- Replace Google Meet entirely with self-hosted LiveKit
|
|
- Bot joins as a native LiveKit Agents participant
|
|
- **Pros:** Full control, proper bot SDK, no DOM hacks, Tailscale-native
|
|
- **Cons:** Team has to switch away from Google Meet
|
|
|
|
### Recommended: Option A now, Option C later
|
|
|
|
### Flow
|
|
|
|
```
|
|
Google Calendar (invite taco@tinqs.com)
|
|
→ EventBridge cron (every 1-2 min)
|
|
→ API service checks calendar
|
|
→ Upcoming meeting found
|
|
→ Launches Fargate task for that meeting:
|
|
1. Headless Chrome starts
|
|
2. Signs in as taco@tinqs.com
|
|
3. Joins Google Meet
|
|
4. Captures captions → writes to S3 (or posts to API service)
|
|
5. Listens for "Hey Taco" triggers
|
|
6. On trigger → calls Claude with transcript context
|
|
7. Posts answer to Meet chat
|
|
8. Meeting ends → generates notes → writes to S3
|
|
9. Task stops
|
|
```
|
|
|
|
## Integration with Tailscale
|
|
|
|
- ECS tasks can join the tailnet via Tailscale sidecar container or installed in the Docker image
|
|
- API service accessible over tailnet for internal admin/dashboard
|
|
- Public API endpoints exposed via ALB or API Gateway (for Google Calendar webhooks if used)
|
|
- Meeting bot tasks communicate with API service over tailnet or AWS internal networking
|
|
|
|
## Google Meet Alternatives Research (2026-04-07)
|
|
|
|
Researched 10 platforms for the longer-term self-hosted path. Full analysis below.
|
|
|
|
### Recommendation: LiveKit
|
|
|
|
- **Self-host on Tailscale:** Single Go binary or Docker image. Zero external dependencies.
|
|
- **Chat:** First-class data channels via `lk.chat` topics.
|
|
- **Bot API:** Agents SDK (Python/TS/Go) — bot joins as a participant, reads/sends chat. Maps 1:1 to taco-bot.
|
|
- **Transcription:** Built-in STT pipeline (Deepgram, Whisper).
|
|
- **License:** Apache 2.0
|
|
- **Migration benefit:** Replaces brittle DOM scraping with proper SDK. No more breaking when Google changes their UI.
|
|
|
|
### Also considered
|
|
|
|
| Platform | Verdict |
|
|
|---|---|
|
|
| **Matrix/Element Call** | Best bot ecosystem for chat, but complex deploy. Uses LiveKit under the hood. |
|
|
| **Galene** | Ultra-lightweight (15 MB), MIT. DIY bot integration via WebSocket. |
|
|
| **Jitsi Meet** | Mature but weak bot API. |
|
|
| **Nextcloud Talk** | Good if running Nextcloud. P2P for small calls. |
|
|
| **Whereby / Daily.co** | Cloud-only — eliminated |
|
|
| **BigBlueButton** | Too heavy, education-focused |
|
|
| **RustDesk** | Remote desktop, not conferencing. Ozan uses for remote access. |
|
|
|
|
## GitHub → Git Studio Migration
|
|
|
|
All repos have moved from GitHub to Git Studio (`git.arikigame.com`). The codebase still has GitHub references that need updating:
|
|
|
|
### Code changes required
|
|
|
|
| File | What to change |
|
|
|---|---|
|
|
| `lib/docs-search.ts` | Swap GitHub Code Search API → Gitea search API (`git.arikigame.com/api/v1/repos/{owner}/{repo}/contents` or Gitea's topic/code search endpoints) |
|
|
| `.env.example`, `.env.local.example` | `GITHUB_TOKEN` → `GITEA_TOKEN`, `GITHUB_DOCS_REPO` → `GITEA_DOCS_REPO`, add `GITEA_TOKEN_NAME` |
|
|
| `README.md` | Remove GitHub legacy notice, update all references to Git Studio |
|
|
| `NEXT-STEPS.md` | Update env var docs, deployment instructions |
|
|
| `RUNBOOK.md` | Update env var table, deployment references |
|
|
| `.cursor/SOUL.md` | Update "GitHub Code Search" → "Gitea code search" |
|
|
| `.cursor/MEMORY.md` | Update migration status (it's done now) |
|
|
| `package.json` | `repository.url` → `https://git.arikigame.com/tinqs-ltd/taco-bot.git` |
|
|
| `.github/workflows/agentic-pipeline.yml` | Either migrate to Gitea Actions or remove if CI moves to Gitea |
|
|
| `scripts/security-check.sh` | Keep `github_pat_` pattern in secret scan (still useful), add Gitea token pattern |
|
|
|
|
### Gitea search API
|
|
|
|
Gitea provides these search endpoints (replacing GitHub Code Search):
|
|
|
|
- **Repo code search:** `GET /api/v1/repos/{owner}/{repo}/topics` and content search via raw file access
|
|
- **Repo file contents:** `GET /api/v1/repos/{owner}/{repo}/contents/{filepath}` — read files directly
|
|
- **Repo search:** `GET /api/v1/repos/search?q={query}` — search across repos
|
|
- Token auth: `Authorization: token {GITEA_TOKEN}` header
|
|
- Token name tracked in `GITEA_TOKEN_NAME` (currently `Taco-bot`) for audit/rotation purposes
|
|
|
|
### Deploy pipeline
|
|
|
|
CI builds Docker image → pushes to ECR → deploys to ECS. Existing `agentic-pipeline.yml` already has ECR push steps. Migrate triggers from GitHub Actions to Gitea Actions (or keep as manual/webhook-triggered).
|
|
|
|
## Backlog
|
|
|
|
### Phase 0: Migration cleanup
|
|
- [ ] Migrate `lib/docs-search.ts` from GitHub API → Gitea API
|
|
- [ ] Update env vars across all config files (`GITHUB_*` → `GITEA_*`)
|
|
- [ ] Update README, NEXT-STEPS, RUNBOOK — remove GitHub references
|
|
- [ ] Update `.cursor/` files (SOUL.md, MEMORY.md)
|
|
- [ ] Migrate CI from GitHub Actions to Gitea Actions (ECR push + ECS deploy)
|
|
- [ ] Update `package.json` repo URL
|
|
|
|
### Phase 1: Calendar + auto-join
|
|
- [ ] Set up `taco@tinqs.com` Google Workspace account
|
|
- [ ] Google Calendar API OAuth for reading Taco's calendar
|
|
- [ ] Calendar polling cron on Vercel (or push notifications via webhook)
|
|
- [ ] Headless browser meeting joiner (Puppeteer + Google auth)
|
|
|
|
### Phase 2: AWS infrastructure + meeting bot core
|
|
- [ ] ECS Fargate cluster + task definitions (API service + meeting bot task)
|
|
- [ ] ALB or API Gateway for API endpoints
|
|
- [ ] EventBridge cron rule for calendar polling
|
|
- [ ] ECR repository (may already exist from CI pipeline)
|
|
- [ ] Caption capture in headless mode (port content.js logic)
|
|
- [ ] Chat injection in headless mode (port reply logic)
|
|
- [ ] Meeting notes generation (Claude summary at meeting end)
|
|
- [ ] Transcript + notes storage in S3
|
|
|
|
### Phase 3: LiveKit (longer-term)
|
|
- [ ] Evaluate LiveKit self-hosted on Tailscale (Ozan)
|
|
- [ ] Prototype taco-bot agent using LiveKit Agents SDK
|
|
|
|
## Decisions
|
|
|
|
- **2026-04-07:** All repos moved from GitHub to Git Studio (`git.arikigame.com`). Codebase migration pending.
|
|
- **2026-04-07:** LiveKit recommended as long-term Google Meet replacement. Ozan to evaluate.
|
|
- **2026-04-07:** Product vision defined — calendar-driven auto-join bot. Full AWS deployment (ECS Fargate + S3 + EventBridge). No Vercel.
|
|
- **2026-04-07:** App Runner deployed. ECR repo created. Service live at `ytre5eznhp.eu-west-1.awsapprunner.com`.
|
|
- **2026-04-07:** LLM switched to Mistral Ministral 8B on Bedrock for testing (`LLM_PROVIDER=bedrock`). ~$0.0005/question vs Claude's ~$0.02/question. Anthropic API key commented out locally as safety measure.
|
|
- **2026-04-07:** Domain `taco.arikigame.com` CNAME switched from Vercel to App Runner. ACM cert validation pending (Ozan to add DNS records in Cloudflare).
|
|
- **2026-04-07:** Chrome extension caption detection fixed — poll-based on `.ygicle.VbkSUe`. Debounce tuning ongoing.
|