Overview
Flotilla is mission control for automation agencies that run n8n for many clients. The best-practice setup is one isolated n8n instance per client, which sounds clean until you have 15 clients. Then you hand-deploy the same workflow 15 times, recreate the credentials 15 times, re-apply every fix 15 times, and you usually find out a client’s automation broke from an angry email rather than a dashboard.
Flotilla is the missing ops layer. It connects to the public REST API of instances the clients own (clean under n8n’s Sustainable Use License), deploys one golden workflow template across the whole fleet with per-client credential injection, detects drift when someone hand-edits production, rolls out updates in one click, alerts failures to Telegram before the client notices, and generates the monthly hours-saved report agencies attach to their retainer invoice.
Challenge
Agencies are the fastest-growing channel in the n8n ecosystem, and they all hit the same wall. The need for a fleet layer is documented in n8n’s top-voted feature request, open for years, and in the weekly “how do you manage multiple clients?” threads across the forum and agency communities. The one adjacent commercial tool stops at read-only monitoring: no deploy, no credentials, no drift. The hard part, and the whole point, is owning the write path safely.
That meant building four things that are easy to get wrong:
- A defensive n8n API client that survives version differences and missing endpoints across the fleet, and never aborts every client because one instance has a quirk.
- Drift detection that can tell a real hand-edit apart from per-client variables and credential identity. Without that, every legitimate per-client difference reads as a false alarm.
- Credential injection that maps template placeholders to real per-client credentials and verifies them, without Flotilla ever storing the secret.
- A clean multi-tenant data model with per-account isolation.
Solution
Canonical-hash drift detection (the core)
Drift compares a canonical form of the workflow, with volatile fields (ids, timestamps, node positions, webhook ids) stripped and credential identity masked, against the checksum of what was actually deployed to that client. Per-client variables and credentials do not count as drift; only a genuine edit does. The output is a clean JSON diff and one-click re-sync.
Fleet deploy with credential injection
A golden template references credentials by placeholder (TPL_slack) and variables by {{VAR:key}}. Deploying to any subset of instances substitutes each client’s variables, maps or creates the credential on the remote through the n8n API, then verifies it. Results come back per instance, and nothing aborts the fleet if one client fails.
Built to survive real n8n
Verified end-to-end against a live self-hosted n8n, not just a friendly mock. Real n8n differs in ways Flotilla now handles automatically: webhook trigger nodes need a generated webhookId or their production URL returns 404 even while the workflow reports active; credential data is schema-validated; POST/PUT on workflows is strict about which fields it accepts; and n8n enriches a workflow on save, so drift compares the canonical form of what n8n actually stored and never flags a false positive.
Monitoring, alerts, and AI triage
Cursor-based polling of executions across every instance feeds a failures view and per-client stats. Failures fire Telegram and webhook alerts with cooldown dedup per workflow and instance. An AI triage step clusters failures by normalized error signature and writes a one-line summary of what to fix first, with a heuristic fallback so it still works with no API key.
The report that defends the retainer
A printable per-client report (runs, success rate, estimated hours saved, incidents caught, automations under management) that attaches straight to the monthly invoice. This is the artifact that turns invisible maintenance into something a client can see.
License posture
Flotilla never hosts, bundles, or redistributes n8n. It only calls the public API of instances the customers own, which sits outside n8n’s Sustainable Use License restrictions, and it actively encourages the compliant instance-per-client setup.
Results
- A working v0.1 MVP, runnable today, covering fleet deploy, credential injection, drift diff and re-sync, monitoring, alerts, AI triage, and client ROI reports.
- 75-plus automated end-to-end checks passing against three mock n8n instances, plus an idempotent, self-cleaning smoke test that runs the whole loop against a real n8n and deletes everything it created (safe to point at production).
- Multi-tenant with per-account data isolation. Instance API keys are encrypted at rest with Fernet and never shown again, and credential secrets pass straight through to the instance, never persisted in Flotilla.
Honest limitations (v0.1, by design)
- OAuth2 credentials are created as shells, because the consent step is interactive. API-key and token credentials inject fully; OAuth ones need a manual finish.
- Executions are polled (30 seconds by default), not streamed.
- No RBAC within an account, no billing yet, and a white-label client portal is on the roadmap.
Why these choices
- FastAPI with server-rendered Jinja2, no SPA. The UI is operational, not flashy, and the whole thing ships as one deployable.
- Sync SQLAlchemy 2.x, SQLite for dev and Postgres for production via a single environment variable.
- Fernet for instance-key encryption. Secrets are passed through to the instance, never stored.
- APScheduler for the poll, drift, and health jobs.
- Optional Claude Haiku for the triage summary, with a heuristic fallback so the feature never depends on an API key being present.