Open-Source AI Playbook

Learn to Build AI Tools Like This

Every tool on this site was built by one person using multi-model AI agents. Here's exactly how.

Three ideas behind every tool

Understand these patterns and you can build anything on this site yourself.

⚔️

Multi-Agent Debates

Multiple AI models debate a question simultaneously from different expert angles. The disagreement surfaces better answers than any single model.

🔀

Model Routing

Different models excel at different tasks. DeepSeek for parallel experts (cheap + smart). Gemini 2.5 Pro for synthesis (thinking mode). o3 for financial logic. Groq for speed.

🏛️

The Consilium Pattern

Experts debate → Synthesis model reads all views → Structured verdict. This pattern powers every tool here.

From zero to your first consilium

Four steps to run a working multi-agent debate engine on your machine.

1

Install dependencies

Three packages. That's all you need to get started.

pip install httpx python-dotenv groq openai
2

Set API keys

All three providers have free tiers. DeepSeek is the cheapest for expert agents.

# .env DEEPSEEK_API_KEY=your-key # deepseek.com — very cheap GROQ_API_KEY=your-key # console.groq.com — free tier GOOGLE_API_KEY=your-key # ai.google.dev — free tier
3

Run your first consilium

Copy this snippet. Pass your business question. Get a structured debate from 5 parallel experts.

import asyncio, httpx, os from dotenv import load_dotenv load_dotenv() EXPERTS = ["CTO", "CFO", "CMO", "COO", "Risk Officer"] QUESTION = "Should we expand to the LatAm market next quarter?" async def ask_expert(role, question): async with httpx.AsyncClient(timeout=30) as c: r = await c.post("https://api.deepseek.com/chat/completions", headers={"Authorization": f"Bearer {os.environ['DEEPSEEK_API_KEY']}"}, json={"model": "deepseek-chat", "messages": [ {"role": "system", "content": f"You are an expert {role}. Be direct."}, {"role": "user", "content": question}]}) return role, r.json()["choices"][0]["message"]["content"] async def main(): results = await asyncio.gather(*[ask_expert(e, QUESTION) for e in EXPERTS]) for role, opinion in results: print(f"\n[{role}]\n{opinion}") asyncio.run(main())
4

Customize expert roles for your domain

Replace EXPERTS with roles relevant to your business: Legal, Security, UX, Sales, DevOps — whatever your decision needs. Fork the full debate-engine on GitHub for synthesis, rounds, and structured reports.

Tool-by-Tool Guides

Click any tool to see what it does, when to use it, the key code pattern, and an example output.

🏗️ Virtual CTO Architecture
Routes your tech question through 5 AI experts: CTO, Security, DevOps, Frontend, Backend — then synthesizes a verdict.
Architecture trade-offs, technology choices, scaling decisions, before hiring a CTO.
# Submit your tech question — 5 experts debate in parallel EXPERTS = ["CTO", "Security Engineer", "DevOps", "Frontend", "Backend"] question = "Should we use microservices or monolith for our MVP?" # → 5 experts debate → Gemini 2.5 Pro synthesizes the verdict
Example Output [CTO] Start monolith. Ship faster, split later when you have real load data...
[Security] Monolith easier to secure at MVP stage. Microservices attack surface is large...
[DevOps] Microservices = 3x ops overhead. Only worth it above 50k req/day...

SYNTHESIS: Unanimous lean toward monolith for MVP. Revisit at 10k daily active users.
🧠 CEO Coach Bot Strategy
A Telegram bot powered by DeepSeek Reasoner that thinks through leadership challenges: hiring, strategy, conflict, culture.
Difficult people decisions, strategic pivots, weekly prioritization, morale issues.
# How to use # 1. /start in Telegram → describe your leadership challenge # 2. DeepSeek Reasoner (R1) thinks step-by-step # 3. Receive structured coaching advice + action items # Model: deepseek-reasoner — chain-of-thought reasoning
Example Output Challenge: "My best engineer keeps missing deadlines but produces great work eventually."

Coach: This is a deadline-setting problem, not a performance problem. Three levers: (1) involve them in estimation, (2) break milestones smaller, (3) have a direct conversation about what "deadline" means to them vs you.
📊 Financial Audit Bot Finance
Upload your P&L or KPI spreadsheet — get KPI analysis, anomaly detection, and runway calculation via OpenAI o3.
Monthly reviews, investor prep, catching unexpected cost trends.
# Pattern: parse spreadsheet → o3 → structured verdict data = parse_spreadsheet("p_and_l_q4.xlsx") prompt = f"""Analyze this P&L. Return: 1. Top 3 anomalies vs prior period 2. Runway risk if trend continues 3. One actionable recommendation Data: {data}""" # model="o3" — best for structured financial reasoning
Example Output ANOMALY 1: COGS +18% while revenue grew only 6% — margin compression risk
ANOMALY 2: Marketing spend +31% with no corresponding lead volume increase
ANOMALY 3: Accounts receivable aging beyond 90d increased 2.3x
RUNWAY: ~8 months at current burn trajectory
RECOMMENDATION: Audit Q3 marketing contracts for vendor lock-in
👥 Hiring Assistant HR
Paste job description + CVs → 4 expert evaluators (HR, Technical, Culture Fit, Risk) score each candidate and return a ranked shortlist.
Screening 10+ applicants, hiring for senior roles, reducing bias.
# 4 evaluators score each candidate independently EVALUATORS = ["HR Specialist", "Technical Lead", "Culture Fit Evaluator", "Risk Assessor"] # Each evaluator: score 1-10 on their dimension # Final rank = weighted average across all 4
Example Output CANDIDATE: Alexei M. | TOTAL: 8.4/10
HR: 9/10 — Strong progression, clear motivation
Technical: 8/10 — 5y Python, no async experience yet
Culture: 8/10 — Startup background matches pace
Risk: 8/10 — Clean exit from current role
RECOMMENDATION: Strong hire. Address async gap in onboarding.
🏛️ Consilium Decision-Making
The general-purpose debate engine. 6 specialist experts debate your business question in parallel, then Gemini 2.5 Pro reads all positions and produces a structured verdict with confidence level.
Any high-stakes decision: pricing, market entry, partnerships, pivots, org changes.
# Full consilium: 6 experts + Gemini 2.5 Pro synthesis python3 consilium.py \ --question "Should we enter the Brazilian iGaming market in H1 2026?" \ --experts market_analyst,legal,cfo,sales,risk,ops \ --synthesis gemini-2.5-pro \ --thinking
Example Output DEBATE: 6 positions captured (Legal: regulatory risk; CFO: positive ROI; Market Analyst: bullish timing)

SYNTHESIS VERDICT: Conditional YES — enter H2 not H1. Regulatory prep requires 4-6 months. ROI positive at 12 months if existing LatAm clients used as anchor. Confidence: 74%.

Right model for the right task

Using the wrong model wastes money and time. This routing table is extracted from production usage across 15+ tools.

Use Case Model Provider Why
Parallel expert agents deepseek-chat DeepSeek Smart + very cheap ($0.001/1K tokens)
Final synthesis gemini-2.5-pro Google Best reasoning + thinking mode
Financial / process logic o3 OpenAI Best for structured reasoning
Fast / UX tasks gemini-2.0-flash Google Free tier, fast
High volume / cheap llama-3.3-70b Groq Free tier, very fast
❌ Avoid for agentic loops claude-* Anthropic Too expensive at scale

Everything you need to start building