Projects
SoundReverse — Multi-Agent Audio Mastering AI
Multi-agent system that reverse-engineers a song's mastering chain from its sonic fingerprint.
A LangGraph pipeline (MCP → Gateway → Musician → Analyst ⇄ Critic) where a pure-Python YAML
rules engine owns every EQ and compression number and Gemini owns only the wording, gated by
four physical-impossibility checks. Ships a downloadable Producer Session Pack — a 2-page PDF
blueprint, a JSON preset, and a public LangSmith trace of the full agent debate.
- Audio analysis runs in the Modal GPU build of Audio Sonic MCP (HTDemucs + CLAP, ~30–90s) — SoundReverse carries zero audio libraries
- Gateway: pure Python Pydantic v2 model_validate() — SignalSignature contract violations fail loudly here, never silently corrupt downstream math
- Musician: deterministically maps stem fundamentals to notes via equal-temperament math, derives tonal tags; Gemini writes the tonal-character line only under a string-only tool schema
- Analyst: rules.yaml evaluated in pure Python produces all EQ/compression/gain values; Gemini writes one grounded engineer justification per decision
- Critic: 4 hardcoded physical-impossibility checks; below 0.8 confidence → correction hints loop back to Analyst; 3-iteration cap
- Output node embeds real LangSmith public trace URL (resolved post-graph) into every artifact
Stack: LangGraph, Gemini Flash, Modal MCP, Supabase, LangSmith, FastAPI
GitHub ·
Live demo
Zenic — Agentic RAG Health & Nutrition AI
Multi-turn agentic health and nutrition AI. LangGraph state machine with mandatory
safety gates, hybrid BM25 + vector retrieval, cross-encoder reranking, and a Ragas
evaluation suite enforcing hard quality thresholds.
- LangGraph directed graph — mandatory Safety Check node before routing to specialised workflows (Nutrition QA, Meal Planning, Workout Planning, Trend Analysis)
- Multi-query expansion via Llama 3.3 70B (Groq) to increase retrieval hit rate across mixed-vocabulary sources
- Hybrid retrieval: dense vector (BAAI/bge-small-en-v1.5, ChromaDB/Qdrant) + BM25Okapi with proprietary max_per_source diversity cap
- Cross-encoder reranking (BAAI/bge-reranker-base) — top 30 candidates → precision pass
- Hard quality thresholds: Faithfulness > 0.85, Context Precision > 0.75 (Gemma 4 31B as judge)
- Deterministic Python handles all BMR/TDEE/macro math — LLM never touches numbers
Stack: LangGraph, Llama 3.3 70B, Qdrant, RAGAS, BM25
GitHub
Audio Sonic MCP — Open-Source Audio Analysis Engine
Local-first, fully offline audio-analysis engine with two front-ends — an async FastMCP
server for LLMs, agents, and IDEs, and a CLI for musicians — that turns any local file
or YouTube URL into a structured sonic signature: BPM, section-by-section key map, a
512-dim CLAP embedding, vibe tags, and a production profile.
- Two front-ends, one engine: FastMCP server (Claude Desktop, Cursor, Windsurf, Cline) + local CLI — no API keys, no cloud, 100% on-device
- Async fire-and-forget jobs: get_sonic_signature returns a job_id immediately; client polls get_job_status
- 6-stage pipeline: ingestion/validation → yt-dlp download → FFmpeg → Demucs mdx_extra 4-stem separation → librosa analysis → LAION CLAP embedding
- CUDA detection: GPU path loads models into VRAM; CPU path peak RAM 1.5–2.5 GB (Demucs) / 2.1–2.5 GB (with CLAP)
- CONCURRENCY_LOCK serializes ML jobs to prevent OOM on consumer hardware
- Graceful degradation: HPSS + librosa feature matrices if CLAP stack not installed — same output shape, no crash
Stack: FastMCP, Demucs, LAION CLAP, Librosa, PyTorch, Docker
GitHub
Audio Sonic MCP — Cloud Build (Modal GPU)
Serverless, GPU-accelerated Modal deployment of Audio Sonic MCP. A cheap always-on CPU
node handles MCP/SSE and polling REST traffic; each analysis job spawns a dedicated
NVIDIA A10G container, runs the full pipeline, and shuts down immediately. The production
MCP backend powering SoundReverse.
- CPU/GPU container split: Starlette ASGI stays always-on on CPU; each job fires a separate A10G container — expensive compute exists only for the duration of the job
- Inter-container state via modal.Dict (job status + payload) and modal.Volume (persistent model-weight cache)
- TRANSFORMERS_OFFLINE=1 / HF_HUB_OFFLINE=1: eliminates ~40–50s of HuggingFace network handshakes per cold start
- Direct browser-to-Modal upload: raw audio bytes never touch the SaaS backend; file deleted from container immediately after analysis
- Shared SignalSignature contract with the offline build — same Pydantic v2 schema, same Gateway validation in SoundReverse
- Timings: ~37s GPU compute · ~91s cold start · ~20s warm
Stack: Modal, FastMCP, NVIDIA A10G, Demucs, Starlette, Python
GitHub
SHPSv2 — Predictive ML for Civil Infrastructure
End-to-end predictive asset management system. Transforms raw sensor telemetry
into actionable maintenance intelligence using a multi-model committee (XGBoost & LSTM)
delivering 95% calibrated confidence intervals for structural longevity.
- Quantile regression committee: 3 XGBoost models at [0.025, 0.5, 0.975] → calibrated 95% CI for Remaining Useful Life (R² 0.98)
- LSTM sequence model for 25-year deterioration forecasting (R² 0.89)
- Physics-informed feature engineering: 9 raw inputs → 14 high-order features (Fatigue Index, Stress Ratio, Degradation Rate)
- Pydantic v2 physics guardrail rejects structurally impossible payloads before reaching the model
- SHAP latency fix: KMeans centroid compression (10,000 rows → 100 centroids) → SHAP inference under 50ms
- Multi-stage Docker, tensorflow-cpu, Gunicorn, non-root security config
Stack: XGBoost, Keras, SHAP, Flask, Docker, Pydantic v2
GitHub ·
Live demo on Hugging Face