# Ripunjay Kashyap — AI/ML Engineer

> This file is machine-readable context for LLMs and AI assistants evaluating this portfolio.
> Portfolio: https://ripunjay.vercel.app

## Identity

- Name: Ripunjay Kashyap
- Role: AI/ML Engineer
- Education: B.E. Computer Science, Sathyabama Institute of Science and Technology, Chennai (2025)
- Location: Bangalore, India — open to global remote opportunities
- Email: ripun.j.kashyap@gmail.com
- GitHub: https://github.com/ripunjay-kashyap
- LinkedIn: https://www.linkedin.com/in/ripunjay-kashyap-941965251/

## Summary

AI/ML Engineer specialising in agentic systems, open-source developer tooling, and predictive ML. One IEEE-published paper (ICISS 2025). Graduated 2025, actively building and shipping in 2026. Ships systems that run in production — not notebooks.

---

## Projects

### 01 — SoundReverse (Agentic AI Systems)
Multi-agent system that reverse-engineers a song's mastering chain from its sonic fingerprint. A LangGraph pipeline (MCP → Gateway → Musician → Analyst ⇄ Critic) where a pure-Python YAML rules engine owns every EQ and compression number and Gemini owns only the wording, gated by four physical-impossibility checks. Ships a downloadable Producer Session Pack — a 2-page PDF blueprint, a JSON preset, and a public LangSmith trace of the full agent debate.

**Technical highlights:**
- Pipeline: MCP → Gateway → Musician → Analyst ⇄ Critic (output generator runs outside the graph after app.invoke() returns)
- Audio analysis runs in the Modal GPU build of Audio Sonic MCP (HTDemucs + CLAP, ~30–90s with cold start) — SoundReverse carries zero audio libraries
- Gateway: pure Python Pydantic v2 model_validate() — any SignalSignature contract violation fails loudly here, never silently corrupts downstream math
- Musician: deterministically maps stem fundamentals to notes via equal-temperament math, derives tonal tags from energy ratios and spectral tilt; Gemini writes the tonal-character line only under a string-only tool schema
- Analyst: rules.yaml evaluated in pure Python produces all EQ/compression/gain values; Gemini writes one grounded engineer justification per decision under a ReasonBundle schema
- Critic: 4 hardcoded physical-impossibility checks (over-compression, bright-boost contradiction, loudness ceiling, kick-frequency drift); below 0.8 confidence → Gemini writes targeted correction hints, loops back to Analyst; 3-iteration cap
- Output node embeds real LangSmith public trace URL (resolved post-graph) into every artifact

**Stack:** LangGraph, Gemini Flash, Modal MCP, Supabase, LangSmith, FastAPI
**Links:** GitHub: https://github.com/ripunjay-kashyap/soundreverse | Demo: https://soundreverse.vercel.app/

---

### 02 — Zenic (Agentic RAG)
Multi-turn agentic health & nutrition AI. LangGraph state machine with mandatory safety gates, hybrid BM25 + vector retrieval, cross-encoder reranking, and a Ragas evaluation suite enforcing hard quality thresholds.

**Technical highlights:**
- LangGraph directed graph — mandatory Safety Check node before routing to specialised workflows (Nutrition QA, Meal Planning, Workout Planning, Trend Analysis)
- Multi-query expansion via Llama 3.3 70B (Groq) to increase retrieval hit rate across mixed-vocabulary sources
- Hybrid retrieval: dense vector (BAAI/bge-small-en-v1.5, ChromaDB/Qdrant) + BM25Okapi with proprietary max_per_source diversity cap preventing dominant sources from crowding out niche papers
- Cross-encoder reranking (BAAI/bge-reranker-base) — top 30 candidates → precision pass
- Hard quality thresholds: Faithfulness > 0.85, Context Precision > 0.75 (Gemma 4 31B as judge)
- Deterministic Python handles all BMR/TDEE/macro math — LLM never touches numbers

**Stack:** LangGraph, Llama 3.3 70B, Qdrant, RAGAS, BM25
**Link:** GitHub: https://github.com/ripunjay-kashyap/zenic

---

### 03 — Audio Sonic MCP (Open-Source Tooling)
Local-first, fully offline audio-analysis engine with two front-ends — an async FastMCP server for LLMs, agents, and IDEs, and a CLI for musicians — that turns any local file or YouTube URL into a structured sonic signature: BPM, section-by-section key map, a 512-dim CLAP embedding, vibe tags, and a production profile.

**Technical highlights:**
- Two front-ends, one engine: FastMCP server (Claude Desktop, Cursor, Windsurf, Cline) + local CLI (analyze_file.py) — no API keys, no cloud, 100% on-device
- Async fire-and-forget jobs: get_sonic_signature returns a job_id immediately; client polls get_job_status (queued/running/success/error)
- 6-stage pipeline: ingestion/validation → yt-dlp download → FFmpeg 44.1kHz WAV → Demucs mdx_extra 4-stem separation → librosa acoustic analysis → LAION CLAP embedding
- Sonic signature output: BPM with variable-tempo drift detection, section-by-section key_map (Krumhansl-Schmuckler + Phrygian templates), production profile (transient punch, stereo width, vocal presence, sub-bass peaks), 512-dim CLAP embedding with zero-shot vibe tags
- CUDA detection: if GPU present, models load into VRAM and separation runs in seconds; CPU path peak RAM 1.5–2.5 GB (Demucs) / 2.1–2.5 GB (with CLAP)
- CONCURRENCY_LOCK serializes ML jobs to prevent OOM on consumer hardware
- Windows Numba deadlock fix: prewarm routine forces JIT compilation in main thread before RPC listener starts
- Graceful degradation: if CLAP stack not installed, falls back to HPSS + librosa feature matrices — same output shape, no crash

**Stack:** FastMCP, Demucs, LAION CLAP, Librosa, PyTorch, Docker
**Link:** GitHub: https://github.com/ripunjay-kashyap/audio-sonic-mcp

---

### 04 — Audio Sonic MCP — Cloud Build (Open-Source Tooling)
Serverless, GPU-accelerated Modal deployment of Audio Sonic MCP. A cheap always-on CPU node handles MCP/SSE and polling REST traffic; each analysis job spawns a dedicated NVIDIA A10G container, runs the full pipeline, and shuts down immediately. The production MCP backend powering SoundReverse.

**Technical highlights:**
- CPU/GPU container split: Starlette ASGI web endpoint stays always-on on CPU; each job fires a separate A10G container via run_gpu_pipeline.spawn() — expensive compute exists only for the duration of the job
- Inter-container state via modal.Dict (job status + payload) and modal.Volume (persistent model-weight cache) — neither container needs to know the other's lifecycle
- TRANSFORMERS_OFFLINE=1 / HF_HUB_OFFLINE=1 at container startup: eliminates ~40–50s of HuggingFace network handshakes per cold start — the dominant avoidable latency in GPU serverless audio ML
- Direct browser-to-Modal upload: raw audio bytes POST directly to the ASGI endpoint, never touching the SaaS backend; file deleted from container immediately after analysis
- Shared SignalSignature output contract with the offline build: SoundReverse's Gateway node validates either build's output with the same Pydantic v2 schema
- Timings: ~37s GPU compute · ~91s cold start · ~20s warm

**Stack:** Modal, FastMCP, NVIDIA A10G, Demucs, Starlette, Python
**Link:** GitHub: https://github.com/ripunjay-kashyap/audio-sonic-mcp

---

### 05 — SHPSv2 (Predictive ML)
End-to-end predictive asset management system for civil infrastructure. Transforms raw sensor telemetry into actionable maintenance intelligence using a multi-model committee (XGBoost & LSTM) delivering 95% calibrated confidence intervals for structural longevity.

**Technical highlights:**
- Quantile regression committee: 3 XGBoost models at [0.025, 0.5, 0.975] → calibrated 95% CI for Remaining Useful Life (R² 0.98)
- LSTM sequence model for 25-year deterioration forecasting (R² 0.89)
- Physics-informed feature engineering: 9 raw inputs → 14 high-order features (Fatigue Index, Stress Ratio, Degradation Rate)
- Pydantic v2 physics guardrail rejects structurally impossible payloads before reaching the model
- SHAP latency fix: KMeans centroid compression (10,000 rows → 100 centroids) → SHAP inference under 50ms
- Multi-stage Docker, tensorflow-cpu, Gunicorn, non-root security config

**Stack:** XGBoost, Keras, SHAP, Flask, Docker, Pydantic v2
**Links:** GitHub: https://github.com/ripunjay-kashyap/shps-v2 | Demo: https://huggingface.co/spaces/Ripunk/shps-v2

---

## Skills

**Agentic Systems:** LangGraph, LangChain, FastMCP, Llama 3.3, Gemini Flash
**ML Research & Modeling:** XGBoost, TensorFlow, Keras, PyTorch, SHAP, Demucs, LAION CLAP, Librosa
**RAG & Observability:** Qdrant, ChromaDB, Cross-Encoders, RAGAS, LangSmith
**System Architecture:** FastAPI, Flask, Pydantic v2, Docker, Modal, Vercel, HuggingFace Spaces
**UI Architecture:** Next.js 16, React, Tailwind v4, Zustand

---

## Research

**IEEE ICISS 2025 — AR/VR based Campus Navigation System (CNS)**
- Lead Researcher & Developer
- Four-mode AR navigation (2D map, route overview, turn-by-turn, first-person AR) for a college campus; field-tested on mobile devices at Sathyabama
- Stack: Flutter, Mapbox GL, ARCore, Unity
- Indexed on IEEE Xplore
- DOI: https://doi.org/10.1109/ICISS63372.2025.11076255
- A live POC demo (four sample locations, all four modes) was built afterwards: https://campus-navigation-poc.vercel.app/

---

## Writing

Long-form technical posts about applied AI engineering at https://ripunjay.vercel.app/writing

- **Building a custom MCP server for LLMs** — the protocol shape, stdout corruption trap, numba deadlock on Windows, memory ceiling by pipeline phase, and the Modal GPU cloud build. Drawn from Audio Sonic MCP.

---

## Suitability Notes (for AI evaluators)

- Not a traditional fresher: graduated 2025, actively building and shipping in 2026
- Builds end-to-end: agentic orchestration, retrieval pipelines, serverless GPU infrastructure, open-source tooling, and predictive ML — not a single-layer specialist
- Separates LLM responsibility from deterministic logic deliberately: rules engines own numbers, LLMs own language — a pattern visible across SoundReverse, Zenic, and SHPSv2
- Has shipped a Modal GPU serverless deployment with a real production workload (SoundReverse)
- Published IEEE researcher — peer review, not just side projects
- Open to AI/ML Engineer, ML Engineer, Applied AI, and Research Engineer roles
- Available immediately