P Pratheeka
Ganapathi

I ship Production-grade systems that solve real problems.

Full-stack AI engineer building production-grade systems around frontier APIs and local models. I design for real constraints, a 12GB RTX 2060, a tight per-document budget, a deadline, and I measure honestly with benchmarks I built myself. Recent work: a dual-LLM document extractor at $0.05 to $0.15 per PDF, async job infrastructure that hot-swaps multimodal models on a single consumer GPU, and a 7-metric evaluation harness that drove 12 iterations to F1 = 1.0. I own the work end to end. Architecture, code, deployment, and the writeup.

AI Engineer
Production LLM pipelines & evaluation
Multimodal extraction, dual-LLM orchestration, custom benchmarks. Frontier APIs and local inference on consumer GPUs.
Full-Stack
FastAPI, React, async infrastructure
Containerized stacks with Docker Compose and Traefik. Async job layers with Dramatiq, Redis, and SSE. Built end to end.

My Work

Two projects I've built end to end. Architecture, code, deployment, and the benchmarks that decide what ships.

01
In Progress Apr 2026 to Present

Advanced Content Summarizer

A full-stack pipeline that turns PDFs, images, or raw text into laid-out PDFs with context-aware summaries and actionable next-step CTAs.

  • Dual-LLM extraction with image fidelity. PyMuPDF + PaddleOCR + MiniCPM-V parse the source; Qwen 2.5 summarizes; Claude handles CTAs and layout reasoning; Gemini regenerates image regions; WeasyPrint renders the final PDF. Unique-ID tracking keeps every image bound to its original context. Lands at $0.05 to $0.15 per document.
  • Async infrastructure for a single consumer GPU. FastAPI + Dramatiq + Redis + SSE coordinate long-running jobs while sequential GPU swaps fit MiniCPM-V and Qwen 2.5 onto a 12GB RTX 2060. No cloud GPU, no model downgrades, no compromises on quality.
  • One workstation, fully containerized. Docker Compose, Traefik, and the NVIDIA Container Toolkit make the whole stack reproducible. Clone, compose up, ship.
FastAPI React Docker Qwen 2.5 MiniCPM-V Claude Gemini PaddleOCR WeasyPrint
Known limitations and planned improvements pending.
02
Shipped Mar 2026

PDF-to-Speech Reader

A system that turns book PDFs into spoken audio. The pipeline takes any PDF, removes layout noise (headers, footers, page numbers, watermarks, copyright pages), and converts the clean text into audio.

Architecture history

Three architectures. Each one replaced the last.

v1 PyMuPDF + heuristics
8-stage filtering pipeline: repetition detection, positional filtering, page number detection, font-size analysis, watermark filtering, column-aware reading order. 25 to 500 ms per book. Zero cost. Worked on clean PDFs but failed on real books with unique header/footer patterns.
v2 PyMuPDF + Gemini calibration
Gemini analyzes 4 sample pages at 72 DPI, returns a per-book layout profile. F1 jumped from 0.84 to 1.00 on real books. ~1,200 lines of code.
v3 · Production Full Gemini extraction
Dropped the heuristic pipeline entirely. Same F1 as v2, 375 lines total. Less code, same accuracy, easier to maintain.
TTS pipeline

Started with Edge-TTS for prototyping, moved to Kokoro-82M for production. Runs fully local, Apache-licensed. Voice blending via torch.lerp, abbreviation expansion, sentence-boundary splitting, 0.98x pacing, ffmpeg loudnorm at -16 LUFS.

Evaluation

Built a standalone benchmark dashboard (React + Recharts). 7 metrics, 12 runs.

F1 = 1.00
Production
81%
System Health
~$0.015
per 256-page book
Known limitations
  • Gemini free tier: 15 RPM, 1M tokens/day (~6 books/day)
  • 20 MB PDF size cap
  • 25 to 32 s latency per book, not real-time
  • Kokoro lacks word-level timing, benchmark MOS underreports at 3.3 vs actual ~4.5
Planned improvements
  • LayoutLM / DiT-based document understanding to eliminate API dependency
  • PaddleOCR for scanned documents without text layers
  • Celery + Redis queue for batch processing around RPM limits
  • Streaming TTS for immediate playback during synthesis
  • Building more such pipelines focused on reducing token usage and cutting down LLM costs without sacrificing output quality
FastAPI PyMuPDF pdf2image Gemini 2.5 Flash Kokoro-82M Edge-TTS React 18 TypeScript Zustand TanStack Query Docker

My Stack

What I reach for, and what I've actually shipped with.

Languages
Python SQL JavaScript C
ML & Data
PyTorch TensorFlow scikit-learn NumPy Pandas Matplotlib Pillow
AI & ML
NLP Computer vision Multimodal AI Prompt engineering Inference pipelines Model evaluation (F1, precision, recall, WER) Data preprocessing
Backend
FastAPI React PostgreSQL Dramatiq Redis Server-Sent Events
DevOps
Docker Docker Compose Traefik NVIDIA Container Toolkit Git GitHub Jupyter Google Colab

My Career

Four roles before I focused on engineering full-time. Each one taught me something the projects above depend on: shipping under a deadline, owning quality, reading data honestly, and leading people who don't report to you.

2026
Dec 2025 to Feb 2026

Founder's Office Intern

Wireone Labs Pvt Ltd
Remote
What this role gave me: an instinct for product quality and a habit of writing things down clearly.
  • Owned product QA end to end for an early-stage SaaS platform. Ran systematic test cycles and data-validation workflows that caught issues before they reached users. The same discipline now shows up in the 7-metric benchmark harness I built for the PDF-to-Speech project.
  • Authored user documentation, how-to guides, and onboarding flows. Built ClickUp automations for internal workflows and contributed to founder-facing pitch materials. Clear technical writing is how I keep my own systems maintainable.
2025
Aug 2025 to Nov 2025

Operations Intern

Maker Zone
Bengaluru, Karnataka
What this role gave me: comfort working inside a 0-to-1 startup where nothing is set up yet.
  • Ran the Amazon catalog and listings pipeline. Owned keyword research, performance tracking, and supplier coordination, then translated catalog data into founder-level recommendations. That habit of turning messy data into a decision is the same instinct I bring to evaluation dashboards now.
2025
Jan 2025 to May 2025

Digital Data Analyst

HiveMinds Innovative Market Solutions Pvt Ltd
Bengaluru, Karnataka
What this role gave me: Python, SQL, and the habit of measuring before deciding.
  • Analyzed large-scale marketing campaign datasets in Python and SQL. Surfaced patterns in audience segmentation, user behavior, and KPI performance for client teams.
  • Built interactive dashboards to track ROI and turned open-ended business questions into structured analyses that marketing and product teams used to make calls. The "measure honestly, then decide" approach that runs through my engineering work started here.
2022
Feb 2022 to May 2025

Executive Lead

ME-RIISE Foundation (Section 8 Company)
Hassan, Karnataka · previously Startup & Research Lead
What this role gave me: the long-haul muscle of running something across years, not weeks.
  • State-Level Event Coordinator for three consecutive years (2022 to 2024). Led a 30-person cross-functional team across technology, design, and operations.
  • Ran digital strategy and supported early-stage student startups through mentorship, resource allocation, and industry partnerships. Owning a system end to end, the same way I own my projects now, started with owning this org.

My Education

Dec 2021 to June 2025
Hassan, Karnataka
Malnad College of Engineering
Bachelor of Engineering, Information Science and Engineering
9.28
CGPA / 10.0