P Pratheeka
Ganapathi

I ship Production-grade systems that solve real problems.

Full-stack AI engineer building production-grade systems around frontier APIs and local models. I design for real constraints, a 12GB RTX 2060, a tight per-document budget, a deadline, and I measure honestly with benchmarks I built myself. Recent work: a dual-LLM document extractor at $0.05 to $0.15 per PDF, async job infrastructure that hot-swaps multimodal models on a single consumer GPU, and a 7-metric evaluation harness that drove 12 iterations to F1 = 1.0. I own the work end to end. Architecture, code, deployment, and the writeup.

AI Engineer

Production LLM pipelines & evaluation

Multimodal extraction, dual-LLM orchestration, custom benchmarks. Frontier APIs and local inference on consumer GPUs.

Full-Stack

FastAPI, React, async infrastructure

Containerized stacks with Docker Compose and Traefik. Async job layers with Dramatiq, Redis, and SSE. Built end to end.

Contact LinkedIn GitHub

My Work

Two projects I've built end to end. Architecture, code, deployment, and the benchmarks that decide what ships.

In Progress Apr 2026 to Present

Advanced Content Summarizer

A full-stack pipeline that turns PDFs, images, or raw text into laid-out PDFs with context-aware summaries and actionable next-step CTAs.

Dual-LLM extraction with image fidelity. PyMuPDF + PaddleOCR + MiniCPM-V parse the source; Qwen 2.5 summarizes; Claude handles CTAs and layout reasoning; Gemini regenerates image regions; WeasyPrint renders the final PDF. Unique-ID tracking keeps every image bound to its original context. Lands at $0.05 to $0.15 per document.
Async infrastructure for a single consumer GPU. FastAPI + Dramatiq + Redis + SSE coordinate long-running jobs while sequential GPU swaps fit MiniCPM-V and Qwen 2.5 onto a 12GB RTX 2060. No cloud GPU, no model downgrades, no compromises on quality.
One workstation, fully containerized. Docker Compose, Traefik, and the NVIDIA Container Toolkit make the whole stack reproducible. Clone, compose up, ship.

FastAPI React Docker Qwen 2.5 MiniCPM-V Claude Gemini PaddleOCR WeasyPrint

Known limitations and planned improvements pending.

Shipped Mar 2026

PDF-to-Speech Reader

A system that turns book PDFs into spoken audio. The pipeline takes any PDF, removes layout noise (headers, footers, page numbers, watermarks, copyright pages), and converts the clean text into audio.

Architecture history

Three architectures. Each one replaced the last.

v1 PyMuPDF + heuristics

8-stage filtering pipeline: repetition detection, positional filtering, page number detection, font-size analysis, watermark filtering, column-aware reading order. 25 to 500 ms per book. Zero cost. Worked on clean PDFs but failed on real books with unique header/footer patterns.

v2 PyMuPDF + Gemini calibration

Gemini analyzes 4 sample pages at 72 DPI, returns a per-book layout profile. F1 jumped from 0.84 to 1.00 on real books. ~1,200 lines of code.

v3 · Production Full Gemini extraction

Dropped the heuristic pipeline entirely. Same F1 as v2, 375 lines total. Less code, same accuracy, easier to maintain.

TTS pipeline

Started with Edge-TTS for prototyping, moved to Kokoro-82M for production. Runs fully local, Apache-licensed. Voice blending via torch.lerp, abbreviation expansion, sentence-boundary splitting, 0.98x pacing, ffmpeg loudnorm at -16 LUFS.

Evaluation

Built a standalone benchmark dashboard (React + Recharts). 7 metrics, 12 runs.

F1 = 1.00

Production

81%

System Health

~$0.015

per 256-page book

Known limitations

Gemini free tier: 15 RPM, 1M tokens/day (~6 books/day)
20 MB PDF size cap
25 to 32 s latency per book, not real-time
Kokoro lacks word-level timing, benchmark MOS underreports at 3.3 vs actual ~4.5

Planned improvements

LayoutLM / DiT-based document understanding to eliminate API dependency
PaddleOCR for scanned documents without text layers
Celery + Redis queue for batch processing around RPM limits
Streaming TTS for immediate playback during synthesis
Building more such pipelines focused on reducing token usage and cutting down LLM costs without sacrificing output quality

FastAPI PyMuPDF pdf2image Gemini 2.5 Flash Kokoro-82M Edge-TTS React 18 TypeScript Zustand TanStack Query Docker

My Stack

What I reach for, and what I've actually shipped with.

Languages

Python SQL JavaScript C

ML & Data

PyTorch TensorFlow scikit-learn NumPy Pandas Matplotlib Pillow

AI & ML

NLP Computer vision Multimodal AI Prompt engineering LLMs RAG Hugging Face Transformers LangChain Inference pipelines Model evaluation Data preprocessing

Backend

FastAPI React PostgreSQL Dramatiq Redis Server-Sent Events

DevOps

Docker Docker Compose Traefik NVIDIA Container Toolkit Git GitHub Jupyter Google Colab

My Career

Four roles before I focused on engineering full-time. Each one taught me something the projects above depend on: shipping under a deadline, owning quality, reading data honestly, and leading people who don't report to you.

2026

Dec 2025 to Feb 2026

Founder's Office Intern

Wireone Labs Pvt Ltd

Remote

What this role gave me: an instinct for product quality and a habit of writing things down clearly.

Owned product QA end to end for an early-stage SaaS platform. Ran systematic test cycles and data-validation workflows that caught issues before they reached users. The same discipline now shows up in the 7-metric benchmark harness I built for the PDF-to-Speech project.
Authored user documentation, how-to guides, and onboarding flows. Built ClickUp automations for internal workflows and contributed to founder-facing pitch materials. Clear technical writing is how I keep my own systems maintainable.

2025

Aug 2025 to Nov 2025

Operations Intern

Maker Zone

Bengaluru, Karnataka

What this role gave me: comfort working inside a 0-to-1 startup where nothing is set up yet.

Ran the Amazon catalog and listings pipeline. Owned keyword research, performance tracking, and supplier coordination, then translated catalog data into founder-level recommendations. That habit of turning messy data into a decision is the same instinct I bring to evaluation dashboards now.

2025

Jan 2025 to May 2025

Digital Data Analyst

HiveMinds Innovative Market Solutions Pvt Ltd

Bengaluru, Karnataka

What this role gave me: Python, SQL, and the habit of measuring before deciding.

Analyzed large-scale marketing campaign datasets in Python and SQL. Surfaced patterns in audience segmentation, user behavior, and KPI performance for client teams.
Built interactive dashboards to track ROI and turned open-ended business questions into structured analyses that marketing and product teams used to make calls. The "measure honestly, then decide" approach that runs through my engineering work started here.

2022

Feb 2022 to May 2025

Executive Lead

ME-RIISE Foundation (Section 8 Company)

Hassan, Karnataka · previously Startup & Research Lead

What this role gave me: the long-haul muscle of running something across years, not weeks.

State-Level Event Coordinator for three consecutive years (2022 to 2024). Led a 30-person cross-functional team across technology, design, and operations.
Ran digital strategy and supported early-stage student startups through mentorship, resource allocation, and industry partnerships. Owning a system end to end, the same way I own my projects now, started with owning this org.

My Education

Dec 2021 to June 2025

Hassan, Karnataka

Malnad College of Engineering

Bachelor of Engineering, Information Science and Engineering

9.28

CGPA / 10.0

P PratheekaGanapathi