Chapter I  ·  The Struggler

Hitendra
Kawale

"If you're a struggler — struggle, endure, contend." — The Skull Knight

Open to AI engineering roles

AI Engineer

MSc Artificial Intelligence (University of Surrey). I build production-grade AI systems — LLM platforms, RAG pipelines, and interactive 3D vision — backend-heavy, deployment-aware, and obsessed with what holds up in the real world. I'm also a polyglot: I work comfortably in English and Spanish, hold a conversation in Mandarin, carry Hindi and Marathi as mother tongues, and read or speak a half-dozen more. Global teams and international users feel like home.

Polyglot English · Spanish · Mandarin (conversational) · Hindi · Marathi · Japanese, German, Norwegian (basics)
Arsenal

Python · PyTorch · Hugging Face · LangGraph · RAG · 3D Gaussian Splatting · Ollama · FastAPI · PostgreSQL · pgvector · Qdrant · LanceDB · Redis · SQL · Docker · AWS · GitHub Actions · Prometheus · Grafana · Next.js · TypeScript

Built AI-powered log anomaly-detection platform

Log Guardian

Microservice platform that ingests application logs and scores each one for anomalies with a Random Forest model (ROC-AUC 0.89), a versioned model registry, and a human-feedback retraining loop with score-drift detection. Synchronous REST and Kafka-backed streaming share one scoring pipeline, with OpenTelemetry traces (through Kafka), Prometheus/Alertmanager, and Grafana dashboards. 49 tests, CI, k6 load-tested at ~220 req/s (p95 ~24ms), shipping via Docker Compose and Kubernetes.

PythonFastAPIKafkaPostgreSQLOpenTelemetryPrometheusGrafanaKubernetes
View project
Built Self-hosted LLM platform with OpenAI-compatible APIs

Mini OpenAI Platform

Microservices LLM platform — API gateway, RAG, embedding, and inference services — with an embedding-based semantic cache cutting latency ~80× on cache hits, a difficulty-based router dispatching prompts across Ollama model tiers, a CI quality gate on retrieval metrics (recall@k, MRR), and 16 Prometheus/Grafana panels covering latency, token economics, and answer quality.

PythonFastAPIReactQdrantOllamaDocker ComposePrometheusGrafana
View project
Built Turn AI chat transcripts into a searchable knowledge base

Chat2Study

Full-stack RAG app converting long AI chat transcripts into knowledge bases, study notes, and concept maps. A 10-node LangGraph pipeline orchestrates Playwright capture, artifact persistence, chunking, and embedding; async job re-architecture took ingestion API responses from 30–90s to ~10ms, with pgvector retrieval at ~50ms, a provider-agnostic LLM factory, JWT auth, and full CI.

Next.jsTypeScriptFastAPILangGraphPostgreSQLpgvectorMinIODocker
View project
All chronicles ↗