Projects

Building production-ready AI systems and tools

TitanCompute

agkavin/TitanCompute

Distributed LLM inference engine with zero-proxy streaming architecture and intelligent MCDA scheduling across heterogeneous consumer devices

GolangPythongRPCDocker

RAG Pipelines

agkavin/RAG-Pipelines

Modular agentic RAG framework with autonomous tool-based reasoning, web-search retrieval, and multi-source document querying capabilities

LangChainChromaDBTavilyGemini

FloatChat

agkavin/FloatChat-Backend

Production-grade NL-to-SQL analytics platform with hybrid RAG architecture, automated anomaly detection, and scientific PDF report generation

FastAPIVanna AIPostgreSQLAgents

Speech X

Real-time conversational avatar system optimized by preloading models into RAM, applying latent space optimizations in the VAE, and caching intermediate results to minimize recomputation

FastAPIWhisperTTSOptimization

SmartRail Triage

agkavin/Railway-Complaint-Management

Multimodal complaint classification system using Vision Transformers and OCR for automated ticket metadata extraction and severity-based routing

PythonAzure VisionMulti-ModalOCR

llm-finetuning

agkavin/llm-finetuning

Fine-tuning Phi-3-mini-4k-instruct on ArXiv math dataset via QLoRA, optimized for domain-specific mathematical queries with local inference support

LLMFine-TuningQLoRAMath Dataset

Tokenizer From Scratch

agkavin/Tokenizer-From-Scratch

Toolkit for building custom tokenizers from scratch, covering fundamental text tokenization techniques without relying on pre-built libraries

TokenizerNLPPython

Transformer From Scratch

agkavin/Transformer-From-Scratch

Raw PyTorch implementation of the Transformer model from scratch, demonstrating the core architecture from 'Attention Is All You Need' with multi-head self-attention and positional encoding

PyTorchTransformerDeep Learning