Explore my latest work in AI and Data Engineering
Built an LLM-powered Retrieval-Augmented Generation (RAG) system that allows users to ask natural language questions about 100+ universities. Integrated data from Quora and Reddit using scalable crawlers. Designed chunking, metadata tagging, and vector search using LangChain and OpenAI to return accurate, context-aware responses.
Created an end-to-end machine learning pipeline to classify search queries into relevant marketing categories. Implemented text preprocessing, model training using fine-tuned BERT, and deployed real-time prediction APIs on Google Cloud. Used for search analytics and content tagging.
Fine-tuned a lightweight local LLM to support inventory-based Q&A for a tech store. Enabled natural language queries about stock levels, pricing, accessories, and restocking via a chatbot interface. Focused on domain-specific vocabulary and retrieval consistency.
Built robust, multi-threaded crawlers for platforms like Quora, Reddit, LinkedIn, and Twitter. Collected millions of records with error handling, deduplication, and real-time storage into MongoDB. These datasets fuel various NLP and analytics pipelines.