Projects

Explore my latest work in AI and Data Engineering

University Q&A System with LLM & RAG Gallery image for University Q&A System with LLM & RAG

University Q&A System with LLM & RAG

Built an LLM-powered Retrieval-Augmented Generation (RAG) system that allows users to ask natural language questions about 100+ universities. Integrated data from Quora and Reddit using scalable crawlers. Designed chunking, metadata tagging, and vector search using LangChain and OpenAI to return accurate, context-aware responses.

LangChain RAG OpenAI ChromaDB Python MongoDB Vector Search
View Details
Search Query Trend Prediction Platform

Search Query Trend Prediction Platform

Engineered a data pipeline to process 100M+ monthly Google SERP records and forecast keyword trends using clustering and regression models. Handled large-scale data ingestion, transformation, and analytics for business insights.

ETL BigQuery Dataflow ML Python Scikit-learn GCP
ML-Based Topic Categorization Engine

ML-Based Topic Categorization Engine

Created an end-to-end machine learning pipeline to classify search queries into relevant marketing categories. Implemented text preprocessing, model training using fine-tuned BERT, and deployed real-time prediction APIs on Google Cloud. Used for search analytics and content tagging.

ML Text Classification BERT Python Google Cloud APIs
LLM-Powered Resume Builder using Agents

LLM-Powered Resume Builder using Agents

Developed a multi-agent LLM system to generate customized candidate resumes based on job descriptions and career goals. Orchestrated different agents for formatting, tone control, and ATS optimization using LangChain.

LangChain Agents OpenAI LLMs Python Prompt Engineering
LLM Fine-Tuning for Tech Store Inventory Support

LLM Fine-Tuning for Tech Store Inventory Support

Fine-tuned a lightweight local LLM to support inventory-based Q&A for a tech store. Enabled natural language queries about stock levels, pricing, accessories, and restocking via a chatbot interface. Focused on domain-specific vocabulary and retrieval consistency.

LLM Fine-Tuning Quantized Models LangChain Vector Search RAG Python
Kafka-Based Warehouse Notification System

Kafka-Based Warehouse Notification System

Designed a real-time Kafka-based notification system to monitor warehouse inventory updates and send alerts across services. Integrated multiple data sources using producers and consumers with fault-tolerant event handling.

Kafka Streaming Python Google Cloud ETL Notifications
Scalable Web Crawlers for Business Intelligence

Scalable Web Crawlers for Business Intelligence

Built robust, multi-threaded crawlers for platforms like Quora, Reddit, LinkedIn, and Twitter. Collected millions of records with error handling, deduplication, and real-time storage into MongoDB. These datasets fuel various NLP and analytics pipelines.

Scrapy Selenium MongoDB Python Error Handling Web Crawling
QA System for CSV & Excel Files (LLM-Based)

QA System for CSV & Excel Files (LLM-Based)

Built a question-answering system that enables users to ask questions about large structured datasets (CSV/XLSX) without using RAG. Handled preprocessing, indexing, and dynamic summarization using OpenAI models with context window optimizations.

OpenAI Python Pandas Data Parsing NLP LLMs
LLM-Based Stock Sentiment Analyzer

LLM-Based Stock Sentiment Analyzer

Developed a sentiment analysis tool for stock-related discussions using LLMs like GPT. Extracted and analyzed user emotions from Reddit and Twitter using prompt-engineered queries. Translated sentiment trends into actionable investment signals for analysts.

OpenAI LangChain Sentiment Analysis Python Prompt Engineering LLMs