Welcome to my Portfolio

Full Stack Developer / Data Engineering / ML Ops/ Data Analysis

> const skills = ['TypeScript', 'Python', 'Ruby', 'Java', 'Go'];
> const interests = ['Data', 'ML Ops', 'AI'];
> console.log("Let's get to work!");
"Let's get to work!"

Front End Dev

Project 1

SaaS Analytics Dashboard

A multi-tenant SaaS analytics dashboard that allows businesses to visualize key performance metrics from third-party APIs like Google Analytics or Stripe. Users can log in, connect their accounts, and view charts, tables, and trends for their KPIs.

React TypeScript TailwindCSS Node.js Express PostgreSQL Docker
Project 2

E-Commerce Platform

A full-featured e-commerce web app where users can browse products, add to cart, and pay via Stripe. Admins can add/edit products through a secure dashboard. Includes user auth, order tracking, and search functionality.

Next.js TailwindCSS Django MongoDB Stripe API Heroku
Project 3

Real-time Chat App

Real-time chat application where users can join public or private rooms, send messages, and see others typing live. Includes emoji support, file sharing, and presence tracking.

Vue.js Node.js Socket.io Express MongoDB MongoDB
Project 4

Blog CMS Platform

A full-stack content management system that allows users to write, format, and publish blog posts using Markdown or WYSIWYG editors. Includes image upload, category tags, post drafts, and user roles (writer, editor, admin).

Vue.js Django MySQL MySQL
Project 5

Real Estate Explorer

An interactive tool to explore properties based on pricing and location. Users can view property details, historical pricing trends, and nearby amenities. It features secure authentication, user favorites, and a dashboard for saved searches. Real Estate Explorer is ideal for showcasing data visualization, geolocation, API integration, and CRUD functionality in a real-world context.

Next.js Python/FastAPI GeoPandas Folium
Project 5

AI Assistant App

The AI Assistant app is a full-stack application designed to provide contextual, personalized, and real-time assistance to users. It features a chat-style interface with markdown rendering, code highlighting, and tool integration panels. The assistant supports both text and voice input, with speech-to-text handled via Whisper API. OpenAI's GPT-4 API is used as the core LLM engine for natural language processing and generation.

Next.js Python/FastAPI PostgreSQL Whisper API OpenAI's GPT-4 API LangChain

Backend Dev

Project 3

Serverless Image Processing Pipeline

A backend using AWS Lambda, S3, and API Gateway to process uploaded images—generating thumbnails, extracting metadata, and storing results. Uses Batch for orchestration and CloudWatch for monitoring. Ideal for demonstrating event-driven, serverless architectures.

Python AWS Lambda AWS Batch
Project 4

Real-Time Chat Analytics

Built with Node.js, Firestore (GCP), and Pub/Sub, this app collects and analyzes chat messages in real-time. Uses Cloud Functions to compute metrics and push updates to a dashboard via WebSockets. Emphasizes GCP eventing and NoSQL.

Node.js Firestore Pub/Sub
Project 4

Secure File Storage API

Developed with FastAPI, GCS (Google Cloud Storage), and Google Identity Platform. Users can upload, download, and manage files with signed URLs. Includes Cloud Logging for audit trails and fine-grained IAM for access control.

Node.js GCS Google Identity
Project 4

Event-Driven PDF Generator

Users submit data via a REST API (FastAPI); jobs are queued in AWS SQS, processed by ECS Fargate, and output PDFs are saved to S3. CloudWatch Alarms notify on failures. Demonstrates async job pipelines on AWS.

FastAPI AWS SQS ECS Fargate
Project 4

IoT Data Aggregator

Collects temperature and sensor data from IoT devices via AWS IoT Core, processes data using Lambda, and stores it in DynamoDB. Includes a monitoring dashboard with metrics via CloudWatch and alerts via SNS.

AWS IoT AWS Lambda AWS DynamoDB
Project 4

Resume Ranking API

Uses GCP Cloud Run, Vertex AI Embeddings, and BigQuery to parse and rank resumes against job descriptions. Data is uploaded via Cloud Storage and results are cached in Memorystore. Showcases AI integration with scalable GCP services.

GCP Cloud Run Vertex AI BigQuery

Data Engg.

Map

ELT Pipeline with Fivetran + dbt + BigQuery

Fivetran / DBT / BigQuery

Built a fully automated ELT pipeline using Fivetran to extract and load data from Stripe, HubSpot, and Google Analytics into BigQuery. Transformation logic is managed with dbt, including custom models, incremental materializations, and data quality tests. Used dbt Cloud for scheduled runs and Slack alerts. Designed semantic layers and documented the DAG in dbt docs for BI teams.

Fivetran DBT BigQuery Looker
Game 2

Reverse ETL with Hightouch + Snowflake

Hightouch / Snowflake

Implemented a reverse ETL project to sync customer segmentation data from Snowflake to marketing platforms like HubSpot and Facebook Ads using Hightouch. Defined dynamic models in dbt, including customer LTV and churn risk scores. The synced data enabled personalized campaigns and improved ad targeting, reducing CPA by 22%.

Hightouch Snowflake DBT HubSpot Facebook Ads
Game 3

Real-Time Streaming Pipeline with Kafka + dbt + BigQuery

DBT / Kafka / BigQuery

Designed a real-time event ingestion pipeline where user activity events are streamed from applications via Kafka, processed with Kafka Streams, and loaded into BigQuery for downstream analytics. Applied transformations and aggregations using dbt, with BigQuery scheduled queries for time-window metrics. Monitored performance and freshness using Monte Carlo or Datafold. world.

Kafka BigQuery DBT Airflow
Game 4

Data Warehouse Migration: Redshift to Snowflake

Snowflake / Redshift / DBT

Led a data warehouse migration project from Redshift to Snowflake, including schema redesign, refactoring of legacy SQL transformations into modular dbt models, and optimization for Snowflake’s virtual warehouse architecture. Validated migration using Great Expectations for data quality checks. Improved query performance by 40% and storage costs by 25%. rural town.

Snowflake Redshift DBT Airflow

Data Analysis

Map

Customer Churn Analysis

Analyzed telecom customer data to predict churn using logistic regression and decision trees. Cleaned and preprocessed data with Pandas, visualized trends with Seaborn, and built a predictive model in Scikit-learn. Identified key churn drivers and presented actionable insights through an interactive dashboard using Plotly and Streamlit.

Game 2

E-commerce Sales Trend Forecasting

Built time-series models (ARIMA, Prophet) to forecast monthly sales for an online retailer. Used Pandas for feature engineering, handled seasonality and promotions, and visualized forecasts using Matplotlib. Delivered insights through a Streamlit dashboard and generated reports with Jupyter Notebook and PDF export.

Game 3

Airbnb Price Prediction

Scraped Airbnb listings and performed exploratory data analysis to identify price influencers. Used correlation heatmaps and hypothesis testing to derive features. Built regression models (Linear, XGBoost) to predict optimal pricing. Presented findings via Tableau and deployed the model via Flask for live inference.

Game 4

COVID-19 Impact Analysis

Explored global COVID-19 datasets to analyze trends in infections, deaths, and recoveries. Merged multiple data sources using Pandas, visualized heatmaps and line plots with Plotly. Applied clustering (KMeans) to group countries by response patterns. Shared insights in a published data story using ObservableHQ or Power BI.

AI

Map

Quality Inspection in Manufacturing

Automated quality assurance by using computer vision models to detect defects in real-time on a production line. High-resolution images of products are analyzed by deep learning algorithms trained on thousands of defect and non-defect samples.

AI

Chatbot for Customer Support Using NLP

Natural Language Processing (NLP)-based AI chatbot to handle customer service inquiries 24/7 across web, mobile, and social platforms. Built using transformer-based model GPT-4.

Game 3

Predictive Maintenance in Industrial IoT

AI-driven predictive maintenance using machine learning to forecast equipment failures before they occur. By analyzing real-time sensor data—such as temperature, vibration, pressure, or motor speed—ML models identify patterns and anomalies indicating wear or potential breakdown.

Game 4

Personalized Recommendation Systems in E-Commerce

AI-powered recommendation engines analyzing user behavior, purchase history, and browsing patterns to deliver personalized product suggestions. Using collaborative filtering, content-based filtering, or hybrid models, the system predicts items a customer is likely to buy next.

MLOPS

Map

Automated Model Training and Deployment Pipeline

MLOps enabling continuous integration and continuous deployment (CI/CD) of ML models. New code or dataset push triggers automated pipelines that preprocess data, train models, validate performance, and deploy production-ready models to staging or production environments.

AI

Edge Deployment of ML Models Using AWS Greengrass

Deployed ML models to edge devices via AWS Greengrass, enabling real-time, low-latency inference near data sources. MLOps packages model as Greengrass components, automating deployment, version control models, and monitor edge device health and inference metrics.

Game 3

Model Monitoring and Drift Detection

Deployed MLOps frameworks to track model performance metrics and data distribution in real-time to detect data or concept drift. Alerts notify engineers when model accuracy degrades or input data shifts beyond thresholds.

Game 4

Collaborative Experiment Tracking and Reproducibility

MLOps helps reproducibility by managing experiments, model versions, datasets, and hyperparameters in a centralized system. Teams log experiments using tools like MLflow or Weights & Biases, enabling easy comparison and audit of model development cycles.

Get In Touch

Email Icon
Email: kuma98310@gmail.com
GitHub Icon
GitHub: @kuma9831