With 3 years of experience analyzing data and developing insights, I leverage my skills in Python, Excel, PowerBI, Databricks, R, and SQL to deliver value to clients. By the way, check out my awesome work.
Currently pursuing a Master of Science in Applied Data Science at Syracuse University, I stay on top of the latest techniques and technologies in order to provide innovative solutions tailored to each client's unique needs. My educational background combined with real-world experience enables me to be an effective data science consultant who transforms complex data into understandable and usable information that drives strategic decision-making.
Whether it's wrangling data, creating compelling visualizations, or developing robust analytical solutions, I bring a unique blend of practical experience and academic rigor to the table. Join me on this data-driven adventure, where we turn raw information into meaningful narratives and empower businesses to make informed decisions. Let's explore the boundless possibilities that data has to offer!
B.Tech. in Electronics & Communication Engineering
9.15/10
Snagged a full tuition scholarship 💰 for grad school
Badges Earned
These badges show my ongoing efforts to keep learning and growing. Each one represents a new skill I’ve picked up or a course I’ve completed. They’re not just symbols; they’re proof of my dedication to staying updated and improving in my field. From learning new tools to exploring advanced analytics, these badges highlight my passion for continuous improvement.
As a Jr. AI/ML Engineer (Intern) at Inferenz, I have been actively involved in cutting-edge AI and machine learning projects, with a focus on natural language processing and conversational AI. This role has provided me with valuable opportunities to apply and expand my expertise in large language models, prompt engineering, and AI-driven business solutions.
Working closely with the Inferenz team, I have contributed to the development of advanced AI applications that enhance user engagement and drive business growth for our clients.
Intelligent Chatbot Development:
Designed and implemented a chatbot leveraging Large Language Models (LLMs) and LangChain, incorporating agents to manage complex user interactions and execute multi-step tasks, significantly enhancing user engagement.
Prompt Engineering and Optimization:
Engineered and fine-tuned prompt chains to guide the chatbot's conversational flow, resulting in more contextually relevant and coherent responses. This optimization led to a 40% reduction in response time and a 30% decrease in token usage while maintaining high-quality interactions.
Recommendation System Implementation:
Developed and integrated a recommendation system using collaborative filtering techniques, working in tandem with the chatbot to provide personalized product suggestions. This approach resulted in a 17% increase in sales of promoted products and a 25% boost in cross-sell opportunities.
Performance Optimization:
Implemented advanced prompt engineering techniques, including context pruning, dynamic generation, and compression algorithms, to optimize the chatbot's performance. These enhancements improved efficiency while maintaining the high quality of responses.
Tredence Inc.
As an Analytics Consultant at Tredence Inc., I spearheaded impactful initiatives in data analytics and visualization, contributing to a dynamic and data-informed decision-making environment. In this role, I honed my skills in leveraging advanced analytics tools and methodologies, contributing to the organization's success through strategic data utilization and continuous improvement.
Collaborating closely with Unilever, I played a pivotal role in implementing these strategies and delivering valuable insights that supported Unilever's objectives.
Performance Dashboard Development:
Developed and maintained a comprehensive performance dashboard in PowerBI, employing DAX calculations, incremental refresh strategies, and row-level security protocols. This instrumental dashboard offered actionable insights into sales operations, enhancing the overall understanding of key performance indicators.
Sales Performance Enhancement:
Utilized data visualizations and harnessed Q&A natural language queries to pinpoint improvement opportunities, leading to a notable 7% increase in sales performance. The success was attributed to a strategic focus on data-driven decision-making.
Data Quality Automation:
Automated data quality control workflows on Databricks using Power Query M code, Dataflows, and PySpark scripts. This transformative automation significantly increased data accuracy to over 95%, achieved through systematic profiling, validation, and cleaning processes.
Continuous Monitoring and Root Cause Analysis:
Implemented continuous data quality monitoring by developing standardized metrics and dashboards using Python scripts, ensuring ongoing vigilance over data accuracy and integrity.
Conducted root cause analysis on data anomalies and errors utilizing SQL queries and advanced data profiling tools.
Systemic Issue Resolution:
Identified and implemented fixes to resolve systemic data issues, leading to substantial improvements in overall data integrity. The proactive measures addressed challenges at their root, preventing recurring discrepancies.
Interactive Data Quality Dashboard:
Created an interactive PowerBI dashboard dedicated to data quality, featuring dynamic DAX measures and providing enhanced visibility into key metrics. The dashboard achieved an impressive 98% data coverage, reinforcing a culture of data-driven excellence.
Cognizant
During my internship at Cognizant, I had the opportunity to work on various projects that allowed me to enhance my skills and gain hands-on experience in the field of technology. Here are some highlights of my responsibilities:
Database Management:
Developed and maintained efficient database systems to support project requirements.
Implemented data retrieval and manipulation processes, ensuring optimal performance.
Proficient in SQL, PostgreSQL, and MySQL
Web Development:
Designed and created dynamic web applications using HTML and Cascading Style Sheets (CSS).
Implemented responsive design principles to ensure compatibility across various devices.
Anomaly Detection in Auxiliary Power Unit (APU) of Metro Trains
This project focuses on detecting anomalies in the Auxiliary Power Unit (APU) of metro trains using sensor data. The goal is to enable predictive maintenance, enhance system reliability, and minimize downtime by identifying potential failures early.
Technologies Used:
Dataset: MetroPT dataset with 1,516,948 rows and 17 columns (February to August 2020).
Data Preprocessing: Schema definition, data cleaning, and exploratory data analysis (EDA) using correlation heatmaps and temporal analysis.
Anomaly Detection Techniques:
K-Means Clustering: Identified normal (Cluster 0) and anomalous (Cluster 1) operations.
LSTM Autoencoder: Detected anomalies based on reconstruction error (95th percentile threshold).
Key Results: Anomalies peaked during early morning hours (2 AM - 5 AM) and aligned with recorded failure events.
Challenges: Implementing Isolation Forest and One-Class SVM with PySpark and determining anomaly thresholds.
COMPASS is an AI-powered university guidance system that helps international students find and track university programs, living expenses, and career opportunities in the United States. The system provides personalized recommendations based on user preferences and maintains an interactive chat interface for queries about universities, costs, weather, and job prospects.
Technologies Used:
Personalized University Recommendations: Based on field of study, budget, location, and weather preferences.
Interactive Chat Interface: Ask about university programs, living expenses, weather conditions, and job market trends.
Application Tracking: Manage applications, deadlines, and document requirements with downloadable templates.
Resource Generation: Generate application checklists in DOCX format and CSV templates for tracking.
This project aims to build a cloud-based data pipeline using Azure services to analyze and visualize the 2021 Tokyo Olympics dataset. The pipeline integrates data ingestion, transformation, and visualization to unlock insights into athlete demographics, country performance, and event participation.
Technologies Used:
Data Ingestion (Azure Data Factory): Automated extraction of data from a GitHub-hosted CSV file into Azure.
Data Storage (Azure Data Lake Gen2): Scalable and secure storage for raw and processed data.
Data Transformation (Azure Databricks): Cleansing and processing data using Spark.
Data Analysis (Azure Synapse Analytics): SQL-based querying and advanced analytics.
Visualization (Power BI): Interactive dashboards displaying insights and performance metrics.
LEAP is a web application that generates a personalized learning path based on users' educational background, skills, and career goals. It leverages AI-driven models to create customized plans that include key concepts, curated resources, and estimated timelines, making it easier for users to achieve their learning objectives.
Technologies Used:
Built a web application using Python, Streamlit, and the GROQ LLM API to design personalized learning paths.
Integrated AI to recommend curated resources, breaking down complex transitions into actionable steps.
Developed a feature to generate downloadable .docx files for users, enabling structured offline access to their learning plans.
Provided estimated completion timelines and progress tracking, optimizing the user's learning experience.
Designed an intelligent recommendation system to suggest resources from trusted platforms, enhancing learning efficiency.
Sage is a health chatbot developed using Python, LangChain, and Streamlit, designed to diagnose injuries and provide probable precautions based on user input. Leveraging advanced natural language processing and machine learning techniques, Sage offers accurate and timely health advice, ensuring users receive relevant information and guidance for their symptoms.
Technologies Used:
Developed a health-focused chatbot using Python, integrating LangChain for natural language processing and Streamlit for the user interface.
Implemented advanced NLP techniques to accurately interpret user-reported symptoms and health concerns.
Integrated Large Language Models (LLMs) to enhance the chatbot's language understanding and response generation capabilities.
Created an interactive, conversational user experience that provides real-time health advice and injury diagnosis.
Designed the system to offer tailored recommendations based on individual user inputs, ensuring personalized health guidance.
Received recognition through the Wolfram Award, highlighting the project's innovation and potential impact in the health tech space.
EqualEyes aims to advance image captioning technology by combining recent advances in image recognition and language modeling to generate rich and detailed descriptions beyond simple object identification. Through inclusive design and training on diverse datasets, the project seeks to create a system accessible to all users, particularly benefiting individuals with visual impairments. Stakeholders include visually impaired individuals, educators, and developers.
Technologies Used:
Developed an image captioning system that generates rich, descriptive captions going beyond naming objects by combining advanced image recognition and language modeling techniques.
Implemented data preprocessing pipelines, including image augmentation, text tokenization, and vectorization to prepare diverse datasets for model training.
Explored and evaluated multiple state-of-the-art model architectures like CNN Encoder-Decoder, Vision Transformers (ViT-GPT2), and BLIP for image encoding and caption generation.
Conducted extensive data exploration and analysis on the image-caption dataset, examining image size/orientation distributions, caption lengths, word frequencies, and image quality assessments.
Implemented evaluation metrics focused on measuring how well generated captions capture the full context of images beyond just object presence.
Developed a working web application that takes images as input, processes them through the trained captioning model, and generates descriptive captions with audio output for accessibility.
Analyzing Austin Animal Center Data for Enhanced Adoption Strategies
This project involves a comprehensive analysis of data from the Austin Animal Center to understand trends in animal intakes, outcomes, and stray locations. By merging and analyzing multiple structured datasets, project aims to identify factors contributing to stray animal cases and develop strategies to address the issue. The analysis includes exploratory data analysis, preprocessing, and actionable insights to improve adoption rates and animal welfare.
Technologies Used:
Preprocessed and integrated three datasets (Austin Animal Center Intakes, Outcomes, and Stray Map) by handling missing values, removing duplicates, and performing inner joins to create a unified dataset for analysis.
Conducted exploratory data analysis on the intake dataset to examine distributions of animal types, intake conditions, sexes, ages, and breeds, identifying trends and potential areas of focus.
Analyzed outcome data to determine common outcomes (adoption, transfer, euthanasia) across different animal types, ages, and assessed top breeds for targeted adoption efforts.
Performed geospatial analysis on the stray animal map data, pinpointing urban hotspots and frequent locations for stray animal findings to guide targeted interventions and resource allocation.
Investigated correlations between animal age at intake and outcome to derive insights for optimizing adoption strategies based on age groups and tailoring marketing/fostering approaches .
Developed visualizations, including bar charts, heatmaps, and geographic maps, to effectively communicate key findings and patterns related to intake sources, outcome distributions, and stray locations.
Synthesized analysis results to propose data-driven recommendations for the Austin Animal Center, such as sterilization programs, adoption campaigns, resource allocation, and improvements to recordkeeping and identification practices.
Data Analysis For Energy Consumption & Conservation Strategies For eSC
In this project, we spearheaded a comprehensive analysis of energy consumption patterns with a keen focus on peak demand during the hot summer months, particularly in July. Leveraging a robust toolkit that included R Studio, Shiny app development, and advanced data cleaning and merging techniques, we delved into the intricacies of energy data to derive meaningful insights.
Technologies Used:
Conducted a meticulous analysis of energy usage data, employing data cleaning and merging techniques to ensure the integrity and accuracy of the dataset.
Utilized R Studio to identify key drivers of high demand during peak periods, specifically in July, shedding light on the factors contributing to increased energy consumption during critical periods.
Developed predictive models using linear modeling, decision trees, and random forest algorithms. These models were instrumental in forecasting future energy demand scenarios, providing a quantitative basis for understanding the potential impact of conservation initiatives.
Formulated strategic recommendations for the Energy Services Company (eSC) aimed at managing demand during peak periods. Explored alternative approaches beyond the traditional method of building additional power plants, considering innovative conservation initiatives.
Presented the comprehensive analysis and strategic plan to key stakeholders, highlighting the findings and recommendations. Customized Shiny app dashboard was utilized to provide an interactive and intuitive platform for stakeholders to engage with the insight
In the creation of Harmony Hub, a Database Management System (DBMS) tailored for a Music Streaming Service, I led the design and development of a robust and comprehensive solution that seamlessly organized and stored data related to tracks, artists, and streaming history.
Technologies Used:
Designed and developed an end-to-end music streaming database solution using SQL and Microsoft Data Studio. This encompassing solution provided a structured and efficient platform for organizing a vast array of data, ensuring optimal performance.
Engineered optimized table schemas to efficiently ingest streaming data from source systems. These schemas were meticulously designed to transform raw streaming data into analysis-ready datasets, laying the foundation for detailed usage analytics.
Implemented a streamlined process for ingesting streaming data from various source systems, ensuring the continuous flow of information into the database. This facilitated real-time updates and maintained the integrity of the dataset.
Utilized the power of SQL queries and stored procedures to grant key stakeholders self-service access to streaming analytics. This empowerment enabled decision-makers to delve into usage patterns, contributing to data-driven decision-making in areas such as artist payments and content recommendations.
The implementation of self-service analytics played a pivotal role in enhancing decision-making processes related to artist payments and content recommendations. Stakeholders could navigate and extract insights independently, fostering a more agile and responsive approach to business strategies.
This is bold and this is strong. This is italic and this is emphasized.
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }. Finally, this is a link.
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';