Leveraging 2+ years experience with Python, R, SQL, Excel, PowerBI, Databricks in untangling data threads & weaving insights into innovative solutions
Intro
With over 2 years of experience analyzing data and developing insights, I leverage my skills in Python, Excel, PowerBI, Databricks, R, and SQL to deliver value to clients. By the way, check out my awesome work.
Currently pursuing a Master of Science in Applied Data Science at Syracuse University, I stay on top of the latest techniques and technologies in order to provide innovative solutions tailored to each client's unique needs. My educational background combined with real-world experience enables me to be an effective data science consultant who transforms complex data into understandable and usable information that drives strategic decision-making.
Whether it's wrangling data, creating compelling visualizations, or developing robust analytical solutions, I bring a unique blend of practical experience and academic rigor to the table. Join me on this data-driven adventure, where we turn raw information into meaningful narratives and empower businesses to make informed decisions. Let's explore the boundless possibilities that data has to offer!
As a Consultant at Tredence Inc., I spearheaded impactful initiatives in data analytics and visualization, contributing to a dynamic and data-informed decision-making environment. In this role, I honed my skills in leveraging advanced analytics tools and methodologies, contributing to the organization's success through strategic data utilization and continuous improvement.
Collaborating closely with Unilever, I played a pivotal role in implementing these strategies and delivering valuable insights that supported Unilever's objectives.
Performance Dashboard Development:
Developed and maintained a comprehensive performance dashboard in PowerBI, employing DAX calculations, incremental refresh strategies, and row-level security protocols. This instrumental dashboard offered actionable insights into sales operations, enhancing the overall understanding of key performance indicators.
Sales Performance Enhancement:
Utilized data visualizations and harnessed Q&A natural language queries to pinpoint improvement opportunities, leading to a notable 7% increase in sales performance. The success was attributed to a strategic focus on data-driven decision-making.
Data Quality Automation:
Automated data quality control workflows on Databricks using Power Query M code, Dataflows, and PySpark scripts. This transformative automation significantly increased data accuracy to over 95%, achieved through systematic profiling, validation, and cleaning processes.
Continuous Monitoring and Root Cause Analysis:
Implemented continuous data quality monitoring by developing standardized metrics and dashboards using Python scripts, ensuring ongoing vigilance over data accuracy and integrity.
Conducted root cause analysis on data anomalies and errors utilizing SQL queries and advanced data profiling tools.
Systemic Issue Resolution:
Identified and implemented fixes to resolve systemic data issues, leading to substantial improvements in overall data integrity. The proactive measures addressed challenges at their root, preventing recurring discrepancies.
Interactive Data Quality Dashboard:
Created an interactive PowerBI dashboard dedicated to data quality, featuring dynamic DAX measures and providing enhanced visibility into key metrics. The dashboard achieved an impressive 98% data coverage, reinforcing a culture of data-driven excellence.
Cognizant
During my internship at Cognizant, I had the opportunity to work on various projects that allowed me to enhance my skills and gain hands-on experience in the field of technology. Here are some highlights of my responsibilities:
Database Management:
Developed and maintained efficient database systems to support project requirements.
Implemented data retrieval and manipulation processes, ensuring optimal performance.
Proficient in SQL, PostgreSQL, and MySQL
Web Development:
Designed and created dynamic web applications using HTML and Cascading Style Sheets (CSS).
Implemented responsive design principles to ensure compatibility across various devices.
Posts
Post 1
The Truth Behind the Shine: Fake Offers in Today’s Job Market
EqualEyes aims to advance image captioning technology by combining recent advances in image recognition and language modeling to generate rich and detailed descriptions beyond simple object identification. Through inclusive design and training on diverse datasets, the project seeks to create a system accessible to all users, particularly benefiting individuals with visual impairments. Stakeholders include visually impaired individuals, educators, and developers.
Developed an image captioning system that generates rich, descriptive captions going beyond naming objects by combining advanced image recognition and language modeling techniques.
Implemented data preprocessing pipelines, including image augmentation, text tokenization, and vectorization to prepare diverse datasets for model training.
Explored and evaluated multiple state-of-the-art model architectures like CNN Encoder-Decoder, Vision Transformers (ViT-GPT2), and BLIP for image encoding and caption generation.
Conducted extensive data exploration and analysis on the image-caption dataset, examining image size/orientation distributions, caption lengths, word frequencies, and image quality assessments.
Implemented evaluation metrics focused on measuring how well generated captions capture the full context of images beyond just object presence.
Developed a working web application that takes images as input, processes them through the trained captioning model, and generates descriptive captions with audio output for accessibility.
Analyzing Austin Animal Center Data for Enhanced Adoption Strategies
This project involves a comprehensive analysis of data from the Austin Animal Center to understand trends in animal intakes, outcomes, and stray locations. By merging and analyzing multiple structured datasets, project aims to identify factors contributing to stray animal cases and develop strategies to address the issue. The analysis includes exploratory data analysis, preprocessing, and actionable insights to improve adoption rates and animal welfare.
Preprocessed and integrated three datasets (Austin Animal Center Intakes, Outcomes, and Stray Map) by handling missing values, removing duplicates, and performing inner joins to create a unified dataset for analysis.
Conducted exploratory data analysis on the intake dataset to examine distributions of animal types, intake conditions, sexes, ages, and breeds, identifying trends and potential areas of focus.
Analyzed outcome data to determine common outcomes (adoption, transfer, euthanasia) across different animal types, ages, and assessed top breeds for targeted adoption efforts.
Performed geospatial analysis on the stray animal map data, pinpointing urban hotspots and frequent locations for stray animal findings to guide targeted interventions and resource allocation.
Investigated correlations between animal age at intake and outcome to derive insights for optimizing adoption strategies based on age groups and tailoring marketing/fostering approaches .
Developed visualizations, including bar charts, heatmaps, and geographic maps, to effectively communicate key findings and patterns related to intake sources, outcome distributions, and stray locations.
Synthesized analysis results to propose data-driven recommendations for the Austin Animal Center, such as sterilization programs, adoption campaigns, resource allocation, and improvements to recordkeeping and identification practices.
Data Analysis For Energy Consumption & Conservation Strategies For eSC
In this project, we spearheaded a comprehensive analysis of energy consumption patterns with a keen focus on peak demand during the hot summer months, particularly in July. Leveraging a robust toolkit that included R Studio, Shiny app development, and advanced data cleaning and merging techniques, we delved into the intricacies of energy data to derive meaningful insights.
Conducted a meticulous analysis of energy usage data, employing data cleaning and merging techniques to ensure the integrity and accuracy of the dataset.
Utilized R Studio to identify key drivers of high demand during peak periods, specifically in July, shedding light on the factors contributing to increased energy consumption during critical periods.
Developed predictive models using linear modeling, decision trees, and random forest algorithms. These models were instrumental in forecasting future energy demand scenarios, providing a quantitative basis for understanding the potential impact of conservation initiatives.
Formulated strategic recommendations for the Energy Services Company (eSC) aimed at managing demand during peak periods. Explored alternative approaches beyond the traditional method of building additional power plants, considering innovative conservation initiatives.
Presented the comprehensive analysis and strategic plan to key stakeholders, highlighting the findings and recommendations. Customized Shiny app dashboard was utilized to provide an interactive and intuitive platform for stakeholders to engage with the insight
In the creation of Harmony Hub, a Database Management System (DBMS) tailored for a Music Streaming Service, I led the design and development of a robust and comprehensive solution that seamlessly organized and stored data related to tracks, artists, and streaming history.
Designed and developed an end-to-end music streaming database solution using SQL and Microsoft Data Studio. This encompassing solution provided a structured and efficient platform for organizing a vast array of data, ensuring optimal performance.
Engineered optimized table schemas to efficiently ingest streaming data from source systems. These schemas were meticulously designed to transform raw streaming data into analysis-ready datasets, laying the foundation for detailed usage analytics.
Implemented a streamlined process for ingesting streaming data from various source systems, ensuring the continuous flow of information into the database. This facilitated real-time updates and maintained the integrity of the dataset.
Utilized the power of SQL queries and stored procedures to grant key stakeholders self-service access to streaming analytics. This empowerment enabled decision-makers to delve into usage patterns, contributing to data-driven decision-making in areas such as artist payments and content recommendations.
The implementation of self-service analytics played a pivotal role in enhancing decision-making processes related to artist payments and content recommendations. Stakeholders could navigate and extract insights independently, fostering a more agile and responsive approach to business strategies.
This is bold and this is strong. This is italic and this is emphasized.
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }. Finally, this is a link.
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';