
Intro
With over 4 years of experience in data analytics and applied machine learning, I specialize in transforming complex data into meaningful insights that inform strategy and drive impact. My work spans various domains, supporting teams in solving business challenges through data.
I hold a Master of Science in Applied Data Science from Syracuse University, where I focused on data-driven problem solving, AI systems, and scalable analytical solutions. My background combines consulting, research, and product oriented work, enabling me to bridge the gap between data science and real world outcomes.
I'm particularly interested in how Gen AI and machine learning can be applied responsibly and effectively across industries from building smarter tools to enabling better decisions. I'm always open to opportunities where data, innovation, and impact intersect.
| Institution | Degree | Year | GPA |
|---|---|---|---|
| Syracuse University | M.S. Applied Data Science | May 2025 | 3.97 / 4.0 |
| SRM University | B.Tech. Electronics & Communication Engineering | May 2021 | 9.15 / 10 |
Graduate Student Excellence Award — Applied Data Science, Syracuse University
Awarded annually to one graduating student for academic excellence and research contributions. Recognized for impactful, innovation-driven work at the intersection of data science and real-world problem solving. Also recipient of a full tuition scholarship.
Ongoing commitment to staying current across tools, analytics, and AI. Each certification represents a new capability acquired.
Work
Currently working at Ernst & Young as a Senior Data Engineering Consultant, contributing to enterprise-scale data transformation and analytics initiatives.
Contributed to scalable, production-ready AI applications focused on NLP and generative AI. Optimized conversational systems and integrated Retrieval-Augmented Generation (RAG) with enterprise-grade infrastructure.
- Redesigned chatbot architecture using async API calls and parallel processing in Snowflake.
- Applied context pruning and advanced prompt design to reduce token consumption.
- Deployed a RAG pipeline combining OpenAI with a Snowflake-hosted vector store.
Grew from Data Analyst to Analytics Consultant, partnering with Unilever to deliver data-driven strategies improving market expansion, store performance, and pipeline scalability across 16 markets.
- Built the foundation for large-scale data pipeline development, dashboarding, and cross-functional stakeholder reporting.
- Developed and maintained SQL-based data workflows for downstream analytics teams.
- Forecasted demand and identified optimal locations for 1,200 Unilever stores using geospatial analytics and Power BI — contributing $1M+ in annual revenue gains.
- Designed scalable ETL pipelines in Databricks and PySpark, integrating 52+ CSV data sources.
- Automated data validation reducing manual checks by 80%; built Power BI dashboards achieving 98% data coverage.
- Reduced pipeline latency by 35% using Azure Data Factory and Databricks.
Contributed to internal tools through relational database management (PostgreSQL, MySQL) and responsive web interface development.
Led and mentored 100+ undergraduates in "Information Technologies," delivering weekly sessions and bridging students to faculty.
Appointed to review student conduct cases and uphold institutional values of fairness, integrity, and accountability.
Projects
Winning team of the Orange Hoops Data Science Challenge. Predicts injury risk in basketball players analyzing performance and physiological metrics across 2,604 records. Random Forest model achieved AUC 0.90 and recall 0.98 for injured players.
Detecting anomalies in the Auxiliary Power Unit of metro trains using 1.5M rows of sensor data to enable predictive maintenance. Anomalies peaked during 2–5 AM and aligned with recorded failure events.
AI-powered university guidance system for international students. Personalized recommendations with interactive chat, application tracking, and resource generation based on field of study, budget, and location preferences.
End-to-end cloud data pipeline analyzing the 2021 Tokyo Olympics dataset across the full Azure data stack — from ingestion to Power BI dashboards revealing athlete demographics and country performance insights.
Health chatbot diagnosing injuries and providing precautions based on user input. Leverages LangChain for natural language processing with real-time conversational health advice. Recognized with the Wolfram Award for innovation in health tech.
Generates personalized learning paths based on educational background, skills, and career goals. AI-driven plans include curated resources and estimated timelines — with downloadable .docx output.
Advances image captioning beyond simple object identification by combining image recognition and language modeling for rich, context-aware descriptions. Designed for accessibility with audio output for visually impaired users.
Posts


















Get in Touch
Let's talk about data, AI, and building things that matter.
- LinkedInin/shashankguda
- GitHubgudashashank
- Medium@shashankguda
- YouTube@GudaShashank
- Twitter@gudashash
Open to consulting engagements, full-time opportunities, and collaborations where data and innovation intersect.
Book a Meeting Download Résumé








