Data Engineering · Gen-AI/ML · Analytics

Shashank
Guda

Building the infrastructure for intelligent data.
01 — About

Intro

With over 4 years of experience in data analytics and applied machine learning, I specialize in transforming complex data into meaningful insights that inform strategy and drive impact. My work spans various domains, supporting teams in solving business challenges through data.

I hold a Master of Science in Applied Data Science from Syracuse University, where I focused on data-driven problem solving, AI systems, and scalable analytical solutions. My background combines consulting, research, and product oriented work, enabling me to bridge the gap between data science and real world outcomes.

I'm particularly interested in how Gen AI and machine learning can be applied responsibly and effectively across industries from building smarter tools to enabling better decisions. I'm always open to opportunities where data, innovation, and impact intersect.

PythonSQLData EngineeringLLMs / RAGMachine LearningDatabricksSnowflakeAzurePower BIPySparkPrompt Engineering
Shashank Guda
Education
InstitutionDegreeYearGPA
Syracuse UniversityM.S. Applied Data ScienceMay 20253.97 / 4.0
SRM UniversityB.Tech. Electronics & Communication EngineeringMay 20219.15 / 10
🏆

Graduate Student Excellence Award — Applied Data Science, Syracuse University

Awarded annually to one graduating student for academic excellence and research contributions. Recognized for impactful, innovation-driven work at the intersection of data science and real-world problem solving. Also recipient of a full tuition scholarship.

"Badges are not just symbols — they're proof of dedication to continuous growth."
Certifications & Badges

Ongoing commitment to staying current across tools, analytics, and AI. Each certification represents a new capability acquired.

Badges
View All Badges ↗
02 — Experience

Work

From Oct 2025
To Present
Sr. Consultant — Data Engineering

Currently working at Ernst & Young as a Senior Data Engineering Consultant, contributing to enterprise-scale data transformation and analytics initiatives.

From May 2024
To Aug 2024
Jr. AI/ML Engineer (Intern)

Contributed to scalable, production-ready AI applications focused on NLP and generative AI. Optimized conversational systems and integrated Retrieval-Augmented Generation (RAG) with enterprise-grade infrastructure.

AI Chatbot & RAG
  • Redesigned chatbot architecture using async API calls and parallel processing in Snowflake.
  • Applied context pruning and advanced prompt design to reduce token consumption.
  • Deployed a RAG pipeline combining OpenAI with a Snowflake-hosted vector store.
40%Response Time ↓
28%Token Usage ↓
25%Accuracy ↑
RAG in Prod
From Jun 2021
To Jun 2023
Analytics Consultant ↑ promoted from Data Analyst

Grew from Data Analyst to Analytics Consultant, partnering with Unilever to deliver data-driven strategies improving market expansion, store performance, and pipeline scalability across 16 markets.

Data Analyst (Jun 2021 – Jan 2023)
  • Built the foundation for large-scale data pipeline development, dashboarding, and cross-functional stakeholder reporting.
  • Developed and maintained SQL-based data workflows for downstream analytics teams.
Analytics Consultant (Jan 2023 – Jun 2023)
  • Forecasted demand and identified optimal locations for 1,200 Unilever stores using geospatial analytics and Power BI — contributing $1M+ in annual revenue gains.
  • Designed scalable ETL pipelines in Databricks and PySpark, integrating 52+ CSV data sources.
  • Automated data validation reducing manual checks by 80%; built Power BI dashboards achieving 98% data coverage.
  • Reduced pipeline latency by 35% using Azure Data Factory and Databricks.
$1M+Revenue Gained
80%Manual QA ↓
35%Latency ↓
15%Profitability ↑
From Jan 2021
To Jun 2021
Data Engineer (Intern)

Contributed to internal tools through relational database management (PostgreSQL, MySQL) and responsive web interface development.

Campus Leadership
iSchool, SU
2023–2025
Recitation Lead — IST 195

Led and mentored 100+ undergraduates in "Information Technologies," delivering weekly sessions and bridging students to faculty.

SU
2023–2025
Board Member — University Conduct Board

Appointed to review student conduct cases and uphold institutional values of fairness, integrity, and accountability.

What Others Say
1 / 6
03 — Academic

Projects

Project 01 — 🏆 Award Winner
PythonRandom ForestEDASports Analytics

Winning team of the Orange Hoops Data Science Challenge. Predicts injury risk in basketball players analyzing performance and physiological metrics across 2,604 records. Random Forest model achieved AUC 0.90 and recall 0.98 for injured players.

iHoop Insights
Project 02
LSTM AutoencoderK-MeansPySparkTime Series

Detecting anomalies in the Auxiliary Power Unit of metro trains using 1.5M rows of sensor data to enable predictive maintenance. Anomalies peaked during 2–5 AM and aligned with recorded failure events.

Anomaly Detection
Project 03
OpenAI GPT-4ChromaDBStreamlitRAG

AI-powered university guidance system for international students. Personalized recommendations with interactive chat, application tracking, and resource generation based on field of study, budget, and location preferences.

COMPASS
Project 04
Azure Data FactoryDatabricksSynapse AnalyticsPower BI

End-to-end cloud data pipeline analyzing the 2021 Tokyo Olympics dataset across the full Azure data stack — from ingestion to Power BI dashboards revealing athlete demographics and country performance insights.

Tokyo Olympics
Project 05 — 🏆 Award Winner
LangChainLLMsStreamlitNLP

Health chatbot diagnosing injuries and providing precautions based on user input. Leverages LangChain for natural language processing with real-time conversational health advice. Recognized with the Wolfram Award for innovation in health tech.

Sage
Project 06
GROQ APIStreamlitPython

Generates personalized learning paths based on educational background, skills, and career goals. AI-driven plans include curated resources and estimated timelines — with downloadable .docx output.

LEAP
Project 07
ViT-GPT2BLIPCNNVision Transformers

Advances image captioning beyond simple object identification by combining image recognition and language modeling for rich, context-aware descriptions. Designed for accessibility with audio output for visually impaired users.

EqualEyes
Project 08
PythonGeospatialEDAData Viz

Comprehensive analysis of animal intakes, outcomes, and stray locations from Austin Animal Center. Identifies urban hotspots and data-driven recommendations for sterilization programs, adoption campaigns, and resource allocation.

Austin Animal Center
04 — Writing

Posts

All articles on Medium ↗ 24
Engineering Data Across Modalities: Architectures, Techniques, and Practice
📖 19 min read
The Agentic Data Shift: Architecting Self Healing Data Ecosystems for the Financial Systems
📖 25 min read
Unstructured Data Management in Finance: Challenges, Architecture & Modern Tools
📖 21 min read
Inside Microsoft's AI Tour 2025 @ Chicago: Building Agentic Apps, Unified Data Estates, and Next-Gen AI Agents
📖 29 min read
Data Reconciliation with GenAI: Can LLMs Solve a Billion Dollar Banking Headache?
📖 47 min read
Lakehouse, Agent Bricks & Lakeflow: Learnings from Databricks DevConnect Chicago 2025
📖 26 min read
Why Your Data Lake Became a Swamp & How Data Contracts Can Save It
📖 36 min read
Parameter Efficient Fine Tuning (PEFT): Fine Tune LLMs Without Breaking the Bank
📖 52 min read
Context Is the New Code: Engineering Intelligence at Scale
📖 25 min read
LLM Deployments Aren't Plug & Play: Building for Scale and Efficiency
📖 47 min read
AI is making your refrigerator louder!: The Unconventional Side Effects of AI
📖 21 min read
LLMs Are Great, Until You Talk to Them Twice!: Why Chatbots Struggle in Multi-Step Dialogues
📖 23 min read
Perception Language Models: When AI Can See and Speak
📖 22 min read
The Hidden Thoughts of AI: When Chain-of-Thought Doesn't Tell the Whole Truth
📖 21 min read
The State of AI Models: Scaling, Reasoning, and Agentic Intelligence
📖 22 min read
Challenges & Criticisms of LangChain
📖 15 min read
Understanding LLM-as-a-Judge: The Future of Automated Evaluation
📖 7 min read
Beyond Tokens: Large Concept Models in AI
📖 7 min read
05 — Contact

Get in Touch

Let's talk about data, AI, and building things that matter.

Ready to connect?

Open to consulting engagements, full-time opportunities, and collaborations where data and innovation intersect.

Book a Meeting Download Résumé