
Pavan Yellathakota
I'm Data Analyst
Data-Driven Innovator
Dynamic Data & Analytics professional with 3+ years of experience designing end-to-end data pipelines, cloud-native platforms, and advanced analytics solutions. Passionate about delivering high-impact, insight-driven systems that accelerate smarter business decisions.
Currently pursuing an M.S. in Applied Data Science at Clarkson University, I specialize in building scalable ETL pipelines, interactive dashboards, and machine learning models using Python, SQL, PySpark, and cloud technologies like AWS and Snowflake.
My expertise spans marketing analytics, financial modeling, and data governance, with a proven track record of reducing reporting errors by 90%, boosting sales through data-driven strategies, and achieving 51% returns on a $650K investment portfolio.
I thrive at the intersection of data engineering, analytics, and machine learning, creating solutions that drive measurable business impact. Actively seeking opportunities in Data Analytics, Machine Learning, and Cloud Data Engineering. Let's transform data into decisions!

Data Analyst & Analytics Engineer
Combining 3+ years of experience with advanced training in Applied Data Science to deliver scalable, insight-driven solutions across analytics, machine learning, and cloud platforms.
- Website: www.pye.pages.dev
- Email: pavan.yellathakota.ds@gmail.com
- City: Austin, TX, USA
- Degree: M.S. in Applied Data Science
- Work-Ex: 3+ years
- Got interesting ideas? Let's connect!
At HAVK Mladost, I built Power BI and Google Data Studio dashboards, reducing manual reporting by 40% and cutting costs by 15%. I also consolidated diverse data sources into a PostgreSQL warehouse, improving analytics speed by 100%.
As a Graduate Research Analyst at Clarkson University, I developed platforms like CU-Next and BingeMax, leveraging collaborative filtering and FastAPI to serve thousands of users with low-latency, data-driven recommendations.
Technical Skills
As a data enthusiast, I’m passionate about uncovering insights and building solutions with data. Using my expertise, I’ve mastered a range of tools for roles like Data Analyst, Analytics Engineer, and Cloud Data Specialist. Curious about my other explorations? Click here.
Programming & Scripting

Data Science & ML
Data Visualization & BI





Cloud & Data Engineering



Development & Deployment

Collaboration & Tools

Portfolio
Discover a curated selection of my projects, where I apply advanced data science, machine learning, and full-stack development to create innovative solutions that drive meaningful business outcomes.
- Featured
- Data & Business Analytics
- Data Science & ML
- Python Fullstack Web Dev
Resume
Relentlessly pursuing excellence through continuous learning and unwavering dedication—because results are earned, not given
Summary
PAVAN YELLATHAKOTA
A data-driven problem solver with an M.S. in Applied Data Science, combining domain knowledge with hands-on ML engineering and analytics expertise. Proficient in building ETL pipelines, scalable dashboards, and ML workflows using Python, SQL, Tableau, and cloud technologies. Passionate about converting complex datasets into actionable insights that drive real-world decisions.
- Austin, TX, USA
- Quick Call
- Mail me
Education
M.S. in Applied Data Science
August 2023 - May 2025
Clarkson University, Potsdam, NY, USA
Gaining deep expertise across statistical modeling, machine learning, data visualization, and financial analytics. Developed production-grade projects including "Detoxify-Telugu," a BERT-based toxicity classifier, and built full-stack web apps using Flask and Streamlit. Coursework includes Portfolio Management, ML, Data Mining, and Visualization using Tableau, R-Shiny, and Plotly.
B.Tech in Computer Science
August 2016 - December 2020
Yogi Vemana University, Kadapa, AP, India
Completed comprehensive study in CS fundamentals, including Data Structures, Algorithms, Web Development, and Operating Systems. Built a capstone project titled "Fake News Classifier" applying NLP and data mining to identify misinformation across news articles.
Research
Graduate Research Analyst – Applied Statistics & Advanced Analytics
October 2023 - April 2025
Clarkson University, Potsdam, NY
- Developed CU-Next, a career analytics platform with a suggestion engine using collaborative filtering (KNN, cosine similarity); served 3,000+ student queries/month via a web dashboard.
- Built BingeMax, a recommendation engine (ALS on 20M+ ratings) for Clarkson Movie Night, matching users to genres and sending automated email alerts; deployed using FastAPI (<500ms latency), MLflow, Streamlit, and Airflow for retraining.
- Created AutoEval, an XGBoost-based used car price prediction app with region-specific fairness adjustments; deployed via Flask + Streamlit, delivering pricing bands, margin insights (25–30%), and 5% lower MSE.
Equity Research Analyst – SMIF
August 2024 - May 2025
Clarkson University, Potsdam, NY
- Co-managed a $650K fund, achieving a 51% annual return and outperforming the S&P 500 by 26% through valuation and risk analysis.
- Built valuation models (DCF, WACC, Multiples) in Excel and Python to assess high-growth and value stocks.
- Developed Power BI dashboards tracking EPS, sector weights, and valuation metrics to inform weekly reviews.
Professional Experience
Data Analytics Engineer – ETL, Marketing & Cloud Pipelines
October 2023 - April 2025
HAVK Mladost
- Built interactive campaign and financial dashboards in Power BI and Google Data Studio, reducing manual reporting by 40% and cutting costs by 15%.
- Consolidated ticketing, CRM, sponsorship, and ad data into a PostgreSQL warehouse; validated pipelines with PySpark in AWS Glue, reducing reporting errors by 90% and doubling analytics speed.
- Conducted channel-level attribution using GA4 and Meta Ads; built SQL-based RFM and cohort models, boosting ticket sales by 20% and increasing merchandise revenue by 22%.
- Collaborated to align 12+ KPIs and implemented a shared metric dictionary, ensuring consistent reporting across teams.
Business Data Analyst – Retail & Supply Chain
November 2022 - April 2023
eAppSys Limited, Hyderabad, India
- Processed 2M+ daily POS records using AWS Glue and Snowflake, cutting data lag from 5h to under 2h; refactored pipelines using dbt macros to improve query speed by 85%.
- Built Amazon QuickSight dashboards to monitor stock health; improved stockout detection by 25% and reduced missed sales.
- Tuned Snowflake compute and enforced IAM + RBAC policies, projecting $100K/year in savings.
- Developed QA scripts and unit tests for SKU/store-level validation, reducing anomalies by 80%.
Data Analyst – Customer & Campaign Insights
September 2021 - May 2022
Kantar GDC India Ltd., Pune, India
- Built Looker and Tableau dashboards to track churn, conversion, and acquisition for FMCG campaigns; improved executive reporting cycles by 30%.
- Conducted A/B tests and VADER sentiment analysis, increasing CTR by 18% via region-specific message targeting.
- Automated reporting using SQL + dbt in Snowflake, saving 12 analyst hours/month and improving metric freshness.
- Led KPI standardization and implemented lineage documentation using dbt docs and Notion, enhancing data transparency across teams.
My Expertise
Data professional with hands-on experience across analytics, machine learning, cloud engineering, and full-stack deployment—focused on building scalable, insight-driven solutions.
Data Insights & Business Analytics
Delivered insights through Power BI, Tableau, and Looker dashboards, reducing reporting time by 40% and boosting sales by 20% via RFM and cohort analysis.
Data Science & Machine Learning
Built recommendation systems and predictive models using Scikit-learn, XGBoost, and VADER; deployed low-latency solutions with FastAPI and Streamlit.
Cloud & Data Engineering
Engineered ETL pipelines with AWS Glue, Snowflake, and dbt, cutting data lag by 60% and saving $100K/year through compute optimization.
SQL & Database Management
Optimized PostgreSQL and Snowflake schemas, automated QA with PySpark, and reduced reporting errors by 90%.
Model Deployment & Full-Stack Development
Deployed ML models and dashboards using Flask, FastAPI, and Streamlit, achieving <500ms latency for real-time applications.