Pavan Yellathakota

I'm Data Analyst

Data-Driven Innovator

Dynamic Data & Analytics professional with 3+ years of experience designing end-to-end data pipelines, cloud-native platforms, and advanced analytics solutions. Passionate about delivering high-impact, insight-driven systems that accelerate smarter business decisions.


Currently pursuing an M.S. in Applied Data Science at Clarkson University, I specialize in building scalable ETL pipelines, interactive dashboards, and machine learning models using Python, SQL, PySpark, and cloud technologies like AWS and Snowflake.


My expertise spans marketing analytics, financial modeling, and data governance, with a proven track record of reducing reporting errors by 90%, boosting sales through data-driven strategies, and achieving 51% returns on a $650K investment portfolio.


I thrive at the intersection of data engineering, analytics, and machine learning, creating solutions that drive measurable business impact. Actively seeking opportunities in Data Analytics, Machine Learning, and Cloud Data Engineering. Let's transform data into decisions!

Profile pic

Data Analyst & Analytics Engineer

Combining 3+ years of experience with advanced training in Applied Data Science to deliver scalable, insight-driven solutions across analytics, machine learning, and cloud platforms.

  • Website: www.pye.pages.dev
  • Email: pavan.yellathakota.ds@gmail.com
  • City: Austin, TX, USA
  • Degree: M.S. in Applied Data Science
  • Work-Ex: 3+ years
  • Got interesting ideas? Let's connect!

At HAVK Mladost, I built Power BI and Google Data Studio dashboards, reducing manual reporting by 40% and cutting costs by 15%. I also consolidated diverse data sources into a PostgreSQL warehouse, improving analytics speed by 100%.

As a Graduate Research Analyst at Clarkson University, I developed platforms like CU-Next and BingeMax, leveraging collaborative filtering and FastAPI to serve thousands of users with low-latency, data-driven recommendations.

Technical Skills

As a data enthusiast, I’m passionate about uncovering insights and building solutions with data. Using my expertise, I’ve mastered a range of tools for roles like Data Analyst, Analytics Engineer, and Cloud Data Specialist. Curious about my other explorations? Click here.

Programming & Scripting

Python Python
R Language R
SQL SQL
PySpark PySpark

Data Science & ML

Pandas Pandas
NumPy NumPy
Scikit-Learn Scikit-Learn
Jupyter Notebooks Jupyter
PyTorch PyTorch

Data Visualization & BI

Power BI Power BI
Tableau Tableau
Looker Looker
Amazon QuickSight QuickSight
Google Data Studio Data Studio

Cloud & Data Engineering

AWS AWS
Snowflake Snowflake
PostgreSQL PostgreSQL
Apache Airflow Airflow
dbt dbt

Development & Deployment

Flask Flask
FastAPI FastAPI
Streamlit Streamlit
MLflow MLflow
Git Git

Collaboration & Tools

Jira Jira
Confluence Confluence
Slack Slack

Portfolio

Discover a curated selection of my projects, where I apply advanced data science, machine learning, and full-stack development to create innovative solutions that drive meaningful business outcomes.

  • Featured
  • Data & Business Analytics
  • Data Science & ML
  • Python Fullstack Web Dev
Detoxify-Telugu

Detoxify-Telugu: Toxicity Classifier

Fintech Sales GAP Analysis

Fintech Sales GAP Analysis

Supply Chain Optimization Dashboard

Supply Chain Optimization Dashboard

Netflix Content Strategy Dashboard

Netflix Content Strategy Dashboard

Career Preferences Report: Gen Z Edition

Gen Z Career Preferences Report

Customer Acquisition Cost Analysis

Customer Acquisition Cost Analysis

Light vs Dark Theme Website Performance

Web Page Theme A/B Testing

Interactive Dashboards with Plotly

Plotly Interactive Dashboards

Popular Programming Languages 2004-2024

Programming Languages Dashboard

Stock Market Portfolio Analytics

Stock Market Analytics

Fake News Classifier

Fake News Classifier

Advanced Text Analysis using NLP

Text Analysis with NLP

Markit Media Group Website

Markit Media: Digital Marketing

Resume

Relentlessly pursuing excellence through continuous learning and unwavering dedication—because results are earned, not given

Summary

PAVAN YELLATHAKOTA

A data-driven problem solver with an M.S. in Applied Data Science, combining domain knowledge with hands-on ML engineering and analytics expertise. Proficient in building ETL pipelines, scalable dashboards, and ML workflows using Python, SQL, Tableau, and cloud technologies. Passionate about converting complex datasets into actionable insights that drive real-world decisions.

Education

M.S. in Applied Data Science

August 2023 - May 2025

Clarkson University, Potsdam, NY, USA

Gaining deep expertise across statistical modeling, machine learning, data visualization, and financial analytics. Developed production-grade projects including "Detoxify-Telugu," a BERT-based toxicity classifier, and built full-stack web apps using Flask and Streamlit. Coursework includes Portfolio Management, ML, Data Mining, and Visualization using Tableau, R-Shiny, and Plotly.

B.Tech in Computer Science

August 2016 - December 2020

Yogi Vemana University, Kadapa, AP, India

Completed comprehensive study in CS fundamentals, including Data Structures, Algorithms, Web Development, and Operating Systems. Built a capstone project titled "Fake News Classifier" applying NLP and data mining to identify misinformation across news articles.

Research

Graduate Research Analyst – Applied Statistics & Advanced Analytics

October 2023 - April 2025

Clarkson University, Potsdam, NY

  • Developed CU-Next, a career analytics platform with a suggestion engine using collaborative filtering (KNN, cosine similarity); served 3,000+ student queries/month via a web dashboard.
  • Built BingeMax, a recommendation engine (ALS on 20M+ ratings) for Clarkson Movie Night, matching users to genres and sending automated email alerts; deployed using FastAPI (<500ms latency), MLflow, Streamlit, and Airflow for retraining.
  • Created AutoEval, an XGBoost-based used car price prediction app with region-specific fairness adjustments; deployed via Flask + Streamlit, delivering pricing bands, margin insights (25–30%), and 5% lower MSE.

Equity Research Analyst – SMIF

August 2024 - May 2025

Clarkson University, Potsdam, NY

  • Co-managed a $650K fund, achieving a 51% annual return and outperforming the S&P 500 by 26% through valuation and risk analysis.
  • Built valuation models (DCF, WACC, Multiples) in Excel and Python to assess high-growth and value stocks.
  • Developed Power BI dashboards tracking EPS, sector weights, and valuation metrics to inform weekly reviews.

Professional Experience

Data Analytics Engineer – ETL, Marketing & Cloud Pipelines

October 2023 - April 2025

HAVK Mladost

  • Built interactive campaign and financial dashboards in Power BI and Google Data Studio, reducing manual reporting by 40% and cutting costs by 15%.
  • Consolidated ticketing, CRM, sponsorship, and ad data into a PostgreSQL warehouse; validated pipelines with PySpark in AWS Glue, reducing reporting errors by 90% and doubling analytics speed.
  • Conducted channel-level attribution using GA4 and Meta Ads; built SQL-based RFM and cohort models, boosting ticket sales by 20% and increasing merchandise revenue by 22%.
  • Collaborated to align 12+ KPIs and implemented a shared metric dictionary, ensuring consistent reporting across teams.

Business Data Analyst – Retail & Supply Chain

November 2022 - April 2023

eAppSys Limited, Hyderabad, India

  • Processed 2M+ daily POS records using AWS Glue and Snowflake, cutting data lag from 5h to under 2h; refactored pipelines using dbt macros to improve query speed by 85%.
  • Built Amazon QuickSight dashboards to monitor stock health; improved stockout detection by 25% and reduced missed sales.
  • Tuned Snowflake compute and enforced IAM + RBAC policies, projecting $100K/year in savings.
  • Developed QA scripts and unit tests for SKU/store-level validation, reducing anomalies by 80%.

Data Analyst – Customer & Campaign Insights

September 2021 - May 2022

Kantar GDC India Ltd., Pune, India

  • Built Looker and Tableau dashboards to track churn, conversion, and acquisition for FMCG campaigns; improved executive reporting cycles by 30%.
  • Conducted A/B tests and VADER sentiment analysis, increasing CTR by 18% via region-specific message targeting.
  • Automated reporting using SQL + dbt in Snowflake, saving 12 analyst hours/month and improving metric freshness.
  • Led KPI standardization and implemented lineage documentation using dbt docs and Notion, enhancing data transparency across teams.

My Expertise

Data professional with hands-on experience across analytics, machine learning, cloud engineering, and full-stack deployment—focused on building scalable, insight-driven solutions.

Data Insights & Business Analytics

Delivered insights through Power BI, Tableau, and Looker dashboards, reducing reporting time by 40% and boosting sales by 20% via RFM and cohort analysis.

Data Science & Machine Learning

Built recommendation systems and predictive models using Scikit-learn, XGBoost, and VADER; deployed low-latency solutions with FastAPI and Streamlit.

Cloud & Data Engineering

Engineered ETL pipelines with AWS Glue, Snowflake, and dbt, cutting data lag by 60% and saving $100K/year through compute optimization.

SQL & Database Management

Optimized PostgreSQL and Snowflake schemas, automated QA with PySpark, and reduced reporting errors by 90%.

Model Deployment & Full-Stack Development

Deployed ML models and dashboards using Flask, FastAPI, and Streamlit, achieving <500ms latency for real-time applications.

Contact

Got a question, idea, or just want to say hi? Let's make it happen!

Austin, TX, USA