ABHINAYAREDDY PISATI

DATA SCIENTIST
M.S. STATISTICAL DATA SCIENCE

About Me

Data Scientist with 4 years of experience building machine learning models and data pipelines that drive business impact. I specialize in developing end-to-end ML solutions, from data engineering to model deployment, with expertise in deep learning, NLP, and computer vision.

Currently pursuing my M.S. in Statistical Data Science at San Francisco State University, I've delivered measurable results at companies like Clusteratech and Suguna Media Network through scalable ML systems, automated pipelines, and data-driven insights that influence product strategy and business decisions.

Education

M.S. Statistical Data Science

San Francisco State University

GPA: 3.56/4.0

Aug 2024 – Dec 2026 (Expected)

Advanced Machine Learning, Deep Learning, Big Data with Spark, NLP, Computer Vision

Programming Languages

PythonRSQLC++JavaScript

Machine Learning

Supervised LearningUnsupervised LearningDeep LearningNLPComputer VisionReinforcement LearningStatistical ModelingPredictive Analytics

Frameworks & Libraries

PyTorchTensorFlowScikit-learnKerasXGBoostPandasNumPyOpenCVNLTK

Big Data & Cloud

Apache SparkHadoopSnowflakeDatabricksHiveAWSGCPAzureDockerKubernetes

Data Visualization

TableauPower BILookerMatplotlibSeabornPlotly

Developer Tools

GitGitHub ActionsAirflowLinuxJupyterVS Code

Work Experience

Data Science Intern

Clusteratech

May 2025 – Aug 2025Milpitas, CA

Developed and optimized machine learning models and ETL pipelines processing millions of records, resulting in an 8% increase in product engagement and improved user retention metrics. Built automated, scalable dashboards using Python, SQL, and Tableau to track acquisition, retention, and conversion funnel metrics, enabling data-driven decision making across product teams. Partnered with cross-functional engineering and product teams to translate complex data insights into actionable recommendations, influencing feature development and product strategy.

PythonSQLMachine LearningETL PipelinesTableauProduct AnalyticsA/B Testing
Data Science Intern

Suguna Media Network

Feb 2024 – Jul 2024Hyderabad, India

Engineered scalable data pipelines and real-time dashboards for marketing campaign performance monitoring, processing data from multiple channels and touchpoints. Applied causal inference methods and advanced ML models to optimize multi-channel campaign targeting strategies, achieving 12% higher engagement rates and improved ROI. Delivered comprehensive statistical analysis and visualizations to business stakeholders, facilitating data-driven decisions on marketing spend and channel optimization.

PythonSQLCausal InferenceML ModelsReal-time DashboardsMarketing Analytics
Data Scientist

Headlines Media Group

Jan 2021 – Jan 2024Hyderabad, India

Automated large-scale data workflows and ETL processes using Python and SQL, reducing analytics reporting time by 40% and improving data reliability for business intelligence. Created interactive Tableau and Power BI dashboards for comprehensive user behavior analysis and content engagement monitoring across web and mobile platforms. Built and deployed predictive models for churn prediction and retention forecasting, driving data-informed content strategy and personalized user experiences.

PythonSQLETL AutomationTableauPower BIPredictive ModelingChurn Analysis

Featured Projects

End-to-end machine learning solutions delivering measurable business impact through advanced analytics and automation.

Customer Churn Prediction System
FEATURED
Machine Learning Pipeline

Customer Churn Prediction System

Developed end-to-end predictive models using XGBoost and logistic regression achieving 87% accuracy in churn detection. Engineered features from user behavior data and built automated monitoring dashboards enabling proactive retention strategies that reduced churn by 18%.

87% Accuracy18% Churn ReductionAutomated Monitoring
PythonXGBoostLogistic RegressionFeature EngineeringDashboards
Multi-Channel Marketing Attribution
FEATURED
Analytics Framework

Multi-Channel Marketing Attribution

Created comprehensive attribution models utilizing clustering and regression techniques on large-scale datasets. Improved marketing ROI by 18% through accurate channel performance measurement and budget reallocation decisions.

18% ROI ImprovementMulti-Channel AnalysisBudget Optimization
PythonClusteringRegressionAttribution ModelingAnalytics
Deep Learning Image Classification
FEATURED
Computer Vision Application

Deep Learning Image Classification

Designed and trained CNN models using TensorFlow and PyTorch for complex image recognition tasks, achieving 90% accuracy. Implemented distributed GPU training in AWS and GCP environments, reducing training time by 60% through optimized pipeline architecture.

90% Accuracy60% Faster TrainingGPU Optimization
TensorFlowPyTorchCNNAWSGCPDistributed Training

Additional Projects

More data science and machine learning projects showcasing diverse technical capabilities

Privacy Risk Detection System
Data Engineering

Privacy Risk Detection System

Automated Spark and Airflow pipelines to monitor and alert on privacy incidents across diverse data sources. Cut incident detection time by 50% with real-time validation and anomaly detection.

PythonSparkAirflowAPI Integration
Agent-Based Schema Validation
AI Automation

Agent-Based Schema Validation

Orchestrated CrewAI and Haystack agents for schema checks and PII flagging in ETL. Enabled automated data quality checks reducing errors by 30% across pipelines.

PythonCrewAIHaystackETL
GDPR-Compliant Data Architecture
Data Governance

GDPR-Compliant Data Architecture

Designed pipelines with pseudonymization and audit logs meeting GDPR and global privacy standards. Embedded continuous compliance testing within data transformations.

SparkAirflowDelta LakeCompliance
Real-Time Privacy Dashboard
Business Intelligence

Real-Time Privacy Dashboard

Built Power BI dashboards with automated alerts consolidating privacy KPIs for governance. Accelerated incident resolution by 20% through actionable insights.

Power BIPythonSQLETL
NQL Query Understanding System
Natural Language Processing

NQL Query Understanding System

Built a natural language query system that translates user questions into SQL queries using NLP techniques, improving query accuracy by 85%.

PythonNLPBERTSQL
Click-Through Rate (CTR) & Conversion Rate (CVR) Prediction
Predictive Analytics

Click-Through Rate (CTR) & Conversion Rate (CVR) Prediction

Developed machine learning models to predict CTR and CVR for digital advertising campaigns, achieving 88% accuracy and improving ad ROI by 22%.

PythonXGBoostFeature EngineeringA/B Testing
Personalized Local Services Search Ranking
Recommendation Systems

Personalized Local Services Search Ranking

Created a personalized search ranking system for local services using collaborative filtering and location-based features, increasing user engagement by 35%.

PythonElasticsearchCollaborative FilteringGeospatial
Urban Insights - City Explorer & Route Optimizer
Optimization

Urban Insights - City Explorer & Route Optimizer

Built an urban analytics platform with route optimization algorithms, reducing travel time by 30% and providing data-driven city exploration insights.

PythonGraph AlgorithmsOptimizationMapping APIs
Supply Chain Optimization with Demand Forecasting
Time Series Forecasting

Supply Chain Optimization with Demand Forecasting

Developed time series forecasting models for supply chain demand prediction, reducing inventory costs by 25% and improving fulfillment rates by 18%.

PythonARIMAProphetOptimization
AI Audit Algorithm Development
Machine Learning

AI Audit Algorithm Development

Developed machine learning algorithms to detect bias and ensure fairness in AI systems, improving model transparency by 40%.

PythonTensorFlowScikit-learnPandas
Flight Delay Prediction System
Predictive Analytics

Flight Delay Prediction System

Built predictive models to forecast flight delays using historical data, achieving 85% accuracy in delay predictions.

PythonRandom ForestXGBoostMatplotlib
LLM Research Tool
NLP Research

LLM Research Tool

Created a comprehensive research tool for analyzing large language models, streamlining research workflows for academic teams.

PythonTransformersPyTorchStreamlit
E-commerce Recommendation Engine
Recommendation Systems

E-commerce Recommendation Engine

Implemented collaborative filtering algorithms to provide personalized product recommendations, increasing sales by 25%.

PythonCollaborative FilteringApache SparkRedis

Certifications

Professional certifications validating expertise in data science, machine learning, and cloud technologies.

🏆

Google Professional Data Analytics Certificate

Google

Comprehensive program covering data analysis, visualization, and statistical methods using industry-standard tools.

🏆

TensorFlow Developer Certificate

TensorFlow

Professional certification demonstrating proficiency in building and training neural networks using TensorFlow.

🏆

Databricks Certified Associate Developer for Apache Spark

Databricks

Certification validating expertise in building scalable data pipelines and processing big data with Apache Spark.

🏆

AWS Certified Machine Learning - Specialty

Amazon Web Services

Advanced certification demonstrating ability to design, implement, and maintain ML solutions on AWS.

Leadership & Involvement

Committed to fostering growth in the data science community through mentorship, leadership, and active participation in professional organizations.

👥

Graduate Teaching Assistant

Aug 2025 – Present

San Francisco State University

Mentored 100+ students on privacy analytics and data engineering best practices. Created assignments linking statistical theory to practical data science applications. Enhanced communication and teamwork through cross-departmental mentoring initiatives.

Impact: Improved student comprehension and practical skills in privacy-focused data science

👥

Vice President, Analytics Club

2024 – 2025

San Francisco State University

Led SQL/Tableau workshops for 100+ attendees enhancing data literacy. Organized technical workshops and industry speaker events connecting students with data science professionals.

Impact: Increased club membership by 40% and established partnerships with 5 tech companies

👥

Participant - Women in Data Science (WiDS)

2025

WiDS San Francisco 2025

Active participant in the global Women in Data Science conference, engaging with cutting-edge research presentations and contributing to diversity initiatives in data science.

Impact: Built network of 50+ female data scientists and promoted diversity and inclusion

Get In Touch

Let's Connect

I'm always interested in discussing new opportunities, collaborating on data science projects, or sharing insights about the latest trends in machine learning and analytics.

📧abhinayapisati@gmail.com
📞+1 628 290 7240
📍San Francisco, CA
Send a Message
Built with v0