Abhinaya Reddy

Data Scientist & Statistical Analyst specializing in advanced analytics, machine learning, and data-driven decision making.

Abhinaya Reddy

Work Experience

My professional journey in data science and analytics

Sept 2024 – May 2025

Research Assistant

San Francisco State University | San Francisco, CA

  • Guided simulations in statistical modeling and hypothesis testing using Monte Carlo methods.
  • Developed curriculum modules with a focus on experimental design and forecasting methods.
  • Developed case-based forecasting exercises in Business Forecasting using time series and ARIMA models.
  • Led hands-on simulation labs in Computer Simulation using Monte Carlo and discrete-event modeling techniques.
Jan 2023 – Jan 2024

Data Analyst

Headlines Media Group Of Publications | Hyderabad, India

  • Analyzed high-volume user behavior data (millions of records) using Python and SQL to extract actionable insights and identify system-level engagement patterns, aiding product optimization initiatives.
  • Developed automated data pipelines and real-time dashboards using SQL and Power BI to monitor KPIs, improving data availability and reporting efficiency by 40%.
  • Conducted statistical trend analysis to uncover anomalies and performance dips, leading to timely interventions and operational adjustments.
  • Partnered with editorial, design, and engineering teams to translate insights into UX improvements, echoing Amazon's data-driven decision-making principles.
  • Applied A/B testing methodologies and exploratory data analysis to validate content strategies and predict audience engagement outcomes.

Projects

A showcase of my data science and analytics projects

Responsible AI Audit on Hiring Algorithm

Responsible AI Audit on Hiring Algorithm

A hands-on project demonstrating the use of Fairlearn, SHAP, and Aequitas to audit and mitigate bias in machine learning models, particularly in hiring scenarios. Created a synthetic hiring dataset with 1,000 candidates and 12 features to study systematic discrimination and bias measurement in AI systems. Addresses intersectionality and provides ground truth for bias detection and mitigation.

Tools Used:

Python, Jupyter Notebook, Fairlearn, SHAP, Aequitas, Pandas, Scikit-learn

View Project
Flight Delay Prediction Using Weather and Schedule Data

Flight Delay Prediction Using Weather and Schedule Data

Built a comprehensive machine learning pipeline to predict flight departure delays by analyzing flight and weather data. Applied classification, regression, and feature engineering to model delay likelihood and duration. Achieved 73% accuracy in binary classification and submitted results in Kaggle-ready format. Includes data integration, exploratory analysis, and model optimization with cross-validation.

Tools Used:

Python, Jupyter Notebook, Pandas, Scikit-learn, Matplotlib, Seaborn

View Project
LLM-GenAI Research Tool with OpenAI, FAISS & LangChain

LLM-GenAI Research Tool with OpenAI, FAISS & LangChain

A comprehensive research tool designed to streamline the analysis of news articles using cutting-edge AI technologies. Integrates Large Language Models with OpenAI's APIs, utilizing FAISS for efficient vector storage and LangChain for seamless data handling. Enables fast semantic search and automated summarization, transforming unstructured data into actionable insights.

Tools Used:

Python, OpenAI API, FAISS, LangChain, Natural Language Processing

View Project
E-Commerce Recommendation using GenAI

E-Commerce Recommendation using GenAI

Engineered a GenAI-powered e-commerce recommendation system that reduced irrelevant product suggestions by 75% using semantic search, RAG techniques, and dynamic user input processing, enabling highly personalized shopping experiences.

Tools Used:

Python, Streamlit, Langchain, OpenAI API, FAISS, Pandas, Seaborn, Matplotlib

View Project
Customer Churn Analysis

Customer Churn Analysis

Increased churn prediction accuracy by 15% through advanced feature engineering and model optimization, enabling targeted retention strategies. Built and evaluated multiple classification models, translating customer behavior data into actionable business insights.

Tools Used:

Python, Pandas, Scikit-learn, Matplotlib, Seaborn

View Project
Supply Chain Optimization with Demand Forecasting

Supply Chain Optimization with Demand Forecasting

Improved sales forecast accuracy by 20% through a custom Random Forest model, allowing proactive inventory decisions and reducing stock outs. Built a feature-rich forecasting pipeline that captured seasonality, promotions and regional variations - contributing to a data-driven supply chain strategy aligned with business goals.

Tools Used:

Python, Scikit-learn, Pandas, Plotly

View Project

Open Source Contributions

Explore my code repositories and contributions on GitHub

Abhinayareddy18

Data Science & Machine Learning Projects

View Profile

Featured Repository

LLM-GenAI Research Tool - Advanced AI research tool combining OpenAI, FAISS, and LangChain

Python
00

Technologies

PythonMachine LearningOpenAILangChainFAISSStreamlitPandasScikit-learn

Technical Skills & Expertise

Programming Languages & Tools

Python, SQL, R, Bash, Git, GitHub, OOP in Python, Linux (bash, cron)

Machine Learning & AI

Scikit-learn, HuggingFace, LangChain, OpenAI API, Anomaly Detection, Survival Analysis, Regression, Classification, Hypothesis Testing, SHAP, Fairlearn

Big Data & Engineering

Apache Spark, PySpark, Airflow, ETL Pipelines, FAISS, REST APIs, PostgreSQL, MySQL, Pinecone

Data Visualization & Analytics

Power BI, Dash, Plotly, Tableau, Seaborn, Matplotlib, Statistical Analysis, A/B Testing

Cloud & Platforms

AWS (EC2, S3), Databricks, Jupyter, Google Colab, Apache Spark, Time Series Forecasting

Development & Operations

CI/CD, Agile (Jira, Confluence), Process Optimization, Model Deployment, Production Pipelines

Specialized Technical Expertise

Deep Learning & NLP

KerasTensorFlowLSTMProphetARIMATransformers

MLOps & Production

Model DeploymentPipeline AutomationPerformance MonitoringScalability

Specialized Analytics

Time Series ForecastingSurvival AnalysisAnomaly DetectionBias Detection

Business Intelligence

KPI MonitoringReal-time DashboardsStatistical Trend AnalysisOperational Analytics

Key Technical Achievements

92%

Model AUC achieved in churn prediction with 10M+ records

12-min

MAE in flight delay forecasting using time series models

30%

Reduction in QA triage time with LLM-powered diagnostics

Education

Master of Science in Statistical Data Science

San Francisco State University | San Francisco, CA

GPA: 3.9/4.0

Duration: Sept 2024 – May 2025

Relevant Coursework:

  • Statistical Methods for Data Analysis
  • Data Mining
  • Machine Learning
  • Applied Multivariate Analysis
  • Experimental Design and Analysis
  • Programming for Data Analysis
  • Natural Language Processing
  • Deep Learning

Focused on advanced statistical methods and machine learning techniques for data-driven decision making and predictive analytics.

Certifications

Google Professional Data Analytics Certificate

Comprehensive certification covering data analysis techniques, tools, and methodologies used in professional data analytics.

View Certificate

Download My Resume

Want to learn more about my qualifications? Download my complete resume to see my full experience, education, and skills.

Get in Touch

Location

San Francisco, CA

Connect with me