Abhinaya Reddy
Data Scientist & Statistical Analyst specializing in advanced analytics, machine learning, and data-driven decision making.

Work Experience
My professional journey in data science and analytics
Research Assistant
San Francisco State University | San Francisco, CA
- Guided simulations in statistical modeling and hypothesis testing using Monte Carlo methods.
- Developed curriculum modules with a focus on experimental design and forecasting methods.
- Developed case-based forecasting exercises in Business Forecasting using time series and ARIMA models.
- Led hands-on simulation labs in Computer Simulation using Monte Carlo and discrete-event modeling techniques.
Data Analyst
Headlines Media Group Of Publications | Hyderabad, India
- Analyzed high-volume user behavior data (millions of records) using Python and SQL to extract actionable insights and identify system-level engagement patterns, aiding product optimization initiatives.
- Developed automated data pipelines and real-time dashboards using SQL and Power BI to monitor KPIs, improving data availability and reporting efficiency by 40%.
- Conducted statistical trend analysis to uncover anomalies and performance dips, leading to timely interventions and operational adjustments.
- Partnered with editorial, design, and engineering teams to translate insights into UX improvements, echoing Amazon's data-driven decision-making principles.
- Applied A/B testing methodologies and exploratory data analysis to validate content strategies and predict audience engagement outcomes.
Projects
A showcase of my data science and analytics projects

Responsible AI Audit on Hiring Algorithm
A hands-on project demonstrating the use of Fairlearn, SHAP, and Aequitas to audit and mitigate bias in machine learning models, particularly in hiring scenarios. Created a synthetic hiring dataset with 1,000 candidates and 12 features to study systematic discrimination and bias measurement in AI systems. Addresses intersectionality and provides ground truth for bias detection and mitigation.
Tools Used:
Python, Jupyter Notebook, Fairlearn, SHAP, Aequitas, Pandas, Scikit-learn

Flight Delay Prediction Using Weather and Schedule Data
Built a comprehensive machine learning pipeline to predict flight departure delays by analyzing flight and weather data. Applied classification, regression, and feature engineering to model delay likelihood and duration. Achieved 73% accuracy in binary classification and submitted results in Kaggle-ready format. Includes data integration, exploratory analysis, and model optimization with cross-validation.
Tools Used:
Python, Jupyter Notebook, Pandas, Scikit-learn, Matplotlib, Seaborn

LLM-GenAI Research Tool with OpenAI, FAISS & LangChain
A comprehensive research tool designed to streamline the analysis of news articles using cutting-edge AI technologies. Integrates Large Language Models with OpenAI's APIs, utilizing FAISS for efficient vector storage and LangChain for seamless data handling. Enables fast semantic search and automated summarization, transforming unstructured data into actionable insights.
Tools Used:
Python, OpenAI API, FAISS, LangChain, Natural Language Processing

E-Commerce Recommendation using GenAI
Engineered a GenAI-powered e-commerce recommendation system that reduced irrelevant product suggestions by 75% using semantic search, RAG techniques, and dynamic user input processing, enabling highly personalized shopping experiences.
Tools Used:
Python, Streamlit, Langchain, OpenAI API, FAISS, Pandas, Seaborn, Matplotlib

Customer Churn Analysis
Increased churn prediction accuracy by 15% through advanced feature engineering and model optimization, enabling targeted retention strategies. Built and evaluated multiple classification models, translating customer behavior data into actionable business insights.
Tools Used:
Python, Pandas, Scikit-learn, Matplotlib, Seaborn

Supply Chain Optimization with Demand Forecasting
Improved sales forecast accuracy by 20% through a custom Random Forest model, allowing proactive inventory decisions and reducing stock outs. Built a feature-rich forecasting pipeline that captured seasonality, promotions and regional variations - contributing to a data-driven supply chain strategy aligned with business goals.
Tools Used:
Python, Scikit-learn, Pandas, Plotly
Open Source Contributions
Explore my code repositories and contributions on GitHub
Abhinayareddy18
Data Science & Machine Learning Projects
Featured Repository
LLM-GenAI Research Tool - Advanced AI research tool combining OpenAI, FAISS, and LangChain
Technologies
Technical Skills & Expertise
Programming Languages & Tools
Python, SQL, R, Bash, Git, GitHub, OOP in Python, Linux (bash, cron)
Machine Learning & AI
Scikit-learn, HuggingFace, LangChain, OpenAI API, Anomaly Detection, Survival Analysis, Regression, Classification, Hypothesis Testing, SHAP, Fairlearn
Big Data & Engineering
Apache Spark, PySpark, Airflow, ETL Pipelines, FAISS, REST APIs, PostgreSQL, MySQL, Pinecone
Data Visualization & Analytics
Power BI, Dash, Plotly, Tableau, Seaborn, Matplotlib, Statistical Analysis, A/B Testing
Cloud & Platforms
AWS (EC2, S3), Databricks, Jupyter, Google Colab, Apache Spark, Time Series Forecasting
Development & Operations
CI/CD, Agile (Jira, Confluence), Process Optimization, Model Deployment, Production Pipelines
Specialized Technical Expertise
Deep Learning & NLP
MLOps & Production
Specialized Analytics
Business Intelligence
Key Technical Achievements
Model AUC achieved in churn prediction with 10M+ records
MAE in flight delay forecasting using time series models
Reduction in QA triage time with LLM-powered diagnostics
Education
Master of Science in Statistical Data Science
San Francisco State University | San Francisco, CA
GPA: 3.9/4.0
Duration: Sept 2024 – May 2025
Relevant Coursework:
- Statistical Methods for Data Analysis
- Data Mining
- Machine Learning
- Applied Multivariate Analysis
- Experimental Design and Analysis
- Programming for Data Analysis
- Natural Language Processing
- Deep Learning
Focused on advanced statistical methods and machine learning techniques for data-driven decision making and predictive analytics.
Certifications
Google Professional Data Analytics Certificate
Comprehensive certification covering data analysis techniques, tools, and methodologies used in professional data analytics.
View CertificateDownload My Resume
Want to learn more about my qualifications? Download my complete resume to see my full experience, education, and skills.