Open to opportunities

Pranay RishithBondugula

Data ScientistwithAI / ML Engineering & MLOps Skills

I build end-to-end intelligent systems — from raw data pipelines and model training to production deployment and monitoring — across the full ML lifecycle.

PythonPySparkSQLPyTorchTensorFlowXGBoostScikit-learnLangChainRAGMLflowKubernetesDockerAWSFastAPINLPComputer Vision

4 yrs atAccenture·Harman International·UNT M.S. Data Science

About

Building with Data,
End to End

I'm a Data Scientist with 4 years of hands-on experience across the full AI/ML lifecycle — from collecting and transforming raw data, to training and evaluating models, to shipping them into production where they handle real workloads.

At Accenture, I worked on GenAI systems powered by large language models — designing retrieval systems, running experiments to improve output quality, and monitoring their behavior in production. At Harman International, I built data pipelines and ML models that processed high-volume sensor data and produced real-time predictions across a large fleet of connected devices.

I'm most effective when I can move across the problem — working with data, building models, and making sure those models actually run and stay healthy in production. I hold an M.S. in Data Science from the University of North Texas.

📊

Data & Analysis

Large-scale data pipelines
Feature engineering
Statistical modeling
A/B experimentation
EDA & visualization

🧠

Modeling & AI

Supervised & unsupervised ML
Deep learning & NLP
GenAI & LLMs
RAG & fine-tuning
Model evaluation & iteration

🚀

Deployment & Scale

Production ML systems
Model monitoring & reliability
API development
Cloud infrastructure
End-to-end ML lifecycle

Experience

Where I've Done the Work

4 years across two production environments — spanning data, modeling, and deployment.

AI / ML Engineer

Accenture

Jan 2025 — Present

Designed and deployed a GenAI system using LLMs and retrieval-augmented generation, serving large volumes of user queries in production
Applied prompt engineering, fine-tuning, and evaluation frameworks to improve model output quality and reliability
Collaborated on the full model lifecycle — from data preparation and experimentation to deployment and performance monitoring
Conducted experimentation and analysis to measure the impact of system changes on end-user outcomes

LangChainLangGraphPythonMLflowFastAPIAWS

M.S. Data Science

University of North Texas

Aug 2023 — May 2025

Graduate program covering machine learning, statistical modeling, distributed systems, and applied AI

Machine LearningStatisticsData EngineeringResearch

Data Scientist

Harman International

Jan 2021 — Jul 2023

Built scalable data pipelines to process high-volume, real-time sensor telemetry from a large fleet of connected devices
Developed and evaluated machine learning models for anomaly detection, achieving significant accuracy improvements through iterative experimentation
Performed feature engineering, exploratory data analysis, and model selection across structured and time-series datasets
Optimized and deployed trained models into production environments, ensuring reliability and performance at scale

PySparkPyTorchXGBoostScikit-learnSQLAWS

Capabilities

Technical Toolkit

A consolidated view of the tools and frameworks I use across the ML lifecycle.

🧠

Machine Learning & AI

Model training, deep learning, and generative AI pipelines.

PyTorchTensorFlowXGBoostScikit-learnLangChainTransformersRAG PipelinesNLPComputer Vision

📊

Data & Analytics

Processing high-volume streaming and batch data at scale.

PythonSQLPySparkPandasPineconeData ModelingFeature Engineering

⚙

MLOps & Infrastructure

Containerization, orchestration, and model deployment.

KubernetesDockerAWSGCPMLflowAirflowFastAPIPrometheusCI/CD

Projects

Selected Work

A sample of projects — from data pipelines and model training to production AI systems.

PERSONAL PROJECTGenAI · RAG Pipeline

Legal Document RAG System

94%Accuracy

<1sResponse

5K+Documents

100+Queries/sec

LangChainPineconeClaudeFastAPIDocker

PRODUCTIONEdge AI · Data Pipeline

IoT Anomaly Detection at Scale

84%Accuracy

2minETL

1TB+Data/day

50K+Devices

PySparkXGBoostCNNsTF LiteEdge

PRODUCTIONGenAI · Agentic System

AI Agent for Multi-Step Reasoning

3xSpeed

85%Automation

10+Tools

92%Accuracy

LangGraphLangChainOpenAIRAGFastAPI

LinkedIn Shorts

Quick Thoughts

Bite-sized notes on engineering, machine learning pipelines, and scale.

Data EngineeringRecent · 2 min read

Day 16: Explaining ML's Neglected Concepts - 𝗕𝗮𝘁𝗰𝗵 𝘃𝘀. 𝗦𝘁𝗿𝗲𝗮𝗺 𝗣𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴

Most tutorials teach you batch. Most jobs eventually need stream. They're not just different speeds. They're different assumptions about when data shows up.

Read short →