RUTURAJ DIXIT
Data Scientist AI/ML Engineer Product Analytics
609-***-**** **************@*****.*** LinkedIn GitHub New York, NY PROFESSIONAL SUMMARY
AI-native data scientist with hands-on experience designing ML systems, product analytics pipelines, and AI-driven user insights for enterprise clients. Built and deployed scalable data workflows at Tata Consultancy Services — achieving 40% efficiency gains and 80–90% model accuracy across manufacturing and industrial clients. Currently advancing research in LLM inference, RAG workflows, and transformer-based architectures at Pace University AI Lab. Passionate about converting messy behavioral signals into product strategy, experimenting rigorously, and operating at the intersection of AI and product. Daily practitioner of AI tools; first to adopt, and first to build durable workflows with them. CORE COMPETENCIES FOR THIS ROLE
Product Analytics KPI frameworks, funnel analysis, activation & retention metrics, time-to-value modeling
Experimentation A/B testing design, causal inference, rollout evaluation, statistical significance, LLM output quality metrics
SQL & Python Expert SQL (complex joins, window functions, aggregations), Python (Pandas, PySpark, Scikit-learn, LightGBM, TensorFlow)
ML & AI Systems Supervised/Unsupervised learning, RAG pipelines, LangChain, Hugging Face, embedding-based retrieval, time-series forecasting Data Engineering ETL/ELT pipelines, Apache Spark, Airflow, Kafka, Snowflake, AWS S3, Azure Data Lake, Databricks
Cloud & MLOps AWS (SageMaker, Textract, S3, EC2), Azure ML, GCP, Docker, FastAPI, CI/CD, REST APIs
Storytelling Power BI, Matplotlib, executive-level insight communication, roadmap influence through data
AI-Native Operator Daily user of LLMs, prompt engineering, agentic tools; builds durable AI-first workflows in research & production
WORK EXPERIENCE
Graduate Teaching Assistant — AI, ML & Python Pace University New York, NYOct 2025 – Present
• Designed coursework and graded student projects on RAG pipelines, NLP models, ML pipelines, and data mining
— evaluating model quality using AUC, F1 Score, Confusion Matrix, and precision-recall metrics.
• Conducted research in the Pace AI Lab on ADAPT architecture, fine-tuning vision-language models (VLMs), VGGT, Depth Anything (DA3), and Meta SAM3D — measuring model performance and validating evaluation frameworks.
• Applied experimentation mindset to model evaluation: tested multiple computer vision and deep learning approaches, synthesized findings, and communicated results to faculty stakeholders.
• Served as a hands-on AI practitioner — daily user of LLM-based tools, ChatGPT, and AI research workflows to accelerate lab output and improve student feedback loops. AI / ML Engineer Tata Consultancy Services Pune, India Oct 2022 – Aug 2024 Delivered AI/ML solutions for enterprise clients — Tokyo Electron, Lamb Weston, and Stellantis — in manufacturing and industrial domains.
• Owned end-to-end model development and deployment: built deep learning models (YOLO, Detectron2, ResNet, X3D, ANN) for object detection, anomaly detection, and activity recognition — achieving 80–90% accuracy in production.
• Designed and operated OCR data pipelines using AWS Textract with custom pre/post-processing logic, reducing document processing time by ~40% and improving data extraction quality for downstream analytics.
• Built product-facing insights: translated model outputs into operational efficiency recommendations delivered directly to client product and engineering stakeholders, influencing manufacturing workflow decisions.
• Architected real-time video processing systems (OpenCV, Django) and optimized SQL-based data pipelines for large-scale dataset handling across distributed environments.
• Leveraged Azure Cloud & AutoML for scalable model training and deployment; collaborated cross-functionally with client PMs and engineers to define success metrics and iterate on model performance.
• Established model monitoring and validation standards across multiple client engagements — defining what 'good' looked like for each use case and building documentation for reproducibility. PROJECTS & RESEARCH
OpInfer — Open-Source Python Package for Optimized VLM Inference — github.com/ruturajdixit99/Opinfer Released v1.0.0
• Built and published a Python package that optimizes video inference performance for Vision Transformers (ViTs) and Vision-Language Models (VLMs) — achieving 50%+ reduction in computational overhead via adaptive motion gating and intelligent frame batching.
• Designed an automated parameter optimization system that analyzes video characteristics (motion patterns, lighting, scene stability) and self-tunes inference thresholds — supporting ViT/DeiT classifiers and OWL-ViT detectors across diverse real-world scenarios.
• Packaged with full CI/CD pipeline (GitHub Actions), test suite, benchmarking framework, and PyPI-ready distribution — demonstrating production-grade software engineering alongside ML research. PrimeMarket AI — Time-Series Forecasting Platform — primemarketai.com
• Built and deployed deep learning time-series forecasting pipelines (XGBoost, RNN, feature engineering) for intraday financial market prediction — full production deployment including infrastructure and API layer.
• Designed automated feature engineering for noisy market signals; applied model evaluation frameworks to validate signal quality and reduce false positives in prediction outputs. Financial Retention Behavior Modeling — Customer Churn Analytics
• Built supervised ML model to identify customers likely to churn for financial institutions; engineered behavioral features and produced actionable retention strategy recommendations from model outputs. Financial Transaction Risk Analyzer — Fraud Detection & Anomaly Modeling
• Simulated card transactions, built risk features, and trained both supervised classifier and unsupervised anomaly detector to flag high-risk customers — mirroring real-world product trust & safety pipelines. Healthcare Agentic AI — Scheduling Agent — LLM / Agentic Workflow
• Designed an LLM-powered scheduling agent enabling appointment booking and doctor-patient query routing — early hands-on experience with agentic AI product flows and human-AI interaction design. Pattern Recognition in Financial Charts — Research Publication — ResearchGate
• Applied image processing and statistical analysis to detect patterns in financial time-series visuals; published methodology and evaluation metrics on ResearchGate. NLP for Banking Support Automation — Transformer / Hugging Face
• Built support ticket classifier using transformer embeddings (Hugging Face) to route queries across categories — applied to real-world B2B support automation use case. EDUCATION
M.S. in Data Science — Pace University, New York Expected May 2026 B.Tech in Electronics Engineering — Shivaji University July 2022 CERTIFICATIONS
AWS Certified Machine Learning Engineer – Associate Databricks Fundamentals Certified PCAP: Python Programming Essentials