Data Scientist with Production Data Systems Expertise

Location:

Nashville, TN

Salary:

70000

Posted:

April 21, 2026

Contact this candidate

Resume:

Jignasu Pathak

Ó 540-***-**** R ********@*****.*** linkedin.com/in/jignasu-pathak github.com/Mister-JP Professional Summary

Full-stack software engineer with 2+ years of experience building production web applications, backend services, AWS-based data platforms, and operational decision-support systems at Amazon. Strong record of owning features across React front ends, Python and Java services, workflow/state modeling, deployment pipelines, and live-system troubleshooting. Experienced in translating end-user and business-process needs into maintainable software that improves reliability, usability, and operational efficiency at scale.

Selected Impact Highlights

• Built and scaled a React- and Java-based operational platform from early pilot to 86 warehouses, growing usage to 1,250 weekly active users while saving an estimated 90 labor hours per day through workflow automation and decision-support improvements.

• Delivered AWS-backed backend systems and data workflows using Lambda, DynamoDB, S3, SQS, API integrations, and Spark/EMR processing over 1.6B records per day, supporting low-latency recommendations and production-grade operational visibility.

• Owned production reliability improvements across live services through CI/CD automation, staged deployments, health checks, alerting, rollback controls, and incident troubleshooting, reducing failed release impact by 70% and improving operational resilience.

Experience

Amazon Dec 2023 – Jan 2026

Software Development Engineer I Tennessee, USA

Full-Stack Compliance and Decision-Support Platform for Inventory Intake

• Built and scaled a React- and Java-based application used during inbound inventory intake before stow, guiding warehouse associates through packaging, safety, and legal-compliance checks; expanded adoption from 4 to 86 fulfillment sites, 12 to 1,250 weekly active users, and 50 to 1,100 daily cases.

• Owned end-to-end feature development across the user interface, backend integrations, and workflow-state design, combining item images, barcode and catalog signals, seller metadata, weight details, workflow history, and policy-configured logic into a single decision surface that made high-risk decisions more consistent and traceable.

• Modeled workflow state in DynamoDB and built asynchronous service integrations that retrieved case context, pre-populated fields, and surfaced next-step guidance, turning a fragmented manual process into a structured, data-driven workflow with clearer operator actions.

• Built a backend diagnostics and recommendation pipeline on Apache Spark running on AWS EMR that processed 1.6B warehouse records per day from multiple data sources, compared similar products to identify likely root causes behind item-identification and compliance failures, and powered a rule-based policy engine returning actionable guidance with 80–90 ms query latency.

• Reduced cognitive load on users, cutting 85% of user clicks and saving an estimated 90 labor hours per day at peak volume by eliminating repetitive lookups and manual data entry while improving the consistency of intake decisions.

• Improved workflow completion rates from 70% to 95% by incorporating pilot-site feedback, running A/B tests, and redesigning the UX with clearer decision cues, color-coded states, site-aware override paths, and defensive friction controls in a high-churn operational environment.

• Delivered a reliable production system with 250–450 ms user-facing latency, 99.9% uptime, real-time policy/configuration updates, and a 40% reduction in downstream damage events through earlier and more consistent packaging-compliance decisions.

Production Backend, Data, and Deployment Systems

• Engineered a production computer-vision pipeline that processed 4M images per day with 99.9% availability and 580 ms P95 inference latency, supporting near real-time operational decisions in a live warehouse environment.

• Architected an AWS-based edge-to-cloud data platform across 185 industrial devices at 40 sites using AWS IoT Core, S3, SQS, Lambda, and DynamoDB, synchronizing image metadata, embeddings, OCR-derived attributes, and telemetry into downstream searchable workflows.

• Built asynchronous Python ETL and indexing workflows that joined inference outputs with camera, weight, Lidar, lighting, and process metadata into a 1.3B-record searchable corpus, making suspicious cases queryable within 10 minutes of capture and retrievable in 20–40 ms for investigation and monitoring.

• Designed and productionized a metadata-driven ownership index over 350M+ records using DynamoDB and S3, improving auditability and reducing GDPR-related retrieval and deletion turnaround time by 76% through automated record-linkage validation.

• Built and maintained CI/CD workflows for containerized services, including security scans, integration tests, deployment gates, and artifact publishing, then orchestrated canary and staged deployments with health checks, CloudWatch alerting, and rollback triggers that reduced failed release impact by 70%.

• Wrote SQL and analytical queries to validate operational datasets, troubleshoot missing or inconsistent records, and support incident response while partnering with scientists, operations teams, and engineers to convert production pain points into measurable pipeline and data-quality improvements. Virginia Tech Jan 2023 – Dec 2023

Graduate Research Assistant Blacksburg, Virginia

• Designed and executed a research study on explainability in human-AI collaboration, evaluating how different explanation styles influenced trust, decision quality, and user reliance across multiple experimental conditions.

• Built Python-based data processing and analysis workflows to clean, organize, and evaluate experimental results, reducing analysis turnaround time by 50% and improving reproducibility of research outputs.

• Analyzed participant behavior across structured experiments and translated findings into a published master’s thesis for academic and non-technical audiences, demonstrating strong written communication and evidence-based reasoning.

• Collaborated with research advisors to define hypotheses, evaluate results, and refine experimental methodology in a cross-functional research environment.

• Tech: Python, Pandas, NumPy, Data Analysis, Statistical Reasoning, Experimentation, Research Writing Amazon May 2022 – Aug 2022

Software Development Engineer Intern Arlington, Virginia

• Built a serverless operational dashboard using React, AWS Lambda, API Gateway, and DynamoDB to centralize IoT deployment error data and reduce time spent manually collecting troubleshooting information across tools.

• Designed backend data flows to aggregate deployment failures, surface the highest-priority issues, and give engineers a clearer operational view of rollout health across environments.

• Reduced incident response time from 90 minutes to 15 minutes by transforming scattered deployment logs and service data into an accessible dashboard used by engineering teams for faster diagnosis.

• Improved observability of rollout failures by exposing recurring error patterns and deployment-status signals, enabling quicker identification of systemic issues and reducing repeat investigation work.

• Partnered with engineers and stakeholders to refine requirements, prioritize the most actionable metrics, and improve usability of the reporting interface.

• Tech: React, AWS Lambda, API Gateway, DynamoDB, JavaScript, Operational Reporting Education

Virginia Tech Dec 2023

Master of Science in Computer Science — GPA: 3.77/4.00 Blacksburg, Virginia Vellore Institute of Technology Jul 2021

Bachelor of Technology in Electrical and Electronics Engineering — GPA: 8.36/10 Vellore, India Skills

Programming & Querying: Python, SQL, Java, TypeScript, JavaScript Frontend & APIs: React, Next.js, HTML, CSS, REST APIs, Dashboards, Operational Reporting Cloud & Backend: AWS Lambda, DynamoDB, S3, SQS, API Gateway, ECS/Fargate, ECR, SageMaker, IoT Greengrass, IoT Jobs, Distributed Systems

Data & Processing: ETL Pipelines, Data Modeling, Workflow Automation, Data Validation, Metadata Systems, Reporting Pipelines, Apache Spark, EMR

ML & Analytics: Pandas, NumPy, PyTorch, TensorFlow, Model Evaluation, Experimentation Pipelines, Statistical Analysis

Engineering Practices: CI/CD, Docker, Monitoring, Alerts, Runbooks, Agile Development, Production Troubleshooting

Contact this candidate