DARSHANA CHOUDHARY
R ********.*********.***@*****.*** — Ó +1-551-***-**** — linkedin/darshana-choudhary/ WORK EXPERIENCE
American Technology Initiative Inc. California, USA Data Engineer Jan 2025 - Jun 2025
• Implemented a knowledge search platform using Azure OpenAI, AI Search, Azure ML Prompt Flow to summarize enterprise documents, PDFs, emails optimizing with vector indexes and integrating LLM
• Developed an API to track and monitor SQL queries, improving reporting reliability by 20% Bank of New York Mellon New York, USA
Big Data Engineer Dec 2023 - Dec 2024
• Engineered scalable batch data pipelines using GCP, Python, Airflow modernizing Pig scripts to BigQuery
• Analyzed 4 PB of clickstream data across platforms, identifying trends and targeted investment strategies. Led advanced statistical analysis, data validation, root cause analysis cutting issue resolution time by 15%
• Reduced pipeline latency by eliminating Hadoop clusters, saving $100,000 monthly cloud costs. Decreased average job runtime from 1 hour to 18 minutes, drastically improving SLA (by 2+ hrs)
• Collaborated with stakeholders to design Looker dashboards, data models and ETL pipelines, driving actionable insights for performance monitoring, risk indicators and KPIs for leadership
• Developed ML time-series models to forecast cash positions and detect anomalies with 87% accuracy Larsen and Toubro Infotech Maharashtra, India
Data Engineer Analyst Oct 2020 - Sep 2022
• Migrated data from multiple sources into Snowflake and Synapse Analytics using Azure Data Factory pipelines, performed advanced analysis with predictive insights achieving 25% client cost savings
• Automated large-scale data processing with Databricks and PySpark transformations. Leveraged Azure Logic apps, Key Vault for data security and troubleshooting workflows boosting 70% pipeline reliability
• Conducted different tests like A/B Testing and conversion rate to evaluate results for business initiatives
• Optimized complex SQL queries by indexing, joins, partitioning, reducing data retrieval time by 60%
• Built 20+ impactful Power BI reports and dashboards on 30TB+ data empowering data-driven decisions
• Deployed CI/CD pipelines & designed Snowflake schema with slowly changing dimensions SCD2 scripts
• Collaborated directly with clients, SMEs to gather requirements and translate them into actionable BI solution PROJECTS
Payment Fraud Trend Analytics Using Hadoop Ecosystem Analyzed 5 Million payment records using Google Cloud Dataproc and HiveQL to identify trends and detect fraud. Designed HBase schema for fast lookups, enabling insights that drove 10% ticket recovery Customer Sentiment Analysis using Gemini Multimodal Model (GCP) Implemented a real-time analysis pipeline using AI and NLP of Gemini to classify customer feedback as positive or negative. Built a data pipeline using Python, Dataflow, Pub/Sub, Functions and BigQuery EDUCATION
Pace University Master of Science in Data Science GPA: 3.88/4 New York, NY May 2024 SKILLS
Languages: Python, C, C++, SQL, T-SQL, PL/SQL, PySpark, PowerShell Script Databases: Snowflake, BigQuery, Oracle, MySQL, PostgreSQL, CloudSQL, AlloyDB, MongoDB Big Data & Cloud: Azure, GCP, Databricks, Hadoop, Hive, HBase, HDFS, Pig, Kafka Machine Learning: Pandas, PyTorch, TensorFlow, Scikit-Learn, Time-series models, Neural Network, MLOps BI & Visualization: Power BI, Looker, Tableau, Qlik, DAX Tools: Airflow, Jupyter, Jira, MS Excel (pivot table etc), Version Control(Git, GitHub) LEADERSHIP AND CERTIFICATES
• Spearheaded 3 large-scale events of Quality Engineering Foundation (non-profit org Quality & AI)
• Co-ordinated as lead developer and mentored a team of 7 new associates at Larsen and Toubro Infotech
• Microsoft certified Azure Data Scientist Associate DP-100, Artificial Intelligence AI-900, DP-900, AZ-900