Sai Nikhil Dunuka
Data Analyst
+1-551-***-**** ******@****.*** New Jersey, United States LinkedIn Github SUMMARY
Experienced Data Analyst skilled in SQL, Python, ETL, machine learning (Scikit-learn, XGBoost), and big data processing with AWS and Apache Spark.
Proficient in data modeling, fraud detection, and data visualization using Tableau and Power BI. Strong background in regulatory compliance (SOX, GDPR) and optimizing workflows in Agile environments. WORK EXPERIENCE
Junior Data Analyst, BNP PARIBAS, Chennai, India Aug 2022 - Aug 2023 Analyzed high-volume transactions using SQL and Python (Pandas, NumPy), enhancing transaction approval processes.
Managed ETL pipelines with SQL, Pandas, PySpark, and AWS S3, improving data processing efficiency by 15%. Developed and implemented fraud detection models using Scikit-learn and XGBoost, increasing fraud detection accuracy by 20% and reducing manual processing time by 20%. Cleared halted transactions and identified bottlenecks, reducing transaction processing time by 25%. Collaborated with Data Science, Risk, and Operations teams to optimize SQL queries, enhancing overall transaction approval workflows.
Applied Agile methodologies, participating in sprints and daily stand-ups to ensure collaborative improvements and efficient project delivery.
Developed Power BI and Tableau dashboards for real-time insights and ensured compliance with SOX and GDPR standards.
Data Analyst Intern, BNP PARIBAS, Chennai, India Mar 2022 - Aug 2022 Developed a Custody Management System using Python (Pandas, NumPy) and SQL, automating reconciliation and reducing processing time by 20%.
Optimized SQL queries and ETL workflows, cutting data processing time by 30% and ensuring regulatory compliance. Applied SVM and PCA for anomaly detection, and automated compliance reporting with Python, improving fraud detection accuracy and reducing manual effort by 40%. EDUCATION
New Jersey Institute of Technology, Newark, United States - Master of Science, Data Science - GPA: 3.75 Sep 2023 - Dec 2024
PROJECTS
Plants Classification using Deep Neural Networks
Developed a deep learning model using custom CNN and pre-trained architectures (ResNet50, InceptionV3) for plant image classification.
Improved accuracy by 15% through data preprocessing, augmentation, and hyperparameter tuning techniques. Utilized TensorFlow and Keras, with GPU acceleration for model training. Wine Quality Prediction using Docker Container and S3 Bucket Built a parallel machine learning model on Apache Spark using AWS for wine quality prediction. Deployed with Docker for scalability and achieved 90% accuracy using F1 score for evaluation. Stored data on AWS S3 for easy access and management during deployment. Flight Data Analysis Project
Analyzed flight schedules and airport operations with SQL, Python (Pandas, NumPy), and Hadoop. Identified delay factors, reducing delays by 20%.
Set up a 4-VM Hadoop cluster for distributed data processing and enhanced big data capabilities. SKILLS
Data Analysis : SQL, Python (Pandas, NumPy), Data Processing, Data Modeling ETL Processes : Data Extraction, Transformation, Loading (ETL Pipelines), Data Warehousing Machine Learning
Techniques
: Scikit-learn, XGBoost, SVM, PCA, Anomaly Detection, Fraud Detection Big Data Technologies : Apache Spark, Hadoop, Distributed Data Processing Cloud and Visualization
Tools
: AWS (S3, EC2), Docker, Power BI, Tableau