Sign in

Data Python

Buffalo, NY
August 01, 2020

Contact this candidate



Buffalo, NY - *****+1-716-***-**** • • Summary

Multi skilled, data driven and seasoned professional with 5 years of experience in building data intensive solutions inclusive of Hadoop, ETL, Data Warehousing, Reporting and Dashboarding. Implemented projects in both Agile and Waterfall setup. Experienced in team and client management. Certified AWS Solutions Architect – Associate.


Data Modeling, Data Management, Data Warehousing, Data Engineering, Data Analytics, Web Analytics, Statistical Modeling and Machine Learning. Big Data: Hadoop (HDP), Hive, Pig, HBase, HDFS, Ambari, Sqoop, Spark, Presto, Drill, Zeppelin. Data Analytics and Machine Learning: Python, Regression, Classification, Clustering, Text Analytics (NLP), Web Analytics. ETL Tools: Informatica PowerCenter, SSIS, Alteryx. Reporting Tools: SAP Business Objects, Tableau, Power BI, JIRA, Looker, Google Analytics. Database and tools: Oracle 11g/12c (SQL, PL/SQL), PostgreSQL, MySQL, SQL Server, SQLite, SQL Developer, Erwin. Cloud: Amazon Web Services (AWS), Microsoft Azure. Experience

Delta Technology, India

Data Engineer April 2017 – May 2019

• Implemented Hadoop ecosystem and worked closely with data scientists to serve data mining and machine learning based applications like fraud detection, customer segmentation and service personalization.

• Performed data quality, integration, provision of data to the downstream teams. Used Hive, Pig, Sqoop and Zeppelin for data wrangling and preliminary analysis, reducing efforts of the data science team by 20%. Optimized existing queries using Spark.

• Developed a predictive model in Python using Support Vector Regressor to predict commodity (food) prices based on retail and wholesale prices of food products. Achieved accuracy of 61.89% by model tuning and optimization.

• Collected and integrated data using AWS. Performed data cleansing and preprocessing, and created visualizations using Python. Analyzed the data to establish relationship between wholesale and retail prices of various food products.

• Modeled an IT solution to track cyber incidents and fraud cases to mitigate financial losses of the banks. Performed A/B testing. Enhanced analysis and monitoring of the cyber incidents by 20 %. Helped client to create more robust guidelines and decrease cyber frauds by 8%.

• Created data pipeline using SSIS to collect, clean, integrate from JSON and XML sources and store the data coming from various financial institutes in a more useful format. Performed data analysis and created Tableau dashboards to assist active monitoring and cyber analytics. Business Intelligence Engineer November 2014 – March 2017

• Designed data warehouse architecture and performed data modeling. Used Informatica for data integration, validation and manipulation based on business logic to perform risk assessment of the banks.

• Created stored procedures for risk profiling and scoring of banks based on their compliance to Basel Accords and exposure to different types of risks. Designed BO reports and interactive Power BI dashboards on key metrics for executive reporting.

• Improved efficiency of health monitoring of banks by 36 % leading to enhanced risk management and compliance of banks.

• Leveraged SSIS to provide data warehousing and analytics solution for portfolio management entailing NAV, and Returns calculations, investment tracking, and brokerage management. Used Power BI for customer and market trend analysis.

• Optimized SQL queries. Developed stored procedures, triggers, cursors for automation and used advanced functions like pivot, rollup and cube to implement MTD, YTD, and YTY calculations. Reduced error rate by 11 % and increased efficiency by 16 %. Academic Projects

Credit Card Fraud Detection (Python, Flask, Amazon S3, Machine Learning)

• Built a web application using Flask framework wherein a file could be uploaded, or single entry could be done to identify the nature of transaction.

• Used Logistic regression, SVM, Random Forest Classifier, Naive Bayes algorithms, had to pickle these models to run the test data set, which fetched data from the AWS S3 implemented the best model on the go to give out the best results. Achieved an overall accuracy of 75.63%. Referral pattern analysis of Doctors (Amazon Redshift, Amazon S3, Tableau, SAS Enterprise Miner, Python, Machine Learning)

• Extracted and integrated the doctors' demographical data with shared patient’s data from AWS S3 to generate primary dataset of 69 million rows. Used Redshift for data warehousing and running analytical queries.

• Created tableau-based visualizations to identify general trends and patterns among the doctors. Data will be fed to SAS-E-miner to perform network analysis. Results will be verified with python-based implementation. Theme extraction using unsupervised learning (Python, Natural Language Processing, MS-Excel)

• Scrapped a health forum on Sexually Transmitted Diseases and Infections (STDs and STIs) using selenium to collect various inputs like posts, replies, likes, shares and support to create initial dataset.

• Performed data wrangling, text mining and sentimental analysis using python to identify the general audience, their emotional state and needs.

• Implemented Topic Modeling using an unsupervised method - Latent Dirichlet Allocation (LDA) and to identify the key topics. Themes and persona of individuals were created based on them to support the adaptive website and chatbot. Education

• University at Buffalo, The State University Of New York Master of Science in Management Information Systems June 2020

• University of Mumbai

Bachelor of Engineering in Electronics and Telecommunications June 2014

Contact this candidate