PHANINDRAKUMAR CHINTAPALLI
***********.*@************.*** 857-***-**** www.linkedin.com/in/phanindrakumar Availability: July 2021 EDUCATION
Northeastern University, Boston, MA July 2021
Master of Science in Computer Engineering (Data Science & Machine learning) GPA:3.8/4 Coursework: Machine Learning, Database Management Systems, Data Management & Preprocessing, Data Visualization TECHNICAL SKILLS
Statistical & programming languages: Python, R
Database Query Languages: T-SQL, My-SQL, NoSQL, PostgreSQL, Oracle SQL Big Data and Cloud: Snowflake, Google cloud platform, Big query, Cloud Functions, App engine, Cloud Run, Apache Spark, Hive ETL and visualization Tools: Tableau, Power BI, Data studio, SSIS (ETL), Talend (ETL), Elastic search, Kibana, Google Analytics Machine learning & Deep learning packages: Scikit-learn, Sci-Py stack, GBM, XGBoost, Keras, TensorFlow, NLTK Operating systems and others: Windows, Linux, Git hub, Bash, Jira, Confluence, Selenium web driver, RESTAPI, Docker, Containers PROFFESIONAL EXPERIENCE
Data Engineer Intern, American Tire Distributors, USA May 2020 –Jan2021
• Built a robust data pipeline and ETL to load the customer data into big query and visualized the results in Kibana which helped digital marketing team save about 2 million dollars.
• Created an efficient data pipeline which incorporates data extraction, cleaning, filtering, merging, transforming, and reporting using SSIS, Python, SQL, Power BI as part of developing labor planning Tool which replaced a 150+ excel sheets across 140 warehouses saving more 50,000+ hours, therefore, saving 10 million dollars to the company.
• Developed a dashboard that contains Real-time monitoring of product, customer, and inbound-outbound units for all the hubs and spokes using Tableau and data studio.
• Designed and deployed a machine learning model for inbound forecasting of warehouse units from the manufacturers and predicted outbound units from a distribution center to customers with an accuracy of 86% using Big Query ML. Data Analyst, Tata Consultancy Services, India Dec 2017 –Jul 2019
• Performed data validation which included smoothing of the data by using different types of Moving Averages like Calculating running total, Moving Average, Weighted Moving Average and Exponential Moving Average to ensure data quality using SQL Window functions, Bucketing and calculating the correlations.
• Provided excellent service to the client in making practical decisions that helped in the revenue growth of 6% in one year using Visualization tools such as Tableau and Power BI. ACADEMIC / PERSONAL PROJECTS Project links
Unsupervised Domain Adaptation for Synthetic to Real Images (Python, Pytorch) Feb 2020 – Mar 2020
• Implemented domain adaptation on Domain Net dataset (10 classes) (Source: Sketches of objects, Target: Real images of objects) using the Reverse Gradient (Rev Grad) architecture based on Generative adversarial networks (GANs).
• Converted Iterated using various modifications to the architecture. The best architecture achieved an accuracy of 79.47% (Res Net 101 with Rev Grad architecture) on target distribution compared to 64% when baseline
(ResNet50) model on Domain Net dataset.
Music generation using deep learning (Python, Tensorflow, Keras) Dec 2019 – Jan 2020
• Used Many to Many Character Recurrent Neural Networks, where number of outputs is equal to the number of inputs to generate new music automatically.
• Achieved an accuracy of 90% with a three-layer LSTM architecture by running the model for 100 epochs. Human activity recognition (Python, Keras) Oct 2019 –Nov 2019
• Built a model that predicts human activities such as walking, walking downstairs, walking upstairs, sitting, standing, and laying down by using the data recorded by the sensors (accelerometer and gyroscope).
• Procured an accuracy of 90.09 % with a simple two-layer LSTM architecture by using the raw time series data. Personalized Cancer Diagnosis (Python, Scikit-learn) Sep 2019 – Oct 2019
• Conducted data cleaning, imputed missing values, created new features to improve the model performance
• Tested multiple classification models such as Random Forest, Logistic and Gradient Boosting Decision Trees.
• Achieved a Log loss of 1.0944 using logistic regression by Class balancing the data available as the best output when compared to all the other machine learning models. Hospital Evaluator (SQL, Java) Jan 2020 – Mar 2020
• Designed an ER model to load data related to hospital ratings/reviews, performance, finances of hospitals in United states at state and federal level from heterogeneous sources.
• Conducted Used JDBC for the data access layer and JSP to perform CRUD operations on the data.
• Used Clover DX to create ETL workflows for data extraction and transformation on external data source
• Developed a dashboard in Tableau to visualize all the Business Insights in the project.