Data Python

Location:

Posted:

August 03, 2020

Resume:

Harshitha Sanikommu

404-***-**** Boston, MA ******************@*****.*** linkedin.com/in/harshitha-sanikommu https://github.com/SanikommuHarshitha

EDUCATION

Northeastern University, Boston, MA Expected Aug 2020 Master of Science in Information Systems GPA : 3.5/4.0 VR Siddhartha College, Vijayawada, India May 2016

Bachelor of Technology in Information Technology GPA : 8.45/10.0 Relevant Courses: Data Science, Big Data, Python, Relational Databases(DB), Machine Learning, Data Warehousing Business Intelligence, Artificial Intelligence(AI), Natural Learning Processing(NLP), Data Structures and Algorithms, R programming, Cloud Computing, Data Mining

TECHNICAL SKILLS

Programming Language : Python, R Programming, SAS, Java, Scala, JavaScript Machine Learning packages : Numpy, Pandas, Matplotlib, Scikit-learn, Keras, Tensorflow, NLTK, Selenium Databases : MySQL, Oracle, PL/SQL, PostgreSQL, T-SQL, NoSQL, MongoDB, Cassandra AWS Cloud Services : S3, Redshift, EC2, lambda, Kubernetes, SageMaker, EKS ETL Pipelines : Apache Beam, Apache Airflow, Metaflow(Netflix), Kafka, DASK, DAG Data Visualization Tools(BI) : Power BI, Tableau, ER Studio, Talend, Alteryx, Looker, PivotTables Big Data : PySpark, Hadoop, MapReduce, Hive, HBase, Pig, Apache Spark, Oozie, Hbase Other Tools/Technologies : Jupyter Notebook, Docker, Github, Slack, Trello, Turnilo, Pycharm, A/B testing PROFESSIONAL EXPERIENCE

Data Engineer Co-op July 2019 – June 2020

Shah Family Foundation, Boston, USA

• Built AWS data pipelines and company’s MySQL server from the ground up which optimized the query performance by 92%

• Engineered $30M budget project and increased the profit by 10% by improving efficiency of Boston Public Schools

• Conserved 92% of the time spent by transferring and wrangling raw data with custom-made ETL application and automated them to prepare unruly data for machine learning models

• Utilized Spark, Scala, Hadoop, HBase, Cassandra, MongoDB, Kafka, Spark Streaming, MLLib, Python to provide storage and perform data analysis

Data Scientist July 2016 – June 2018

EdgeVerve, Bangalore, India

• Conducted statistical data analysis using logistical model, KNN, decision tree classification and random forest model to increase the accuracy(P-scores) by 30%

• Accelerated statistical and analytical insights by 40% for effective strategic positioning using PySpark

• Expanded recurrent business among land financial specialists by 25% ACADEMIC PROJECTS

Dockerize Sentiment Analysis Model using Metaflow Data Pipeline Feb 2020

• Designed a ETL pipelines using Airflow and Apache Beam to scrape, preprocess and label the data of company earning calls and store the retrieved data on S3

• Predicted the sentiment of sentences using Python, Metaflow(Netflix), NLP, Docker and Flask App

• Designed Bert Model on GPU using NVIDIA CUDA and increased the accuracy to 93%

• Incorporated Amazon Comprehend API in algorithms to label the data for sentiment analysis

• Developed a flask app which runs Tensorflow model and created Docker containers Quora Question Pairs Python, Tensorflow, LSTM, Word2Vec, Log Loss March 2019

• Assembled a Long Short-Term Memory model that identifies the duplicate questions on Quora

• Used Keras and Tensorflow packages to build the log loss algorithm and increased the accuracy by 73%

• Ranked 37th position on Kaggle’s public leaderboard Customer Relationship Management Using Hadoop March 2019

• Led and managed a team of 4 members to classify whether customers are valued customers or not using MapReduce jobs

• Triggered and monitored workflows in Oozie using Linux commands and developed HiveQL queries

Contact this candidate