Sindhu Samudrala
*******.*********@*****.*** +1-669-***-**** LinkedIn Github
EDUCATION
Master of Science in Data Analytics Jan 2020- Expected Dec 2021 San Jose State University, San Jose, California GPA:3.5/4 Bachelor of Technology in Information Technology Aug2013-May2017 Jawaharlal Nehru Technological University, India GPA:3.65/4 TECHNICAL SKILLS
Programming Languages: Python, Java, C, Scala, R, JavaScript, MySQL, SQL, PL/SQL, MongoDB, Neo4j Big Data: Hadoop, Pig, Hive, Apache Spark, Databricks Tools and Packages: Tableau, GIT, Maven, JCR, OSGI, Apache Sling. Cloud Technologies: AWS Cloud (EC2, S3, RDS), GCP
Certifications: Hadoop Certified Spark Developer (HDPCD/Spark), Oracle Certified Java Programmer (OCJP). PROFESSIONAL EXPERIENCE
Virtusa, India- Data Engineer (JPMC, Cisco) Aug 2017 to Jan 2020
• Developed an application to perform text extraction from Identity documents to validate the customer data.
• Provided data integration best practices by understanding and resolving data integration issues.
• Automated the data ingestion patterns as a framework developed using Spark and Scala.
• Conducted workshops with various business teams, gathered and documented requirements to collect the user feedback during Live Events.
• Developed specifications, performed gap/fit analysis, identified the data sources & required data integrations across multiple systems to generate the key insights required to evaluate the product performance and user feedback.
• Integrated the live web pages to the Google Cloud Platform (GCP) firestore database.
• Created 15+ management dashboards, insights template to analyze the customer feedback from Live Events. KCM Petrotech, India- Data Analyst/Report Developer Jan 2017 to July 2017
• Developed more than 10+ reports to support business requests and assisted in planning activities, measuring the sales performance against the plan.
• Extract data from various data sources, generate periodic and ad-hoc reports to analyze various business trends, impact of new sales initiatives on revenues and facilitate decision making process.
• Responsible for data preparation, transformation, loading and quality checks; ensured that reporting and analytics applications are operating seamlessly.
• Participated in Database Design for creation of tables, views, and stored procedures for order processing application.
ACADEMIC PROJECTS
Prediction of Kickstarter projects (Random Forest, KNN, XGboost, Ensemble Techniques, NLP)
• Machine learning model is built to predict whether the project would be successful or failure.
• Performed Sentiment Analysis on Blurb (description) of the project, considered the number of days, fund amount, country, category and achieved an accuracy of 72%. Recruitment Insights on Data Science Jobs (Web Scrapping, Python, Tableau)
• Identified the data sources, data collection and wrangling process required to generate the insights; helping students to identify the recruitment trends, key skills, and regional patterns.
• Developed impactful story telling data visualizations using Tableau features such as data blending, filters, quick filters, global/context filters, hierarchies, sets, groups, calculated fields, LOD expressions, geo Maps, dual-axis charts etc.
Classification of Disaster Tweets (Logistic Classification, Decision Tree Classifier, SVM, NLP)
• Applied Machine learning and NLP techniques to perform sentiment analysis and categorize the tweets which are related to real disaster’s and which one’s aren’t.
• Decision Tree Classifier got highest accuracy of 90% among all models. New York Energy Consumption Prediction (Plotly, Fbprophet)
• Identified and wrangled the data. Used Plotly to visualize the trends of the consumption of the energy.
• Used Fbprophet to predict the energy consumption and performed the cross validation to check the results.