Sign in

Data Science,python, sql, statistics

Cambridge, MA
January 29, 2019

Contact this candidate


Soumya Shalini


** ********* ******, *** ***, Quincy, MA 02169

+1-704-***-**** Soumyas0406 soumyas0406 Professional Summary

• 6yearsofcomprehensiveexperienceinobjectorientedprogramming,relational databasemanagement,datawarehousingand business intelligence, providing solutions for applications involving huge influx of real-time as well as batch data

• Advocated active participation in all phases of project life cycle: requirements analysis and design, maintaining ETL workflows, database management, application development, deployment and monitoring, data analysis and reporting

• Masters in Data Science and Business Analytics, with well amalgamated coursework from schools of Business, Computer Science and Statistics

• Experienceworkinginmachinelearningincludingexploratoryanalysis,datacleaning,featureengineering,algorithmselection,cross validation,hyper-parameter tuning, regularization, model training and selection

• Expertise in data mining, statistical modeling, Bayesian analysis, information visualization, text analytics and basic intuitive knowledge in application of deep learning algorithms for information retrieval and natural language processing

• Working knowledge on Hadoop ecosystem using HIVE, Map reduce and Spark

• Commendable communication skills, professional outlook towards knowledge sharing and team work Work Experience

Benefits Science Technologies Boston, MA

DATA ANALYST Aug. 2018 - Present

• Performed data integration with health insurance benefits data, using Pentaho DI tool

• Developed DI jobs and transformations for data preparation and storage in AWS environment

• Handled relational modelling,development of database scripts using AWS Redshift and BI reporting using Tableau ProLytics LLC Charlotte,NC

DATA SCIENCE INTERN Jan. 2018 - Jun. 2018

• Performed advanced statistical analysis to implement machine learning algorithms on NBA Games data

• Applied autoML to model NBA Draft Projections using h2o

• Fetched player tweets using Twitter API and conducted sentiment analysis using python to determine if a player has had a positive or a negative event that may affect his performance on a given day

• Visualized data and analysis results into Radar Charts in dashboard HCL Technologies


Bengaluru, India


• Application Development: Developed EJB module for J2EE web applications, performed data manipulation, handled database persistence to support portfolio management and trading desks

• Data Analysis: Analyzed data for visual presentation; Collaborated with user experience strategist and data visualization designer to generate dashboards

Infosys Limited


Bhubaneswar, India

SYSTEMS ENGINEER Oct. 2011 – Mar. 2016

• Application Development and Maintenance: Prepared functional specifications; Developed and maintained J2EE applications; Developed producer and consumer web services(SOAP and RESTful); Analyzed application performance for vulnerability detection

• Data Engineering: Maintained ETL Workflows for storing semi-structured and unstructured data (flat files, csv, json, natcha, xml etc) in proper format for downstream queryingandanalysis

• Database Management: Developed and maintained stored procedures, functions, views, triggers, and cursors for hosted web appli- cations; Developed software applications to facilitate data extraction tasks; Provided support for database modules

• Data Analytics and Reporting: Identified fraudulent transactions using Anomaly Detection; Created ad-hoc reports using complex SQL queries, scripts and stored procedures summarizing payments transactions; Automated reporting processes; Implemented data dashboards to consolidate and display the key indicators of payments process Education

University of North Carolina at Charlotte Charlotte, NC MASTERS in Data Science and Business Analytics Jan. 2017 - May. 2018

• Emphasis on Data Mining, Applied Machine Learning and Deep Learning, Natural Language Processing, Database Systems, Advanced Business Analytics, Data Visualizations and Data Warehousing concepts Biju Patnaik University of Technology Bhubaneswar, India BACHELORS in Engineering (Electrical and Electronics) Jul. 2007 - Jul. 2011

• Emphasis on Object Oriented Programming,Data Structures, Relational Database Management System, Network Theory, Digital Sig- nal Processing, Control Systems and Communications Engineering Technical Skills

• PROGRAMMING: python, Spark, Java(Core. Spring, Webservices (SOAP and REST))

• STATISTICAL ANALYSIS: Hypothesis Testing (t-test, chi square test), ANOVA

• MACHINE LEARNING: Classification, Regression, Clustering and Association rule mining and Anomaly Detection Algorithms, Dimen- sionality Reduction, Time Series Analysis, EnsembleModelsandHyperparameterTuning,Regularization,Recurrent Neural Networks, Convolutional Neural Networks, Topic Modelling(LDA), Word2Vec, Sentiment Analysis

• VISUALIZATION: Tableau, D3, Javascript, CSS

• DATABASES: Oracle, Microsoft SQL Server, MySQL, AWS Redshift

• TOOLS AND TECHNOLOGIES: Pentaho Data Integration, AWS S3, Jupyter, h2o, Anaconda, WebStorm, Hadoop Ecosystem, SAS E Miner,SAS E Guide,Eclipse, Toad, Websphere Application Server, SVN, Git Academic Projects

• TextAnalyticsforBrandcomparisonusingAmazonReviews(python, beautifulsoup, scikit-learn, nltk, SAS, Tableau): Datascrap- ing, preprocessing, POS tagging, stemming and lemmatization, tfidf calculation, sentiment analysis, topic modeling on review texts to perform brand comparison

• PredictionofLoanStatusandDefaultsforLendingClub(python, scikit-learn, Tableau): Exploratoryanalysis, feature engineering, handling imbalanced classes, feature selection, hyper parameter tuning using cross-validation, training and performance evaluation of various classifiers, utilizing 180000+ data points

• Determining Drug-Drug Interactions (DDI)( Azure, Google API, SOLR): Information Retrieval System to get documents relevant to DDI and created indexes to list the drug pairs that cannot be given to patients, both at same time

• Price recommendation for Airbnb Properties in Boston( SAS, python, Tableau): Exploratoryanalysisandbuiltapredictionmodel to recommend prices for new listings for various localities in Boston; Analyzed key influencers for their impact on the prices

• Data Mining for Big Mart Retail Chain following CRISP-DMmodel(R, Tableau): Data Exploration, Missing value imputation,feature engineering, predictive modeling using linear regression and decision tree,confirmatory data analysis through hypotheses testing, association mining to group frequent item sets, using 1000+ data points,for business decision making and optimizing overall sales

• HireHeroesClientManagement(TeradataCompetition)( SAS, R, Tableau): Extensivedatacleaning,exploratoryanalysis,Bayesian analysis to compare conditional probabilities of events, using stepwise regression and decision trees for variable selection, thus rec- ommending measures to increase efficiency using historical data (60000+ rows and 300+ columns)

• Visual Analytics for United Nations Development Program( D3.js, CSS, JavaScript, Tableau): Analysis of about 5000+ data points and rendering interactive visualization system for projects in the 2015 Human Development Report (HDR) for 188 countries Assistantships and Volunteering

• Graduate Teaching Assistant for Database Management Systems Course, UNC Charlotte: Provided outside-class guidance to 25+ students with database concepts using MySQL and Neo4j, handled assignments and grading; prepared course materials

• Volunteer for AnalyticsFrontiersConference,Charlotte: Assistedineventorganizersandspeakers(seniorexecutives,entrepreneurs, software engineers, data scientists, and other influencers) for proper conduct of sessions on Artificial Intelligence and Deep Learning

• Active member in Data Science Initiative Club, UNC Charlotte

Contact this candidate