SREEJA SINHA ROY
email@example.com www.linkedin.com/in/sreejasinharoy 716-***-**** www.github.com/sreeja50
SKILLS and CERTIFICATIONS
Certifications: Tableau Desktop Specialist, November 2019, AWS Solutions Architect Associate, March 2020
Machine Learning: Regression (Linear Regression, Polynomial Regression), Classification (Logistic Regression, Support Vector Machine), Clustering (K-means, hierarchical), Anomaly Detection (One-class SVM), Regularization (LASSO, Ridge, ElasticNet)
Data Analytics: Descriptive Analytics, Predictive Analytics, Data Cleaning, Exploratory Data Analysis, Feature Extraction
Programming Languages and Tools: Python, SSIS, Tableau, SQL
Database: Oracle 11g/12c (SQL), MySQL
Environment: AWS, Linux
University at Buffalo, The State University Of New York
Master of Science in Management Information Systems (STEM Designated) June 2020
●Relevant Coursework: Database Management Systems, Predictive Analytics, Statistics, Data Visualization, Web Analytics
West Bengal University of Technology
Bachelor of Engineering in Electronics and Telecommunications June 2015
Tata Consultancy Services Kolkata, India
Data Analyst (System Engineer) August 2015 – May 2019
Client - Electricity De France(EDFE)
●Developed a platform from scratch which were used by business analysts to create the profile of customers, track the customer journey and predict customer churn. Data was collected from different sources like SAP, Oracle GRM, Excel Sheets.
●Helped in Building Customer Data Platform using SQL and ETL tools (Talend).
●Generated compliance reports using Tableau and sent it to government bodies called Ofgem(Office Of Gas and Electricity )
●Used Amazon Redshift to perform ETL operations and S3 to load data where S3 acted as a landing zone.
●Used several Amazon EC2 instances and on top of it installed Talend Servers to get data from S3 and then do processing.
Client - Pepsico
●Developed stored procedures, triggers for automation and used SQL queries to reduce error rates by 15 % and increase efficiency by 20%.
●Collaborated with cross functional teams to understand the requirements and communicated the same to technical teams..
●Created Informatica ETL components (mappings, sessions, workflows) for loading pharmaceutical data in EDW staging and core layer from multiple source systems like SAP, JDE etc, helped maintain a master data repository for the client data
●Performed system integration testing of data post execution of ETL components of EDW staging and core layer.
●Performed root cause analysis on production issues, performed permanent fix for more than 100 recurring issues.
●Provided consultative inputs to business through periodic presentations and deliverables of performance statistical reports, created interactive Tableau visualisations and dashboards to give insights to clients to help them make strategic decisions
●Worked in a cross functional team for ad-hoc requests. Performed several change requests as part of support activities in order to enhance performance of systems by 90%.
ANALYTICS AND MACHINE LEARNING PROJECTS
Buffalo, New York Neighbourhood Analysis Third Estate Analytics July 2019 - present
●Interacted with client on weekly basis to understand problem statement of ranking Buffalo neighbourhoods compared to second-tier cities.
●Performed data cleaning and data scraping (2.3 million records) using pandas; performed merging of datasets and labelled data.
●Implemented geographical plots of parking violations (2008-2019) in each street across Buffalo using Tableau and geo-panda.
●Enhanced model’s prediction through random forest classifier to rank the neighbourhoods based on increase in parking violations and check if it affects all cities, which helped in client’s real estate investments in developing neighbourhoods.
Buffalo Region Sexual Crimes Analysis Based On Region and Time December 2019
●Primary goal of the project was to analyse Buffalo crime dataset to understand which regions are safer for women.
●Cleaned data (over 1 million records) using Pandas and analysed crimes based on region and time.
●Created interactive visualisations using matplotlib to understand the crimes trends over the years and across regions.
●Performed latitude and longitude computations to divide the whole Buffalo region into four parts and analysed the overall sexual crimes in each region over the years.
Analysis and Forecasting of Thyroid Dataset Predictive Analysis Tableau December 2019
●Analysed the Thyroid Dataset using Official Federal government’s Thyroid Dataset. Performed Data Quality assurance activities like cleaning data for analysis.
●Anomaly Detection using one-class Support Vector Machine (SVM) and Trained a one-class SVM on features corresponding to the majority class in thyroid dataset and used it with recall of 96%, precision of 3% and F1-score of 6%.