C S (Candidate Initials) Email : **@**-*****.***
Corp to Corp Only Phone # 973-***-****
Cannot work with multiple layers
SUMMARY
Overall, 10 years of IT experience, hands on experience as Data Scientist and analytics, Data mining, Statistical Analysis, Testing Engineer, Database Administrator.
Experience with Biological data sets (genomic, transcriptomic, microbiome, etc.,)
Implemented various methods of data analysis, and interpretation of genomic data sets. Worked on DNA and RNA sequencing analysis data for different human and animal diseases.
Experienced with solid data analytics and data warehousing background.
Gleaning insights leveraging computational tools (Python), and Machine Learning Algorithms.
Performed data analysis using python libraries like Pandas, Matplotlib
Profound knowledge in Machine Learning Algorithms like Linear, Non-linear and Logistic Regression, Natural Language Processing, Random forests, Ensemble Methods, Decision tree, Gradient-Boosting, K-NN, SVM, Naïve Bayes, Clustering (K-means), Deep Learning.
Intermediate Level R analytics expertise (Exploratory analysis using base graphs, ggplot2 packages)
Outstanding pre-eminence in Data extraction, Data cleaning, Data Loading, Statistical Data Analysis, Exploratory Data Analysis, Data Wrangling, Predictive Modelling using R, Python and Data visualization using Power BI.
Statistical analysis and visualization skills from college level graduate assistantship at IU hospital, and Master level University program.
Performed Python scripting for high definition plots and graphics.
Good Experience in using various Python libraries (Beautiful Soup, NumPy, SciPy, matplotlib, Pandas).
Effective interpersonal skills to interact professionally with a diverse group, including executives, managers, and subject matter experts.
Hands on experience of Git, AWS and Jira.
Great team player and ability to work collaboratively and independently as required.
SKILLS
Operating Systems – Windows, UNIX, Linux, Mac OSX.
Data Modelling – ER Studio, Star Schema, Snowflake Schema.
Methodologies – Waterfall/Agile Software Development
Statistics – Various hypothesis testing, estimation, probability theory, time-series analysis, statistical modeling.
Machine Learning – Algorithms for Regression (Linear, Logistic), Classification (Decision Trees, Random forest, XGBoost, SVM, Naïve Bayes, k-NN), PCA, Clustering (k-means, Hierarchical).
R - Implemented ML Algorithms, used packages like dplyr, glmnet, ggplot, caret, Boruta, miss Forest, mice, dummies
Python - ML Algorithms, using packages like NumPy, Pandas, SciPy, Scikit-Learn, Stats models, Matplotlib, Seaborn
Natural Language Processing – Text Processing, Web Scrapping, Sentiment Analysis, Regular Expressions, NLTK
Deep Learning – Neural Networks, RNNs, CNNs, LSTMs; tensor flow, keras.
SQL - Performing Basic Queries, Sub-queries, Joins, Aggregation, Statistical Functions.
Qlik Sense and Tableau – Data Visualization, Business Intelligence, Forecasts, Tables, Charts, Dashboards
Big Data – Basic knowledge of Hadoop, Map Reduce and Spark, along with all the other tools of the eco-system.
EXPERIENCE
MAY 2017 TO PRESENT
DATA SCIENTIST, PHARMA LYNX, PRINCETON, NJ
Performed data analysis using python libraries like Pandas, Matplotlib
Intermediate Level R analytics expertise on genomic datasets (Exploratory analysis using base graphs, ggplot2 packages)
Worked on Machine Learning algorithms like Classification and Regression with KNN Model, Decision Tree Model, Naïve Bayes Model, Logistic Regression and SVM Model.
Used Atlassian suite (JIRA) for defect life cycle management of all reports.
Worked on different formats such as JSON, XML and performed machine learning algorithms in Python.
Participated in all phases of datamining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
Performed data visualization with RStudio and Python, and generated dashboards to present the findings.
Used Pandas library for statistical Analysis.
Communicated the results with operations team for taking best decisions.
Collected data needs and requirements by Interacting with the other departments.
MARCH 2016- MAY 2017
DATA SCIENTIST, NEXTGEN SCIENCES INC, ANN ARBOR, MI
Communicated and coordinated with other departments to gather business requirements.
Gathering all the data that is required from multiple data sources and creating datasets that will be used in analysis.
Performed Exploratory Data Analysis and Data Visualizations using R, and Python.
In Preprocessing phase, used Pandas and Scikit-Learn to remove or impute missing values, detect outliers, scale features, and applied feature selection (filtering) to eliminate irrelevant features.
Conducted Exploratory Data Analysis using Python Matplotlib and Seaborn to identify underlying patterns and correlation between features.
Used Python (NumPy, SciPy, Pandas, Scikit-Learn, Seaborn to develop variety of models and algorithms for analytic purposes.
MARCH 2010-JULY 2016
DATABASE ADMINISTRATOR, ALIS STORES 3 INC, PANAMA CITY
Responsible for all telecommunications systems, Computer equipment and the interaction between the two.
•Install updated software and support the company’s network segment and internet system.
•Responsible for maintaining the network hardware and software including analyzing problems and monitoring the
Network to ensure availability to all system users.
•Make recommendations for upgrades to hardware and software.
•Use software applications to develop, maintain, administer, troubleshoot, tune and upgrade database used in Retail Store chains and devise backup and disaster recover strategies.
•Define, develop production and distribution of business-related reports for company use.
•Ensure that incoming data reporting is complete and comprehensive to the end user.
JANUARY 2006-APRIL 2008
SOFTWARE TEST ENGINEER, PANTALOON RETAIL (INDIA)LIMITED, INDIA
•Worked on white and black box testing and sanity testing.
•Developed testing programs that address areas such as database impacts, software scenarios, regression testing, negative testing, error or bug retests, or usability.
•Documented software defects, using a bug tracking system, and reported defects to software developers.
•Planned tested schedules or strategies in accordance with project scope or delivery dates.
•Reviewed software documentation to ensure technical accuracy, compliance, or completeness to mitigate risks.
Education:
Herguan University, California 2008-2010
Master’s in computer science
Andhra University 2000-2004
Bachelor of Engineering, Electronics and Communications Engineering