Snehith Chelamallu
Data Analyst
+1-940-***-**** **********@*****.***
PROFESSIONAL SUMMARY
Experienced Data Analyst with four years of experience in data analysis, data science, machine learning, and data visualization tools. Demonstrated success in building scalable data pipelines, developing predictive models, and deriving insights from large datasets. Proficient in statistical analysis, data preprocessing, and advanced SQL scripting to enhance business operations and efficiency. Looking to apply Python skills, machine learning knowledge, and data visualization expertise in a progressive organization.
TECHNICAL SKILLS
Programming Languages: Python, SQL (advanced), SAS, C++, Java
Machine Learning: Regression, Classification, Clustering, Ensemble Learning, Neural Networks, Deep Learning, NLP, Generative AI
Data Manipulation: NumPy, Pandas, PySpark
Visualization Tools: Tableau, Power BI, Quicksight, Matplotlib, Seaborn, Plotly, Excel(Pivot tables, VLookup index match, HLookup)
Big Data & Cloud: AWS (S3, Redshift, Glue, SageMaker, IAM), Hadoop, Hive, Spark, Databricks, Snowflake
Databases: MySQL, PostgreSQL, MongoDB, Cassandra, SQLite
DevOps 7 CI/CD: Docker, Kubernetes, Jenkins, Git
Build Tools and IDE’s: VS Code, Visual Studio, Eclipse
API Documentation Tools: Postman
Other Skills: Data structures and algorithms, Linux, A/B Testing, Statistical Hypothesis Testing, Time Series Analysis, Feature Engineering, Data Wrangling, Predictive Modeling, Dimensionality Reduction, Text Analytics
EDUCATION
Master of Science in Information systems and technologies, GPA: 3.9 Jan 2022 – Dec 2023 University of North Texas
Coursework: Big Data Analytics, Data Visualization for Analytics, Enterprise Data Warehousing, Data Mining, Enterprise Applications of Business Intelligence, Predictive Analytics & Business Forecasting, Foundations of Database Management, Programming Languages for Business Analytics.
PROFESSIONAL EXPERIENCE
Data Analyst JP Morgan Chase Feb 2024 – Present
Spearheaded the development of ETL pipelines using Python, AWS Redshift, Glue, and S3, ensuring seamless integration of loan repayment data across systems.
Developed advanced SQL queries and views for real-time reporting in Tableau, enabling the creation of dynamic dashboards that reduced stakeholder interpretation time by 20%.
Applied machine learning algorithms (Random Forest, Logistic Regression) for customer behavior prediction, achieving a 15% improvement in accuracy and increasing quarterly revenue by $50,000.
Extracted and transformed data from SQL databases and AWS S3, Redshift, enabling advanced analysis for loan repayment strategies and improved collection rates by 12%.
Executed predictive modeling for loan default risks using Python (Scikit-learn) and Tableau, achieving a precision score of 92%.
Tech Stack: Python, SQL, Tableau, AWS (Redshift, Glue,S3), Scikit-learn, Pandas, NumPy, Plotly, seaborn, Matpplotlib, Pyspark
Data analyst Sun Micro Info solutions Jul 2020 – Oct 2021
Conducted detailed analysis of sales trends, inventory data, and regional performance using Python libraries (Pandas, NumPy, Matplotlib), enabling a 15% increase in annual revenue.
Built advanced models using Random Forests to predict CI/CD pipeline behavior, improving release reliability by 20%.
Implemented predictive models to refine pricing strategies, boosting revenue growth by 12%.
Produced reports on inventory levels and customer satisfaction scores, leveraging Tableau, python for visualization and SQL for querying databases.
Spearheaded a predictive maintenance system with Random Forest models and AWS SageMaker, improving on-time maintenance execution by 25% and reducing downtime by 20%.
Tech Stack: Python, SQL, Tableau, Scikit-learn, AWS SageMaker, Pandas, Matplotlib
Data Analyst Ample technologies June 2018 – July 2020
Designed and implemented a modular solution to identify erratic test cases within the CI/CD pipeline, improving testing reliability by 20%.
Developed advanced Python scripts to analyze test case execution history, leading to a 20% increase in testing accuracy and reliability.
Created informative reports and dashboards in Tableau, providing actionable insights to project teams and stakeholders.
Developed machine learning models to analyze test case behavior and predict intermittency, leading to more efficient CI/CD pipelines.
Led data cleaning and feature scaling efforts, ensuring data integrity and accuracy in predictive modeling.
Optimized database performance, resulting in a 15% reduction in query execution time and improved data accessibility
PERSONAL PROJECTS
Chronic Kidney disease prediction
Developed a predictive model to assist in diagnosing chronic kidney disease using supervised learning algorithms, including Logistic Regression, Random Forest, and SVM.
Carried out extensive data preprocessing with Pandas to handle missing values, normalize data, and encode categorical variables.
Optimized the model's performance using feature engineering, hyperparameter tuning with GridSearchCV, and cross-validation to achieve an accuracy improvement of 25%.
Combined top-performing models into an ensemble using a single neuron called perceptron further boosting prediction efficiency by 20%.
Visualized insights into the dataset and feature importance using Matplotlib and Seaborn
Environment: Python, Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, Jupyter Notebook
IPL Score and Performance Analysis Dashboard
Launched an interactive Tableau dashboard focusing on IPL match statistics; integrated data points that facilitated more informed discussions during team strategy meetings, thus enhancing overall performance review processes.
Aggregated and cleaned data using SQL and Python, ensuring accuracy and consistency for effective visualization.
Generated advanced filters and interactive charts, allowing users to explore data by teams, players, venues, and match types.
Highlighted metrics like points tables, player contributions, and dismissal types, enabling stakeholders to identify key insights and patterns.
Environment: Tableau, Python, Pandas, SQL
Customer Churn prediction
Engineered a machine learning model to predict customer churn for subscription-based services using Logistic Regression and Random Forest.
Executed comprehensive exploratory data analysis (EDA) to uncover trends in customer behaviors and their correlation with churn.
Enhanced prediction accuracy by implementing feature selection, data normalization, and hyperparameter tuning.
Created a user-friendly Tableau dashboard to visualize customer segments and risk scores, providing actionable insights for churn reduction strategies.
Environment: Python, Pandas, NumPy, Scikit-learn, Seaborn, Tableau