Post Job Free

Resume

Sign in

Machine Learning Data Analyst

Location:
Alpharetta, GA
Posted:
November 01, 2023

Contact this candidate

Resume:

Prabhas Raj Tadesetti

+1-470-***-**** – ad0sgf@r.postjobfree.com – Alpharetta, GA 30004

GitHub: https://github.com/PrabhasRajT/Machine-Learning-models.git LinkedIn: www.linkedin.com/in/prabhas-raj-t-59b47a263

Professional Summary:

A dedicated Data Analyst with over 4 years’ experience in applied machine learning, data analysis, algorithm design, and statistical analysis. Experienced in developing ML models and conducting comprehensive data analysis. Proven ability to derive insights from complex datasets and enhance business performance.

Proficient in using Python, Spark, and R programming languages.

Proficient in T-SQL, SSRS, SSIS, SPSS, SAS, Visual Basic (VBA), DAX expressions.

Proficient with designing, implementing, and operating large scale distributed systems.

A strong knowledge of internal data fields and table structures and able to create automated reporting and dashboards, normalization, query optimization.

Designs and develops low complexity programs and tools to support ingestion, curation, and provisioning of enterprise data to achieve analytics or reporting using Crystal Reports.

Works effectively within a group setting in areas of design of data collection instruments, database management, analysis of data, and creation of reports and dashboards.

Proficient working with data-driven Python with modules such as TensorFlow, Keras, NumPy, SciPy, Pandas, Matplotlib, PyTorch, Scikit-Learn, data visualization and reporting tools like Power BI, Tableau.

Proven ability with statistical methods and advanced modeling techniques like Regression, Decision Trees, Random Forest, Clustering, NLP, Text Mining, etc.

Experience working with large datasets and data warehouses (DW); particularly data mining, manipulation, modeling, and exploration.

Experience in working with internal and external datasets and client reference data and provides analysis in the development of statistical, financial and/or econometric models for analyzing asset performance, securities data, derivative pricing, risk exposure or other sophisticated concepts.

Excellent written and verbal communication skills, competent to present work to peers, cross-functional businesses, and Senior Management with a positive attitude and hunger for learning.

Academic Qualification:

- Master of Science in Computer Science.

University of Illinois, Springfield - Springfield, Illinois January 2022 - May 2023 GPA – 3.9/4.0

Technical Skills:

Applied Machine Learning: Proficient in Python (NumPy, Pandas, Scikit-Learn, TensorFlow, PyTorch, XGBoost, Plotly, Seaborn) and R (qdap, caret, ggplot).

Algorithm Design and Fine-Tuning: Demonstrated ability to design and fine-tune ML and Deep learning algorithms, develop data wrangling and validation strategies. Proficient in GIT.

ML Model Development: Experienced in developing, training, and testing ML models to solve complex business challenges.

Statistical Analysis: Proficient in running A/B tests, drawing statistical analysis from data using SPSS, SAS studio, and measuring ML model impact.

Natural Language Processing (NLP): Strong expertise in text representation, algorithm selection, and NLP task implementation –

tf-idf, GloVe, Word2Vec

Data Processing and Modeling: Skilled in building ETL pipelines, using Spark, Scala, Hadoop, and SQL for data processing and modeling. SQL, NoSQL databases.

Experiment Design and A/B Testing: Familiar with designing experiments and building CI/CD pipelines.

Cloud Technologies: AWS, Azure, and GCP cloud technologies.

Experience working with databases such as Cassandra, MongoDB, or Teradata.

Familiar with Azure DevOps, Agile development, and Git.

Academic Projects:

1) CNN based Leaf Disease Identification and Remedy Recommendation System

Developed an application using Convolutional Neural Network (CNN) algorithm which recommends remedies to specific leaf disease to the farmers(users). Data is fetched from a .csv file containing a list of 100,000 leaf samples.

A convolution and other layers repeatedly extract feature maps, and the network eventually outputs a label that indicates an expected class.

2) An Examination System Automation Using Natural Language Processing

Used Natural Language Processing (NLP) algorithm (NLTK) with Python to build a working prototype of an Examination System Automation. Data is pre-designed in the form of questions and answers and compared against user input answers.

The user data undergo different stages to feed a neural network model and it gives a test analysis of input based on pre-defined dataset output. Publication: https://jicrjournal.com/index.php/volume-13-issue-vi-june-2021/

3) TensorFlow Object Detection using TensorFlow 2.0.

Developed a detection model for object identification in one's camera's video stream using TensorFlow, OpenCV, Anaconda. Data is selected from COCO dataset which contains around 330K labeled images.

Implemented with the model 'ssd_mobilenet_v2_320x320_coco17_tpu-8' and developed in Jupyter notebook using python 3.7.

Professional Experience:

Data Analyst

Sol IT Systems, Irving, TX. July 2023 – Present

Responsibilities:

Importing data and Generating reports, Data profiling, Data validation, and cleansing with the IC4.

Development of scripts using Python for loading, extracting, and transforming data.

Improving the speed and optimization of Hadoop's current algorithms.

Monitor, diagnose and troubleshoot data pipeline issues, ensuring data availability and reliability.

Develops comprehensive data reports that include research methodologies, key findings, and actionable recommendations.

Building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in AWS Athena and coordinate tasks among the team.

Scheduling all jobs using Airflow scripts using Python added different tasks to DAG, LAMBDA.

Utilizing Spark streaming APIs to perform on-the-fly transformations and actions for the data models.

Performing data transformation and data analysis tasks using Google cloud Bigquery, dataflow, compute engine.

Assisted in performing, testing, and debugging of modified programs ensuring all logic paths are tested and pre-determined results are verified.

Generating data visualization reports using Looker.

Involved in preparing project workflow, design, requirements gathering, code flow using requirement document.

Build a program with Python and apache beam and execute it in cloud Dataflow to run data validation between raw source file and big query tables.

Monitoring Big query, Dataproc and cloud Data flow jobs via Stackdriver for all the environments.

Data Analyst

Illinois Department of Revenue, Springfield, IL. August 2022 – May 2023.

Responsibilities:

Collaborated on algorithm design, fine-tuning, and enhancing Machine Learning - driven features.

Worked on SQL scripts, data processing, and performance enhancement using PySpark and Spark SQL.

Gathers, analyzes, and draws conclusions from large, diverse data sets to identify problems and contribute to decision-making in service of secure, stable application development.

Leveraged advanced Transact- SQL (T-SQL) proficiency to design and optimize complex database queries, facilitating streamlined data retrieval and manipulation for informed decision-making.

Worked with resilient distributed datasets using Python and PySpark to transform, filter, load, and validate data as per requirement.

Building ETL pipelines for data transformation and loading the data into Redshift.

Develops database queries to acquire needed data and optimizes SQL queries to extract, transform, and load data from various sources.

Performing reconciliation task between qualified leads stored in program CRM solution and Project Database

Identify opportunities to optimize forecasting models, program database and budget accruals, Maintain business workflow documents.

Develops dashboards or reports to assist in solving business problems.

Assists in documenting limited business and technical requirements and coding of system components.

Generated comprehensive tables and graphs using SAS studio analysis, documented robust data integrity protocols, and provided regular updates to the team.

Documents and communicates required information for deployment, maintenance, support, and business functionality.

Designing dashboards and various reports using Tableau.

Executed procedures employing SSMS, SSIS packages, and containers to facilitate seamless data transformation.

Tools: AWS Athena, EMR, S3, Redshift, SQL, MS SQL Server, Airflow, SAS Access, Tableau, MS Office Suite, Visual Studio.

Data Analyst Intern

SNS Solutions July 2019 - December 2021.

Responsibilities:

Wrote technical specifications for new software development based on user requirements.

Developing Python scripts tailored for specific data analysis tasks.

Designing and implementing backend models and API endpoints for complex scientific workflows.

Collect, clean, and analyze customer data using various tools and techniques to identify patterns, trends, and correlations.

Serves as an organizational consultant on matters relating to databases by providing expertise to assist users in meeting their needs.

Building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in AWS Athena and coordinate tasks among the team.

Assists with pilots for Business Intelligence tool upgrades, innovative tool evaluations, and configuration of metadata for Business Intelligence tools.

Employed SPSS and Excel to construct dashboards, data dictionaries, and charts for in-depth data analysis.

Conducted A/B tests and refined models based on results of qualitative and quantitative data analysis.

Translating user requirements into a well-defined software and system design.

Collaborated with the QA team, resolved issues, and created performance dashboards in Tableau.

Debugging and testing (unit and functional) developed software to minimize the cost of errors.

Preparing diagrams and documentation to illustrate application functionality.

Scheduling and monitoring tasks using Apache Airflow.

Tools: AWS Athena, EMR, S3, Redshift, MS SQL Server, Tableau, SAS Enterprise Miner, MS Office Suite.



Contact this candidate