Machine Learning Data Science

Location:

Irving, TX

Posted:

May 02, 2024

Contact this candidate

Resume:

UDAY RAM V

+1-214-***-****

***********@*****.*** https://www.linkedin.com/in/uday-ram-v

CAREER SUMMARY

A Data Science and programming enthusiast, aiming for Automation and Optimization of recurrent process possible to see the world working on something worthwhile, with a strong foundation in Software development with proficiency in Python, C, C++, Java, SQL, NoSQL. Experienced in building high dimensional statistical machine learning models, data analysis, data preprocessing, data visualizations, automation applications, web applications. TECHNICAL SKILLS

Programming Languages Python, Java, C, C++, R, JavaScript, Shell Script. Databases MySQL, PL/SQL, Neo4j (Graph SQL), NoSQL, Postgres, Snowflake (SnowPro certified).

Machine Learning and Deep

Learning

SAS programming, Pandas, Numpy, Scikit-Learn, Keras, TensorFlow, PyTorch, LLM, NLP, FCN, CNN, DNN, MATLAB, Llama, Mistral.

Frameworks and Libraries Keras, Tensorflow, Sklearn, PyTorch, Hugging Face, GPT, Bert, Flask, Django, Pytest, PyUnit.

Cloud Databases GCP Clusters, GCP BigQuery, GCP DataProc, AWS Data Lake, AWS Lambda, AWS Key Vault, Azure Data Lake, Azure Key Vault.

Big Data Technologies Hive, Hadoop, Spark, Kafka, Data Structures. Software Tools Tableau, Open Refine, RapidMiner, Power BI, Open CV, Microsoft Office, Googles Docs sheets, Bitbucket, Confluence, Jira, Docker, Jenkins, IntelliJ Idea, PyCharm, VS Code, Git, Airflow, Vision, Gray Log.

EDUCATION

University of North Texas, Denton, Texas Aug 2022 – Dec 2023 Master of Science in Data Science GPA: 3.8

Relevant Coursework: Applied Machine Learning, Statistical models, Data Modeling, Data Analysis, Data Mining, Data Visualization.

SRM Institute of Science and Technology, India June 2015 – July 2019 Bachelor of Technology in Computer Science and Engineering GPA: 3.5 Relevant Coursework: Python Programming, Advanced Data Structures, Machine Learning Algorithm Implementation.

PROFESSIONAL EXPERIENCE

ML Engineer/Data Analyst at Ford, TX, USA Aug 2023 – Present Technologies: Python, Data Analysis, Machine Learning Models, Deep Learning, Tableau, Deployment

Worked on the datasets with Data Pre-processing, Data preparation and Data transformations using Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib, Stats.

Engineered features from structured and unstructured data sources, leveraging domain knowledge and techniques using text vectorization, image augmentation, and time series decomposition.

Conducted comprehensive data analysis and preprocessing techniques, including missing value imputation, outlier detection, and feature scaling, using Pandas and NumPy.

Developed end-to-end machine learning solutions, from data preprocessing and feature engineering to model training and evaluation with classification, regression, clustering, reinforcement learning, convolutional neural networks

(CNN), natural language processing (NLP), LLM using Python scikit-learn, Keras, TensorFlow, and PyTorch.

Proven ability to apply cutting-edge AI technology to solve real-world problems using OpenAI APIs and library GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers) for tasks like language translation, text generation, and summarization.

Designed and developed Large Language Models (LLMs) using state-of-the-art deep learning techniques, leveraging frameworks TensorFlow, PyTorch and Hugging Face Transformers to build custom language models tailored to specific business needs.

Worked in image segmentation using FCN architectures, enabling accurate pixel-level predictions for tasks such as autonomous driving interpretation.

Deployed machine learning models into variety of environments using Docker with Kubernetes platforms for scalability and reproducibility, to integrate models with CI/CD workflows, validation and deployment using Jenkins. ML Engineer at NielsenIQ Company Pvt Ltd, India June 2018 – July 2022 Technologies: Python, SQL, Neo4J Graph SQL, Machine learning, Snowflake, Postgres.

Worked in Python utilizing libraries such as scikit-learn, TensorFlow, and Keras to design, build, and deploy predictive models on large datasets.

Developed a churn prediction model using Python's scikit-learn, PyTorch library by training various machine learning algorithms including logistic regression, random forest, and gradient boosting on a large dataset to identify customers at risk of churn and deployed the model using Flask for real-time predictions.

Evaluated model performance using appropriate metrics and cross-validation techniques, optimizing hyperparameters, grid search, random search, Bayesian optimization and accuracy, precision, recall, F1-score, ROC-AUC.

Worked with Fine-tuned pre-trained Language Models (LLMs) - BERT, GPT, and RoBERTa using framework OpenAI GPT.

Developed and deployed end-to-end natural language processing (NLP) pipelines using Langchain frameworks such as Llama, Mistral, and OpenAI APIs, integrating pre-trained language models (LLMs) for tasks such as text generation, sentiment analysis, and named entity recognition (NER).

Developed and deployed performant statistical models for various applications, including time series forecasting and real-time data analysis.

Utilized techniques such as Integrated Gradients, LIME (Local Interpretable Model-Agnostic Explanations), and SHAP (SHapley Additive exPlanations) to interpret BERT model predictions and provide explanations for decision- making processes.

Leveraged AWS cloud services such as Amazon SageMaker, Amazon EC2, and Amazon S3 for scalable and cost- effective machine learning model training and deployment.

Hands-on experience with unstructured data processing, leveraging Natural Language Processing (NLP) techniques for text data analysis.

Explored Generative AI models, including Variational autoencoders (VAEs) and Generative adversarial networks

(GANs), for generating synthetic data and enhancing enterprise applications.

Familiar in Setting up and configuring CI/CD pipelines for Python applications using Terraform, Jenkins, Docker image and Kubernetes.

Created DAG’s and Rest APIs for triggering dataset ingestion, importing and fetching using python flask application in AIRFLOW and VISION.

Set up, Built and Deployed in various environments with Automated SHELL Scripts using Jenkins.

Owned and developed an Automation Flask application from the scratch to seamlessly implement RESTful APIs, specializing in handling large-scale data operations that include data collection, preparation, transformation, and ingestion using python in Snowflake, Postgres.

Developed and Managed Automated Testing Framework for all the Rest APIs and basic JWT token generation features.

Data Analyst Intern at NielsenIQ Company Pvt Ltd, India Feb 2018 – May 2019 Technologies: Python, Data Analysis, Machine Learning Models, Automation Framework

Worked on the dataset with Data Pre-processing, Data preparation and Data transformations using Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib, Stats.

Implemented and Applied various Machine Learning Algorithms/Models that include Linear and Logistic Regressions, SVM, K-NN, NLP to find the pattern of the data prepared.

Used Data Visualizations for presenting the data patterns found using various applied models on data.

Developed an Automation Testing Framework in Python for testing huge number of APIs, fundamental operations of an entire application.

PROJECT EXPERIENCE

Product Review Aspect Ranking

Aim – to get a consolidated and meaningful opinion of all the reviews of a product.

Implemented a model using Text Classification, Semantic Analysis, Tokenization, and other NLP Algorithms to find out the aspect of reviews of a product using Python, Scikit-Learn, Pandas, NumPy.

Applied the model and performed the operations on a set of reviews of few products for classification and found a success rate of 87% of the model.

Image Recognition

Aim – to identify an image based on the set of trained images of animals.

Performed Image preprocessing on the data for converting all the images to Uni colored images with a proper outline using Python, Scikit-Learn

Implemented K-Means algorithm for the preprocessed dataset and resulted in 89% accuracy rate of the model. Exploratory Data Analysis on GDP Dataset

Aim – to perform analysis and identify various patterns involved in the dataset.

Exploratory Data Analysis on GDP dataset from World Development Indicators: Conducted thorough data exploration on the dataset, revealing insights and presenting them with various visualizations using Python. Data Analysis in Google Cloud Platform (GCP) and Tableau

Created VM instances in GCP Clusters, to upload and use the dataset from BigQuery for data analysis.

Initiated Spark, HIVE instances and performed few queries for data analysis on the dataset imported using Spark SQL and Hive SQL.

Also Preprocessed the data in OpenRefine tool using diverse options and visualized the data in various plots and graphs in Tableau.

INTERESTS

Programming Applied Machine Learning Automation Data Analysis Data Engineering ACHIEVEMENTS

Successfully built IRIS Recognition model using Daugman's algorithm with 80 percent accuracy.

Improved business logic and performance that saved the budget of the organization by improving ingestion process from a 12-hour process to 2:40 minutes in handling huge data by implementing an automated PL/SQL and python, that chunks, transforms and ingests data.

Developed end-to-end Automation of data ingestion process from scratch, that gets that data from ADLs and Postgres, performs transformation processes using Pandas, ingests to Snowflake to perform pivot operations and then uploads to ADLs.

Contact this candidate