Pavani Pulipelli
217-***-**** ******.*@************.*** Charleston, IL
Summary
Data Scientist with 3+ years of experience in data analysis, machine learning, statistical modeling, and predictive analytics, leveraging advanced algorithms insights to optimize business strategies, enhance decision-making, and drive innovation.
Expertise in Linear Regression, Logistic Regression, Decision Trees, and Random Forests developing predictive models, optimizing algorithms, and extracting actionable insights from complex datasets.
Skilled in CNN, LangChain, Hugging Face Transformers (BERT, GPT-3), NLP, Generative AI, and Large Language Models, as well as AI-driven solutions, natural language processing applications, and deep learning models for advanced analytics.
Proficient in AWS, Azure, Tableau, Power BI, Hypothesis Testing, cloud-based data analysis, statistical hypothesis testing, and creating data visualizations to drive informed decision-making.
Experienced in Matplotlib, Scikit-learn, XGBoost, and SQL Server. Skilled in data visualization, machine learning model development, and efficient data management for predictive analytics.
Skills
Languages: Python, R, SQL
IDE’s: Jupyter Notebook, Google Colab
Machine Learning: Linear, Logistic Regression, Decision Trees, Random Forests, Naive Bayes, SVM
Deep Learning: CNN, ANN, RNN, LSTM, LangChain, Hugging Face Transformers (BERT, GPT-3), NLP
AI Technology: Generative AI, Large Language Model (LLM)
Cloud/Visualizations: AWS (SQS, SNS, S3, EC2, CloudWatch, CloudFormation), Azure (Azure DevOps, Azure App Service, Azure Functions, Azure Kubernetes Service (AKS)), Tableau, Power BI, Excel, Looker
Statistical Techniques: Hypothesis Testing, Data Visualization, Data Modelling, A/B testing, Model Evaluation
Packages and Frameworks: NumPy, Pandas, Matplotlib, Scikit-learn, Seaborn, TensorFlow, Keras, NLTK, XGBoost, PyTorch
Database: MySQL, PostgreSQL, MongoDB, SQL Server
Education
Master of Science in Computer Technology Eastern Illinois University, Charleston, IL
Bachelor of Technology - Computer Science and Engineering Jawaharlal Nehru Technological University Hyderabad, India
Experience
MetLife, IL Jan 2024 – Present
Data Scientist
Applied machine learning techniques to analyze large datasets, uncovering hidden patterns and insights that drove significant positive outcomes, including a 25% improvement in key performance indicators (KPIs).
Deployed high-accuracy CNN models for image classification/object detection tasks, increased defect detection rates, a 65% reduction in false positives, and successfully deployed an automated quality control system.
Optimized AI-powered chatbots using advanced NLP techniques and deep learning frameworks such as TensorFlow, PyTorch, and spaCy, improving chatbot response accuracy by 45%.
Designed scalable LangChain pipelines on Azure, integrating with Azure Cognitive Services, Azure OpenAI, and Azure Synapse Analytics, reducing data retrieval time by 30% and improving model inference efficiency.
Crafted deep learning models (BERT, GPT-3) to significantly enhance the performance of various NLP applications, including text generation with a 20% improvement in overall model accuracy.
Zensar Technologies, India Aug 2020 – Nov 2022
Jr. Data Scientist
Engineered XGBoost model performance by 25% through meticulous hyperparameter tuning, leveraging advanced techniques such as grid search, random search, and cross-validation.
Enhanced data analysis efficiency by 30% through the strategic design and implementation of optimized data models within Power BI, leveraging DAX (Data Analysis Expressions) for complex calculations and real-time analytics.
Established a suite of high-performing machine learning models by leveraging a diverse set of Scikit-learn algorithms, including Decision Trees, Random Forests, and Gradient Boosting techniques.
Implemented a real-time data pipeline using AWS services, including Lambda and S3, to streamline data ingestion, transformation, and storage architecture to automate data processing workflows, reducing latency and improving data processing speed by 20%.
Contributed high-performing deep learning models using TensorFlow for time series forecasting, leveraging advanced architectures to capture complex temporal dependencies.
Built exploratory data analysis (EDA) by developing interactive and highly customizable visualizations using Matplotlib, enabling stakeholders to identify key data trends 40% faster.
Generated data extraction processes from PostgreSQL by developing and executing complex SQL queries, significantly improving query performance and data retrieval speed.
Developed complex SQL queries to efficiently extract, transform, and load (ETL) data from SQL Server databases, optimizing data processing workflows for analytical purposes.