Data Scientist Machine Learning

Location:

Richmond, VA

Posted:

May 26, 2025

Contact this candidate

Resume:

Pranay Reddy Anthareddy

Data Scientist

****************@*****.*** +1-301-***-**** Centerton, AR SUMMARY

Data Scientist with around 5 years of experience in designing and deploying advanced machine learning and NLP models using Python, HuggingFace Transformers, and BERT for real-world applications such as ESG scoring and recommender systems. Proficient in data ingestion, feature engineering, and model evaluation leveraging tools like SpaCy, Selenium, Streamlit, and Azure Functions. Skilled in statistical analysis, text classification, and recommendation algorithms with strong expertise in cloud platforms including AWS, Azure, and GCP. Experienced in collaborating with cross-functional teams to align technical solutions with business goals, ensuring scalable and high-impact data-driven insights. Adept at leveraging SQL, Spark, and visualization tools like Tableau and Power BI to deliver actionable analytics and drive strategic decision-making. SKILLS

Methodologies: SDLC, Agile, Waterfall

Language: Python, R, SQL, SAS, Java

IDEs/Database: Visual Studio Code, PyCharm, Jupyter Notebook, MySQL, SQL Server, Oracle, MongoDB Statistical Methods: Hypothetical Testing, ANOVA, Time Series Machine Learning: Regression analysis, Bayesian Method, Decision Tree, Random Forests, Support Vector Machine, Neural Network, Sentiment Analysis, K-Means Clustering, KNN, Classification, SVM, Naive Bayes, Natural Language Processing (NLP), LLM, CNN, XGBoost, Deep Learning Packages: NumPy, Pandas, Matplotlib, SciPy, ggplot2, Scikit-Learn, PyTorch, TensorFlow, Keras, Spark Visualization/Other Tools: Tableau, Power BI, Jira, Microsoft Excel Cloud Technologies: AWS (S3, EC2, ECR, SageMaker, Redshift, CloudWatch), GCP, Azure (Blob Storage, Virtual Machines, AI) Software/Other Skills: Data Cleaning, Data Wrangling, Critical Thinking, Communication Skills, Presentation Skills, Problem-solving, Decision Making, EDA, Communication Skills, Databricks, Data Visualization, Predictive Analytics, Pattern Recognition, JMP, Data Integrity, Quantitative Data, Data Science, Statistics, Statistical Analysis, Data Analytics, Data Modeling, Big Query, Snowflake, Data Analysis, Data Mining, SAP, Mathematics, Computer Science, Programming

Operating System: Windows, Linux

WORK EXPERIENCE

JPMorgan Chase & Co., USA Data Scientist June 2024 - Current

• Developed and deployed a Natural Language Processing (NLP) pipeline to extract ESG-related indicators from unstructured company disclosures, news articles, and regulatory filings using BERT and SpaCy.

• Designed and trained a multi-label text classification model using HuggingFace Transformers to identify Environmental, Social, and Governance

(ESG) factors aligned with SASB and UN PRI frameworks.

• Implemented data ingestion workflows using Selenium and BeautifulSoup for large-scale web scraping of SEC filings, sustainability reports, and ESG news data.

• Engineered features from text corpora using Named Entity Recognition (NER) and TF-IDF vectorization, improving ESG signal precision by 22%.

• Integrated ESG scoring outputs into JPMorgan’s investment analytics platform, enabling portfolio managers to assess ESG risk and impact during asset selection.

• Automated ESG report generation through Azure Functions, enabling scalable ESG data pipelines and real-time score updates for institutional investors.

• Collaborated with the investment strategy team to align the ESG model with ESG regulatory reporting standards, supporting sustainable finance initiatives.

• Conducted model performance evaluation using F1-score, Hamming Loss, and Precision@K, ensuring high accuracy and interpretability in ESG tagging.

Byteworks Solutions, India Data Scientist Jan 2020 - July 2023

• Designed and deployed a hybrid recommender system combining collaborative filtering (SVD) and content-based filtering using TF-IDF to personalize product suggestions for e-commerce users.

• Developed user-item matrix factorization models using the Surprise library to improve recommendation accuracy and reduce RMSE by 18%.

• Built a lightweight, interactive app using Streamlit for real-time recommendation demo and stakeholder validation, enabling faster product feedback cycles.

• Performed A/B testing and user behavior analysis to assess recommendation effectiveness in increasing click-through rate (CTR) and conversion rate.

• Implemented cold-start handling strategies using user profile similarity and metadata features, improving new-user coverage by 26%.

• Conducted feature engineering and similarity scoring using TF-IDF vectors and cosine similarity to improve relevancy in content-based recommendations.

• Collaborated with product and engineering teams to align business objectives with ML recommendation logic, ensuring data-driven feature releases.

• Optimized model performance using cross-validation and grid search, deploying the solution on a scalable architecture suitable for integration with e-commerce platforms.

EDUCATION

Master in Management Information Systems University of Maryland, College Park, Maryland

Contact this candidate