Software Engineer Data Analytics

Location:

San Jose, CA

Posted:

September 11, 2024

Contact this candidate

Resume:

SOWMYA KURUBA

510-***-**** # ******.******@****.*** ï sowmya-kuruba § sowmyakuruba20 § portfolio

EDUCATION

M.Sc. in Data Analytics, San Jose State University (GPA - 3.9) top 1% Jan 2023 - Dec 2024

• Research Assistant: Presented “Enhancing Bay Area Rapid Transit Utilization using Neo4j and XGBoost” for ridership prediction and route optimization at the NODES2023 Conference (Conference)(Github).

• Teaching Assistant: Mentored over 240 students in Mathematics and Statistics, Databases, NoSQL, and Big Data. B.Eng. in Computer Science, Visvesvaraya Technological University EXPERIENCE

Walmart Global Tech, Personalization Sep 2021 – Oct 2022 Software Engineer III

• Coordinated and collaborated cross-functionally to integrate Favorites personalization features into Walmart’s digital experiences, increasing click-through rates and conversion rates through recommendations by 20%.

• Engineered a high-performance data ingestion pipeline for Walmart using Apache Spark, GCP, and Apache Airflow, reducing processing time by over 75% and enabling concurrent execution of 13 jobs.

• Conducted 5+ interactive data visualization analytical dashboards incorporating KPIs and insightful reports using Tableau and Splunk, leading to improved data-driven decision-making and reduced bug resolution time.

• Streamlined application deployment processes by integrating Kubernetes CI/CD pipeline configurations with Java, in collaboration with a cross-functional DevOps team, which resulted in a 50% reduction in manual deployment errors.

• Developed and implemented comprehensive JUnit testing and sanity checks for the Favorites microservices domain, enhancing code quality and reliability while reducing bug occurrence by 30%.

• Enhanced Favorites microservices architecture using software design patterns, and actively participated in Agile, Software Development Life Cycle and Scrum methodologies, reducing development time and boosting team productivity. Abyeti Technologies, Precise Jul 2018 – Sep 2021

Technical Associate

• Designed and optimized a critical data process by Snowflake query optimization, and memcaching using C++ data structures. Achieved over a remarkable 90% reduction in processing time and cost, the most optimal solution to date.

• Spearheaded a high-performance asynchronous multi-threaded API solution for parallel data processing, resulting in a 20% boost in application responsiveness and user interaction speed.

• Revamped a legacy Flash UI using JavaScript, increasing customer user experience satisfaction by 70%.

• Represented exceptional client support by effectively communicating and resolving 30 critical software issues. PROJECTS

• Fabricated a Personal Financial Stock Analyzer utilizing NVIDIA’s NIM generative AI model and LangChain, delivering real-time stock portfolio analysis through Yahoo Finance data, and enhancing investment decision-making (Demo).

• Implemented A/B testing and hypothesis testing methodologies in the Movie Recommendation System project, comparing different filtering algorithms and conducting statistical analyses (e.g., t-tests, chi-square tests) to optimize recommendation accuracy and user engagement. (GitHub).

• Led the development of a scalable ETL data pipeline utilizing AWS cloud computing analytics (Glue, Redshift, EMR, QuickSight) to analyze 6.9M+ Yelp Reviews through Natural Language Processing (NLP) techniques to derive actionable insights for restaurant improvement (Medium).

• Conducted Sentiment Analysis on Customer Reviews using Natural Language Processing and machine learning models, addressing class imbalance through resampling techniques (Github).

• Performed Customer Segmentation using K-means clustering and hierarchical clustering on retail data to identify distinct customer groups and optimize marketing strategies (Github).

• Composed a Customer Churn Prediction model using PySpark, implementing machine learning algorithms including Logistic Regression, Random Forest, and Gradient Boosted Trees on telecom customer data (Github).

• Developed a machine learning model for Fraud Detection in financial transactions using Google Cloud, leveraging BigQuery for data analysis and model training.

TECHNICAL SKILLS

• Programming Languages: Python, C++, Java, Javascript, SQL, CUDA.

• Frameworks: Pytorch, PySpark, Flask, TensorFlow, Keras, Numpy, Sci-kit learn, Pandas, Matplotlib, OpenCV.

• Database: MySQL, PostgreSQL, Oracle, Neo4j, Cassandra, cosmosDB.

• BigData: Amazon Web Services(AWS), Google Cloud Platform, Microsoft Azure, Spark, Hadoop, Kafka, MLFlow.

• Other Tools: Tableau, Power BI, Excel, Git, GitHub, Maven, Perforce, Docker, Kubernetes, Linux. CERTIFICATION

• IBM Machine Learning in Python, Meta Database Engineering, Google Generative AI

Contact this candidate