Vishesh Paka
Boston, MA ***** ****.*@************.*** +1-857-***-**** LinkedIn GitHub
EDUCATION
Northeastern University, Boston, MA May 2024
Masters in Data Analytics Engineering, (GPA: 4.0/4.0) Courses: Database Management & Design, Computation and Visualization, Data Mining, Cloud Computing (AWS) VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Aug 2021 Bachelors in Computer Science, (GPA: 3.8/4.0)
Courses: Data Structures & Algorithms, Probability Statistics, Big Data & Predictive Analytics, Artificial Intelligence, NLP TECHNICAL SKILLS
Programming Languages: Python (Pandas, NumPy, Scikit-learn, SciPy), PySpark, SparkSQL, R, Cypher, C, C++, Java Databases: SQL (MySQL, Oracle, PostgreSQL), NoSQL (MongoDB) ETL & Data Visualization Tools: Informatica, Snowflake, Tableau, Power BI, Flourish, Datawrapper, MS Excel Tools & Technologies: MATLAB, Git, Apache Spark, Hive, Hadoop, Airflow, Databricks, AWS, GCP Specializations: ETL, EDA (Data Analysis), Data Modelling, Data Mining, Statistics, Data Wrangling PROFESSIONAL EXPERIENCE
Accenture, Bangalore, India Sep 2021- Sep 2022
Data Engineer
Developed end-to-end data pipelines using Informatica to extract, transform, and load datasets from multiple sources
(CSV, JSON, flat, Parquet) into a Snowflake data warehouse, resulting in a 50% reduction in data processing time
Designed and implemented data mapping and transformation logic that reduced data inconsistency issues by 30% across multiple data sources, resulting in improved data accuracy and streamlined data processing
Collaborated with 5+ cross-functional teams to design and implement multiple data-driven solutions, resulting in a 25% improvement in data integration efficiency and a 15% reduction in data integration errors
Created 5 Tableau dashboards to enable analysts in visualizing results and facilitating data-driven business decisions, leading to a 15% surge in data accessibility and adoption Cognizant, Hyderabad, India Mar 2021 - Aug 2021
Full Stack Engineer Intern
Gained experience as an intern developer, worked with a team to develop a web application for the pharmacy medicine supply management system
Automated the logic of forming a schedule, resulting in a 20% increase in operational efficiency. Gained valuable experience in agile software development methodologies and version control tools such as Git PROJECTS
Time Series Analysis for Human Activity Data Dec 2022
Performed time series analysis on 15 subjects to extract features for various human activities including walking, running, climbing up, and climbing down, to forecast fall events in elderly individuals by applying NVG and HVG
Created scatter plot visuals and calculated network topology metrics such as average degree, network diameter, and average path length. These metrics helped to identify any underlying relationships or patterns Store Management System Nov 2022
Engaged in data modeling by designing an Entity-Relationship (ER) diagram to capture 10 key entities, 12 relationships, and over 50 attributes in the system and implemented a relational database using MySQL Workbench
Enhanced its performance and usability by interpreting tables with SQL queries to perform analytical tasks, resulting in the generation of reports and valuable insights into the organization's business operations Analysis on Global Carbon Emissions Oct 2022
Designed a dynamic Google website, showcasing interactive visualizations and analysis of global carbon emissions data
Harnessed the power of advanced data visualization tools, including Tableau, Flourish, and Datawrapper, to create a visually striking and information-rich website. This strategic integration resulted in a 20% increase in user engagement Visual Questioning and Answering May 2021
Developed a deep learning model using TensorFlow with a pre-trained VGG 16-Net for question processing. Leveraged LSTMs to answer relevant questions asked on medical images, including MRI scans and X-rays
Executed extensive data preprocessing and performed model tuning to enhance performance, resulting in a significant increase in the BLEU score to 0.393
Disease Prediction Using Symptoms Nov 2020
Developed a machine learning model utilizing Random Forest to predict a specific disease based on a series of symptoms provided, achieving an accuracy of over 90%
Conducted rigorous feature engineering and model tuning to optimize the performance of the model, leading to a remarkable 30% decrease in misdiagnosis rates and an evident improvement in patient outcomes CERTIFICATIONS
Certified in Machine Learning from STANFORD UNIVERSITY and in Python for Data Science and AI & ML from IBM