Roshni Vadiraja
Data Engineer
617-***-**** **************@*****.*** MA Tableau GitHub
SUMMARY
• 4+ years of experience as a Data analyst with a strong foundation in Python, SQL, AWS, Snowflake, Tableau, and Git. Adept at designing and implementing efficient data pipelines, optimizing data storage and retrieval, and ensuring data quality and accuracy. Proven track record in handling large-scale datasets, data modeling, ETL processes, and data visualization. Proficient in leveraging cloud platforms and tools to build scalable, robust, and cost-effective data solutions. Seeking opportunities to utilize my expertise in data engineering and contribute to innovative projects in a dynamic and collaborative environment.
EDUCATION
Masters in Analytics (Applied Machine Intelligence) Northeastern University, Boston, MA Bachelor of Engineering in Information Sciences Visvesvaraya Technological University, Bangalore, India SKILLS
Methodology: SDLC, Agile, Waterfall
Programming Language: Scala, Python, R, Java, SQL
IDE’s: PyCharm, Jupiter Notebook
Cloud Technologies: AWS, Azure, Google Cloud Platform Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, Kafka, R shiny, Stats Models, PyTorch, Keras, Tkinter
AWS Tools: EC2, S3, IAM, Glue, Quick Sight, Athena Reporting Tools: Tableau, Power BI, SSRS
Database: MS SQL Server, Snowflake, MySQL, Oracle
Other Tools: Git, GitHub, MS Office
Operating Systems: Windows, Linux
WORK EXPERIENCE
Fractal.ai Boston, MA - Data Engineer Oct 2021 – Oct 2023
• Developing scalable and efficient backend systems using Python, Scala, and Java, leveraging frameworks such as Flask and Django to create RESTful APIs for seamless integration with front-end applications.
• Implementing data processing pipelines using Pyspark, Kafka, and AWS Glue to ingest, transform, and load large volumes of data from various sources into cloud-based storage solutions such as AWS S3 and Google Cloud Storage.
• Utilize NumPy, Pandas, Matplotlib, and SciPy for data analysis and visualization, enabling stakeholders to gain actionable insights from complex datasets.
• Integrate third-party tools and services such as Tableau, Power BI, and SSRS with backend systems to enable real-time monitoring and reporting of key performance metrics, enhancing decision-making processes.
• Developing desktop applications using Tkinter and PyQt for data visualization and analysis, providing user-friendly interfaces for data exploration and manipulation.
• Conducting data analysis, identified anomalies and outliers, and enhanced data science components by recalibrating metrics calculations.
• Implementing data cleaning and preprocessing procedures in Python, ensuring data accuracy and consistency, and reducing data errors by 25%.
IBM (Client: Rockwell Automation, Bengaluru, India- Applications Analyst Jan 2017 – Aug 2018
• Developed custom Python scripts to automate repetitive tasks, reducing manual workload by 30% and improving overall data processing efficiency.
• Conducted training sessions for the team on best practices in data engineering, contributing to a 15% improvement in team productivity and efficiency.
• Implemented scalable and cost-effective data solutions on AWS, resulting in a 30% reduction in infrastructure costs while handling a 50% increase in data volume.
• Streamlined Git workflows and branching strategies, resulting in a 20% reduction in code conflicts. Healthcare Global Enterprises Limited., India - Data Engineer Jun 2016 – Nov 2016
• Implemented data cleaning and preprocessing procedures in Python, ensuring data accuracy and consistency, and reducing data errors by 25%.
• Developed and implemented a comprehensive data pipeline architecture, integrating diverse healthcare datasets from multiple sources, resulting in a 30% increase in data accessibility and efficiency.
• Developed and deployed a Tableau-based data visualization solution, providing actionable insights to stakeholders and aiding in informed decision-making, leading to a 35% improvement in operational efficiency.
• Optimized data storage and retrieval processes by leveraging AWS services, resulting in a 30% improvement in data access times and a 20% reduction in storage costs.
CERTIFICATIONS
Azure Data Engineer Associate (Microsoft) Fractal Data Engineer Certificate (Fractal.ai) Python for Everybody Specialization (Coursera)
IBM Data Science Professional Certificate (Coursera)