Post Job Free
Sign in

Data Scientist & Analytics Engineer with Cloud Experience

Location:
Indianapolis, IN
Posted:
November 16, 2025

Contact this candidate

Resume:

Meghana Avadhanam

812-***-**** ********@**.*** Bloomington, IN

[LinkedIn] [GitHub] [Portfolio]

EDUCATION & CERTIFICATIONS

● Master of Science in Data Science, Indiana University Bloomington Aug 2022 - May 2024 Coursework: Statistics, Machine Learning, Information Visualization, Database Technologies, Big Data, Music Data Mining

● Bachelor of Technology in Computer Science, SRM University Jun 2017 - May 2021 Coursework: Data Structures & Algorithms, Data Analytics & Big Data, Web Design & Development, OS, Networks

● AWS Cloud Practitioner

SKILLS

Languages & Databases: Python, R, SQL, Scala, YAML, C++; MongoDB, MySQL, Cassandra, Neo4j (Graph database) Data Visualization: Tableau, Salesforce Datorama, POWER BI, MS Excel, Google Looker Studio, Google Analytics Cloud: AWS (Amazon Web Services), GCP (Google Cloud Platform), Microsoft Azure, ETL Libraries: Keras, Tensorflow, PyTorch, SKLearn, Numpy, Pandas, Matplotlib, Selenium, Statsmodels, Librosa, Spotipy, PRAW Data Frameworks: Hadoop, PySpark, Spark, Azure Databricks, Datafactory, Snowflake, Kafka, Hive, Alteryx, Airflow, DBT Statistics: A/B Testing, Hypothesis Testing, Regression Analysis, ANOVA, Statistical Software (R, Python), Chi-Square testing PROFESSIONAL EXPERIENCE

Data Analyst - Indiana University Mar 2023 - Present

● Developed & maintained 30+ interactive data visualizations utilizing KPIs, for Analytics & Paid Media teams using Google Analytics (GA4), Looker Studio, Salesforce Marketing Cloud (Datorama), Tableau, improving campaign assessment by 30%.

● Crafted SQL functions to query and extract data from 7+ contrasting data sets (tables) using unique identifiers and joins.

● Conducted data analysis for 6 campaigns’ data storytelling, ensured data accuracy translated them into effective data models.

● Managed and optimized data storage, maintaining data quality within Datorama, achieving a 97% reduction in storage costs.

● Leveraged familiarity with a range of RDBMS and NoSQL databases, such as MongoDB, Cassandra, and Redis, to optimize data storage and retrieval, significantly enhancing critical data operations' efficiency.

● Led the optimization and cost reduction of analytics data pipeline by $50k developing an end-to-end ETL system for Data Migration, Data Ingestion, Data Transformation, and Data Integration, using Dataflow, Pub/Sub, BigQuery as Data Warehouse.

● Delved into cutting-edge research on large language models and generative AI, resulting in 60% rise in the team's grasp of NLP. Data Analytics Engineer - Fractal Analytics, India Jun 2021 - Jul 2022

● Created 17+ Python containers and Scala components using Agile method for Airflow DAG to orchestrate and automate data processing from SFTP to Google Cloud Storage (GCP) and BigQuery. Enlisted Terraform for infrastructure as code (IaC).

● Reduced error rates by 35% through effective code composition and deployment strategies, employing Docker files for containerization, and rapidly resolving data workflow errors in GitHub CI/CD.

● Transformed 3 Alteryx workflows into GCP-native Apache-compatible workflows, ensuring optimal data processing.

● Enforced distributed computing technologies, including HDFS, Hadoop MapReduce, Scala/Spark-SQL, and YARN/MR2 for resource management and job execution, incorporated messaging via Kafka, leading to 30% reduction in data processing time. Data Scientist Intern - Pannini.ai, India Feb 2021 - Jun 2021

● Performed Image Processing of 25+ tax forms extracted from PDF using OpenCV and Python Image Library (PIL) in PyTorch.

● Fabricated a table detection algorithm using Hough line transform, skew correction, erosion, dilation, and gaussian blur methods.

● Synthesized a table data extraction algo using Optical Character Recognition (OCR) PyTesseract with a precision rate of >75%.

● Correlated time series analysis of W-2 and 1094 forms data to uncover seasonal trends and anomalies for financial forecasting. PROJECTS & PARTICIPATIONS

Speech Recognition & Emotion Analysis using Convolutional Neural Network & Natural Language Processing: Utilized Librosa, Sounddevice, Keras and Tensorflow Python modules to recognize target words from speech files & analyze sentiment. Employee DB AWS Web Application: Designed & implemented the architecture for deploying a scalable web app on AWS, with EC2 for hosting, S3 for file storage, RDS for relational database, and CloudFormation for infrastructure management. Automatic Risk Identification of COVID-19 patients with comorbidities: Performed Predictive Modeling on 100+ chest X-rays, CT scans, and clinical (immunological) data of patients with comorbidities to identify their mortality risk of COVID-19. Global Customer Revenue Analysis: Utilized Tableau to conduct comprehensive customer analysis within a corporate organization, with a specific focus on global customer revenue patterns. US University & College Ancestry Visualizations: Analyzed the distribution and pattern of universities across the United States, for the past 4 decades. Temporal, GeoSpatial and Topical Analysis was performed. Cross Cultural Music Recommendation using Audio Features: Scraped around 1000 songs from Spotify API using ‘SpotiPy’ module from worldwide artists. Implemented KMeans Clustering using audio features to group similar songs from different cultures. CRNY Data Visualization Competition: Illustrated 5+ creative data visualizations such as Geospatial Heat Map, Proportional Symbol Maps, Bubble charts, Demographic Charts for CRNY foundation to support individual artists in New York. Hackathon: Leveraged Random Forest, Polynomial Regression algos and adjusted R-squared error to optimize the prediction of CAPEX compliance for upcoming fiscal year projects for data-driven decision-making. Stood 4th place.



Contact this candidate