PRAVALLIKA AVULA
+1-571-***-**** • *****************@*****.*** • linkedin.com/in/pavula04/
Data Analyst with over 4 years of experience delivering insights through statistical analysis, machine learning, and data visualization. Proficient in SQL, Python, and R with expertise in ETL pipelines, Azure Data Factory, and data warehousing with Snowflake. Hands-on experience with cloud platforms including Azure and AWS, using tools like Apache Airflow, AWS Glue, and Power Automate. Skilled in ad-hoc analysis, hypothesis testing, and regression. Experienced in building interactive dashboards with Tableau and Power BI. Strong communicator with a proven ability to translate complex data into actionable insights.
SKILLS
Programming Languages: Python, R, HTML, Databricks, SQL
Frameworks and Libraries: Pandas, NumPy, Matplotlib, Seaborn, SciKit-Learn, Keras, SciPy, TensorFlow, Ggplot, Plotly, PySpark, Django
Big Data Technologies: HDFS, Hadoop, MapReduce, HBase, Spark, Apache Kafka, Snowflake, Cassandra, Delta Lake
Databases: MySQL, SQL Server, Oracle, NoSQL, MongoDB, DynamoDB, RDS, Teradata, PostgreSQL
Cloud Technologies: AWS (Kubernetes, Lambda, IAM, S3, Route 53, Glue, Redshift, Athena, Lake Formation), Azure (Data Bricks, Synapse, Data Factory, Blob Storage), GCP (BigQuery, Data Proc, Pub/Sub), Snowflake (SnowPipe, SnowPark)
Tools: Tableau, SAS, Power BI, Alteryx, GitHub, Jira, SharePoint, Looker, MS Excel, Confluence,
Others: ETL Development, Data Modeling (Relational and Dimensional), Data Warehousing, Data Processing, Statistical Analysis, Predictive Modeling, Business Intelligence, A/B Testing, Clustering, Regression, Classification
Environments: SDLC, Agile, Scrum, Waterfall, Windows, Mac OS, Linux
PROFESSIONAL EXPERIENCE
Data Analyst Jan 2024 – Present
Clairvoyant, USA
Designed and implemented ETL/ELT processes using Python, AWS Glue, Apache Airflow and to ensure seamless data flow and integration across systems.
Developed scalable cloud data warehouses using AWS Redshift, Azure Synapse, enhancing query performance by 45% for financial reporting.
Written Python scripts to automate ETL/ELT workflows, cutting manual intervention by 50% and improving data validation efficiency.
Analyzed A/B test results using R, prepared 25+ Tableau dashboards to monitor financial KPIs, sourcing data from diverse systems including AWS S3, RDS.
Built statistical models, Machine Learning models using Python (scikit-learn, TensorFlow, PySpark) for predictive analytics and decision-making.
Designed and delivered 15+ Power BI dashboards, embedding interactive visualizations into PowerApps for enhanced analytics and user experience.
Managed relational and NoSQL databases (PostgreSQL, Oracle, MongoDB), ensuring 99.9% uptime with automated backups and monitoring.
Optimized SQL code in Snowflake to support automation and enhanced complex queries, reducing execution time by 60% on large datasets.
Built real-time streaming data pipelines using Kafka, AWS Kinesis, and Spark Structured Streaming, enabling immediate insights for financial analytics.
Provided serverless analytics solutions using AWS Athena and Lambda, offering agile and reducing query costs by 35% for critical financial insights.
Leveraged Git, Jira, Confluence and Agile methodologies (SDLC) across projects, enhancing delivery efficiency and reducing code conflicts by 30%.
Data Analyst Sep 2019 – Sep 2022
Avenir Technology
Created interactive visualizations and 30+ dashboards in Power BI and Tableau, clearly communicating insights for key healthcare analytics initiatives.
Streamlined distributed ETL workflows for healthcare data using Apache Spark and refined SQL queries, reducing processing times by 35%.
Enabled edge computing solutions with Azure IoT Edge, providing rapid, low-latency analytics and immediate insights from IoT-enabled healthcare devices.
Managed and streamlined AWS cloud infrastructure using Kubernetes, CloudFormation, and Route53, achieving 99.5% availability and cost efficiency.
Engineered data ingestion pipelines integrating Cassandra databases to process 500M daily events, supporting advanced analytics within marketing operations.
Adopted Azure Blob Storage for efficient data storage of over 5TB of data and Azure SQL Database for enhanced querying and analysis.
Optimized performance of Snowflake data warehouses through indexing, caching, and partitioning, significantly reducing query runtime by 50%.
Integrated data governance and compliance controls (Azure Purview, Apache Ranger) to adhere strictly to regulatory requirements (HIPAA compliance).
Collaborated a machine learning model using Python, Azure ML Studio to predict patient readmission rates, integrating results into ETL pipelines.
Created 50+ SQL tables with referential integrity and developed advanced queries using stored procedures, functions, and indexing for efficient data retrieval.
Participated in Business Analysis, engaging with stakeholders to determine 20+ entities and attributes for a scalable and accurate Data Model.
Deployed Docker-based container orchestration on Kubernetes, enabling rapid scaling and simplified management of healthcare data-driven applications.
EDUCATION
George Mason University 2022 – 2024
Master of Data Analytics Engineering
Chaitanya Bharathi Institute of Technology 2016 – 2020
Bachelor of Electronics and Communication Engineering
CERTIFICATIONS
AWS Certified Data Engineer – Associate (Link) Career Essentials in Data Analysis by Microsoft and LinkedIn (Link)