Data Engineer Power Bi

Location:

Los Angeles, CA

Posted:

December 05, 2024

Contact this candidate

Resume:

Charitha Reddy Mandli

Data Engineer

+1-562-***-**** ********.*@*********.*** Long Beach, CA Linkedin SUMMARY

• Data Engineer with around 4 years of experience architecting data-intensive applications leveraging Cloud Data Engineering, Data Warehousing, Big Data Analytics, Data Visualization, and Data Quality solutions.

• Harnessing AWS services (EC2, S3, RDS, Lambda, Glue, Athena, AWS Pipeline, Redshift) and Azure services (Azure Data Factory, Azure DevOps, Databricks, Azure Data Lakes, Azure Stream Analytics, Azure Synapse) for scalable and cost-effective big data analytics on Teradata

• Specialized in crafting interactive dashboards using AWS QuickSight, Power BI, and Tableau, visualizing key performance indicators

(KPIs) from Amazon Redshift data for business stakeholders. EXPERIENCE

CitiGroup, TX Sep 2023 – Present

Data Engineer

• Designed and executed ETL pipelines using AWS Glue to move and transform data between sources (S3, RDS) and target data warehouses (Redshift, Snowflake).

• Orchestrated the development and optimization of ETL pipelines using SQL, python(numpy, pandas, dask) and Apache Spark, to ensure high-fidelity migration of critical financial data with minimal disruption.

• Deployed Kafka for real-time data streaming, achieving a measurable reduction in data latency and improving the processing speed of customer transactions by 20%.

• Accelerated Snowflake's cloud-based architecture to facilitate collaboration between data analysts and data scientists, leading to a 30% increase in data-driven project completion rate.

• Engineered a highly scalable DynamoDB data store to handle a million daily requests, reducing latency by 35% and improving system responsiveness.

• Integrated Apache Airflow with AWS to monitor multi-stage ML workflows, with tasks running on Amazon SageMaker reducing workflow execution time by 35%.

• Constructed and maintained various DBT models and macros, focusing on incremental model design and modularization, resulting in a 20% increase in data pipeline efficiency and a 35% reduction in execution time.

• Leveraged AWS Lambda to automate data for daily sales data feeds, reducing processing time by approximately 25%.

• Created interactive Tableau and Quick Sight dashboards for trend analysis, and anomaly detection, and used SSRS for financial reports to monitor key metrics in real-time, enabling 25% faster data-driven decisions.

• Spearheaded Infrastructure as Code (Terraform) implementation and automated deployments using GitHub Actions for CI/CD. Fusion Software Technologies, India Jan 2020 – Jul 2022 Data Engineer

• Optimized Hive queries through partitioning and bucketing, improving query speed by 25% to handle large datasets, and implemented data processing systems by MapReduce to handle workloads efficiently.

• Employed SQL scripts to create and optimize stored procedures, facilitating efficient data modifications and retrieval.

• Engineered and automated ETL pipelines with Azure Data Factory, achieving a 30% reduction in data processing time by streamlining ingestion processes across six diverse data sources and destinations for improved reporting efficiency.

• Enhanced Azure Synapse Analytics queries by 25% with indexing, partitioning, and query optimization techniques.

• Constructed and optimized highly scalable data pipelines on Databricks using PySpark, ingesting terabytes of data per day from various sources (relational databases, log files, APIs).

• Utilized Snowflake's User-Defined Functions (UDFs) and stored procedures for complex data transformations, achieving a 10% improvement in data processing efficiency.

• Accomplished interactive Power BI reports leveraging advanced DAX and Power Query modeling techniques increasing a 20% reporting speed to unlock deeper data insights and facilitate informed decision-making.

• Operated Postman for API testing, ensuring the reliability and functionality of API’s during development and integration.

• Created and deployed Apache Flink applications for processing high-volume streaming data, achieving a 25% improvement in processing speed over batch processing methods.

SKILLS & CERTIFICATION

• Programming Language: Python, Scala, R, SQL

• Big Data Ecosystem: HDFS, MapReduce, Apache Kafka, Apache Spark, Flink, DataBricks

• Cloud: AWS (EC2, S3, Lambda, Glue, Athena, AWS Pipeline, DynamoDB, Redshift), Azure (Data Lake, Data Factory, Databricks)

• ETL & Tools: SSRS, Data Pipelines, DBT, Airflow, Tableau, Power BI, Excel, Docker, Terraform

• Packages & Data Processing: NumPy, Pandas, Dask, Matplotlib, Seaborn, SciKit-Learn, TensorFlow, PySpark

• Version Control & Database: Git, SQL Server, PostgreSQL, MongoDB, MySQL, Snowflake, HiveQL

• Certification: AWS Data Engineer Associate, The Ultimate Hands-on Hadoop: UDEMY EDUCATION

Master of Science (MS) in Computer Science California State University, Long Beach, California Bachelor in Computer Science Jawaharlal Nehru Technological University, Hyderabad, India

Contact this candidate