Akhil Bollu
*************@*****.*** +1-316-***-**** LINKEDIN
Experience
Data Engineer – Cisco - San Jose, CA Jan 2023 – Present
Responsibilities:
Designed predictive models and analytics dashboards using Python, SQL, and Power BI, enhancing decision-making efficiency by 30%.
Orchestrated ETL data pipelines with Airflow, Python, DBT, Stitch Data, and GCP solutions, cutting processing time by 40%.
Constructed data pipelines using HDFS, HIVE, HBase, Pig, Spark, Scala, Control-M, and Stream Sets ETL, handling 1TB of daily data.
Crafted interactive Power BI dashboards, improving reporting accuracy by 25% and reducing generation time by 50%.
Deployed storage and analytics services on Azure, optimizing costs by 15% with UNIX Shell scripting.
Enhanced fraud detection models by 25%, reducing false positives by 15%.
Engineered ETL workflows in Talend, synchronizing data into Azure SQL Data Warehouse; minimized manual input by 40% and increased data accuracy and reliability by 30%.
Streamlined data workflows by 50% utilizing Azure Data Factory, SQL API, and Mongo API.
Supervised Spark clusters, transitioning log storage to Cosmos DB, boosting query performance by 40%.
Adopted Agile and DevOps practices, reducing the development cycle time by 20%.
Environment: Python, R, SQL, Power BI, Airflow, DBT, GCP, HDFS, Hive, HBase, Pig, Scala, Control-M, ETL, Talend, T-SQL, MSSQL Server, Azure, Azure SQL, Azure Data Factory, Spark, UNIX Shell, SQL API, Mongo API, MongoDB, RDBMS, Ambari Web UI, Cosmos DB, Agile, DevOps, CI/CD.
Data Engineer - Sonata Soft - India Jan 2019 to Dec 2021
Responsibilities:
Utilized Databricks and Spark for data extraction, cleansing, and transformation, increasing efficiency by 30%.
Automated data flow with Control-M, cutting manual intervention by 40%.
Architected robust PySpark solutions for data validation, cleansing, and aggregation; achieved a 20% improvement in data accuracy and streamlined data processing workflow, boosting productivity and reliability of data insights
Collaborated on machine learning projects, enhancing model accuracy by 15%.
Managed DynamoDB and queried AWS S3 with Athena, processing 1 million records daily.
Deployed AWS EMR Spark, accelerating processing speed by 20%.
Developed scalable REST APIs using Flask and orchestrated ETL processes in AWS Glue, successfully migrating 1TB of data to AWS Redshift, enhancing data accessibility and reducing query times by 40%.
Boosted query performance by 20% with Snowflake and Redis cache optimization.
Created Power BI dashboards for effective reporting and analytics.
Developed ETL workflows and managed Redshift using Talend
Environment: AWS, AZURE, API Gateway, Lambda, PySpark, DynamoDB, AWS EMR, AWS S3, AWS Redshift, Athena, REST API, Flask, Databricks, Machine Learning, Spark, Control-M, Pandas, NumPy, Scala, Power BI, ETL, Talend, Snowflake, Apache Spark SQL, Redis, Data Factory, Blob Storage
Education
Wichita State University - Wichita, KS
Master of Science in Data Science
PSCMR College of Engineering & Technology - India
Bachelor of Technology in Electronics & Communication Engineering
Certifications
Microsoft Certified: Azure Data Engineer Associate - 2024
Snowflake Certified - 2024
AWS Certified Solutions Architect - 2024
Skills
Big Data Technologies: Hadoop, MapReduce, Sqoop, Hive, Spark, Kafka, Scala, Cloudera, Databricks, ML (TensorFlow, PyTorch, scikit-learn), Deep Learning, Snowflake
Languages: Python, PySpark, R, SAS, SQL, Scala, HiveQL, C, C++
Cloud Platforms: AWS (EMR, EC2, RDS, S3, Lambda, Glue, Redshift), Azure (Data Factory, SQL DB, Synapse, Data Lake Storage), GCP
Databases: HBase, Cassandra, MongoDB, DynamoDB, MySQL, PostgreSQL, Oracle
Reporting: Power BI, Tableau
Tools: Sales Force, Apache Spark/Spark Streaming, Data Structures and Algorithms, Git, Talend, Terraform, Control-M, Flask, Jenkins, Ansible, Spring boot, Splunk, Jupyter, Anaconda, Jira, Confluence, Informatica, Docker, Kubernetes, Linux