Akhil Apuri
Data Engineer
+1-901-***-**** *****.*@***********.*** https://www.linkedin.com/in/akhilapuri/ SUMMARY
Data engineer with 3 years of experience with AWS and Azure to create and deploy scalable data solutions. Competent in large data technologies like Hadoop and Apache Airflow and proficient in Python, R, and SQL. Experience in utilizing Power BI and Tableau to create interactive dashboards to enhance engagement, data warehouses, and ETL pipelines that improve data-driven decision-making. Knowledgeable about improving data processing and storage and moving off-premises data to cloud platforms. SKILLS
Methodologies: Agile, Waterfall
Language: Python, R, SQL
ML Algorithm: Linear Regression, Logistic Regression, Supervised Learning, Unsupervised Learning, Classification, SVM, Random Forests, Naive Bayes, KNN
Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow Databases: Oracle, MySQL, PostgreSQL, MongoDB
Tools: Visual Studio, Power Bi, Tableau, Git, MATLAB, Eclipse, Jupyter Notebook, Cloud endure, Android Studio, Kafka, Jira Big Data technologies: Hadoop, Apache Airflow
Cloud Technologies/Services: AWS, Azure, Azure Data Bricks, Snowflake, SAAS, GCP, BigQuery Operating System: Windows, Linux
EXPERIENCE
Data Engineer, Freddie Mac, USA July 2023 - Current
• Built and deployed large-scale data processing systems using Apache Spark and Apache Airflow and was able to cut the processing time by 40%.
• Implemented multi-sourced ELT processes to extract data from PostgreSQL databases, legacy SQL Server systems, and external APIs, transforming and loading into Azure Synapse Analytics and Google BigQuery, increasing data processing efficiency by 40% and reducing data loading time by 30%.
• Facilitated database migration strategy from on-premises SQL Server and MySQL to cloud platforms, leveraging Azure Database services for staging and Redshift for final data warehousing, implementing optimized indexing strategies throughout the transition.
• Optimized ETL practices to extract, transform, and load data from sources into data warehouses using Python and SQL.
• Conceptualized Power BI's DAX (Data Analysis Expressions) language to perform complex calculations and create custom measures and KPIs.
• Managed security groups and resource management on Azure, focusing on high availability, fault tolerance, and auto-scaling using Terraform with Azure Resource Manager templates, improving system reliability by 20% and scalability by 30%. Data Engineer, Informative Web Solutions, India March 2021 – August 2022
• Established a data platform from scratch, an activity that influenced the project's requirement gathering and analysis phase, and documented the business requirements.
• Executed data mining using R and SAS to perform statistical tests, including hypothesis testing, and upgrading data accuracy.
• Conducted machine learning experiments and settled predictive models, uncovering insights that increased business forecasting accuracy by 25%.
• Transitioned on-premises infrastructure to AWS using both lift-and-shift and re-architecture approaches, enhancing scalability by 30% and reducing operational costs by 20%.
• Delivered technical guidance and support to cross-functional teams on AWS architecture and best practices, resulting in a 15% improvement in team efficiency.
• Developed Power BI dashboards integrated with Microsoft Fabric, implementing multi-sourced ELT processes from web APIs and databases for real-time analytics.
• Employed Hadoop-based solutions for processing and storing large datasets, enhancing data scalability by 30% and reducing data processing time by 30%.
Data Engineer, Sage Softtech, India July 2020 – February 2021
• Worked on end-to-end machine learning workflow, wrote Python code for collecting data from AWS Snowflake, data preprocessing, feature extraction, modeling, evaluating the model, and deployment.
• Generated Python code for exploratory data analysis using ML Python packages- NumPy, Matplotlib, and Pandas profiling.
• Experience in machine learning algorithms like linear regression, logistic regression, decision trees, and K-mean clustering.
• Familiar with modern machine learning frameworks like PyTorch, Scikit-learn, and TensorFlow.
• Developed and deployed Tableau dashboards by transforming and integrating complex datasets, enabling data-driven decision-making and enhancing business intelligence reporting. EDUCATION
Master of Science in Data Science, University of Memphis, Tennessee, United States August 2022 – May 2024 Bachelor of Science in Computer Science, JNTU, Hyderabad, India August 2017 – September 2021