Post Job Free

Resume

Sign in

Data Engineer Azure

Location:
Ashburn, VA
Posted:
October 03, 2023

Contact this candidate

Resume:

SRI CHENNAM

Data Engineer Email: adz44z@r.postjobfree.com Contact: +1-346-***-****

Professional Summary

Proficient Data Engineer with over 4+ years of experience designing, developing, and implementing complex data solutions. Involved in building scalable and high-performant data pipelines supporting various cloud-native applications. Experienced in Python, Spark, Hadoop, Hive, SQL and various cloud platforms. Experienced in architecting scalable data platforms using cloud-based infrastructure such as AWS, Azure, Snowflake and containerization technologies like Docker and Kubernetes. Strong understanding of Machine Learning algorithms and predictive modeling, with experience in implementing these models using Python, sci-kit learn, Agile and TensorFlow and exposure on Big Data ecosystems. Education :

● Master Master's in computer science, University of Dayton, Ohio

● Bachelor in TKR, Hyderabad, India

Skill matrix:

Programming Languages - Python, SQL,Scala, Java, Hive Big Data EcoSystem - Hadoop, HDFS, Hue, MapReduce, Pig, Hive, Oozie, HBase, Sqoop, Impala, Zookeeper, Flume, Kafka, Yarn, PySpark, Airflow, Informatica, Snowflake, Databricks, Kafka, Cloudera, Airflow, Kafka Snowflake. Cloud Services - AWS Glue, S3, RedShift, EC2, Athena, IAM, EMR, DynamoDB, Data Lake, AWS Lambda, Sage Maker, Cloud watch, Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, Azure Analytical Services, HDInsight, Azure Databricks, Big Query, GCS, Data Proc, Dataflow, Google Cloud Platform (GCP). Database - Oracle, MySQL, Microsoft SQL Server, MongoDB,Cosmos DB, PostgreSQL, Cassandra, Teradata Devops - Jenkins, Terraform

IDE Tools - Eclipse, IntelliJ, PyCharm.

Visualization Tools - Tableau, Power BI, Ms Excel

Containerization - Kubernetes and Docker

Software development Methodology - Agile, Scrum, Waterfall. Professional Experience

Client: Food Lion, Charlotte, NC

Data Engineer, Sep 2022- Present

Responsibilities:

● Designed and implemented large-scale data processing systems using cloud-based technologies such as AWS and Google Cloud Platform (GCP).

● Developed ETL pipelines to ingest, transform, and load large and complex datasets using Python and Apache Spark, and scheduled them using Apache Airflow.

● Developed data pipeline using Spark, Hive, Pig, Python, Impala, and HBase to ingest customers.

● Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python, and Scala.

● Worked on documentation of al worked Extract. Transform and Load, Designed, developed and validated and deployed the Talend ETL processes of Data Warehouse team using PIG, Hive.

● Experience in performing the Data cleansing Process and in interpreting and analyzing the data.

● Designed and implemented data modeling and schema design for both SQL and NoSQL databases, and ensured their performance and scalability.

● involved in end-to-end project lifecycle, from requirements gathering and design to development, testing, and deployment.

● Created AWS Glue jobs on data from AWS S3 and stored the transformed data in AWS RedShift.

● Analyzed large and critical datasets using EMR, Glue and Spark.

● Implemented Spark using PySpark and SparkSQL for faster testing and processing of data.

● Defined database schema and optimized SQL queries for improved performance in the data warehouse using AWS Redshift.

● Built data visualizations and dashboards using Tableau, enabling stakeholders to visualize data trends and insights in real-time.

● Adept at designing, developing, and implementing robust data solutions using modern technologies and frameworks.

● Worked on Data Migration from Teradata to AWS Snowflake Enivorment using Python and BI tools like Alteryx.

● Using Flume,Kafka and Spark streaming to ingest real time or near real time data to HDFS.

● Analyzed and processed complex data sets using advanced querying, visualization and analytics tools.

● Managed and optimized AWS Glue jobs for efficient data processing, reducing processing time and costs.

● Experience in Data Development, Streamline the Data Processing, and Data Automation.

● Capable of designing and applying optimized automation solutions to enhance compliance and streamline regulatory reporting processes.

● Utilized AWS and optimized BigQuery for implementing partitioning and clustering on large datasets, improving query execution time, by conducting complex SQL queries and aggregations on large datasets to uncover valuable insights and patterns in the data.

● Proficient with container systems like Docker and container orchestration like EC2 Container Services and Kubernetes.

● Proficient in planning and executing seamless data migration processes, ensuring smooth transition and preservation of data integrity during system upgrades or migrations. Client: Yash Technologies Pvt Ltd/Hyderabad, India Data Engineer, July 2018 - Aug 2021

Responsibilities:

● Involved in building a modern data platform that enables reliable and timely data ingestion, processing, and consumption using Apache Kafka, Azure Event Hubs, and Azure Databricks.

● Actively participates in systems analysis activities, analyzing current systems and processes to identify areas for improvement and optimization.

● Built a modern data platform enabling reliable and timely data ingestion, processing, and consumption using Apache Kafka, Azure Event Hubs, Azure Databricks, and Google Cloud Platform (GCP).

● Optimized data processing and storage using Azure Data Lake Storage (ADLS) and also worked with Azure Cognitive Services, including Cognito, for ML applications.

● Designed and built data ingestion frameworks for various data sources using Azure DataBricks and developed custom connectors using Azure Logic Apps and Azure Event Grid to integrate data sources and destinations.

● Automated data pipelines using Azure Logic Apps and Azure Functions, using DevOps CI/CD for continuous integration and deployment.

● Created and maintained ETL and ELT workflows for data transformation and integration using Azure Databricks and tools like Azure Data Factory and Azure Synapse Analytics.

● Automated data pipelines using Azure Logic Apps, Azure Functions, Google Cloud Functions, and employed DevOps CI/CD for continuous integration and deployment.

● Designed and implemented scalable data architectures using Azure services like Azure SQL Database, Cosmos DB, and Blob Storage.

● Utilized Azure Synapse Analytics as a centralized data warehousing solution for scalable and high-performance data querying and analysis. Designed and implemented data models for business intelligence and data visualization using Azure Synapse .

● Developed and maintained ETL workflows to extract, transform, and load data from various sources using Azure services like Data Factory and Databricks.

● Worked with NoSQL databases like Cosmos DB and MongoDB to store unstructured data involving operations like designing Data Models, Indexing Data and Optimizing Query Performance.



Contact this candidate