Resume

Data Engineer Sql Server

Location:

Edison, NJ

Posted:

April 22, 2024

Contact this candidate

Resume:

SHAREEF

Phone: - 216-***-**** Email: - ad46uq@r.postjobfree.com

Professional Summary:

•Certified AWS Data Engineer with over 8 years of extensive IT experience, Expertise

Cloud Engineering, Data Warehouses, Data visualization.

•Extensive experience in Amazon Web Services (AWS), Cloud services such as EC2, VPC, S3, IAM, EBS, RDS, ELB, VPC, Route53, Ops Works, DynamoDB, Autoscaling, Cloud Front, CloudTrail, CloudWatch, CloudFormation, Elastic Beanstalk, AWS SNS

, AWS SQS, AWS SES, AWS SWF & AWS Direct Connect.

•Experience with designing, building, and operating solutions using virtualization using private hybrid / public cloud technologies.

•Exposed to all aspects of Software Development Life Cycle (SDLC) such as Analysis, Planning, Developing, Testing and implementing and post-production analysis of the projects and methodologies such as Agile, SCRUM and Waterfall.

•Extensive experience in developing web pages using HTML / HTML5, XML, DHTML CSS / CSS3, SASS,, JavaScript, React JS, Redux, Flex, Angular JS ( 1.X ) jQuery, JSON

, Node.js, Ajax, JQUERY Bootstrap.

•Designed, built, and deployed a multitude of application utilizing almost all AWS stack (Including EC2, R53, S3, RDS, HSM Dynamo DB, SQS, IAM, and EMR ), focusing on high - availability, fault tolerance, and auto – scaling.

•I have experience in designing and creating RDBMS Tables, Views, User Created Data Types, Indexes, Stored Procedures, Cursors, Triggers and Transactions .

•Expert in designing ETL data flows using creating mappings / workflows to extract data from SQL Server and Data Migration and Transformation from Oracle / Access / Excel Sheets using SQL Server SSIS.

•Building and productionizing predictive models on large datasets by utilizing advanced statistical modeling, machine learning, or other data mining techniques.

•Hands on experience in Architecting Legacy Data Migration projects such as Teradata

•Redshift migration and from on - premises to AWS Cloud.

•Experience in designing, installing and implementing Ansible configuration management system for managing Web applications, Environments configuration Files, Users, Mount points and Packages .

•Building and productionizing predictive models on large datasets by utilizing advanced statistical modeling, machine learning, or other data mining techniques

•Developed intricate algorithms based on deep - dive statistical analysis and predictive data modeling that were used to deepen relationships, strengthen longevity and personalize interactions with customers.

Technical skills:

Databases: Oracle, MySQL, SQL Server, MongoDB. Cassandra, DynamoDB, PostgreSQL, Teradata, cosmos.

Programming: Python, Spark, Scala, Java, C, C ++, Shell script, Perl script, SQL, HTML.

Cloud Technologies: AWS, Microsoft Azure

Tools: PyCharm, Eclipse, Visual Studio, SQL*Plus, SQL Developer, SQL Server Management Studio, SQL Assistance, Eclipse, Postman

Versioning tools: SVN, Git, GitHub

Operating Systems: Windows 7 / 8 / XP / 2008 / 2012, Ubuntu Linux, MacOS

Database Modelling: Dimension Modeling, ER Modeling, Star Schema Modeling,

Snowflake Data Modeling.

Monitoring Tool: Apache Airflow

Visualization / Reporting: Tableau, ggplot2, matplotlib, SSRS and Power BI

Machine Learning Techniques: Linear & Logistic Regression, Classification and Regression Trees, Random Forest, Associative rules, NLP, and Clustering.

PROFESSIONAL EXPERIENCE

General Electric – Boston, MA

JOB ROLE: AWS DATA ENGINEER (NOV 2022 – Present)

Responsibilities:

•Designed and implemented highly efficient ETL workflows using Apache Airflow on AWS, enhancing data orchestration and automation capabilities.

•Optimized ETL jobs for superior performance, leveraging AWS resources effectively and reducing load on source databases.

•Conducted thorough unit testing and integration testing to ensure data accuracy and consistency across tables.

•Managed data pipeline scheduling and monitoring using AWS Airflow, ensuring timely execution and resource optimization.

•Developed and implemented complex ETL code utilizing AWS Data Services for Type II Slowly Changing Dimensions with Surrogate keys.

•Created and executed test plans and schedules, collaborating with project managers and development teams for successful code migration.

•Designed and deployed Data Lake infrastructure on AWS to support diverse data analytics, processing, and reporting needs.

•Implemented a robust Security Framework using AWS Lambda and DynamoDB to provide fine-grained access control to objects in AWS S3.

•Conducted end-to-end architecture and implementation assessments of various AWS services like Amazon EMR, Redshift, and S3.

•Integrated Apache Airflow with AWS services to monitor and manage multistage ML workflows, enhancing scalability and efficiency.

•Utilized Spark SQL for Scala and Python interfaces to automate RDD case class conversion to schema RDD, streamlining data processing tasks.

•Developed machine learning algorithms in Python to predict user order quantities, leveraging Kinesis Firehose and S3 Data Lake for real-time data analysis.

•Implemented SQL server scripts and optimized Informatica mappings for improved ETL performance in SQL Server environments.

•Collaborated with data modelers and ETL developers to adapt software code to changing business requirements throughout the SDLC.

•Designed and executed data extraction, transformation, and loading processes using Informatica for seamless data migration.

•Conducted code and design reviews with clients to ensure adherence to best practices and quality standards.

•Prepared comprehensive technical documents including specifications, gap analysis, and deployment guides for code releases.

•Automated application processes using Control-M and UNIX Shell scripts, ensuring continuous functionality during software maintenance and testing phases.

•Enhanced ETL processes with SQL Server procedures/functions and session partitioning techniques in Informatica for optimal performance.

FIS Serve –Jacksonville, FL

JOB ROLE: AZURE DATA ENGINEER (SEP 2021– NOV 2022)

Responsibilities:

•Architected and implemented cutting-edge data storage solutions on Azure using services like Azure Blob Storage, Azure SQL Database, and Azure Data Lake Storage Gen2, optimizing data capture, management, and processing at scale.

•Designed and maintained complex ETL pipelines using Azure Data Factory, ensuring seamless data integration from diverse sources into Azure-based storage systems while enforcing stringent data validation and cleansing processes for data integrity.

•Ensured compliance with data security and privacy regulations by implementing robust encryption and access control measures within Azure environments, safeguarding sensitive data during on-premises to cloud migrations.

•Leveraged Azure Functions and Azure Logic Apps for building serverless applications, tightly integrated with Azure Cosmos DB, to facilitate efficient data processing and task automation.

•Implemented Dremio's SQL acceleration features within Azure environments to optimize performance for complex analytical queries, improving response times and enhancing data processing efficiency.

•Led meticulous migration strategies from on-premises to cloud environments for diverse sectors, including healthcare and finance, ensuring smooth transitions and minimal disruptions to operations.

•Orchestrated the construction and activation of Azure Data Factory and Azure Data Lake Storage-based integration pipelines, facilitating seamless data exchange between on-premises and cloud systems for multiple clients.

•Demonstrated expertise in feature engineering techniques to enhance data quality inputs for machine learning models, resulting in more accurate predictions and actionable insights.

•Proficient in machine learning frameworks and libraries such as Azure Machine Learning, scikit-learn, TensorFlow, and PyTorch for developing, training, and fine-tuning machine learning models on Azure.

•Implemented comprehensive data governance practices, including data masking, access controls, and auditing, to ensure the security and privacy of sensitive data stored in Azure environments.

•Conducted exhaustive testing and validation of Azure Data Factory pipelines, Azure Data Lake Storage, and Azure Synapse Analytics, ensuring data precision, completeness, and adherence to industry standards.

•Collaborated seamlessly with cross-functional teams, including data engineers, administrators, and analysts, to design and implement robust data warehousing and analytics solutions using Azure Synapse Analytics.

•processing and analytics workflows using Azure Data Bricks integrated with Azure Synapse Analytics, resulting in significant improvements in processing efficiency and data accuracy.

•Developed bespoke MapReduce programs using Hadoop within Azure environments, enabling clients to analyze vast datasets and derive actionable insights to drive business performance.

•Conducted cross-functional workshops to educate team members on AI and machine learning concepts, fostering a collaborative environment and ensuring alignment with project objectives.

•Utilized Spark expertise to optimize data processing speeds and security across Azure environments, leading to reduced processing durations and improved operational efficiency.

•Implemented Spark applications with Pyspark and Spark-SQL within Azure environments to extract, transform, and aggregate data from diverse sources, uncovering valuable insights into customer behaviors and preferences.

•Contributed to business growth and success by effectively managing and analyzing customer data using Azure services, driving informed decision-making and strategic planning for enhanced customer satisfaction and revenue generation.

Cyient- India Hyderabad

JOB ROLE: DATA ENGINEER (SPARK) (MAY 2017 – JULY 2021)

Responsibilities:

•Managed the migration of data from on-premises environments to Azure using AZ Copy, ensuring seamless and secure transfer of large volumes of data.

•Leveraged Databricks to create PySpark notebooks for efficient data transformation, processing, and analysis, enhancing data insights and decision-making capabilities.

•Designed and executed Azure Data Factory (ADF) pipelines to orchestrate data processing tasks, facilitating smooth data movement and transformation from diverse sources to destination systems.

•Implemented Azure Synapse Analytics to store and analyze data across multiple nodes, integrating with Azure Blob Storage and Azure Data Lake Storage for comprehensive data storage solutions.

•Integrated Power BI with Azure Synapse for creating interactive dashboards and visualizations, enabling stakeholders to gain actionable insights from the data stored in Azure.

•Developed and maintained data pipelines in Azure Data Factory, incorporating Dataflows and Datasets for efficient data integration and ETL tasks.

•Customized Azure Data Factory functionality using Azure Functions and Logic Apps, extending its capabilities for tailored data processing requirements.

•Designed and implemented Azure Monitor solutions to monitor the health and performance of Azure resources, ensuring optimal operation and reliability.

•Configured and deployed Azure Monitor components such as Log Analytics, Metrics, and Activity Logs, providing comprehensive monitoring and alerting functionalities.

•Integrated on-premises and cloud data sources into Azure Data Factory pipelines, enabling seamless data integration and processing across environments.

•Implemented batch data integration solutions to meet business requirements, ensuring timely and accurate data processing and delivery.

•Automated infrastructure provisioning and deployment processes using Terraform and GitHub Actions, streamlining development and deployment workflows.

•Utilized Spark for interactive queries, streaming data processing, and integration with NoSQL databases, enabling efficient handling of large volumes of data and complex analytics tasks.

•Developed Python programs with Apache Beam and executed them in Cloud Dataflow for data validation between raw source files and BigQuery tables, ensuring data accuracy and consistency.

EDUCATION

Muffakham jah College of Engineering & Technology

Bachelor of Computer Science & Engineering.

Contact this candidate