Post Job Free
Sign in

Senior Data Engineer & Analyst (Remote, High Pay)

Location:
Richmond, TX
Salary:
13000
Posted:
June 10, 2026

Contact this candidate

Resume:

Syed Abdul

Lead Data Engineer/Data Analyst/Modeler

Email: ****.**********@*****.***

Phone: 214-***-****

Professional Summary:

Over 13 years of Industry experience as a Big Data with solid understanding of Data Modelling, Evaluating Data Sources and strong understanding of Data Lake/Data Warehouse/Data Mart Design, ETL, BI, OLAP, OLTP, Client/Server applications and cloud.

•Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server and Teradata.

•Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling

•Strong development skills with Azure Data Lake, Azure Data Factory, SQL Data Warehouse, Azure Blob, Synapse, Azure Storage Explorer.

•Experience with Data flow diagrams, Data dictionary, Data Governance Database normalization theory techniques, Entity relation modelling and design techniques.

•Performed data analysis and data profiling using complex SQL on various sources systems including SQL Server, Oracle and Teradata.

•Exposure working on data modeling tools like ER/Studio, MS Visio also techniques such relational and dimensional modeling.

•Excellent experience on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and FastExport.

•Experienced on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and controlling and granting database access and migrating On-premises databases to Snowflake using Azure Data factory.

•Experience in ER & Dimensional Data Modeling to deliver normalized ER & Star/Snowflake schemas using Erwin, ER Studio, EA Sybase power designer, SQL Server Enterprise manager and Oracle designer.

•Strong experience in using Excel and MS Access to dump the data and analyse based on business needs.

•A data vault is a database modeling method designed to provide long-term historical storage of data coming from multiple operational systems. It aims to offer flexibility and scalability to accommodate changes over time.

•Experienced working with Excel Pivot and VBA macros for various business scenarios.

•Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as SSIS and Informatica PowerCenter

•Experience in testing and writing SQL and PL/SQL statements - Stored Procedures, Functions, Triggers and packages.

•Experience in automating and scheduling the Informatica jobs using UNIX shell scripting configuring Korn-jobs for Informatica sessions.

•Excellent experience in analysis huge volumes of data in industries such as Finance, Healthcare & Retail.

•Excellent knowledge on creating reports on Power BI, SAP Business Objects, Webi reports for multiple data providers.

•Proficiency in IDQ development around data profiling, cleansing, parsing, standardization, verification, matching and data quality exception monitoring and handling.

•Extensive knowledge and experience in producing tables, reports, graphs and listings using various procedures and handling large databases to perform complex data manipulations.

•Excellent experience in Data mining with querying and mining large datasets to discover transition patterns and examine financial data.

•Have good exposure on working in offshore/onsite model with ability to understand and/or create functional requirements working with client and also have Good experience in requirement analysis and generating test artifacts from requirements docs.

Education

•Bachelor's in computer applications from Osmania University.

TECHNICAL SKILLS:

Operating Systems: Microsoft Windows, LINUX.

Languages and Development Platforms: Microsoft Business Intelligence Development Studio (BIDS), SQL, ETL, Python, PySpark, Pandas, SAS, Azure databricks, Synapse, Azure data lake, ER/Studio, MS Visio, Azure Synapse, Storage Blob, Collibra, SSIS.

Databases: SQL Server 2010/2012, Oracle 10g/12c, MySQL. Sybase, SAP HANA Studio

Reporting Tools: Power BI, Business Objects, SSRS.

Data Visualization: MS PowerBI

Other Software: MS-Project, MS-Office, Word, Excel, Power Point, Access

ITSM Tools: BMC Remedy, HP Service Manager, Service Now

Cloud: Azure,Snowflake.

Banner Health, Mar’22 – Till Date

Lead Data Engineer/Sr.Data Modeler

Roles & Responsibilities:

Involved in preparing High Level Design and Low-Level Design based on Functional and Business requirements document of the project and Creating and maintained, detailed support documentation for all ETL processes, developed solutions, including detailed flow designs and drafts.

Involved in Extracting Transforming and Loading data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Snowflake) and processing the data in In Azure Databricks.

Responsible for full data loads from production to AWS Redshift staging environment and worked on migrating of EDW to AWS using EMR and various other technologies.

Develop logical and physical data models that represent the data structures required for applications, databases, and data warehouses.

Use data modeling tools to create models, diagrams, and schemas.

Data Lake is a centralized repository designed to store vast amounts of raw and structured data. Managing a data lake involves ensuring its governance, security, and optimization for efficient data processing.

Managing Data Ingestion Pipelines, Ensure seamless and scalable ingestion of data from various sources (batch and real-time) into the data lake.

LakehouseI is a combines of both data lakes and data warehouses, offering a unified architecture that supports both large-scale data processing and real-time analytics. Managing a lakehouse involves ensuring smooth data flow, governance, and optimizing both unstructured and structured data for analytical workloads.

Unity Catalog, part of Databricks, is a centralized governance solution for managing data assets across an organization.

Design conceptual, logical, and physical data models to support business requirements.

Define entities, attributes, relationships, and data flow.

Work with business stakeholders, data analysts, and engineers to understand data needs and translate them into models

Create efficient and scalable database structures for operational and analytical systems (OLTP/OLAP).

Collaborate with DBAs for performance tuning and indexing

Maintain data dictionaries, ER diagrams, and documentation of models.

Ensure models align with data governance policies, naming conventions, and modeling standards.

Work with data architects, data engineers, and BI teams to ensure models support integration, analytics, and reporting.

Use modeling tools like ER/Studio, ERwin, PowerDesigner, or even dbt for modern data stacks.

Manage versions of models and handle changes as business requirements evolve.

It provides fine-grained controls over access to data, collaboration, and the management of assets in multi-cloud environments. Here are the main duties related to Unity Catalog.

Design and structure databases, ensuring they are efficient, scalable, and meet the organization's requirements.

- Ensure that data is organized in a way that it can be easily retrieved, updated, and maintained.

Analyze existing data systems and requirements to identify inefficiencies, inconsistencies, or opportunities for improvement.

- Collaborate with stakeholders (e.g., business analysts, data architects) to gather requirements and understand business needs.

Establish and enforce data governance policies and standards to ensure data quality, integrity, and security.

- Monitor and resolve issues related to data quality, ensuring that data is accurate.

Create and maintain comprehensive documentation for data models, including entity-relationship diagrams (ERDs), metadata, and data dictionaries.

- Ensure that all changes to the data models are accurately recorded and tracked.

Connected to AWS RedShift through Tableau to extract live data for real time analysis and worked on Normalization and De-normalization concepts and design methodologies like Ralph Kimball and Bill Inmon's Data Warehouse methodology.

Creating the ETL mappings using various Informatica transformations: Source qualifier, Data Quality, Lookup, Expression, Filter, Router, Sorter, Aggregator etc.

Assessed the insurance provider's network to identify the number of doctors, their specialties, and geographic distribution to ensure network adequacy. Analyze claims data to understand the utilization of healthcare services by patients, including the frequency and types of medical procedures performed by doctors.

Analyzed patient demographics and utilization patterns to understand the healthcare needs of different populations and regions.

Used predictive analytics to forecast future claims, estimate costs, and plan for potential spikes in healthcare demand.

Analyzed the cost and utilization differences between in-network and out-of-network healthcare services for insured individuals.

Imported data from RDBMS environment into HDFS using Sqoop for report generation and visualization purpose using Tableau and worked in Loading and transforming large sets of structured, semi structured, and unstructured data.

Developed Power BI data visualizations and dashboards in support of Confidential, QA teams, & Confidential live platform to provide performance benchmarks & market insights to Confidential studios & corporate leadership.

Designed, implemented, and maintained data pipelines on Google Cloud Platform (GCP) to ingest, transform, and store large volumes of data for analytics purposes.

Monitored data pipelines using GCP's monitoring and alerting tools to ensure high availability and reliability.

Utilized GCP services such as Google Cloud Storage,Google BigQuery, and Google Dataflow to build scalable and cost-effective ETL processes.

Developed Cosmos (Azure Data Lake) streams with Scope to prepare queries and create data pipelines that drive historical data analysis and enable insightful data visualizations with validated data engineering.

Regularly used Azure SQL & PowerShell to schedule, transform, & prepare data

Maintained a complex cross-workspace Power BI report via advanced Excel ODC connections in order to analyse usage metadata on hundreds of reports across dozens of workspaces, allowing Confidential BI administrators to efficiently measure and optimize their reporting value by trimming or improving disused reports

Development and maintenance of data pipeline on Azure Analytics platform using Azure Databricks, PySpark, Python, Pandas and NumPy libraries.

Developed MDM integration plan and hub architecture for customers, products and vendors, Designed MDM solution for three domains.

Involved in the design of Data-warehouse using Star-Schema methodology and converted data from various sources to SQL tables.

Extract and transform data from Excel to help with the data migrations and making mass changes in SAP.

Extensively worked with Avro and Parquet files and converted the data from either format Parsed Semi Structured JSON data and converted to Parquet using Data Frames in Spark.

Involved in Data Migration using SQL, SQL Azure, Azure Storage, and Azure Data Factory, SSIS, and PowerShell. Created processes to load data from Azure Storage blob to Azure SQL, to load from web API to Azure SQL and scheduled web jobs for daily loads.

Generated periodic reports based on the statistical analysis of the data from various time frame and division using Power BI.

Worked with ETL team for developing various mappings and workflows in SSIS as required by the design specifications.

Perform both record-level and large-scale manual additions, adjustments and corrections to continuously ensure overall data quality and integrity.

Involved in requirement gathering and database design and implementation of star-schema, snowflake schema/dimensional data warehouse using Erwin

Designed Both 3NF data models for ODS, OLTP systems and dimensional data models using star and snowflake Schema.

Implemented Copy activity, Custom Azure Data Factory Pipeline Activities.

Environment: SQL, ETL, Azure Cloud (SQL, DW, Data Factory, Synapse, Storage Blob),GCP, SQL Server, Power BI, Pyspark, Collibra, ER/Studio and MDM, Erwin/ER Studio SAP, Snowflake.

HM Health, Aug ‘20 – Feb’ 22

Sr Data Modeler/ Data Engineer

Roles & Responsibilities:

•Analysis of functional and non-functional categorized data elements for data profiling and mapping from source to target data environment. Developed working documents to support findings and assign specific tasks.

•Responsible for all metadata relating to the EDW's overall data architecture, descriptions of data objects, access methods and security requirements and developed and automated multiple departmental Reports using Power BI and MS Excel.

•Responsible for full data loads from production to AWS Redshift staging environment and worked on migrating of EDW to AWS using EMR and various other technologies.

•Connected to AWS RedShift through Tableau to extract live data for real time analysis and worked on Normalization and De-Normalization concepts and design methodologies like Ralph Kimball and Bill Inmon's Data Warehouse methodology.

•Creating the ETL mappings using various Informatica transformations: Source qualifier, Data Quality, Lookup, Expression, Filter, Router, Sorter, Aggregator etc.

•Imported data from RDBMS environment into HDFS using Sqoop for report generation and visualization purpose using Tableau and worked in Loading and transforming large sets of structured, semi structured, and unstructured data.

•Examined the relationships and interactions between providers to understand referral patterns and network structures.

•Created profiles for each healthcare provider, including information on specialties, locations, years of experience, affiliations, and credentials.

• Data Vault methodology is utilized in various ways to address the complexities of modern data warehousing and enterprise data management.

•key duties of a data vault include data inyegration, Historical Data storage,Scalability and Flexibilty, Data Lineage and Auditability, Seperation of Concerns etc.

•Analyzed patient demographics served by each provider to identify disparities and ensure equitable access to care.

•Design and develop ETL integration patterns using Python on Spark and developed framework for converting existing ETL mappings and to PySpark (Python and Spark) Jobs and created PySpark frame to bring data from DB2 to Azure Storage Blob then Optimize the PySpark jobs to run on Kubernetes Cluster for faster data processing.

•Worked on catapulting data from Teradata to snowflake to consume on Databricks and worked on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and FastExport.

•Worked with the DW architect to prepare the ETL design document and developed transformation logic to cleanse the source data of inconsistencies during the source to stage loading.

•Extensively used SSIS transformations such as Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task and Send Mail task etc.

•Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the SQL Activity.

•Provided extensive Production Support for Data Warehouse for internal and external data flows to Oracle DBMS from ETL servers via remote servers and Used heterogeneous data sources XML Files and Flat Files as source also imported stored procedures from Oracle for transformations.

•Used the Agile Scrum methodology to build the different phases of Software development life cycle.

•Performed Reverse Engineering of the legacy application using DDL scripts in ER Studio and developed Logical and Physical data models for Central Model consolidation.

•Working with data ingestions from multiple sources into the Azure SQL data warehouse

•Extract data from multiple systems, conduct detailed data analysis, and load data into SAP-MDM

•Complex Semantic Model is created using Azure Analysis Services Cubes over Azure SQL DW for design and developing Cubes.

•Defining data governance process: processes for data quality rules definition, review process, communication plan, templates, etc. using Collibra.

•Closely worked with BI team to write DAX queries for building the Cube in Azure Analysis Services.

•Analyzed, Designed, and Developed OBIEE Metadata repository (RPD) that consists of Physical Layer, Business Mapping and Model Layer and Presentation Layer.

•Developing purging scripts and routines to purge data on Azure SQL Server and Azure Blob storage.

•Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.

•Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.

•Worked on all types of transformations that are available in Power BI query editor and wrote calculated columns, Measures query’s in Power BI desktop to show good data analysis techniques.

•Performed data mining on Claims data using very complex SQL queries and discovered claims pattern.

•Generated and DDL (Data Definition Language) scripts using ER Studio and assisted DBA in Physical Implementation of data Models.

•Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex EDW using SSIS.

•Written complex SQL queries for validating the data against different kinds of reports generated by Power BI.

•Extensively used MS Access to pull the data from various data bases and integrate the data and metrics reporting, data mining and trends in helpdesk environment using Access

•Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.

•Identify & record defects with required information for issue to be reproduced by development team.

Environment: ER Studio, MS Visio, Oracle 11g, SAP, Data Vault, Oracle Designer, CRM, Hadoop, Power BI, Teradata, GIT, SQL Server 2010, SQL, PL/SQL, Hive, JIRA, ERP, and UNIX, Azure, Azure SQL and NoSQL Data Base, and SSIS ETL Tool.

Paccar - Bellevue, WA Feb’ 18-Jul ‘20

Sr. Data Modeler/Analyst

Responsibilities:

•Worked with Business users for requirements gathering, business analysis and project coordination and understood and translate business needs into data models supporting underwriting workstation services.

•Worked with the Application Development team to implement data strategies, build data flows and develop data models and designed and developed Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object oriented Design) using UML and Visio.

•Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index considerations.

•Involved in Teradata utilities (BTEQ, Fast Load, Fast Export, Multiload, and Tpump) in both Windows and Mainframe platforms.

•Involved with Full Data warehouse Lifecycle Implementation upgrading the existing Legacy Data Warehouse to Enterprise Data warehouse using the Kimball’s Four Fixes approach by conforming the Non-conformed Dimensions, creating surrogate keys, delivering the atomic details and reducing redundancies and also designing from the scratch.

•Involved in the entire data Migration process from analyzing the existing data, cleansing, validating, translating tables, converting and subsequent upload into new platform.

•Generated and DDL (Data Definition Language) scripts using ER Studio and assisted DBA in Physical Implementation of data Models.

•Involved in writing T-SQL, working on SSIS, SSRS, SSAS, Data Cleansing, Data Scrubbing and Data Migration.

•Developed Conceptual, Logical and Physical data models for central model consolidation and used Normalization (1NF, 2NF & 3NF) and de-normalization techniques for effective performance in OLTP and OLAP systems.

•Generated ad hoc reports in Excel Power Pivot and shared them using Power BI to the decision makers for strategic planning and involved in developing Power BI reports and dashboards from multiple data sources using data blending.

•Worked on Performance Tuning of the database which includes indexes, optimizing SQL Statements and conducted data modeling JAD sessions and communicated data-related standards.

•Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and Bulk collects.

•Used SSRS for generating Reports from Databases and Generated Sub-Reports, Drill down reports, Drill through reports and parameterized reports using SSRS.

•Used Model Mart of ER Studio for effective model management of sharing, dividing and reusing model information and design for productivity improvement.

•Implemented Forward engineering to create tables, views and SQL scripts and mapping documents and worked on PL/SQL programming Stored Procedures, Functions, Packages and Triggers.

•Wrote DDL and DML statements for creating, altering tables and converting characters into numeric values.

•Involved in development and implementation of SSIS, SSRS and SSAS application solutions for various business units across the organization.

Environment: ER Studio, OLTP, Power BI, ETL, Cognos, JIRA, SQL, PL/SQL, DB2, SSIS, SSRS, Queries, T-SQL, Oracle, Alteryx, Teradata, SSAS, DB2, Excel, Netezza and SSAS

Key Bank - Cleveland, OH Apr’ 16-Jan ‘18

Data Analyst

Responsibilities

•Involved in Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.

•Analysed business requirements, system requirements, data mapping requirement specifications, and responsible for documenting functional requirements and supplementary requirements in Quality Center.

•Setting up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.

•Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.

•Responsible for different Data mapping activities from Source systems to Teradata

•Created the test environment for Staging area, loading the Staging area with data from multiple sources.

•Responsible for analysing various data sources such as flat files, ASCII Data, EBCDIC Data, Relational Data (Oracle, DB2 UDB, MS SQL Server) from various heterogeneous data sources.

•Delivered file in various file formatting system (ex. Excel file, Tab delimited text, Coma separated text, Pipe delimited text etc.)

•Performed ad hoc analyses, as needed, with the ability to comprehend analysis as needed

•Designed SAP HANA attributes views, analytic views, and calculation views.

•Involved in testing the XML files and checked whether data is parsed and loaded to staging tables.

•Executed the SAS jobs in batch mode through UNIX shell scripts and created remote SAS sessions to run the jobs in parallel mode to cut off the extraction time as the datasets were generated simultaneously

•Reviewed and modified SAS Programs, to create customized ad-hoc reports, processed data for publishing business reports.

•Responsible for creating test cases to make sure the data originating from source is making into target properly in the right format.

•Tested several stored procedures and wrote complex SQL syntax using case, having, connect by etc

•Developed Procedures and CDS table functions to implement code to data paradigm to benefit from capabilities of SAP Hana.

•Involved in Teradata SQL Development, Unit Testing and Performance Tuning and to ensure testing issues are resolved on the basis of using defect reports.

•Tested the ETL process for both before data validation and after data validation process. Tested the messages published by ETL tool and data loaded into various databases

•Ensuring onsite to offshore transition, QA Processes and closure of problems & issues.

•Tested the database to check field size validation, check constraints, stored procedures and cross verifying the field size defined within the application with metadata.

Environment: SSIS, SSRS, Data Flux, Oracle 10g, SAP HANA, Quality Center, SQL, TOAD, PL/SQL Flat Files, Oracle, SQL Server, UNIX shell Scripts.

Charles Schwab, SFO, CA Feb’ 12-Mar ‘16

DW Analyst

Responsibilities:

•Designed & Created Test Cases based on the Business requirements (Also referred Source to Target Detailed mapping document & Transformation rules document).

•Involved in extensive DATA validation using SQL queries and back-end testing

•Used SQL for Querying the database in UNIX environment

•Developed separate test cases for ETL process (Inbound & Outbound) and reporting

•Involved with Design and Development team to implement the requirements.

•Developed and Performed execution of Test Scripts manually to verify the expected results

•Design and development of ETL processes using Informatica ETL tool for dimension and fact file creation

•Involved in Manual and Automated testing using QTP and Quality Center.

•Conducted Black Box – Functional, Regression and Data Driven. White box – Unit and Integration Testing (positive and negative scenarios).

•Defects tracking, review, analyzes and compares results using Quality Center.

•Participating in the MR/CR review meetings to resolve the issues and defined the Scope for System and Integration Testing

•Prepares and submit the summarized audit reports and taking corrective actions

•Involved in Uploading Master and Transactional data from flat files and preparation of Test cases, Sub System Testing.

•Document and publish test results, troubleshoot and escalate issues

•Involved in Test Scheduling and milestones with the dependencies also functionality testing of email notification in ETL job failures, abort or data issue problems.

•Identify, assess and intimate potential risks associated to testing scope, quality of the product and schedule

•Created and executed test cases for ETL jobs to upload master data to repository.

•Responsible to understand and train others on the enhancements or new features developed

•Conduct load testing and provide input into capacity planning efforts.

•Provide support to client with assessing how many virtual user licenses would be needed for performance testing, specifically load testing using Load Runner

•Create and execute test scripts, cases, and scenarios that will determine optimal system performance according to specifications.

•Modified the automated scripts from time to time to accommodate the changes/upgrades in the application interface.

•Tested the database to check field size validation, check constraints, stored procedures and cross verifying the field size defined within the application with metadata.

Environment: Windows XP, Informatica Power Center, QTP 9.2, Test Director, Load Runner, Oracle 10g, UNIX AIX 5.2, PERL, Shell Scripting, SQL, SQL Server, Business Object.



Contact this candidate