Senior Data Engineer

Location:

Chicago, IL

Posted:

October 27, 2023

Contact this candidate

Resume:

NAME: Sushma K

EMAIL: ***********@*****.***

CONTACT: +1-331-***-****

LINKEDIN: http://linkedin.com/in/sushma-r-674b12281s

SR. DATA ENGINEER

BACKGROUND SUMMARY:

●IT Professional having 9 years of experience with strong background in end-to-end enterprise Data Warehousing and Big Data Projects.

●Excellent hands-on requirement analysis, designing, developing, testing, and maintaining the complete data management & processing systems, process documentation and ETL technical and design documents.

●Experience in designing and building Data Management Lifecycle covering Data Ingestion, Data integration, Data consumption, Data delivery, and integration Reporting, Analytics, and System-System integration.

●Proficient in Big Data environment and Hands-on experience in utilizing Hadoop environment components for large-scale data processing including structured and semi-structured data.

●Strong experience with all phases including Requirement Analysis, Design, Coding, Testing, Support, and Documentation.

●Extensive experience with Azure cloud technologies like Azure Data Lake Storage, Azure Data Factory, Azure SQL, Azure Data Warehouse, Azure Synapse Analytical, Azure Analytical Services, Azure HDInsight, and Databricks.

●Solid Knowledge of AWS services like AWS EMR, Redshift, S3, EC2, and concepts, configuring the servers for auto-scaling and elastic load balancing.

●Hands on experience with GCP services including DataProc, VM, Big Query, Dataflow, Cloud functions, Pub/Sub, Composer, Secrets, Storage & Compute.

●Hands on experience with Databricks services including Notebooks, Delta Tables, SQL Endpoints, Unity Catalog, Secrets, Clusters.

●Have Extensive Experience in IT data analytics projects, Hands on experience in migrating on-premises data & data processing pipelines to cloud including AWS,& Azure.

●Experienced in fact dimensional modeling (Star schema, Snowflake schema), transactional modeling and SCD (Slowly changing dimension)

●Expertise in transforming business resources and requirements into manageable data formats and analytical models, designing algorithms, building models, developing data mining and reporting solutions that scale across a massive volume of structured and unstructured data.

●Expertise in resolving production issues, hands-on experience in all phases of the software development Life cycle (SDLC).

●Aggregated Data through Kafka, HDFS, Hive, Scala, and Spark Streams in AWS

●Well versed with Bigdata on AWS cloud services i.e., EC2, EMR, S3, Glue, DynamoDB and RedShift

●Developed, deployed, and managed event-driven and scheduled AWS Lambda functions to be triggered in response to events on various AWS sources including logging, monitoring, and security related events and to be invoked on scheduled basis to take backups.

●Used AWS data pipeline for Data Extraction, Transformation and Loading from homogeneous or heterogeneous data sources and built various graphs for business decision-making using Python Math plot library. Worked with Cloudera and Hortonworks distributions.

●Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing Data Mining, Data Acquisition, Data Preparation, Data Manipulation, Feature Engineering, Machine Learning Algorithms, Validation and Visualization and reporting solutions that scales across massive volume of structured and unstructured Data.

●Excellent performance in building, publishing customized interactive reports and dashboards with customized parameters including producing tables, graphs, listings using various procedures and tools such as Tableau and user-filters using Tableau.

●Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement. Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.

●Good knowledge of integrating Spark Streaming with Kafka for real time processing of streaming data.

●Experience in Designing end to end scalable architecture to solve business problems using various Azure Components like HDInsight, Data Factory, Data Lake, Storage and ML Studio.

●Created batch processing workflows in GCS to process large datasets, enabling efficient data analysis and reporting.

●Experience in developing customized UDF’s in Python to extend Hive and Pig Latin functionality.

●Experience in Data Analysis, Data Profiling, Data Integration, Migration, Data governance and Metadata Management, Master Data Management (MDM) and Configuration Management.

●Scheduled jobs in Databricks using Databricks workflows.

●Used Spark Streaming APIs to perform transformations on the fly to build a common data model which gets the data from confluent Kafka in real time and persist it to snowflake.

●Proficiency in SQL across several dialects (MySQL, PostgreSQL, SQL Server, and Oracle).

●Experience in working with NoSQL databases like HBase, DynamoDB.

●Hands-on use of Spark and Scala API's to compare the performance of Spark with Hive and SQL, and SparkSQL to manipulate Data Frames in Scala.

●Good Experience in implementing and orchestrating data pipelines using Oozie and Airflow.

●Developed transformations logic using Snow Pipe. Hands on experience working with snowflake utilities such as Snow SQL and Snow Pipe.

●Created data transformation processes using ETL tools or custom scripts to preprocess and cleanse data before indexing it into Apache Solr.

●Extensively worked with Teradata utilities Fast export, and Multi Load to export and load data to/from different source systems including flat files. Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.

●Developed ETL pipelines in and out of data warehouse using combination of Python and Snowsql.

●Experience in working with Flume and NiFi for loading log files into Hadoop.

●Experience in Creating Teradata SQL scripts using OLAP functions like rank and rank over to improve the query performance while pulling the data from large tables.

●Strong Experience in working with Databases like Teradata and proficiency in writing complex SQL, PL/SQL

●for creating tables, views, indexes, stored procedures, and functions.

●Knowledge and experience with Continuous Integration and Continuous Deployment using containerization technologies like Docker and Jenkins.

●Excellent working experience in Agile/Scrum development and Waterfall project execution methodologies.

TECHNICAL SKILLS

Big Data Ecosystem:

HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Kafka Flume, Cassandra, Impala, Oozie, Zookeeper, MapR, Amazon Web Services (AWS), EMR

Machine Learning:

Classification Algorithms Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbor (KNN), Gradient Boosting Classifier, Extreme Gradient Boosting Classifier, Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayes Classifier, Extra Trees Classifier, Stochastic Gradient Descent, etc.

Cloud Technologies:

AWS, Azure, Google cloud platform (GCP)

IDE’sIntelliJ:

Eclipse, Spyder, Jupyter

Ensemble and Stacking

Averaged Ensembles Weighted Averaging, Base Learning, Meta Learning, Majority Voting, Stacked Ensemble, Auto ML - Scikit-Learn, ML jar, etc.

Databases:

Oracle 11g/10g/9i, MySQL, DB2, MS SQL Server, HBASE

Programming:

Query Languages, Java, SQL, Python Programming (Pandas, NumPy, SciPy, Scikit-Learn, Seaborn, Matplotlib, NLTK), NoSQL, PySpark, PySpark SQL, SAS, R Programming, RStudio, PL/SQL, Linux shell scripts, Scala.

Data Engineer:

Big Data Tools / Cloud / Visualization / Other Tools Databricks, Hadoop Distributed File System (HDFS), Hive, Pig, Sqoop, MapReduce, Spring Boot, Flume, YARN, Hortonworks, Cloudera, Oozie, Airflow, Zookeeper, etc. AWS, Azure Databricks, Azure Data Explorer, Azure HDInsight, Salesforce, GCP, Google Shell, Linux, PuTTY, Bash Shell, Unix, etc., Tableau, Power BI, SAS, We Intelligence, Crystal Reports, Dashboard Design.

PROFESSIONAL SUMMARY

CLIENT: Thomson Reuters, Chicago, IL

DURATION: Feb 2022 – Present

ROLE: Azure Data Engineer

Responsibilities:

●Worked with business/user groups for gathering the requirements and working on the creation and development of pipelines.

●Proficiently managed version control repositories in Bitbucket, including Git and Mercurial, ensuring codebase integrity and collaboration among development teams.

●Experience in integrating Matillion with major cloud platforms like AWS, Azure, or Google Cloud to leverage cloud-native services for data processing.

●Worked on creating Azure Data Factory and managing policies for Data Factory and Utilized Blob storage for storage and backup on Azure.

●Worked on developing the process and ingested the data in Azure cloud from web service and load it to Azure SQL DB.

●Experience in designing and developing data pipelines using Azure Synapse and related technologies.

●Proficient in designing, implementing, and managing data warehousing solutions using Snowflake on cloud platforms like AWS, Azure, or Google Cloud.

●Understanding of Synapse's integration with other Azure services, such as Azure Data Lake Storage and Azure Stream Analytics.

●Designed and developed end-to-end ETL (Extract, Transform, Load) workflows using Azure Data Factory to efficiently ingest, transform, and load data from diverse sources into data lakes and data warehouses.

●Demonstrated expertise in integrating data from diverse sources, such as databases, cloud storage, APIs, and flat files, using Matillion for seamless data flow.

●Expert Python programmer with a solid grasp of contemporary libraries and frameworks, able to use Python to create scalable and effective solutions for a range of applications.

●Proven track record in Azure Synapse performance tuning and optimization, including enhancing Spark workloads and SQL queries.

●Integrated Confluence with Jira to link documentation and project-related information, enabling seamless navigation between the two tools.

●Configured and managed Apache ZooKeeper clusters to ensure high availability and reliability for distributed applications.

●Experience designing and developing data models using Azure data modeling tools such as Azure SQL Database, Azure Synapse Analytics, and Azure Cosmos DB.

●Worked with Spark applications in Java for developing the distributed environment to load high volume files using Scala spark with different schema into Data frames and process them to reload into Azure SQL DB tables.

●Installed Kafka on Hadoop cluster and configured producer and consumer in java to establish connection from source to HDFS with popular hash tags.

●Used Apache Flink for data processing tasks including online scraping and batch processing of data in the pipeline administration.

●Implemented and optimized an AWS Redshift data warehouse, improving query performance and enabling real-time analytics for business stakeholders.

●Designed and developed the pipelines using Databricks and automated the pipelines for the ETL processes and further maintenance of the workloads in the process.

●Worked on creating ETL packages using SSIS to extract data from various data sources like Access database, Excel spreadsheet, and flat files, and maintain the data using SQL Server.

●Worked with ETL operations in Azure Databricks by connecting to different relational databases using Kafka and used Informatica for creating, executing, and monitoring sessions and workflows.

●Proficiency in continuous integration/continuous deployment (CI/CD) practices and tools like Jenkins, GitLab CI, or CircleCI.

●Knowledge of database management systems such as PostgreSQL, MySQL, or NoSQL databases like MongoDB.

●Managed AWS EC2 instances, including provisioning, scaling, and monitoring, to ensure the availability and performance of applications.

●Used Databricks, Scala, and Spark for creating the data workflows and capturing the data from Delta tables in Delta Lakes.

●Developed triggers and user-defined functions in T-SQL to enforce data constraints and automate data-related actions.

●Worked with Delta Lakes for consistent unification of Streaming, processed the data, and worked on ACID transactions using Apache Spark.

●Worked with Azure Blob Storage and developed the framework for the implementation of the huge volume of data and the system files.

●Implemented data transformation tasks within Azure Data Factory, including data cleansing, enrichment, and aggregation, ensuring data quality and consistency.

●Worked with PowerShell scripting for maintaining and configuring the data. Automated and validated the data using Apache Airflow.

●Worked on optimization of Hive queries using best practices and right parameters and using Hadoop, YARN, Java, and Spark.

●Used Sqoop to extract the data from Teradata into HDFS and export the patterns analyzed back to Teradata.

●Created DA specs and Mapping Data flow and provided the details to developer along with HLDs.

●Created Build definition and Release definition for Continuous Integration (CI) and Continuous Deployment (CD).

●Created Application Interface Document for the downstream to create new interface to transfer and receive the files through Azure Data Share.

●Creating pipelines, data flows and complex data transformations and manipulations using ADF and PySpark with Databricks.

●Integrated Google Cloud Storage with BigQuery to efficiently load and export data, enabling seamless data transfer and analysis.

●Ingested data in mini-batches and performs RDD transformations on those mini-batches of data by using Spark Streaming to perform streaming analytics in Data bricks.

●Created, provisioned different Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters. Integrated Azure Active Directory authentication to every Cosmos DB request sent and demoed feature to Stakeholders.

●Implemented structured streaming solutions in Databricks to process and analyze real-time data from various sources, ensuring timely insights.

●Designed and developed a new solution to process the NRT data by using Azure stream analytics, Azure Event Hub, and Service Bus Queue. Created Linked service to land the data from SFTP location to Azure Data Lake.

●Experienced Python developer with a track record of designing and implementing complex algorithms, streamlining processes, and contributing to successful software projects.

●Designed and implemented ETL processes using T-SQL to extract, transform, and load data from various sources into data warehouses or reporting systems.

●Working with complex SQL, Stored Procedures, Triggers, and packages in large databases from various servers.

●Developed complex SQL queries using stored procedures, Common table expressions, temporary table to support Power BI reports.

●Worked on Kafka to bring the data from data sources and keep it in HDFS systems for filtering.

●Implemented and maintained CI/CD pipelines using Jenkins to automate the build, test, and deployment processes for applications.

●Designed and model datasets with Power BI desktop based on the measures and dimensions requested by customers and dashboard needs.

●Developed AWS Lambda functions to automate data processing tasks, reducing manual workload and enhancing system reliability through event-driven serverless architecture.

●Deployed applications in Kubernetes clusters, managing containerized workloads and ensuring high availability and scalability.

●Worked with Tableau for generating reports and created Tableau dashboards, pie charts, and heat maps according to the business requirements.

●Worked with all phases of Software Development Life Cycle and used Agile methodology for development.

Environment: Java, SQL, Cassandra DB, Azure Data Lake Storage Gen 2, Azure Data Factory, Azure SQL DB, Spark, Databricks, SSIS, SQL Server, Kafka, Informatica, Apache Spark, Delta Lake, Azure Event Hubs, Stream Analytics, Azure Blob Storage, PowerShell, Apache Airflow, Hadoop, YARN, PySpark, Hive, Teradata, Sqoop, HDFS, Spark, Agile.

CLIENT: Dish TV, Denver, CO

DURATION: Aug 2019 – Jan 2022

ROLE: Aws Data Engineer

Responsibilities:

●Worked in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional workflow of information from source systems to destination systems.

●Utilizing analytical, statistical, and programming skills to collect, analyze and interpret large data sets to develop data-driven and technical solutions to difficult business problems using tools such as SQL, and Python.

●Worked on designing AWS EC2 instance architecture to meet high availability application architecture and security parameters.

●Created AWS S3 buckets and managed policies for S3 buckets and Utilized S3 buckets and Glacier for storage and backup.

●Built data processing pipelines using Google Cloud Dataflow, Apache Beam, or similar technologies to process and transform data stored in GCS.

●Utilized GCP BigQuery to process and analyze multi-terabyte datasets, delivering actionable insights to support strategic decision-making, resulting in a 20% increase in operational efficiency.

●Worked with different file formats like Parquet files and Impala using PySpark for accessing the data and performed Spark Streaming with RDDs and Data Frames.

●Performed the aggregation of log data from different servers and used them in downstream systems for analytics using Apache Kafka.

●Worked on designing and developing the SSIS Packages to import and export data from MS Excel, SQL Server, and Flat files.

●Worked on Data Integration for extracting, transforming, and loading processes for the designed packages.

●Designed event-driven architectures leveraging AWS Lambda to process events from various AWS services such as S3, DynamoDB, and API Gateway.

●Designed and deployed automated ETL workflows using AWS lambda, organized and cleansed the data in S3 buckets using AWS Glue, and processed the data using Amazon Redshift.

●Added query optimizer improvements to the ETL architecture to boost performance.

●Made use of BigQuery's partitioned and clustered tables to arrange and streamline data storage for quicker query execution.

●Implemented the data that is extracted using Spark, Hive, and large data sets using HDFS.

●Worked on Streaming data transfer, data from different data sources into HDFS, No SQL databases.

●Created ETL Mapping with Talend Integration Suite to pull data from Source, apply transformations, and load data into the target database.

●Implemented the data that is extracted using Spark, Hive, and large data sets using HDFS.

●Worked on scripting with Python in Spark for transforming the data from various files like Text files, CSV, and JSON.

●Loaded the data from different relational databases like MySQL and Teradata using Sqoop jobs.

●Worked on processing the data and testing using Spark SQL and on real-time processing by Spark Streaming and Kafka using Python.

●Scripted using Python and PowerShell for setting up baselines, branching, merging, and automation processes across the process using GIT.

●Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.

●Used various spark Transformations and Actions for cleansing the input data.

●Used Jira for ticketing and tracking issues and Jenkins for continuous integration and continuous deployment.

●Enforced standards and best practices around data catalog, data governance efforts.

●Created DataStage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.

●Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow for ETL batch processing to load into Snowflake for analytical processes.

●Worked in building ETL pipeline for data ingestion, data transformation, data validation on cloud service AWS, working along with data steward under data compliance.

●Worked on scheduling all jobs using Airflow scripts using python added different tasks to DAG, LAMBDA.

●Used Pyspark for extract, filtering and transforming the Data in data pipelines.

●Skilled in monitoring servers using Nagios, Cloud watch and using ELK Stack- Elastic search and Kibana.

●Used Data Build Tool for transformations in ETL process, AWS lambda and AWS SQS.

●Worked on scheduling all jobs using Airflow scripts using python. Adding different tasks to DAG's and dependencies between the tasks.

●Documented data validation processes, test cases, and results, ensuring that they are easily accessible and understandable to stakeholders and team members.

●Conducted training sessions for team members on data validation techniques and best practices, resulting in improved data quality awareness and knowledge across the organization.

●Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.

●Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data bricks cluster.

●Created Unix Shell scripts to automate the data load processes to the target Data Warehouse.

●Worked with the implementation of the ETL architecture for enhancing the data and optimized workflows by building DAGs in Apache Airflow to schedule the ETL jobs and additional components in Apache Airflow like Pool, Executors, and multi-node functionality.

●Used various Transformations in SSIS Dataflow, Control Flow using for loop Containers and Fuzzy.

●Worked on creating SSIS packages for Data Conversion using data conversion transformation and producing the advanced extensible reports using SQL Server Reporting Services.

Environment: Python, SQL, AWS EC2, AWS S3 buckets, Hadoop, PySpark, AWS lambda, AWS Glue, Amazon Redshift, Spark Streaming, Apache Kafka, SSIS, Informatica, ETL, Hive, HDFS, NoSQL, Talend, MySQL, Teradata, Sqoop, PowerShell, GIT, Apache Airflow.

CLIENT: BCBS, Chicago, IL

Duration: Jan 2018- July 2019

Role: Data Engineer

Responsibilities:

●Migrate data from on-premises to AWS storage buckets.

●Developed a python script to transfer data from on-premises to AWS S3.

●Developed a python script to hit REST API’s and extract data to AWS S3.

●Worked on Ingesting data by going through cleansing and transformations and leveraging AWS Lambda, AWS Glue and Step Functions.

●Created YAML files for each data source and including glue table stack creation.

●Worked on a python script to extract data from Netezza databases and transfer it to AWS S3.

●Developed Lambda functions and assigned IAM roles to run python scripts along with various triggers (SQS, Event Bridge, SNS).

●Designed and implemented Sqoop for the incremental job to read data from DB2 and load to Hive tables and connected to Tableau for generating interactive reports using Hive server2.

●Created a Lambda Deployment function and configured it to receive events from S3 buckets.

●Writing UNIX shell scripts to automate the jobs and scheduling cron jobs for job automation using commands with Crontab.

●Developed various Mappings with the collection of all Sources, Targets, and Transformations using Informatica Designer.

●Developed Mappings using Transformations like Expression, Filter, Joiner, and Lookups for better data messaging and to migrate clean and consistent data.

●Used Sqoop to channel data from different sources of HDFS and RDBMS.

●Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats.

●Used Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS using Python and NoSQL databases such as HBase and Cassandra.

●Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performs necessary Transformations and Aggregation on the fly to build the common learner data model and persists the data in HDFS.

●Used Apache NiFi to copy data from local file system to HDP.

●Worked on Dimensional and Relational Data Modeling using Star and Snowflake Schemas, OLTP/OLAP system, Conceptual, Logical and Physical data modeling using Erwin.

●Automated the data processing with Oozie to automate data loading into the Hadoop Distributed File System (HDFS).

Environment: BigData3.0, Hadoop3.0, Oracle12c, PL/SQL, Scala, Spark-SQL, PySpark, Python, kafka1.1, SAS, Azure SQL, MDM, Oozie4.3, SSIS, T-SQL, ETL, HDFS, Cosmos, Pig0.17, Sqoop1.4, MS Access.

CLIENT: Experian Technologies, Bangalore, India

DURATION: Apr 2016- Oct 2017

ROLE: Database Developer

Responsibilities:

●Coordinated with the front-end design team to provide them with the necessary stored procedures and packages and the necessary insight into the data.

●Worked on SQL*Loader to load data from flat files obtained from various facilities every day.

●Created and modified several UNIX shell scripts according to the changing needs of the project and client requirements.

●Wrote Unix Shell Scripts to process the files on a daily basis like renaming the file, extracting date from the file, unzipping the file, and removing the junk characters from the file before loading them into the base tables.

●Involved in the continuous enhancements and fixing of production problems.

●Generated server-side PL/SQL scripts for data manipulation and validation and materialized views for remote instances.

●Developed PL/SQL triggers and master tables for automatic creation of primary keys.

●Created PL/SQL stored procedures, functions, and packages for moving the data from the staging area to the data mart.

●Created scripts to create new tables, views, and queries for new enhancements in the application using TOAD.

●Created indexes on the tables for faster retrieval of the data to enhance database performance.

●Involved in data loading using PL/SQL and SQL*Loader calling UNIX scripts to download and manipulate files.

●Performed SQL and PL/SQL tuning and Application tuning using various tools like EXPLAIN PLAN, SQL*TRACE, and auto trace.

●Extensively involved in using hints to direct the optimizer to choose an optimum query execution plan.

●Implemented business logic using SQLNavigator. Generated server-side PL/SQL scripts for data manipulation and validation and materialized views for remote instances.

●Involved in creating UNIX shell Scripting. Defragmentation of tables, partitioning, compressing, and indexes for improved performance and efficiency. Involved in table redesigning with the implementation of Partitions Table and Partition Indexes to make the Database Faster and easier to maintain.

●Experience in Database Application Development, Query Optimization, Performance Tuning, and DBA solutions and implementation experience in complete System Development Life Cycle.

●Used SQL Server SSIS tool to build high-performance data integration solutions including extraction, transformation, and load packages for data warehousing. Extracted data from the XML file and loaded it into the database.

●Designed and developed Oracle forms & reports generating up to 60 reports.

●Performed modifications on existing form as per change request and maintained it.

●Used Crystal Reports to track logins, mouse overs, click-throughs, session durations, and demographical comparisons with SQL database of customer information.

●Worked on SQL*Loader to load data from flat files obtained from various facilities every day. Used standard packages like UTL_FILE, DMBS_SQL, and PL/SQL Collections and used BULK Binding.

●involved in writing database procedures, functions, and packages for Front End Module.

●Used principles of Normalization to improve the performance. Involved in ETL code using PL/SQL in order to meet requirements for Extract, transformation, cleansing, and loading of data from source to target data structures.

●Involved in the continuous enhancements and fixing of production problems. Designed, implemented, and tuned interfaces and batch jobs using PL/SQL.

●Involved in data replication and high availability design scenarios with Oracle Streams. Developed UNIX Shell scripts to automate repetitive database processes.

●Developed code fixes and enhancements for inclusion in future code releases and patches.

Environment: PL/SQL, SQL, SSIS, Database, Oracle, TOAD, ETL, Crystal Reports, Unix Scripts

CLIENT: Accord Solutions, Hyderabad, India

DURATION: Jun 2014 – Mar 2016

ROLE: Data Analyst

Responsibilities:

●Involved and gathered requirements from the business users and created technical specification documents.

●Designed the database tables and reviewed new report standards to ensure optimized performance under the new reporting service SSRS.

●Planned, designed, and documented the optimum usage of space requirements and distribution of space for the Data Warehouse.

●Worked with MS SQL Server and managed the programs using MS SQL Server Setup.

●Designed and developed packages for data warehousing and Data Migration projects using Integration services and SSIS on MS SQL Server.

●Extracted the data from Oracle, Flat Files, transformed and implemented required Business Logic, and Loaded into the target Data warehouse using SSIS.

●Created OLAP cubes on top of the data warehouse basing various fact and dimension tables for analysis purposes using SQL Server Analysis Services.

●Worked with the setup and implementation of the reporting servers, written T-SQL queries, and Stored Procedures, and used them to build

Contact this candidate