Data Analyst

Location:

Rancho Cordova, CA

Posted:

July 11, 2024

Contact this candidate

Resume:

Sriharshini A

Data Engineer

Contact Number: 773-***-****

Email ID: ********.********@*****.***

Professional Summary:

• Having 4+ years of Industry experience as a Data Engineer with solid understanding of Data Modeling, Evaluating Data Sources and strong understanding of Data Warehouse/Data Mart Design, ETL, BI, OLAP, Client/Server applications.

• Strong working knowledge and experience in Agile-Scrum environments

• Excellent understanding of business operations and analytics tools for effective analysis of data.

• Experience in validating and analyzing Hadoop log files.

• Excellent experience with DW Concepts such as Star schema, Snowflake Schema, Fact, Dimensional tables, physical and Logical data modelling.

• Used Python and Django creating graphics, XML processing, data exchange and business logic implementation

• Strong skills in visualization tools Power BI, MS Excel - formulas, Pivot Tables, Charts and DAX Commands.

• Created and deployed Reports and Dashboards in Power BI web services, Tableau, and Azure Analytics Services, visualizing key performance indicators and trends to facilitate data-driven decision-making. visualizing key performance indicators and trends to facilitate data-driven decision-making.

• Experienced working in cloud-based data warehousing, ADF, Azure Data bricks, Spark, and AWS.

• Prepared the reports in Excel sheets using pivot tables, V-Look ups and Macros.

• Experience in validating map-reduce jobs to support distributed processing using java, hive and pig.

• Worked with AWS Cloud platform and its features which includes EC2, VPC, RDS, EBS, S3, CloudWatch, Cloud Trail, CloudFormation and Auto scaling etc.

• Working experience in Data modeling, SQL, ETL, CI/CD pipelines, task automation using scripting and Data Warehousing

• Experience in loading multiple larger datasets into HDFS and processing the datasets by using the Hive and Pig.

• Experienced in building Cloud Data solutions with Snowflake, Reporting Data Lake, AWS Cloud and Looker

• Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Ab Initio and Informatica PowerCenter Experience in testing and writing SQL and PL/SQL statements - Stored Procedures, Functions, Triggers and packages.

• Experience writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).

• Skilled in defect management using Quality Center/ALM, ClearQuest and JIRA.

• Experience in Map Reduce programming model for analyzing the data stored in HDFS.

• Proficient in Functional, Regression, Integration, End to End and User Acceptance (UAT) testing.

• An excellent team player & technically strong person who has capability to work with business users, project managers, team leads, architects and peers, thus maintaining healthy environment in the project. Technical Skills:

Languages T-SQL, PL/SQL, SQL, C, C++, XML, HTML, DHTML, HTTP, Mat lab, DAX, Python, R. Statistical Analysis R, Python, SAS E-miner 7.1,SAS Programming, Mat lab, Jmp 8.0, Minitab Databases SQL Server, MS-Access, Oracle 9i/10g/11g/12c and Teradata, big data, Hadoop Cloud AWS and Azure

DWH / BI Tools Microsoft Power BI, Tableau, SSIS, SSRS, SSAS, Business Intelligence Development Studio (BIDS), Visual Studio, Crystal Reports, Informatica 6.1. Database Design

Tools and Data

Modelling

MS Visio, ERWIN 4.5/4.0, Star Schema/Snowflake Schema modelling, Fact & Dimensions tables, physical & logical data modelling, Normalization and De- normalization techniques, Kimball &Inman Methodologies and Informatica Tools and Utilities

SQL Server Management Studio, SQL Server Enterprise Manager, SQL Server Profiler, Import & Export Wizard, Microsoft Management Console, Visual Source Safe 6.0, DTS, Crystal Reports, Power Pivot, ProClarity,Microsoft Office, Excel Power Pivot, Excel Data Explorer, Tableau

Project Experience:

Tata Consultancy Services Jan 2022 – Sep 2022

Data Engineer

• Responsible for gathering requirements from business analysts and operational analysts and identifying the data sources required for the reports.

• Managed Amazon RedShift clusters such as launching the cluster by specifying the nodes and performing the data analysis queries.

• Updated Python scripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification

• Tuned the SQL queries for optimum performance.

• Identifying the relationship between metadata tables identifies the relationship of the selected table with all other tables in the data source.

• Implemented python modules to send automated emails to the clients at regular intervals of time.

• Conducted User Acceptance testing (UAT) and worked with users and vendor who build the system.

• Participated in all phases of data mining, data cleaning, data collection, developing models, validation, and visualization and performed Gap analysis.

• Develop Script to Create Connection to Different Data Source and migrate data to snowflake.

• Designed and Developed Data Bricks Python Scripts to extract data from One lake as automation process Job Scheduling.

• Use various sources to pull data (Snowflake, Redshift, MSSQL, ORACLE and API’S)

• Experience in loading Data Files in AWS Environment and Performed analysis using SQL on AWS redshift and Snowflake.

• Analysed large data sets to find any pattern in the data by extracting the data, cleaning the outliers using Pandas and publish them into table and graphs.

• Loading the data into Redshift from S3 bucket.

• Communicated with business users and analysts on business requirements. Gathered and documented technical and business Metadata about the data.

• Good Understanding of Geographical maps, Bar Charts, Bubble Charts and Line Graphs in helping the end user to have clear idea on visualizations.

• Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes.

• Designed automation scripts and batch jobs to create data pipelines between multiple data sources, Spark based analytics platform (Data bricks) and Amazon S3.

• Created Data Quality Scripts using SQL and Hive to validate successful data load andquality of the data.

• Involved in the phases of Analytics using Python and Jupyter notebook.

• Used Python scripts to update the content in database and manipulate files

• Developed complex metrics required for daily business reports in Spark SQL and Snowflake SQL.

• Performed Exploratory DataAnalysis and DataVisualizations using Tableau

• Created views in Tableau Desktop that were published to internal team for review and further data analysis and customization using filters and actions.

• Created interactive dashboards using Tableau desktop 2019.2.15 using filters.

• Created repositories in GitHub & developed wrapper scripts to download process that are placed in GitHub.

Environment: SQL, Python, Data Governance, AWS S3, Redshift, Hive, Tableau, Databricks, Teradata, JIRA, UNIX, Snowflake.

Tata Consultancy Services Jan 2020 – Jan 2022

Data Analyst

• Effectively led multiple client projects. These projects contained a heavy Python, SQL, Tableau and data modelling.

• Created Power BI reports and upgraded power pivot reports to Power BI.

• Created detailed reports for management.

• Reported daily on returned survey data and thoroughly communicated survey progress statistics, data issues, and their resolution.

• Involved in Data analysis and quality check.

• Migrated data from on premises data sources to Azure data bricks.

• Data sources are extracted, transformed and loaded to generate CSV data files with Python programming and SQL queries.

• Stored and retrieved data from data-warehouses using Amazon Redshift.

• Worked on datasets of various file types including HTML, Excel, PDF, Word and its conversions.

• Mine and analyze data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies

• Extensively created data pipelines in cloud using Azure Data Factory.

• Worked with Azure Data Factory (ADF) since it’s a great SaaS solution to compose and orchestrate Azure data services.

• Develop a master data flowchart which was used to measure the completion of study objectives.

• Served as primary contact for the acceptance or rejection of surveys where unique or rare issues were involved.

• Utilized Power Query in Power BI to Pivot and Un-pivot the data model for data cleansing and data massaging.

• Designed and developed business intelligence dashboards, analytical reports and data visualizations using Power BI by creating multiple measures using DAX expressions for user groups.

• Performed Database and ETL development per new requirements as well as actively involved in improving overall system performance by optimizing slow running/resource intensive queries.

• Developed data mapping documentation to establish relationships between source and target tables including transformation processes using SQL.

• Worked extensively with Tableau Business Intelligence tool to develop various dashboards.

• Created the source to target mapping spreadsheet detailing the source, target data structure and transformation rule around it.

• Collaborated with stakeholders to define business requirements and translate them into technical specifications for data analysis and reporting solutions, ensuring alignment with SLAs and data governance policies.

• Wrote Python scripts to parse files and load the data in database, used Python to extract weekly information from the files, Developed Python scripts to clean the raw data.

• Participated in data modeling discussion and provided inputs on both logical and physical data modelling.

• Opened Risks or Issues that the current project is facing and worked towards resolving them.

• Created master Data workbook which represents the ETL requirements such as mapping rules, physical Data element structure and their description.

Environment: Teradata, UNIX Shell Scripts, Azure, Azure Data Factory, Data bricks, Power BI, MS Excel, MS Power Point, Python, SQL, Hadoop Spark

Cyient, India May 2018 – Dec 2019

ETL Developer

• To obtain the data about the customers from different systems and aggregate within the data warehouse using Informatica

• Document the functional flows using MS Visio.

• Developed and tested Store procedures, Functions and packages in PL/SQL for Data ETL.

• Implemented populate slowly changing dimension to maintain current information and history information in dimension tables.

• Also wrote Unit tests for the developed scripts for the getting through the quality checks before pushing to the deployments

• Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.

• Developed the ETL jobs as per the requirements to update the data into the staging database (Postgres) from various data sources and REST API’s

• Written documentation to describe program development, logic, coding, testing, changes and corrections.

• Created complex Cognos reports using calculated data items, multiple lists in a single report.

• Prepared functional and technical documentation of the reports created for future references

• Worked with data modelers in preparing logical and physical data models and adding/deleting necessary fields using Erwin

Environment: ETL, Cognos, Informatica, PL/SQL, SQL, MS Office, MS Excel, Windows.

Contact this candidate