Business Intelligence Data Engineering

Location:

Memphis, TN

Posted:

April 09, 2025

Contact this candidate

Resume:

Nandini Vidadala

901-***-**** ******************@*****.*** LinkedIn

SUMMARY

Results-driven Data Engineering Professional with 3.5+ years of experience specializing in building data pipelines, optimizing data workflows, and delivering actionable insights. Proficient in Python, SQL, and Tableau, with hands-on experience in developing automated reporting solutions and enhancing business intelligence capabilities. Skilled in utilizing ETL tools, cloud platforms, and data visualization tools to support business functions across a range of domains, including Wholesale, Direct-to-Consumer (D2C), and Financial Services. Adept at improving data accuracy, reducing processing time and ensuring compliance with data security and governance standards.

AREAS OF EXPERTISE

Data Engineering & Analysis: Data Ingestion, ETL Development, Data Pipeline Automation, Data Transformation, Data Modeling, Data Migration, Data Quality Assurance

Business Intelligence & Reporting: Data Visualization, Dashboard Creation, Automated Reporting, Predictive Analysis, Statistical Analysis

Tools & Technologies: SQL, Python, PySpark, Tableau, Power BI, Apache Airflow, Databricks, Apache Nifi, Jenkins, Informatica

Cloud Platforms: AWS (S3, Redshift, EC2), Azure (Blob Storage, Synapse Analytics, Data Factory)

Databases & Data Management: SQL Server, PostgreSQL, MySQL, MongoDB, Snowflake, Oracle

Version Control & Automation: Git, Jenkins, CI/CD, Docker, Ansible, Chef, Puppet

TECHNICAL SKILLS

•Analytical Tools: SQL, SAP Analytics Cloud, Teradata, Hive SQL

•Programming & Scripting: SQL, Python, Scala, PySpark, R,NLP(Natural Processing Language), Java

•Data Engineering: Data Ingestion, Pipeline Development, Pipeline Automation, ETL, Data Migration

•Databases: Oracle, SQL Server, MySQL, MongoDB, Redshift, Snowflake, MS Access, PL/SQL

•ETL Tools: Informatica, Oracle Data Integrator (ODI), Azure Data Factory, Databricks

•BI Tools: MS Excel, Tableau, Power BI, Data Governance, Data Quality Improvement, Data Visualization Solutions

•Cloud Services: AWS (EC2, S3, Redshift, QuickSight), Azure (Blob Storage, Data Factory, Azure SQL)

•Vertical Exposure: Marketing, Sales, Customer Service, Financial Service

•Data Engineering Tools: Apache Spark, Kafka, Hadoop (HDFS), Apache Nifi, Databricks

•Testing & Automation: Jira, Shell Scripting, Test Cases

WORK EXPERIENCE

August 2020 – December 2022 Data Engineer Fujitsu Consulting India Private Ltd Chennai, India

Project: Financial Data Engineer and Reporting Optimization

Built and managed over 50 data pipelines using Python and PySpark, ensuring seamless data flow across Fujitsu’s systems, which enhanced data processing speed by 30%.

Assisted in the migration of on-prem SQL databases to Azure Synapse Analytics via Azure Data Factory, achieving high data availability and reducing system downtime by 25%.

Developed real-time visualizations in Power BI, automating data refresh processes using Python for accurate reporting.

Implemented automated workflows with Apache Airflow, minimizing manual intervention and increasing workflow efficiency, leading to a 40% reduction in processing time.

Configured Kinesis Firehose to automatically load streaming data into Amazon S3 and Redshift, enabling efficient storage and querying for downstream processing.

Wroted KornShell (KSH) scripts to automate data pipelines, file transfers, and scheduled tasks, reducing manual intervention and ensuring reliable data flow across the platform.

Designed and implemented logical and physical data models for relational databases (SQL), ensuring efficient storage, retrieval, and consistency of data for various business applications.

Conducted performance tuning of AWS resources (Kinesis, Lambda, EMR) and SQL queries (PL/SQL and SparkSQL) to ensure optimal performance and cost efficiency.

Used CloudWatch and custom logging to monitor Lambda executions, Kinesis stream throughput, and EMR cluster resource utilization.

Implemented graph traversal algorithms to identify shortest paths, centrality measures, and community structures within the data, enabling advanced analytics.

Enforced data governance policies, ensuring data privacy and compliance with regulations like GDPR and HIPAA.

Expertise in configuration and automation using Chef, Chef with Jenkins, Puppet, Ansible and Docker Experience in configuring Docker Containers for Branching and deployed using Elastic Beanstalk

Contributed to the creation of Tableau and Power BI visualizations, providing actionable insights that supported key business decisions and increased data accessibility for stakeholders.

Implemented AWS Lambda functions to process incoming data in real-time from Kinesis Streams and trigger further operations like data transformation and analytics.

Implemented a cloud-based data warehouse using Amazon Redshift, Google BigQuery, or Snowflake to integrate large volumes of structured and semi-structured data.

Set up IAM roles and policies within AWS, Google Cloud, or Azure to ensure appropriate access control to the data warehouse and other services.

Implemented real-time data flow using stream-processing technologies (e.g., Apache Kafka, Apache Flink) to manage continuous data ingestion, processing, and storage, ensuring low-latency and high-throughput data pipelines.

Used a variety of techniques and scripting languages (e.g., SQL, Python, Shell scripting) to cleanse and manipulate raw data, removing duplicates, handling missing values, and standardizing data formats for analysis.

Set up an AWS EMR cluster with Apache Spark to process large datasets, allowing for distributed data analysis and fast computation.

Data Extraction, aggregations and consolidation using PySpark.

Automated data processes using Python, improving efficiency in data extraction, transformation, and analysis.

Integrated MongoDB and Redis in data solutions, enhancing data scalability and performance in NoSQL environments

Designed data models for OLAP reporting using MDX to generate multidimensional queries and dashboards for business intelligence tools.

Worked closely with the DevOps team to deploy and maintain the system in AWS, using CI/CD pipelines and Terraform for infrastructure as code (IaC).

Designed and implemented graph data models using nodes, edges, and properties to represent entities and relationships in the system, such as customers, products, transactions, and their interconnections.

Conducted normalization and denormalization based on performance and reporting requirements, ensuring both efficient storage and fast query response times.

Built logical data models to translate business requirements into data structures and created physical data models for actual implementation on relational databases.

Developed and optimized PL/SQL scripts for batch data processing and complex queries within the Oracle database.

Created and maintained database structures, including tables, indexes, and views, using DDL commands to ensure efficient data storage and retrieval.

Developed shell scripts for automated file search, cleanup, and transformation tasks, improving file organization and freeing up server space in large-scale environments.

Optimized both relational and graph queries by fine-tuning indexes, query plans, and using graph-specific optimizations (e.g., traversals with indexed properties).

Implemented data security protocols and monitored data quality, supporting data governance and compliance efforts, which safeguarded sensitive information and ensured regulatory compliance.

Extensively worked on Jenkins and Hudson by installing, configuring, and maintaining the purpose of Continuous Integration (CI) and for End-to-End automation for all build and deployments and in implementing CI/CD for database using Jenkins.

Assisted in building data warehouse structures, including facts, dimensions, and aggregate tables, enabling efficient data analysis and retrieval, which enhanced data query performance.

Environment: Azure Data Factory, Azure Synapse Analytics, Azure Data Lake, AWS (S3, Redshift, EC2), SQL Server, MySQL, PostgreSQL, MongoDB, PySpark, Apache Airflow, Jenkins, Git, Ansible, Tableau, Power BI, Linux.

Feb 2019 – April 2020 Big Data Engineering Intern Accenture

Assisted in the development and optimization of data pipelines, supporting the extraction, transformation, and loading (ETL) processes for client data across multiple platforms.

Optimized Hadoop clusters, improving data processing performance by adjusting configurations and tuning MapReduce jobs.

Conducted data analysis and built interactive dashboards using Tableau and Power BI, providing actionable insights to support key business decisions.

Developed custom data filtering and conditioning logic to handle edge cases, ensure the accuracy of datasets, and maintain consistency across different data sources.

Developed scripts and processes for data synchronization between different systems (on-premises and cloud-based), ensuring data consistency and availability across various platforms.

Utilized Spark’s MLlib for machine learning applications, building and training models for predictive analytics and data-driven decision-making.

Integrated Scala with Apache Spark for large-scale data processing, reducing processing time and ensuring efficient resource utilization.

Worked with Hadoop ecosystem components (HDFS, MapReduce, Hive, Pig, etc.) to process and manage large datasets, optimizing data storage and retrieval.

Worked with Spark SQL to run complex queries on large datasets, integrating Spark with Hive and HDFS for enhanced query capabilities.

Integrated Hadoop with cloud platforms (AWS, Azure) to enable cost-effective data storage and processing, supporting large-scale data solutions.

Supported the creation of data models for reporting solutions, reducing query time by 15% for large datasets.

Conducted data quality checks and validation, improving data accuracy and consistency by 25% across client systems.

Developed automated reporting systems that decreased manual reporting efforts by 20%, allowing for more timely insights for business leaders.

Utilized Scala’s advanced features such as immutability, higher-order functions, and pattern matching to build clean, modular, and reusable code.

Developed shell scripts for automated file search, cleanup, and transformation tasks, improving file organization and freeing up server space in large-scale environments.

Worked on creating real-time data pipelines using Apache Kafka and AWS services (S3, Redshift), ensuring seamless and efficient data processing.

Gained experience with Git and Jenkins for version control and continuous integration of data engineering workflows.

Environment: AWS (S3, Redshift), Azure (Blob Storage, Data Factory), Python, Apache Spark, Hadoop, Sacla, Kafka, SQL, Tableau, Power BI, Jenkins, Git.

EDUCATION

•Master’s Degree (Information systems) University of Memphis, Tennessee 2024 December

•Bachelor’s Degree (B. Tech) Jawaharlal Nehru Technological University, Kakinada, India 2020 July

CERTIFICATIONS

•Google Data Analytics Professional

•Microsoft Power BI

•Microsoft Excel

Contact this candidate