Post Job Free
Sign in

Data Engineer Management

Location:
Dallas, TX, 75225
Posted:
June 25, 2024

Contact this candidate

Resume:

Aditya Udutha

Dallas, Texas ******.******@*****.*** 903-***-**** https://github.com/aditya-adi14 https://www.linkedin.com/in/aditya-udutha/

Experience:

Broadridge Financial Solutions, India

Role: Data Engineer Jan 2019 – Jul 2022

• Worked on a regulatory project CAT (Consolidated Audit Trail), creating views of the components in the UI using the Message Automation Workbench ETL tool.

• Developed ETL pipelines to extract data from AWS S3 bucket, transformed it into a usable format, and loaded it into the targeted into the database tables

• Developed ETL pipeline for data integration and data warehousing to move and transform data between systems.

• Implemented ETL processes to integrate data from diverse sources, including databases, APIs, and external data providers, ensuring data accuracy, consistency, and quality.

• Created new actions and workflows using the ETL tool, Message Automation Work Bench, enabling end-users to edit and assign data before submission.

• Utilized the ETL tool, Message Automation Work Bench, to create new actions and workflows, improving data management and user efficiency.

• Developed and maintained data management processes to ensure data integrity and availability.

• Designed and implemented data models for efficient storage and retrieval of information.

• Designed, implemented, and tuned complex data warehouse solutions in Oracle and SQL Server environments.

• Improved data retrieval speed by optimizing queries, resulting in a 30% decrease in query response times.

• Reduced active schema table data by 40% through Archival Script implementation, leading to significant performance enhancements.

• Worked with a variety of data technologies, including relational databases, data warehousing, big data platforms (e.g., Hadoop, Spark), and data streaming (e.g., Kafka).

• Proficient in cloud services, with a focus on Google Cloud Platform (GCP), including BigQuery, Dataflow, Data Fusion, Dataproc, Cloud Composer, Pub/Sub, and Google Cloud Storage.

• Collated data from diverse sources including Salesforce Cloud, SQL Server, Oracle, Flat files, S3 files, and REST APIs to ensure comprehensive data integration.

• Applied Continuous Integration/Continuous Deployment (CI/CD) practices to code releases, automated deployments, source code management, and version control using GIT-Stash, SVN, or TFS

• Familiarity with cloud-based data warehousing platforms like Snowflake and BigQuery and Experience with Kafka for real-time data processing

• Designed and implemented complex data warehouse solutions, including high-volume data movement and collation from various sources, using a mix of traditional RDBMS (Oracle, PostgreSQL) and NoSQL and distributed databases (DynamoDB, Elastic search,)

• Developed front-end applications using Python, JSON, and jQuery.

• Managed MySQL databases, optimized performance, and implemented high-availability solutions.

• Designed and developed APIs using Python Flask.

• Migrated Django database from SQLite to MySQL to PostgreSQL.

• Developed interactive reports and dashboards using Python and Tableau.

• Utilized Git for version control in a team-based environment. Projects:

Color and Texture Based Approach for Web Image Retrieval Using Deep Learning Techniques: Retrieving the similar image from the data base using MATLAB Software/Image Processing by using the Unified color and intensity level matrices and grey level matrices methods. Vehicle Number Plate Recognition Software Using Machine Learning: Retrieving the number plate from the input image using Machine Learning algorithms. Skills

• Programming Languages: C, Java, Scala, JavaScript, Python, PySpark, SparkSQL

• Databases: PostgreSQL, MySQL, PL/SQL, NoSQL- MongoDB, T-SQL, CI/CD practices in database releases

• Big Data Technologies: Spark, Hive, Sqoop, Kafka, Kinesis, Hadoop, HDFS, Airflow.

• Virtualization: Tableau, Google Data Studio, Microsoft Power BI.

• ETL: Message automation workbench, Informatica, Matillion, Fivetran, Coalesce, DBT.

• Cloud Platform: Azure (Data Lake, Data Factory, Databricks, SQL Database, Synapse Analytics, Blob Storage, Functions, Cosmos DB, HDInsight), AWS (S3, EC2, Redshift, Lambda, Kinesis, QuickSight, IAM Policies, Athena, DynamoDB, Elasticsearch).

• Machine Learning: NumPy, Scikit-learn, Curves and Regularization, SVM, K-means and PCA.

• Version Control Systems: GitHub, SVN, Git.

• Methodologies: Agile SDLC, Scrum.

• Build and CI Tools: Docker, Kubernetes, Maven, Jenkins. Education:

Texas A&M University Commerce, Texas

Master of Science in Computer Science Aug 2022 - May 2024



Contact this candidate