Aditya Udutha
Dallas, Texas ******.******@*****.*** 903-***-**** https://github.com/aditya-adi14 https://www.linkedin.com/in/aditya-udutha/
Experience:
Broadridge Financial Solutions, India
Role: Data Engineer Jan 2019 – Jul 2022
• Worked on a regulatory project CAT (Consolidated Audit Trail), creating views of the components in the UI using the Message Automation Workbench ETL tool.
• Developed ETL pipelines to extract data from AWS S3 bucket, transformed it into a usable format, and loaded it into the targeted into the database tables
• Developed ETL pipeline for data integration and data warehousing to move and transform data between systems.
• Implemented ETL processes to integrate data from diverse sources, including databases, APIs, and external data providers, ensuring data accuracy, consistency, and quality.
• Created new actions and workflows using the ETL tool, Message Automation Work Bench, enabling end-users to edit and assign data before submission.
• Utilized the ETL tool, Message Automation Work Bench, to create new actions and workflows, improving data management and user efficiency.
• Developed and maintained data management processes to ensure data integrity and availability.
• Designed and implemented data models for efficient storage and retrieval of information.
• Designed, implemented, and tuned complex data warehouse solutions in Oracle and SQL Server environments.
• Improved data retrieval speed by optimizing queries, resulting in a 30% decrease in query response times.
• Reduced active schema table data by 40% through Archival Script implementation, leading to significant performance enhancements.
• Worked with a variety of data technologies, including relational databases, data warehousing, big data platforms (e.g., Hadoop, Spark), and data streaming (e.g., Kafka).
• Proficient in cloud services, with a focus on Google Cloud Platform (GCP), including BigQuery, Dataflow, Data Fusion, Dataproc, Cloud Composer, Pub/Sub, and Google Cloud Storage.
• Collated data from diverse sources including Salesforce Cloud, SQL Server, Oracle, Flat files, S3 files, and REST APIs to ensure comprehensive data integration.
• Applied Continuous Integration/Continuous Deployment (CI/CD) practices to code releases, automated deployments, source code management, and version control using GIT-Stash, SVN, or TFS
• Familiarity with cloud-based data warehousing platforms like Snowflake and BigQuery and Experience with Kafka for real-time data processing
• Designed and implemented complex data warehouse solutions, including high-volume data movement and collation from various sources, using a mix of traditional RDBMS (Oracle, PostgreSQL) and NoSQL and distributed databases (DynamoDB, Elastic search,)
• Developed front-end applications using Python, JSON, and jQuery.
• Managed MySQL databases, optimized performance, and implemented high-availability solutions.
• Designed and developed APIs using Python Flask.
• Migrated Django database from SQLite to MySQL to PostgreSQL.
• Developed interactive reports and dashboards using Python and Tableau.
• Utilized Git for version control in a team-based environment. Projects:
Color and Texture Based Approach for Web Image Retrieval Using Deep Learning Techniques: Retrieving the similar image from the data base using MATLAB Software/Image Processing by using the Unified color and intensity level matrices and grey level matrices methods. Vehicle Number Plate Recognition Software Using Machine Learning: Retrieving the number plate from the input image using Machine Learning algorithms. Skills
• Programming Languages: C, Java, Scala, JavaScript, Python, PySpark, SparkSQL
• Databases: PostgreSQL, MySQL, PL/SQL, NoSQL- MongoDB, T-SQL, CI/CD practices in database releases
• Big Data Technologies: Spark, Hive, Sqoop, Kafka, Kinesis, Hadoop, HDFS, Airflow.
• Virtualization: Tableau, Google Data Studio, Microsoft Power BI.
• ETL: Message automation workbench, Informatica, Matillion, Fivetran, Coalesce, DBT.
• Cloud Platform: Azure (Data Lake, Data Factory, Databricks, SQL Database, Synapse Analytics, Blob Storage, Functions, Cosmos DB, HDInsight), AWS (S3, EC2, Redshift, Lambda, Kinesis, QuickSight, IAM Policies, Athena, DynamoDB, Elasticsearch).
• Machine Learning: NumPy, Scikit-learn, Curves and Regularization, SVM, K-means and PCA.
• Version Control Systems: GitHub, SVN, Git.
• Methodologies: Agile SDLC, Scrum.
• Build and CI Tools: Docker, Kubernetes, Maven, Jenkins. Education:
Texas A&M University Commerce, Texas
Master of Science in Computer Science Aug 2022 - May 2024