Post Job Free
Sign in

Data Architect

Location:
Boston, MA
Posted:
November 25, 2024

Contact this candidate

Resume:

Himanshu Yadav

**************@****.***

704-***-**** Boston, MA

Senior Data Architect

16+ years of experience architecting and implementing high-performance Analytics, Big Data, and Distributed Systems solutions. Specialized in designing and delivering Data Warehouse, Data Lake, end-to-end data pipelines and infrastructure to drive business value across multiple stakeholders.

Skills: Data Warehouse, Data Modeling, Data Governance, Master Data Management,

Python, Pandas, ETL, Data Lake Architecture, RESTful Services, Hadoop ecosystem, Apache Spark, SQL (various flavors), Cloud Platforms (Azure, AWS, GCP), Docker, CI/CD

Senior Data Architect, Acadian Asset Mgmnt Sep 2022 – Present

Spearheading data infrastructure modernization efforts at a quantitative investment management firm with $100B+ AUM, optimizing data pipelines critical for investment strategy execution and risk management.

●Architected and developed SQLPylot, a Python-based application enabling version control and automated deployment of SSIS packages, SSRS reports, and SQL Server schema changes across 78 delivery teams.

●Implemented SQLPylot, resulting in a 40% reduction in deployment time and a 60% decrease in deployment-related incidents, significantly enhancing operational efficiency.

●Led the migration of 78 SQL Server repositories to SQLPylot, leveraging GitLab Actions and Docker for seamless CI/CD integration, improving code quality and reducing time-to-market for data-driven features.

●Engineered the migration of critical RefreshAutomation jobs from ActiveBatch to AWS Lambda, utilizing Python and PyMssql, resulting in a 30% cost reduction and improved scalability of data refresh processes.

●Collaborated with quant researchers and portfolio managers to optimize data delivery pipelines, reducing latency in market data processing by 25% and enabling more timely investment decisions.

Technologies: Data Warehouse, Data Governance, SQL Server, Python, PyODBC, PyMSSQL, Docker, GitLab Actions, JIRA, CI/CD

Senior Data Architect, Commonwealth, DFML Apr 2020 – Sep 2022

Led the design and implementation of secure, compliant data architecture solutions for a state-level agency managing paid medical leave programs, processing sensitive healthcare data for millions of constituents.

●Worked with Case Appeals Team with claimant’s wage analysis using PII data which is extracted from a different department using AWS Glue

●Establishing KPIs to measure the effectiveness of business decisions which provides overall health of the organization at a given time.

●Analysis and insights reports for Employer reach out campaigns, Call Center engagement and overall department performance.

●Design and architected the ETL pipelines written in Python and data warehouse solutions for the AWS env. These ETL pipelines get data from different sources and store into a Redshift Datawarehouse.

●Designed and architected the frameworks for publishing daily data extracts (delta or cumulative) for different teams using Python

Technologies: Data Lake, Data Warehouse, Data Modeling, BI, Scala, Spark, AWS, Azure, Pandas, Postgres, AWS Redshift, Qlik, GitHub, Shell Scripting, CI/CD, Terraform

Lead Data Engineer, Wayfair May 2018 - Apr 2020

Led, coached, and mentored a team of 7 Engineers and 2 product managers to process 10+ terabyte per day of web data for an e-commerce store with 120M unique visitors.

●Written ETL in Python processing 2-3 TB of web analytics events received from wayfair.com and transform it at near real time using Spark, Kafka and Hadoop.

●Architected ETL framework in Python with self-service datasets for 50+ ML and Data engineers, reducing their reporting time from 1-2 week to a single day

●Launched real-time analytics dashboard for 40+ people marketing team in under 6 months which helped automating real-time marketing based on campaign analytics (email, social, etc)

●Identified and implemented a fix for the performance issue of the Hadoop clusters where it was taking time processing small size parquet files.

●Established a new team of 5 engineers and 1 product manager to supply company-wide data analytics

●Collaborated with 80-person infrastructure team to upgrade Hadoop cluster, successfully migrating 13 months of data (2 Petabyte) with nearly 100% uptime

Technologies: Data Lake, Data Warehouse, Data Modeling, Spark, Java, Scala, Apache Kafka, Hadoop, Hive, Presto, Airflow, GitHub, Docker, Jenkins, CI/CD

Senior Architect, Saylent Technologies Apr 2016 - May 2018

Delivered the new cloud based platform to with improved customer engagement and marketing campaign ROI, Case studies: https://saylent.com/case-studies/

●Designed and Architected the microservices based backend API layer.

●Architected the successful migration of on-prem infrastructure to AWS cloud

●Transformed company's engineering vision from Microsoft based tech stack to open-source, data-driven, microservices architecture

●Launched a new platform in a record time of 11 months and successfully onboarded 6 financial institutions with more than 300,000 customers altogether

●Responsible for developing ETL pipeline to import data from more than 30 financial institutions in to different datasets used for machine learning and reporting use cases.

●Design and architect the platform. It includes a secure cloud infrastructure where bank data with customer’s sensitive PII information can be securely stored.

●Architected and build the micro-services architecture where multiple services written in Java8, Scala and Python can talk to each other.

●Developed a token-based authentication service to serve as a gateway for incoming requests coming from the outside world.

Technologies: Data Lake, Pandas, Data Warehouse, Restful Services, AWS Redshift, Docker, Apache Kafka, GitHub, Shell Scripting, Jenkins, CI/CD

Tech Lead, Next Step Living Jul 2015 - Apr 2016

●Led the team of 2 backend engineers and collaborated with UI and QA team to deliver backend services for green energy company

●Interfaced directly with product owner to build data collection Android table application, launching beta version to 60+ adopters

●Developed application adaptor layer to read data from Canvas mobile app, feeding into application’s database

●Rest API and backend development to support UI features.

●Developed an application adaptor layer to read data from a Canvas mobile application and feed into application's database.

Technologies: Java 8, Restful Services, GitHub, Shell Scripting, Jenkins, CI/CD, Postgres

Tech Lead, Optaros June 2014 - Jul 2015

●Delivered in-house API solutions, used by more than 25 engineers

●Data expert or the team and helped develop ETL pipelines for different product tables in the production environment.

●Designed and developed Controller layer which was used to expose REST endpoints to the web-service clients.

●Developed auditing framework using AspectJ.

●Developed on its ORM layer which provides mapping between Controller layer and database out of the box.

●Coordinated between Macy’s tech team based in San Francisco and offshore team in Brazil

●Evangelized the new REST API solution, successfully onboarding different engineering teams on the new platform

Technologies: Java 6, Hibernate, GitHub, Shell Scripting, Jenkins, CI/CD, SQL Server

Senior Software Engineer, Mastec Feb 2012 - June 2014

●Delivered bug fixes for PAS system, enabling BAs to change policy related configurations with minimal developer involvement.

●Feature development and bug fixes for PAS system.

●Support on-boarding and request changes for one of the clients.

Technologies: Java 6, Hibernate, GitHub, Shell Scripting, Jenkins, CI/CD, SQL Server, Apache Tomcat

Tech Lead, Bank of America Jul 2006 - Feb 2012

●Managed a team of 15 on and offshore engineers to deliver Bank of America’s first mobile application for B2B solutions

●Increased collaboration between onshore PM and offshore development and QA teams

●Developed velocity template, rendering UI page for 300+ products

Technologies: Java 6, Hibernate, GitHub, Shell Scripting, Jenkins, CI/CD, SQL Server, Websphere, WebLogic, MQs



Contact this candidate