Post Job Free

Resume

Sign in

Data Architect Big

Location:
Princeton, NJ
Salary:
$250000
Posted:
April 23, 2024

Contact this candidate

Resume:

Professional Summary

Big Data architect with AI/ML(LLM specially) with experience in the design, development and implementation of Data Warehousing. Strong experience with Data Warehouse, Informatica (8.6,9.1 and 10.0,10.1), Tableau, Snowflake,Python,Oracle, Sybase, Perl, Python, Java, IBM MQSeries, SQL, and Unix. Independent team leader with strong ETL design and development of mappings. Proven history of building large-scale data processing systems and serving as an expert in data warehousing solutions while working with a variety of database technologies. Experience architecting highly scalable, distributed systems using different open source tools as well as designing and optimizing large, multi-terabyte data warehouses. Able to integrate state-of-the-art Big Data technologies into the overall architecture and lead a team of developers through the construction, testing and implementation phase. Recently implemented alert system using LLM on news data.

Area of expertise

Databases and Tools:

Data Warehouses: Snowflake,Redshift, Teradata,Netezza, Greenplum.

RDS: MS SQL Server, Oracle, DB2,

NoSQL: HBase, DynamoDB, SAP HANA, HDFS, Cassandra, MongoDB, CouchDB, Vertica, Greenplum,

Agile Tools: Jira, Confluence, Github.

Technical Skills: DevOps using Linux OS’s, Bigdata using Scala, pyspark. Snowflake,Redshift,Informatica version 8.6, 9.1,10, Informatica IDQ, Hadoop technologies (HDFS, Hive, Impala, Spark),AWS, WSRR architecture, SOA, Oracle(PL/SQL), Perl, Shell, Java,w UML, Unix,

Professional Experience

London stock exchange

Senior data manager/engineer

LONDON STOCK EXchange,NYC,NY SePTEMBER 2021- CURRENT

Title:Research manager

AWS and Azure Cloud architect for data migration

The following tasks are being implemented

1. The configuration to AWS services are being terraformed.

2. Bigdata framework setup for data lake and data ingestion to SnowFlake using dedicated EC2 instance(Linux).

3. Lambda functions are being used to kick off events like processing of new files, alerts from Cloudwatch and Cloudtrail.

4. Setup bitbucket repository and integrated that with Jenkin for CI/CD.

5. Setup up EMR services for Bigdata processing(Scala and Pyspark)

6. Glue is being used as ETL tool.

AI/ML Specialist

Developing models using ML and Deep Learning methods(Tensors) for pre-payments in Mortgages.

Successfully completed more than 4 POCs so far using Cloud technology from AWS and AZURE.

Working on creating alerts for possible foreclosures using LLM AI on news contents from different resources.

TD Ameritrade, Jersey City, NJ November 2017 - august 2021

Solution Architect for Big Data Platform (Risk Surveillance System

Lead DevOps for machine learning Platform.

1. Used aws cloudform services to automate the deployment of infrastructure.

2. Setup Bigdata plaform for ML platform.

3. Migrated On-premise Spark Framework(Cloudera) to AWS cloud.

4. Configured dev,uat and prod EMR in aws.

5. Configured linux based containers and EC2 instances for automated run.

Set up Big data platform for Machine Learning.

Developed models for alerts for Compliance Dept.

Used Data Bricks tool in Azure cloud to setup pipeline for data ingestion.

Setup Big Data platform using hortonworks in linux clusters in Azure cloud.

Provided leadership to AI/ML team and tutored them.

Implemented real-time process using Kafka loading data into snowflake.

Designed and implemented frame work using Scala and Unix shell to process large files for ingestion into Risk Surveillance System.

Migrated the data from Oracle and Netezza to Snowflake.

Migrating Machine Learning platform to AWS cloud.

Developing real-time process using AWS kinesis and scala using EMR.

AI/ML specialist:

Setup machine learning platform in cloud;

Implemented alert scenarios for trades using AI/ML.

Used deep learning tensor framework to train models for alert system.

Ram Associates Inc. (Part time) Mercerville,NJ FERUARY 2019 – MAY 2019

Migration of data in AWS Cloud

Created Linux instances to run data migration. Developed scripts using AWS CLI to migrate 50T of data from Amazon Cloud (EBS) to S3 device.

Implemented a Lamdba function to load documents into Amazon Redshift.

Implemented AI/ML based search algorithm using NLP on news contents. This project was a pilot project

for high school interns..

UBS, NYC,NY October 2015 – October 2017

Solution Architect for CCAR(Comprehensive Capital Analysis and Review)

Worked on CCAR project for UBS. Developed extraction process for CCAR data using Informatica PowerCenter 9.6.

Setup Big Data environment to process CCAR data using Snowflake data warehouse in AWS.

Provided the leadership to whole team and delivered the whole product on time.

Used Scala and Python to process CCAR data.

Designed frame work for Data Quality (Informatica IDQ) for CCAR.

Developed a monitoring tool for CCAR dash board using Python.

Developed an API consistent with LAYER 7 for load balancing for CCR applications.

Implemented total infrastructure for CCAR system.

Wrote framework using Unix Shell, Python and Perl for CCAR.

Presented POC for Bigdata warehouse using AWS.

Implemented ER modeling and dimensional modeling for CCAR data.

Developed the Data Quality report using Tableau.

Used Jira,Confluence an Github for source codes.

Did the integration of ERP with CCAR.

Pulled the CRM data from Sales force and developed IDQ test cases for data quality.

Israel Discount Bank of New York, NYC,NY June 2014 – September 2015

Data Architect (Compliance Data Warehouse)

Successfully implemented dimensional modeling which uses Star and Snowflakes to do reporting

Led the whole compliance team and tutored them.

for compliance violations for history transactions.

Integrated Informatica ETL capability with MANTAS application to implement Trusted Pair feature.

Successfully set up Dev, Uat environments for MANTAS (which uses Informatica power center).

Designed and implemented a recon tool using Python for Compliance Recon process.

Worked on Compliance monitoring system for CTR (Cash Trade Report), Money Laundering, Watch list and risk scoring

Set up environment using Informatica to support Oracle MANTAS application.

Designed the Data quality tool using Informatica Power Centre to validate compliance database.

Designed Oracle partitioning for history data.

Implemented AWS cloud for Salesforce based recon applications.

Used Jira,Confluence for project management.

Developed CRM synchronization processes with Sales Force using IICS.

Morgan Stanley, NYC,NY February 2014 – May 2014

Informatica Architect

Enhancement and maintenance of existing mortgage application

Introduced exceptions in Object Oriented Perl Frame Work which processes feeds.

Worked on Data Quality based framework using Informatica IDQ

Enhanced the extraction of mortgage data for large volumes using Informatic Power Center 9.6.

Wrote Procs in Sybase and Oracle to support mortgage application.

Presented dimensional modeling concept to current Data Warehouse.

Provided production support for CBR applications.

Used Jira and confluence for status check.

TD Ameritrade, Jersey city,NJ February 2013 – January 2014

Informatica Architect (Risk and Compliance Dept.)

Worked on ETL Mantas application.

Worked on the dimensional modeling (Star Schema, Snowflake Schema) of Risk and Compliance Data Warehouse.

Implemented mappings for SCD1 and SCD2 using Informatica Power Center.

Worked on the Capacity Planning for Informatica 9.5 version.

Worked on the installation of Informatica 9.5 on Linux cluster.

Worked on the upgrade of 8.6 repository to 9.5 and fixed the issues with mappings after upgrade.

Defined the standards for naming convention and deployment procedures in Power Center.

Developed Informatica mappings to load data for IBM Compliance Platform MANTAS.

Worked on the product integration from different sources and participated in production support.

Worked on the setting up of Uat and Dev environments.

Proposed the Grid design to support multiple Informatica nodes.

Worked in the DR site for Informatica server with infrastructure team.

Developed Object Oriented Frame Work using Perl to support Data Integration which uses Informatica.

Developed Object Oriented testing tool using Perl which supports multiple QA tasks.

Developed a watch dog in Perl which scans all log files and send alerts with different colors depending on severity (red being critical, amber warning etc.).

Tuned the mappings to handle large volume of data. Proposed the design of parameter file creation during run time.

Barclays Capital, Jersey VITY, NJ April 2011 – February 2013

Informatica Architect

Data warehouse setup, Cost Basis Reporting, Trading and reporting applications.

Presented the concept of dimensional Data Warehouse to Barclays Capital.

Implemented Star Schema and Snow Flake schemas for Costs Basis Data Warehouse.

Configured the Grid for Dev, Uat and Prod Informatica Power Center platform.

Worked on the standards of naming convention and deployment strategies.

Developed frame work using Perl to automate SCM for Informatica deployment.

Developed frame work using Perl and shell to support dynamic updates of parameter files in Informatica.

Responsible for the automation of whole process for Informatica using Perl and Shell.

Designed and implemented Publish and Subscribe of data sharing using real time web services through Informatica.

Designed and implemented XML parsing using Informatica Power Center. Laid out the steps to be followed for XML parsing.

Developed a mapplet for cross referencing with Equity and Fixed income Data

Process feeds from main frame using VSAM in Informatica to load balance and position data.

Developed frame work using Perl to automate process for Informatica deployment.

Developed a Perl program to process LDAP Directory service which was used by Informatica process to convert unstructured data to structured format.

Set up QA testing environments with respect to Database Servers, wrote scripts to keep products

Data in QA in sync with Prod.

Coordinated with QA people effectively for data and technical issues.

Wrote an automated tool using Expect tool in UNIX to connect to UNIX servers based on DNS.

Wrote a remote execution tool using Expect and TCL.

Wrote a Frame Work in UNIX shell for Informatica ETL.

Wrote wrappers in BASH shell and KSH for all batch jobs.

Wrote a Frame Work in Shell for Informatica ETL.

Did Production Support for all the above applications.

Used Jira for reporting status of work.

Albridge. Lawrenceville, NJ September 2010 – March 2011

Senior Data Integrator

Product Cross reference model, Product Data warehouse

Worked on the architecture of Data Marts from Data Warehouse using Informatica 8.6.

Worked on Setting up of Data Hub and Data Marts using Informatica 8.6

Used Informatica partition along with oracle partitions s effectively in Data transfer.

Used PL/SQL stored procedures and SQL in Oracle effectively.

Used Dimensional model Star Schema for Data Ware house and Data Marts.

Provided prod support for Data applications.

Wrote deployment script to deploy codes from SCM in prod.

Worked on Informatica mappings to handle large quantity of position and balance data

Developed mappings to load dimension (type1 SCD and type1 SCD)

Tuned Informatica mappings to run effectively for large set of data

Designed the mapping to load different dimensions of equity, like product, ticker etc.

Worked on Setting up of Data Hub and Data Marts using Informatica 8.6

Wrote deployment script to deploy codes from SCM in prod.

Used Dimensional model Star Schema for Data Ware house and Data Marts.

Credit Suisse, NYC,NY August 2002 – March 2009

Senior Programmer Analyst

Implemented interfaces to Data Warehouse in Perl. Contributed to a Common Frame Work developed in Perl which has all the functionalities of ETL.

Designed ERP for prime brokerage in Credit Suisse.

Wrote Stored Procedures and Triggers in Sybase Database

Designed the interfaces using UML technology. Designed the classes in Perl to match the right patterns for Interface.

Designed the data model for positions and balance for Different Counter Parties for Equity, Fixed income, options, futures and forwards

Implemented a Cross Reference Model in Perl for Equity, Fixed Income, Derivative Options and Futures. Various design patterns were used here.

Wrote a Web service client to pull Libor (Interest Rates), FED OPEN rates.

Wrote an application in Perl to pull FX rates.

Wrote Daemon process for Web interface for manual load of Interest Rates by business folks. Used DBI module here to load into Sybase.

Wrote an application using LWP module in Perl to load Index Rates

Used Procedures and Triggers in Perl DBI.

Used SAX, DOM parser in Perl to parse XML files.

Implemented CDS (Derivative), Swaps REPOS (Repurchase of Stocks), and Corporate Actions load using Informatics.

Wrote Loaders in Perl and Shell to load Fixed Income, Futures and Options products.

Use XSLT in XPATH to parse XML files coming from Message Queue (MQ Series).

Did Production Support for all the above applications.

Installed new Perl modules in Production.

Contributed to Price Reporting Web site in Credit Suisse.

Contributed to Portal Developments for Portfolio manager in Credit Suisse.

Developed a remote execution tool using sockets.

Developed a multicast alert email system for Swift Messages using UDP protocol.

Installed a new ODBC driver for third party vendor database by name Meta Metrix. Configured Informatica on UNIX box (Solaris OS) to use the ODBC driver.

Added new repositories and users for UAT in development servers

Helped Informatica Admins to tune Informatica in memory allocation on Sun machines

Designed a common Frame Work in Shell to run Informatica Work flows.

Set up QA testing environments with respect to Database Servers, wrote scripts to keep products

Data in QA in sync with Prod.

Coordinated with QA people effectively for data and technical issues.

Wrote an automated tool using Expect tool in UNIX to connect to UNIX servers based on DNS.

Wrote a remote execution tool using Expect and TCL.

Wrote a Frame Work in UNIX shell for Informatica ETL.

Wrote wrappers in BASH shell and KSH for all batch jobs.

Wrote a Frame Work in Shell for Informatica ETL.

Did Production Support for all the above applications.

Implemented classes in Perl to load prices from Files, Miseries Messages and remote databases.

Wrote stored procedures and SQL queries to update prices in Data Warehouse

Implement interfaces to deliver the prices to clients. The interfaces are developed in Bash Shell, Perl.

Developed a web-based tool using to upload prices.

Wrote Stored Procedures and Triggers in Sybase Database.

Used XSLT to create HTML output for report.

Implemented a Module in Perl, which Creates PDF files on Fly along with Java Command Line tool for Formatting Objects. Here XML files are created by Java using DOM Parser. Used XSLT to parse XML files to load into Sybase database.

Installed XSL: FO, XSLT libraries from Apache Website.

Installed DBI modules from cpan in UNIX server.

Installed XML generator Perl module from CPAN.

Installed CGP.pm module from CPAN.

Configured Linux server for SLL related issues.

Installed encryption software in Linux Server.

Education

MS in Computer Engineering -University of Bridgeport, CT

BS in Electronics and Communication - University of Mysore, India



Contact this candidate