Sr. Cloud BigData Architect

Location:

Chicago, IL

Salary:

165/hr on C2C

Posted:

August 01, 2018

Contact this candidate

Resume:

RYAN RODRIGUES

CHICAGO, IL *****

H*B Visa Holder Available on CTC basis. Please call/email 646-***-**** anthony.reddy at itmmi dot com

OBJECTIVE

Seeking a position as a Sr. Cloud BigData Architect with an opportunity to architect & design Big Data solutions on Amazon AWS.

QUALIFICATIONS

18 years of experience having 8 years in Big Data and 10 years in Java/J2EE.

Post Graduate Diploma in Advance Computing.

MSc Engineering (Real Time Embedded Systems - Computer Science).

SKILLS

Possess strong analytical and excellent communication skills. Quick learner and a good team player, with a strong capability of leading teams. Detailed oriented, having excellent problem solving capabilities. High dedication and strong customer support.

Operating Systems: IOS, Linux, Windows.

Tools: Apache Kafka, Amazon EMR, CDH4.2, Hadoop, Oozie, Flume, Sqoop, Hue, IntelliJ, Eclipse, GIT, BitBucket, Source Tree, JIRA, RAD7, WAS6, IBM Heap Analyzer, JMeter, Visio, Rational Rose, Clear Case, Clear Quest, Synergy, SVN.

Languages: Kafka Connect, Kafka Streams, Spark Streaming, Spark ETL, Spark SQL, Java Map Reduce, PIG, Hive, Impala, Hue, UML Modelling, UseCase Analysis, Design Patterns (Core & J2EE), OOAD, Java8, J2EE (JSP, Servlets, EJB), Web Services, Ant, Python.

Frameworks: Apache Confluent, Stratio BigData, Apache Spark, Amazon AWS, Datastax Enterprise, Hadoop (MapReduce/HDFS), Spring3.0, Hibernate, Struts, Kerberos.

Data Types: CSV, JSON, Avro & Parquet.

AWS Services: APIGateway, Lambda, EMR, Kinesis, IAM, EC2, S3, EBS, Data Pipeline, VPC, Glacier & Redshift.

Databases: Cassandra, MongoDB, SQL Server, MySql, Oracle 9i/10g.

Data Analytics: Tableau, Mahout, RevolutionR, Talend

Domain: Medical, Lifestyle, Banking, Brokerage and Hospitality.

PROFESSIONAL EXPERIENCE

Ryder Systems Inc – USA March 2018 – Aug 2018

Senior Big Data Architect

Architecting, Designing and Implementing (100% hands-on) the Big Data Platform for Ryder. The architecture involves designing the batch and real-time ingestion processes into the datalake, processing data through various stages (raw-proproc-cleansed-detailed-aggregated) and exploring and visualizing the data in PowerBI.

Responsible for building the datalake in Amazon AWS, ingesting structured shipment and master data from Azure ServiceBus using the AWS APIGateway, Lambda, and Kinesis Firehose into s3 buckets.

As the data is being cleansed/processed it flows through various stages via a staging process in Avro and Parquet file formats ultimately getting ready for interactive queries, operational analytics, data mining and reporting. The data moves through the various stages using Hive and all ETLS (cleansing, transformations & aggregations) being done using Spark.

Technologies being used are Hive, Tez, Oozie, Spark SQL, Spark Streaming, Hue, Amazon S3, Data pipeline, VPC, EMR, Kinesis Firehose/Streams, APIGateway, Lambda, Jenkins, GIT, Bit Bucket, VSTS.

DMD Marketing – USA – Feb2017 – March 2018

Senior Big Data Architect

•Architecting, Designing and Implementing (100% hands-on) the Big Data Platform for DMD. Responsible for building the datalake in Amazon AWS, ingesting structured master data from various SQL databases and TBs of transactional data via Amazon Kinesis Firehose/Streams. All data resides in Amazon S3 buckets within the datalake in Avro/Parquet formats.

•As the data is being cleansed/processed it flows through various stages via a staging process ultimately getting it ready for interactive queries, operational analytics, data mining and reporting. The data lake is currently being built to ingest historical data (PageView, TagDetected & EmailsOpens) and HealthCare Professional profile data using Sqoop.

•Future usecases being designed to ingest transactional data from AIM application via Kinesis Firehose. The data moves through the various stages using Hive and all ETLS (cleansing, transformations & aggregations) are being done using Spark.

•Performance tuning to build scalable solutions on the EMR clusters, Hive ETLs and spark jobs to process more than 1bil records (TBs data), using concepts like Auto Scaling, Hive-On-Spark, Partitioning, Oozie sub Workflows, etc. Since the EDW is the source for all Health Care Professional data, the transactional applications will get feeds of HCP data via the HCP Sync process. Technologies being used are Sqoop, Hive, Tez, Oozie, Spark SQL, Spark Streaming, Hue, Amazon S3, Data pipeline, VPC, EMR, Kinesis Firehose/Streams, Jenkins, GIT, Bit Bucket, Source Tree, JIRA.

BBVA Compass – USA – Sept2016 – Nov 2016

Senior Big Data Architect

•Architecting, Designing and Implementing the Big Data Platform for BBVA Compass Bank. This platform is build around the use case for analyzing Marketing Campaign spends within the bank. The aim is to build the Big Data lake ingesting TBs of data from Omniture, Google/Bing, Affiliates, Bank DBs, Mainframe, etc.. The platform is built engaging frameworks like Confluent, Kafka, Schema Registry, Stratio CrossData to stream data into Hadoop HDFS and analyze using Tableau. Languages/Technologies used, Java8, Python, Kafka Stream APIs, Spark APIs, Kafka Connect, Stratio CrossData & HDFS.

Weight Watchers International USA May 2014 – Nov 2015

Designation – Senior Big Data Architect.

•Designing and Implementing the core-data-pipeline using Spark Streaming on EMR. This is a near real time streaming application, which enables multiple teams to ingest journal data, via consumers registered to the data-pipeline. Each stream of data flows via RabbitMQ, which is integrated to the pipelines, fans out the data to multiple heterogeneous systems, registered via their consumers. Currently the activity data is being aggregated to 20steps/day/user which translates into 400 records/sec of journal data (10’s TB/yr.), next year the intent is ingest the raw un-aggregated data (10,000 steps/day/user), which translates into 150K msgs/sec (100’s TB/yr).

•Worked on Architecting, Designing and Implementing the GADS (Global Analytic Datastore) on Amazon AWS. This is a Big Data platform implemented using the Hybrid Big Data Architecture to hold 5-10 years of historical data from various sources (100s of Terabytes of data), which would provide a customer & product centric view of the data. This would enable the analysts to visualize customer behaviour and his journey across various products, customer retention, customer interactions, etc. This would also provide the Business Intelligence to perform predictive and statistical analysis using Big Data technologies.

•The GADS platform was designed following the Hybrid Architecture on Amazon cloud technologies with structured data hosted on Redshift with the ETL pipeline build in Talend and semi/unstructured data would be hosted within S3 buckets using transient EMR clusters, data pipeline, Oozie, Hive, Impala and later migrated to Spark & Spark SQL. The idea was to centralize the data onto the cloud to do further analytics using Tableau and RevolutionR.

•Implemented various Tableau visualizations to identify customer interactions, retention, bookings, engagements, etc. across various dimensions.

•Architected and Implemented the Tableau Server architecture to distribute the Tableau dashboards across the organization.

•Implemented the VPC architecture on Amazon AWS with the infrastructure teams to deploy instances in Dev and Prod environments. This architecture ensures all the security controls were put in place to meet the HIPAA requirements to protect PII and PHI sensitive member information.

•Worked on integration of various datasets both into the Data warehouse and the Big Data platform, bringing in data from, DOTCOM-online data, CHAMP-meeting data, SMV-MDM data, WELLO-Coaching data, Teletech-Chat&Call data, Exact Target-Mail data, Reflexis-Workforce data, ClickTools-Satisfaction data, etc.

Bank of America, New York USA Feb 2013 – May 2014

Senior Big Data Architect.

•Working for Fraud Technologies architecting, designing and implementing a big data solution for fraud detection and analytics. This product called HULC is intended to hold 13 months of historical data from various sources (100s of Terabytes of data), which would provide a consolidated view of the customer’s products across the bank. This would provide the business analysts the business intelligence to perform various analytics using various big data technologies.

•The other aspect of this product called ELECTRO is to perform ETL transformation on the raw data before it would be processed for scoring and alert detection in the bank.

•Responsible for designing the Cassandra data model for Venom, DFP & Flash projects and integrating them into the application design. Venom data model holds the monetary & non-monetary transactions, DFP holds the online login transactions & Flash holds the alerts generated by HULC.

•Responsible for Architecting the solution, defining the integration points with the fraud-scoring engine, capacity planning, deciding key technologies and designing and implementing the solutions.

•Responsible for introducing Big Data tools and technologies into the bank, presenting and implementing POCs for Tableau, Mahout, RevolutionR, Impala, Pentaho, etc.

•These above mentioned projects have been implemented using various big data technologies like Cloudera Hadoop CDH4.2, Java MapReduce, Pig, Hive, Oozie, Flume, Cassandra, Sqoop and Solr.

Citigroup, New York USA Aug 2012 – Nov 2012

Senior Solutions Architect

•SME on MongoDB for the Big Data initiatives, responsible for architecting and designing big data scalable solutions for teams across the organization.

•Being a part of the architecture group responsible for setting up Standards and Best Practices, building POCs, reviewing program level initiatives, vendor management, etc.

•Morgan Stanley – New York USA – Aug 2010 – Aug 2012.

•Designation – Technical Architect

•Participating in the Distributed Computing Initiative using Hadoop, Hive & PIG implementations. The initiative was to build a Data Fabric platform within the organization to enable parallel computation and analysis of large files emerging from the trading desks.

•As part of the emerging technology initiative, working on setting up the Hadoop distributed cluster on the Amazon AWS cloud and building POC for implementing Map-Reduce jobs in Java and monitoring the same using the Web UI in fully distributed mode. Also implementing PIG Latin data processing scripts for parallel processing.

•Responsible for driving the Cloud Computing practices at Morgan Stanley. Executed a comparative study of Amazon, Azure & MS Private Cloud, building and deploying a Java & .NET application and exploring various cloud features like Elastic Computing, Cloud Storage Services, Identity & Access Management, Load Balancing & Auto Scaling, etc.

CITI – Citigroup – New Jersey USA Apr 2009 – Aug 2010

Designation - Project Lead

•Responsible for Design/Development of the Customer Activation System for on boarding Clients/Users enabling them to use the Citi group of products and services. This Admin product goes beyond the client & user on boarding with services such as Admin Agent Management, Service management, Contacts & Reports.

•The application is developed with the UI in .NET, interacting with the Application web services developed in Java and finally being integrated with the End Systems/Applications using the Provisioning product.

•The application has been developed in Technologies like .NET, Web Services, Spring, Hibernate, XML, JMS using tools and products like RAD7, Websphere, TIBCO, TFS, and Oracle10g.

•CIBC – Commerce Imperial Bank Of Canada – Toronto Canada – Mar 2008 to Feb 2009

•Designation - Project Lead

•Responsible for the Design/Development of the Service Request Management (SRM) project whose main goal was to decommission the Remedy and migrate all it’s functionality onto the GOW platform.

•The application was developed on the Struts framework interacting remote services developed with EJB2.1.

Technologies involved: Struts, EJB2.1, Oracle9i, WLI server, Eclipse.

Charles Schwab – SFO USA Jan 2004 to Jan 2008

Designation - Project Lead

•Responsible for the Design/Development of the Quarterly Portfolio Profile, a reporting tool to provide clients with a Quarterly Performance snapshot of assets in their accounts. Also worked with the architects in performance tuning and load testing the tool. Preparation of HLD/LLD documents. Responsible for providing the Impact Analysis, Effort estimation and Sizing.

Walt Disney – Florida USA Sep 2002 to Nov 2003

Senior Software Engineer

•A Web Based System to administrate the Hospitality services of Walt Disney, USA. This system is basically a Sales and Booking Client for booking hotels/resorts, air tickets, tickets for theme parks etc.

Technology: UML, JSP, EJB, Design Patterns (Core and J2EE), SQL, Websphere, Oracle, Rational Rose, IntelliJ.

General Electric Power Systems – Noida India Sep 2001 to Sep 2002

Senior Software Engineer

•This application was the GEPS (General Electric Power Systems) Intranet site which was written using the J2EE architecture. It was developed on CASPER a J2EE framework developed by GE based on the MVC design model.

Technology: Java, JDBC, JSP, XML, HTML, JavaScript, UML, Rational Rose Oracle8i, Weblogic6.0.

Jalva Media India Pvt Ltd – Mumbai India Sep 2000 to Jul 2001

Senior Executive Projects.

•The application had been developed to customize the database objects in the database layer and the front-end layer through EJBs in the middle tier. Clients in US and India are provided their own customized version of database objects and front end look and feel in the form of JSPs which create dynamic forms. The product can make customized media management applications very easy to be delivered to clients.

Technology: Java, J2EE, EJB, Weblogic, Jbuilder, Oracle 8, Weblogic 5.1.

EDUCATION

MSC ENGINEERING – COMPUTER SCIENCE 2004 - 2007

M.S. Ramaiah School of Adv. Studies through Coventry University - UK

• Majored in Real Time Embedded Systems.

• Graduated with Distinction and Gold Medal.

DIPLOMA IN ADVANCE COMPUTING 1999

Center for Development of Advanced Computing

•Studied topics in advance computing like Advanced C, Operating Systems, OOP using C++, MS programming, Business Computing, Software Engineering, Java, VC++, VB, Data Communication and Networking.

•Passed with Merit.

ADVANCE DIPLOMA IN ROBOTICS AND AUTOMATION 1994 – 1995

Victoria Jubilee Technical Institute (VJTI)

• Majored in Robotics and Computing Technology in Industrial Automation.

• Passed with Merit.

DIPLOMA IN ELECTRONICS AND RADIO ENGINEERING 1992 – 1994

Board of Technical Education.

•Majored in Electronic Communication.

•Passed with Merit.

Contact this candidate