Sign in

Data Engineer

Richmond, Virginia, United States
November 16, 2017

Contact this candidate



Agile Scrum

More than ** years of experience in diverse technologies including data analysis, data science, performance improvement, advanced query tuning, and software development and testing.

Looking for challenging roles in data science, data analysis and business intelligence domains.

Proficient with Database Management tools like MySQL, Oracle. Worked on Data Warehousing: Extraction, Transformation and Loading (ETL) and Business Intelligence (BI) tools like Tableau

Experience in Validation, Extraction, Transformation, Loading and reporting using Teradata, Oracle, MySQL, SQL Assistant,Tableau and Unica Affinium campaign.

Expertise in writing large/complex SQL queries using Joins, Views, volatile tables, Sub Queries Stored Procedures, Indexes, Constraints and OLAP functions.

Proficient in performance analysis with SQL query tuning using Explain Plan, Collect

Hands on Experience in Writing Python Scripts for data Extract, data Transfer/Loan, data analysis with NumPy, Pandas libraries.

Created IPython notebooks to develop data analysis reports starting from data munging, data quality check, creating and explaining graphs in Python using Matplotlib, Seaborn.

Applied Machine Learning techniques to build tools that predict customer preferences and behavior.

Hands on experience with Machine Learning algorithms like Linear Regression, Logistic Regression, K-means Clustering, Neural Networks, Recommender Systems, Anomaly Detection, Support Vector Machine.

Conduct exploratory data analysis and hypothesis testing on attributes of customer clusters using Python.

Create custom reports from large-scale data, algorithmically extracted and processed.

Build models to predict customers purchasing a new brand. This includes feature extraction, feature engineering, and implementing classification models (logistic regression, tree ensemble methods and SVM) using Python. The selected model is used to boost conversion rates of marketing campaigns.

Proficient with descriptive and inferential statistical modeling and experimental design, Hypothesis testing, Chi-Squared test, F-test, t-test, ANOVA, Mann-Whitney.

Strong hands on experience in developing dash boards, stories, data visualizations, and analytics using Tableau. Familiarity with best practices around visualization and design.

Good experience in creating Metadata and Field Mapping.Good Experience in designing Logical and Physical Data model using Visio.

Experienced in writing the Korn Shell Bash scripts, to run batch jobs in UNIX.

Extensively worked with Teradata utilities like BTEQ, Fast Load, Multi Load, TPump to export and load data to/from different source systems including flat files.

Extensively used various reporting objects like Action filters, Calculated fields, Sets, Groups, Parameters etc., in Tableau.

Expert level capability in Tableau calculations and applying complex, compound calculations to large, complex data sets.

Experience in performing ad-hoc reports using Teradata,SSRS,and Tableau.

Designed marketing campaigns using UNICA Affinium Campaign Management Tool. Campaigns involved Direct mails, Cross sell campaigns, Email campaigns, Telemarketing, in statement letters and messages.

Experience in maintaining and developing Weekly,Monthly and Quarterly large financial reports.

Proficient in preparing PowerPoint Presentations using PowerPoint, graphs, tables and charts using Excel.

Hands on Experience in Agile Scrum Working Environment.



TERADATA 14/13/12,Oracle 10g,SQL Server2008, Hadoop, Map Reduce, HDFS, NoSQL (MongoDB), AWS (S3, EC2, Redshift, DynamoDB, CloudFormation)

Database tools

SQL developer, TOAD,Teradata SQL Assistant 13, Redshift


C/C#, SQL, PL/SQL, VBA, Python, R, UNIX Korn Shell scripting, BTEQ

Data Science Algorithms and Libraries:

Machine Learning, Reinforcement Learning, Data Mining, Predictive Modeling, Linear/Logistic/Lasso Regression, Decision Trees, Random Forest, Support Vector Machines, Clustering, XGBoost, Tensor Flow, Scikit-Learn, Pandas, Matplotlib, Seaborn, Numpy, ggplot2


AffiniumCampaign (V6.4, V7.2,8.2),Tableau 8.2,BOBJ 4.0,SAS,Informatica,Data Stage, Spyder, RStudio

Job Function

Data Analysis,Agile Scrum Data validation, Business Analysis, System Analysis and Design

Operating Systems

UNIX, Linux, MS-DOS, Windows XP/10, Win NT


Bachelor of Engineering in Instrumentation Technology Delhi University

Masters (MBA) in Finance and Analytics from University of Notre Dame


AWS Certified Associate (Developer) – 96%

Data Analyst Nanodegree from Udacity

Machine Learning Engineer Nanodegree from Udacity (pursuing)


Capital One Financial, Richmond, VA

Marketing Analytics (Consultant)/Data Engineer Nov 2015 – present

Responsible for design, development and administration of transactional and analytical data constructs/structures

Created database objects like tables, views and functions using advanced Teradata SQL queries to provide definition, structure and to maintain data efficiently

Fine-tuned SQL queries to maximize efficiency and performance with strong working knowledge on RDBMS in SQL Server and Oracle with advanced SQL scripting skills like multiple joins, subqueries

Highly efficient in Data Wrangling techniques: Data cleaning and transforming data in usable format, data quality, data profiling, data organization

Collaborate with business and technology leaders to identify key business requirements to enable enterprise Business Intelligence strategy

Development of business intents as automated SQL scripts based on requirements of business analysts and operational analysts

Designed and delivered targeted card marketing campaigns using Unica Affinium, performing segmentation and targeting of product offerings

Performed spend habit analysis to carry out test campaigns among control and test populations for a major partner by targeting them with reward points offers based on purchase transaction volumes

Performed customer management based on KYC (know your customer) regulatory requirement, involving complex segmentation of target population for multiple customer management activities

Performed validation of data according to the business intent and help iron out the errors and gaps between understanding of business intent and the implementation

Documented project reporting requirements and translate them into functional specifications

Developed Tableau workbooks from multiple data sources using Data Blending

Developed Tableau visualization and Dashboards using Tableau Desktop to visualize and compare enterprise data metrics on weekly basis

Environment: Teradata SQL, UNIX, Fast Load, TPump, BTEQ scripting in Korn Shell, Affinium campaign, Git

Capital One Financial, Richmond, VA

Machine Learning Analyst (Consultant) Jan 2017 – present

Designed, built and implemented marketing models for a major luxury retail partner in the US.

Developing a Machine Learning model to analyze the spending habits of customer segments for the luxury retailer from a population of 2 million customers.

Working closely with the business intent owners and creative owners to understand and implement the intent as a machine learning model

Created the complete model pipeline from data import, data munging and exploration, feature selection, model building, validation and model scoring

Performed data import using Teradata SQL and perform data quality improvement using standardization and/or normalization of features

Involved in feature selection, creation and feature pruning (partial dependency plot)

Extendable model pipeline building, and hyper parameter tuning (downhill simplex)

Split the data into train and test buckets, plot ROC curves, Gain curve and Lift curves and get the feature importance list. Used Decision Trees and Gradient Boosting classifiers such as XGBoost

Model performance evaluation and model scoring using metrics such as Precision, Recall and F1 Scores across a range of input parameter values

The project is being executed in an SCRUM Agile environment

Environment: Python, Teradata SQL, Spark (pyspark), AWS (S3, Cloud Formation, EC2), IPython, Machine Learning model building, model validation, predictive modeling using (XGBoost, Tensor Flow, Scikit-Learn, Pandas, Matplotlib, Seaborn, Numpy), Git

Wells Fargo, San Francisco, CA Apr2014 – Sep2015

Data Analyst (Consultant)/ Data Engineer

Gathering requirements from Business Analysts and Operational Analysts and identifying the data sources required for the requests.

Worked on Data Verifications and Validations to evaluate the data generated according to the requirements is appropriate and consistent.

Performed Importing/exporting large amounts of data from files to Teradata and vice versa

Worked on data profiling, data analysis and validating the reports.

Designed and developed Ad-hoc weekly, monthly Tableau reports as per business analyst, operation analyst, and project manager data requests.

Worked on Teradata and spreadsheets as data sources for designing Tableau Reports and Dashboards.

Developed Tableau data visualization using Dual Graphs, Scatter Plots, Geographic Map, Pie Charts and Bar Charts, Stacked Bar Charts.

Distributed Tableau reports using techniques like Packaged Workbooks, PDF etc.

Created dashboards and data visualizations using action filters, calculated fields, sets, groups, parameters etc. in Tableau

Created pivot tables and Graphs in MS Excel by getting data from Teradata tables.

Responsible for collecting the data and loading it into the Data base.

Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex EDW using Informatica.

Developed reports with Custom SQL and views to support business requirements.

Written complex SQL queries using joins and OLAP functions like CSUM, Count and Rank etc.

Worked on Set, Multiset, Derived and Volatile Temporary tables.

Extracted data from existing data source and performed ad-hoc queries.

Performance tuned and optimized various complex SQL queries.

Proc Export procedure is used for exporting the data from SAS to spreadsheets

Applied simple statistical procedures such as PROC MEANS, PROC FREQ in analyzing data

Participated in Spring Planning and Daily Status Meetings

Environment: UNIX, Teradata,Tableau,Informatica Business Objects, SAS,Teradata SQL Assistant

SamsungElectronics, Delhi, India

Senior Database Engineer Apr 2010–Aug 2013

Design and development of the existing ETL framework support high availability high transactional OLTP enterprise application systems including customer support and marketing systems.

Working on the reporting infrastructure of one of the most important groups at Samsung which relies on Business intelligence tools such as SSRS and SSIS to extract data for the higher management.

Creating Adhoc reports using SSRS as well as MS-Excel by pointing to appropriate data sources.

Making changes to the existing extraction, transformation and loading logic which brings in data from a normalized Software Quality Management Data warehouse.

Developed cubes which act as data sources for various reports used within the group and queried data from the multi-dimensional data warehouses using MDX.

Writing Complex stored procedures using T-SQL that form the basis of the business logic underlying these reports.

Also actively involved in project planning and estimation because we follow Agile development methodology and are supposed to follow the scrum planning and plan our development efforts for each sprint.

Performed web Service development, debugging and deployment

Environment:SQL Server 2008/2012, SSRS, SSIS, C#, ADO Dot Net, Visual Studio 2008 Report Builder, Report Designer and Report Modeler

Wipro Technologies, Bangalore, India

TechnicalProjectLeader Oct 2005 – Jan 2009

Superviseda team of 6to develop database projects worth more than $10 million for a major banking client

Responsible for developing Event Triggered Managed Campaign stored procedures and other Database objects for a web based SaaS application

Tuning and creating Extraction job from Legacy Datawarehouse to our OLTP based Database. An ETL processing that happens on a daily basis and brings in new data.

Develop NT Batch processes which act as a wrapper for these Event Triggered Campaigns.

Generating AdHoc SQL reports using SSIS and T-SQL for Business Marketing Group which enables them to send out coupons to qualified customers.

Responsible for implementing changes on DOT NET based APIs.

Maintaining and developing C Sharp code on the intermediate business logic servers coded in C#/C++

Generating the entire workflow of processes using Windows workflow management in C#.

Parallel development of release as well as functional code line using Perforce as the main configuration management tool.

Extracting data from Sybase Data sources to our destination servers after flattening and applying various transformations.

Environment: SQL Server 2008/2012, SSRS, SSIS, C#, ADO Dot Net, Visual Studio 2008 Report Builder, Report Designer and Report Modeler

Wipro Technologies, Bangalore, India

Consultant/Engineer Sep 2000 – Sep 2005

Implemented automated test framework–Led a team of 4 to developa mobile software testing framework, leading to 40% reduction in testing cost for Nokia Mobile

Automated system performance measurements (network transfer, memory usage, frame buffer and X write, file-system read and write).

Testing of D-Bus and G-Conf open source components, created test plan, test tools for unit testing of these modules. Testing was done on x86 as well as target platform after cross compilation of test binaries in scratchbox environment and test results & metrics stored in MySQL.

Analyzed the performance data in MySQL. Generation of ad-hoc reports using reporting services of MySQL to analyze the performance metrics and focus performance improvement of modules

C, Shell Scripting, Linux (Debian), Scratchbox, Gnu-toolchain, Bugzilla, gcov, gprof, valgrind, SQL

Coordinated development of MRI scanning software according to DICOM standard worth $2 million across India & Japan

Study of the existing applications, coded in C, preparing an Object Oriented Design of the new applications, coding in JAVA of the same and testing on UNIX simulated environment on Windows XP using NuTCRACKERtool

The applications include Storage/Query-Retrieve/MPPS/MWM/Print

Networking apps utilized to interact with an image SQL Server database.

Tools - C/C++/JAVA. SGI Irix, NuTCRACKER, DICOM, SQL Server, dbx, Lex/Yacc, socket progg.

Supported account management team by writing project proposals for Toshiba and Motorola leading to project wins worth $5 million

Enhanced and maintained the TCP/IP protocol stack for IBM server product independently

My role involved detailed study of TCP/IP stack code, fixing the reported bugs, testing the stack with the fixes

Also developed Bluetooth Serial Port profile for Wipro’s HomeNet Bluetooth stack on Linux Kernel (2.4.6) as a kernel module

Tools – C, Linux 2.4.6 kernel, Bluetooth, kdb

Contact this candidate