Data Engineer

Virginia Square, Virginia, United States
June 06, 2019

Vienna, VA *****


Programmer Analyst with 4+ years of experience in data warehousing domain. Ability to manage several databases and create BI reports.

Graduated from Udacity as Data Analyst and obtained Nanodegree.

Hold a Patent for a project based on automation in irrigation development using wireless sensors.

Expertise in BI/Data Warehousing concepts, designs, and project implementations.

Posses’ strong knowledge of database management in writing complex SQL queries, Stored procedures, database tuning.

Expertise in analysis, design and implementation of data warehousing applications using Informatica, Oracle and SQL Server technologies.

Strong influence management skills, reporting and analytical/problem solving skills with attention to detail.

Experience in creating and implementing interactive charts, graphs, and other user interface elements.



C++, Java 8, Python 3.6 (Anaconda), SQL, XML, R, Py-Spark


MySQL 5.7, SQLite, Oracle, MongoDB, Atom, Excel, Access, IOT, Jira

Big Data Tools:

Jupyter, Tableau, Rapid Miner, Alteryx, R, Hadoop (basics), Cognos, Power BI, Data Bricks

Cloud Computing:

AWS EC2 (Basics), SnowFlake

Version Control:



Data Engineer, Capital One, VA

Feb 2019– Present

Designing, architecting and testing various credit card application models and integrating them based on different business rules for decision processing.

Upgrading / automating database scripts to increase effectiveness & productivity in Hadoop Py-Spark

Providing post release data validation and working with Project team, internal/external stakeholder to improve existing database applications in Snowflake

Research opportunities for data acquisition and new uses for existing data

Working with requirements and various Data Analysts to re-engineer applications and translate it into functional and/or technical specifications.

Create / update various database/scripts to support teams for effective daily activities in Apache Data Bricks

Integrate new data management technologies and software engineering tools into existing structures.

Coordinate and guide support teams for daily activities in Data Bricks

Using Git to update existing versions of the Hadoop Py-Spark script to its new model.

Developing/Maintaining model data from various sources, debugging and increasing solution feasibility.

Conducting business meetings to gather functional and technical details of a requirement.

Exposure to Apache Data Bricks to generate scripts in Py-Spark to automate the reports.

Data Analyst, Express Analytics,CA

April 2018 – Feb 2019

Programmed in Python (Jupyter) to create clusters of customer records to analyze and create a model.

Worked in Alteryx to validate the Customer Details using Cass, Customer View Matching tool.

Used Alteryx to parse the information from 3rd party API to validate the information.

Interpret data, analyze results using statistical techniques and provide ongoing reports to the organization.

Analyzed and mined business data to identify patterns and correlations among the various data points in Tableau

Mapping and tracing data from system to system in order to solve a given business/system problem.

Performed statistical analysis of business data and reporting to management regarding decision making.

Designed Alteryx Workflow and created data reports from reporting tools to help management in their decision making.

Documented the types and structure of the business data which are required for the project in My SQL

Get the required tasks from JIRA and follow agile methodology.

Working with both Supervised and Unsupervised algorithms to predict and cluster the data based on customer needs.

Creating different models to fit the trained dataset and test it using different data to calculate efficiency.

Get the required support from Alteryx service to standardize the address from Usps data, which helped to figure out customer current address.

Data Analyst, Cognizant Technology Solutions Pvt. Ltd

Jan 2016 – Dec 2016

Was responsible for creating reports using Cognos reporting tool based on data received from Aetna Health Care System.

Redesigned the data following data cleaning, data mining and produced results used for creating reports and dashboards in Tableau and Cognos

Worked on multiple projects in healthcare and telecommunications industries and gained a fundamental knowledge of data warehousing and Big-data.

Developed and maintained applications/databases by evaluating client needs; analyzing requirements; developing software systems in HDFS.

Contributed to team effort by accomplishing related results as needed.

Confirmed program operation by conducting tests; modifying program sequence and/or codes, which were later updated to continuous automation.

Generated statistics and wrote reports for management and/or team members on the status of the programming process.

Designed run and monitored software performance tests on new and existing programs for the purposes of correcting errors, isolating areas for improvement, and general debugging using SQL

Consistently wrote, translated, and coded software programs and applications according to given specifications.

Understood both healthcare and telecommunication sectors with respect to Data Analysis.

Data Analyst, Annapuranaa Industrial Needs Pvt. Ltd Jan 2013 – Jan 2016

Was responsible for maintaining stock and reporting.

Designed and created database structures based on the existing information and predicted future data using Microsoft Excel.

Interacted with business clients on organizational statistics which helped to reduce losses and increase margins.

Interpreted data, analyzed results using statistical techniques using Tableau

Developed and implemented data analysis, data collection systems and other strategies that optimize statistical efficiency and quality.

Acquired data from primary or secondary data sources and maintaining databases.

Developed and implemented databases, data collection systems, data analytics and other strategies that optimize statistical efficiency and quality.

Filtered and “cleaned” data by reviewing computer reports, printouts, and performance indicators to locate and correct code problems.

Worked with management to prioritize business and information needs.

Located and defined new process improvement opportunities.

Provide support for Analytics Processes monitoring and troubleshooting.

Supported business users answering complex business questions via ad-hoc SQL queries.


Machine Learning Algorithms (Python, R, Jupyter, Statistics):

Studied and Understood different Machine Learning Algorithms in Python. Performed various analysis in the given dataset with respect to Supervised and Unsupervised Classification.

Worked in Supervised Classification Algorithms like Naïve Bayes, Support Vector Machines, Decision Trees, K Nearest Neighbor, Linear Regression and Logistic Regression.

Worked in Unsupervised Classification Algorithms like hierarchical clustering, Neural Networks etc.

Data Prediction (Rapid Miner, Excel, Statistics):

Predicted the stock price of Apple products by analyzing the stocks of other electronic giants like Samsung, Dell, Panasonic, HP, Lenovo, etc.

Used Rapid Miner to find the relationship between the stocks by using linear, binomial, or logistic regression

Created a model based on the historic data and tested the model using different dataset.

Analyzes and found the hidden relationship between the stocks and predicted the future of Apple stock with 85% efficiency.

Data Visualization (Tableau, Python, Jupyter):

Gathered data from Twitter using tweet Id of #WeRateDogs using Python

Used data wrangling techniques to analyze and clean data; in the end, used Tableau to organize the data, visualize it and create a story about it.

Understood different Machine Learning Algorithms and used libraries like NumPy and Pandas to Visualize data.

Used data mining techniques to find interesting patterns and hidden knowledge which helped to make the right decisions for the organization to increase its market value

