Neha Ambekar
Address: **** **** ***** **, *****: *******.******@*****.***
San Jose, CA 95134 Phone: 720-***-****
OBJECTIVE
To work as a data engineer/database developer in a fast paced and challenging work environment.
SUMMARY
●6+ years of IT experience focused on Data profiling, Data integration, Data migration and ETL tools and processes
●Experience in design and development of complex queries and mappings to extract data from diverse sources such as flat files, unstructured data, RDBMS tables, and legacy systems using Talend Open Studio and Pentaho
●In-depth knowledge of Data Warehousing concepts, Relational Database Management Systems and Dimensional modeling using ERWIN and Visio
●Experienced in database extraction and loading of large volumes of data from diverse sources using Pentaho, Talend Open Studio and SSAS
●Expertize in creating documentation for requirements gathering and unit testing
●Excellent problem solving and troubleshooting skills, and ability to work individually as well as in teams
WORK EXPERIENCE
Senior ETL developer
Scry Analytics Nov 2014 - Feb 2016
Project name: Cisco Connected Analytics for Call Center Client: Cisco Systems
Role: Senior ETL developer / Data Analyst
●Worked as a data engineer to design an analytics software and services solution for Cisco’s Contact Center Enterprise (CCE) to improve agent effectiveness and reduce customer effort during the call process
●Worked closely with business owners to translate existing business logic and needs to valuable requirements
●Profile, fetch and transform relevant data from databases (MS SQL, PostgreSQL and Postgres GreenPlum) and load into PostgreSQL using Talend Open Studio
●Utilized Agile methodologies to ensure project deadlines were met in a timely manner and Subversion to ensure adequate version control
●Extracted data from flat files/legacy databases and applied business logic to load them into staging databases.
●Designed ETL jobs and packages using Talend Integration Suite (TOS) and Pentaho
●Implemented workflows to populate slowly changing dimensions to maintain current and historical information in warehouse tables with change data capture (CDC).
●Created complex mappings in Talend 5.6 using tMap, tJoin, tReplicate, tAggregateRow, tLogCatcher and tUniqueRow.
●Used tStatsCatcher, tDie and tLogRow to create a generic joblet to store processing stats.
●Created Talend and Pentaho mappings to populate the data into dimensions and fact tables.
●Created Talend ETL jobs to receive attachment files from POP email servers using tPop, tFileList, tFileInputMail, to load data from attachments into the database and to archive the files.
ETL developer
Persistent Systems Limited July 2010 - Oct 2014
Project name: CaTissue Suite Data Migration Client: Dana Farber Cancer Institute
Role: Oracle Developer / Analyst Mar 2012 - Oct 2014
caTissue Suite is a modular, open-source biorepository tool for biospecimen inventory management, tracking, and annotation. This tool permits users to track the collection, storage, quality assurance, and distribution of bio-specimens as well as the data related to the participant.
Responsibilities:
●Involved in full development cycle of Planning, Analysis, Design, Development, Testing, and Implementation.
●Designed logical and physical data models for star and snowflake schemas using Enterprise Architecture.
●Designed Talend and Pentaho jobs to create analysis ready data tables. The ETL scripting process included profiling, fetching and transforming of data from various databases (Oracle, MS SQL & PostgreSQL), and loading the processed data into PostgreSQL.
●Fine-tuned SQL queries for maximum efficiency using rule based optimization.
●Optimize and generate Explain plans for slow running Queries and modify them accordingly.
Project name: OnCore-CaTissue Integration Client: Dana Farber Cancer Institute
Role: ETL developer Oct 2010 - Feb 2012
OnCore is a clinical trials management system (CTMS) to automate and integrate clinical trials operations and reporting.
Responsibilities:
●Used Talend and pentaho as an ETL tool to load data from flat files obtained from various facilities every day.
●Developed PL/SQL stored procedures, triggers and master tables for automatic creation of primary keys.
●Created Talend and Pentaho scripts to create new tables, views, queries for new enhancement in the application using Pentaho
●Used Talend Bulk Collections for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
Project name: caTissue Suite Client: Cancer Biomedical Informatics Grid
Role: Java Developer Aug 2010 - Oct 2010
caTissue Suite is a tool for managing biospecimens collected in support of basic and clinical research.
Responsibilities:
●Identify the use cases
●Set up databases
●Developed the application using Java, JSP, JBoss, SQL Server, LDAP Server
●Tested the application
Systems Engineer Jan 2010 - May 2010
Infosys Limited
Project name: AMEX Banking Client: American Express
Role: Java and Oracle Developer
This system keeps track of day-to-day bank operations like deposits, withdrawals, demand drafts, different types of mortgages for customers as well as employees.
Responsibilities:
●Involved in full development cycle of Planning, Analysis, Design, Development, Testing and Implementation.
●Developed the application using Java, JSP, JBoss, SQL Server
●Designed logical and physical data models for star and snowflake schemas using Enterprise Architecture.
EDUCATIONAL QUALIFICATION
Bachelor of Engineering in Computer Science 2005-2009
Bhilai Institute of Technology, India
Courses: Data Structure,Java,C,computer architecture,Unix
TECHNICAL SKILLS
●Databases: Oracle 9i/10g/11g, MS Sql Server, MySQL, Postgres
●Tuning Tools: Tkporf, Explain plan, Statspack
●Reporting Tools: Oracle Reports 9i, Oracle Discoverer 10g
●ETL Tools: Pentaho, Talend, SSIS
●Data Modeling: Erwin and Visio
●Operating Systems: UNIX/Linux, Microsoft Windows Server, Mac OS X
●Programming Languages: SQL, PL/SQL, Java, Python
REFERENCES
Available upon request.