Sign in

Data Sql Server

Rochester, Minnesota, United States
February 21, 2018

Contact this candidate

Soham Salvi


Expertise in developing Data Warehousing and BI Solutions for more than 5 years in OLTP environments. Expert in writing SQL, SSIS packages, SSRS reporting. Proficient in analyzing healthcare domain data and used systems like Epic, Cerner and McKesson-Star.

Knowledge of healthcare data that involving patient encounter, appointments, provider schedules, patient charges and transactions, hospital supplies, ICD-9 and ICD-10 diagnosis and procedures codes.

Worked in agile and scrum processes to analysis, design, deploy and support in data warehousing. Collaborating with DB Architect and DBA for database modelling, database capacity planning, gap analysis, data retention policy and performing code review.

Expert in data conversion and data migration using SSIS packages to transfer data between Oracle, DB2, flat files, Teradata and Flat Files to SQL Server using SSIS.

Thorough knowledge of Machine Learning and writing algorithms like linear, logistic regression, neural network, k-means, SVM, PCA anomaly detection and collaborative-filtering algorithms.


B.S. in Information Science & Technology and Minor in Business - Penn State University, May 2012

B.Sc. in Information Technology - Mumbai University, April 2010

Machine Learning, Stanford University (Coursera Certification) - Jan 2018

Demonstrated the ability to understand and apply machine learning algorithms to achieve the objectives of the assignments. Have written algorithm codes in octave using the matrix datatypes to vectorised the code. Some of the models developed to predict various data are -

Used Neural Networks using forward and backward propagations to predict handwritten numbers. And tested the algorithms with gradient checking to make sure the cost function is working correctly.

Build a spam email classifier using support vector algorithm with simple RBF (radial-bias function) which uses Gaussian kernel.

Used K-means algorithm to compress images by reducing the number of color that occurs in the image. Applied regression by randomly selecting the cluster centroid points and minimizing the cost function by calculating the mean distance. Improved the performance of the model by optimizing number of clusters involved and the mean of the cluster.

Used anomaly detection algorithm to detect anomalous ping from server computers data. Optimized the algorithm by estimating the regularized and sigma parameters based on the F1 score.

Build PCA algorithm to reduce the dimension of face images dataset from 1024 to 100 dimension as such that it retained 95% of the variation in dataset.

Introduction to Data Science, University of Michigan(Coursera Certification) - Dec 2017

Completed basic python programs using strings, functions, lists, dictionaries, and series and dataframes data structures.

Used Pandas library for math functions; import and manipulate data files; apply indexing and querying; apply merge and join functions; and slicing and dicing on dataframes.

Analyse data with group-by, and applied aggregate functions to dataframes.

Interpret data to evaluate hypothesis tests like Null Hypothesis.

Healthcare Skills

• Healthcare Standards

HL7, CCDA, UDMH FrameWork

Healthcare Tools

Epic Clarity, Epic Hyperspace, Cerner, McKesson Star

Technical Skills


R, Python, C#, SQL, Unix Scripting, C, C++, Core Java

Tools & Database

SQL Server 2014/2012, BIDS 2010, Oracle 9i/10g, DB2 10.5, Performance Dashboard, SQL Execution Plans, SQL Server Profiler, RapidSQL, BIDS, Octave, Juypter, R-Studio, Winscp API

Management Tools

IBM IDA(Info Sphere Data Architect), JIRA, GIT Repository, MS Excel, FastTrack

Professional Experience

CitiusTech Healthcare Technology, Rochester, MN Feb 2016 – Till Date

Designation: Technical Lead

Project: Mayo Clinic, Epic Retrofit (Supply Plus & Population Health) June 2017 – Till Date

Technologies: DB2 10.5, RapidSQL, FastTrack, DataStage 9.5, Epic Clarity and Person system, Epic Hyperspace, Business Objects reports


Monitoring and improving data quality for hospital supplies and access management for Epic data. Validating data, analysing metrics, and finding solutions to outstanding data issues.

Analyzing the report matrices from to developed data validation queries for Hospital Supplies database from Clarity database of Epic. Validated data from old SIMS (supply inventory management systems) in the database to replicate similar grain for new clarity data coming from epic (Chronicles).

Developing source queries that involves supplies, manufactures, supply types/charges, location, departments, sites, specialty and providers and procedures from Epic database.

Coordinating with client to analyze the gaps in the reports and with find the source and with ETL developer to fix the gaps.

Analyzing access management data which involves patient appointments, encounters, provider care teams, sites and departments. Analyses involves monitoring of staff on floor time, specialty / procedure time, inpatient hospital time, offsite care time, and personal time.

Monitoring data quality of the providers visit types and unavailable reason free text of providers coming from epic templates and Cerner schedules block. Also coding the missing visit types to their correct time slots.

Also monitoring data for providers who use specialty visit types or blocks in primary care departments. Provided monthly reports to management which helped in training operational staff on using correct visit types.

Project: DaVita Medical Group (NDW using UDMH) Jan 2017 – May 2017

Technologies: Oracle 11g, Informatica 10.1, Unified Data Model for Healthcare (UDMH), IBM Info-Sphere Data Architect 9.1 (IDA), JIRA


Coordinated with client and SME’s to created source-to-target mapping sheet and database design using UDMH and IDA. Data analyzes included identifying correct business Keys, lookup’s, business logics, and expressions to map correct data using UDMH framework.

Provided walkthrough sessions to handover model design and requirements to ETL and QA team. Design included entities like anchor, detail, bridge and array tables.

Project: Geisinger Subscription Notification for KeyHIE Feb 2016 – Jan 2017

Technologies: MS SQL 2014 Standard Edition, SSIS, Visual Studio-2013, T-SQL, UDF’s, JIRA, Java, FTPS using Winscp API, Eclipse Mars, kafka, storm, GIT


Worked as Lead Database Developer in creating transactional model for Geisinger Health System, setting data lineage across database to handle data load failures; implementing archiving and indexing strategy; monitoring and improving database performance.

Developed transactional data model for CCDA (consolidated clinical document architecture) and to store logs, messages, and subscription notification involving 150 tables covering 2500 attributes.

Performed database and data migration (23 million rows of 75 GB which includes 36 GB BLOB data) using scripts to move from phase-2 to phase-3 of project into production. Base script and delta scripts were versioned in sync with every application bugs fixing releases.

Optimized the database performance by implementing table partitioning, tuning archiving stored procedures, UI queries in order to handle up to 33GB of data daily. Improved performance of application with indexing, setting fill-factor, data-page compression which reduced storage by 17%, tuned application queries to do index seek .

Created SSIS packages to extract flat-files and sent it over FTPS; created POC model to load delimited files into database using SSIS packages.

Monitoring database performance during volume testing of application high peak load of 35000 mgs/hour; archiving; and Archiving performance of peak load of 18000 messages/hour.

Med-Metrix, Parsippany, NJ Jan 2013 – Jan 2015

Designation: Database Developer II, Database Developer I

Project: Patient Billing data Integration for MPower Suite

Technologies: MS SQL Server 2014/2012, BIDS 2010 (SSIS, SSRS), C# scripting, SSIS expression builder


Monitoring and supporting data warehousing and ETL process load daily data for various hospitals. Managed database integration issues including migration between disparate databases, integration, maintenance/conversion, capacity planning issues, and new applications.

Created SSIS packages to load data from flat files, oracle data source to staging tables and then to transactional tables. Performed data quality check and scheduled packages into production.

Implemented logging tables which maintained time taken for control tasks and procedures for all the packages and stored procedures. This helped tremendously in checking time taken by procedure and optimize them.

Scheduled productive session with team to share new built-in functions in sql server, using available tasks in control flow for data transformation, improving code-quality, interpreting query execution plans and improving query performance using indexes.

Used CTE that improved the performance of complex logic by over 30% in time. Developed complex queries that validated the anomalies in charges or transactions.

Used various SSIS objects in data loading packages such as control flow Components, dataflow components, connection managers, error logging, configuration Files etc.

Scheduled and maintained Packages using SQL job agent. Created database maintenance packages using shrink database and rebuild Index tasks for all databases.

Implemented error handling logics in all the SSIS packages to handle bad data to error flow and avoid package failures.

Zodiac Solutions, Malvern, PA Sep 2012 – Dec 2012

Designation: SQL Developer

Project: Healthcare Contract Management

Technologies: MS SQL Server 2008/2005, BIDS 2005 (SSIS, SSRS), C#, Teradata V2R6


Fetched data from Teradata using Microsoft attunity connector and parse report files (non-delimited files). Perform data quality in staging tables to ensure correct data loaded for analytics.

Created C# script to iterate over flat files and parse date, time, and decimal data values, change to ascii characters from report files provider by client.

Created ad-hoc queries to check missing master codes, count and duplicates in transactional tables, min-max date ranges for missing data files, null percentage in columns.

Developed stored procedures for reports for client. Built various drill down reports, parameterized reports and linked reports using SSRS.

Created data model documentation like ER diagrams showing cardinalities, table and attribute definitions.

Auberle – Pittsburgh, PA Sept 2011 – Apr 2012

Designation: SQL Developer

Project: Data Management for Students Information System

Technologies: MS SQL Server 2005, SSRS, Teradata V2R6


Creating and maintaining databases for reporting of student data, creating class and teacher schedules, and provided statistical reports.

Exported and imported data from text files and Excel to SQL Server database using bulk insert and BCP utility.

Optimized the performance of queries with modification in T-SQL queries, normalized tables, established joins and created indexes wherever necessary.

Found bottle necks in queries and stored procedures and fine tuning. Monitored and tuned SQL scripts. Created upgrade scripts for production database and supported it.


Machine Learning, Stanford University (Coursera), Jan 2018

Introduction to Data Science, University of Michigan (Coursera), Dec 2017

R-programming, John Hopkins University (Coursera), Aug 2017

Oracle Certified Associate (OCA), Oracle, July 2009

Oracle 9i, DBA Fundamentals II, Oracle, Dec 2009

C Programming, NIIT, April 2008

Contact this candidate