Hemal Patel
***********@*****.***
Big Data /Data Engineer
Designer, builder and manager of Big Data infrastructures
A collaborative engineering professional with substantial experience designing and executing solutions for complex business problems involving large scale data warehousing, real-time analytics and reporting solutions. Known for using the right tools when and where they make sense and creating an intuitive architecture that helps organizations effectively analyze and process terabytes of structured and unstructured data.
Competency Synopsis
Data Warehousing
Proven history of building large-scale data processing systems and serving as an expert in data warehousing solutions while working with a variety of database technologies. Experience architecting highly scalable, distributed systems using different open source tools as well as designing and optimizing large, multi-terabyte data warehouses. Able to integrate state-of-the-art Big Data technologies into the overall architecture and lead a team of developers through the construction, testing and implementation phase.
Databases and Tools: MySQL, MS SQL Server, Oracle, DB2, NoSQL: HBase, SAP HANA, HDFS, MongoDB,, Vertica, Greenplum, Pentaho and Teradata.
Data Analysis
Consulted with business partners and made recommendations to improve the effectiveness of Big Data systems, descriptive analytics systems, and prescriptive analytics systems. Integrated new tools and developed technology frameworks/prototypes to accelerate the data integration process and empower the deployment of predictive analytics. Working knowledge of machine learning and/or predictive modeling.
Tools: Hive, Pig and Hadoop Streaming, MapReduce,Spark, Kafka.
Data Transformation
Experience designing, reviewing, implementing and optimizing data transformation processes in the Hadoop and Informatica ecosystems. Able to consolidate, validate and cleanse data from a vast range of sources – from applications and databases to files and Web services.
Tools: Ascential Software’s DataStage and Informatica Corp.’s PowerCenter RT; Pentaho Kettle, DataStage, SSIS, and Scripting; Linux/UNIX Commands.
Data Collection
Capable of extracting data from an existing database, Web sources or APIs. Experience designing and implementing fast and efficient data acquisition using Big Data processing techniques and tools.
Tools: APIs and SDKs Interface
Professional Experience
MedeAnalytics Inc Emeryville CA August 2015 to present
Big Data Engg/Architect.
Highlights:
●Designed a large data warehouse using star schema, flow-flake.
● Designed and developed Big Data analytics platform for processing customer viewing preferences using Scala, Hadoop, Spark and Hive.
●Gather the data for analytics for client different sector like on retail, employee reporting
●Integrated Hadoop into Teradata accelerating the extraction, transformation, and loading of massive structured and unstructured data.
●Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
●Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
●Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
●Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP
●Implemented best income logic using Pig scripts.
●Implemented test scripts to support test driven development and continuous integration.Responsible to manage data coming from different sources.
●Installed and configured Hive and also written Hive UDF.
●Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning
●Created reports and dashboards using structured and unstructured data n different Tools like Tableau.
●Experience in designing and developing in Spark using Scala to compare the performance of Spark with Hive and SQL.
●Developed Spark scripts by using Scala shell commands as per the requirement
●Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently.
●Generated context filters and data source filters while handling huge volume of data.
●Built dashboards for measures with forecast, trend line and reference lines
●Experience in creating different Visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
Delta Dental San Francisco oct 2013 to July 2015
Big Data Engg/Architect.
Highlights:
●Strong understanding of Data warehouse concepts, ETL, Star Schema, Snowflake, physical and logical data models.
●Use Informatica IDQ to perform ETL
●Configured SQL database to store Spark metadata.
●Design data analysis platform for business using python, Hadoop and spark.
●Loaded unstructured data into Hadoop File System and Teradata.
●Wrote MapReduce job using Pig Latin.
●Written Hive queries for data analysis to meet the business requirements.
●Creating Hive tables and working on them using Hive QL.
●Importing and exporting data into HDFS and Hive using Sqoop
●Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python
●Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
●Experience with scripting language like LAMP
●Created reports and dashboards using structured and unstructured data n different Tools like Tableau
●Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently.
●Generated context filters and data source filters while handling huge volume of data.
●Built dashboards for measures with forecast, trend line and reference lines
●Experience in creating different Visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
●Pacific Gas and Energy Concord CA Feb 2010 to sept 2013 Data Architect / Developer Database and Enterprise Architect Responsible for the architecture and design of the data storage tier for this third-party provider of data warehousing, data mining and analysis
●Gathered business requirements, definition and design of the data sourcing and data flows, data quality analysis, working in conjunction with the data warehouse architect on the development of logical data models.
●Used Erwin for data modeling.
●Created complex Stored Procedures, Triggers, Functions, Indexes, Tables, Views and other T-SQL code and SQL joins for applications.
●Implemented database standards and naming convention for the database objects.
●established data granularity standards, designed and built star and snowflake dimensional models
●Developed Informatica Packages to extract, transform and load (ETL) data into the data warehouse database from heterogeneous databases/data sources
●Use Java Transformation and written code in Java for xml parasing, writing to the file etc
● Designed Star Schema modeling creating Facts, Dimensions, Measures and Cubes and optimized data connections, Data Extracts, Schedules for background tasks and Incremental Refresh for the weekly and monthly dashboard reports on Tableau Server.
●Used Excel Sheets, Flat files, CSV files to generate Tableau adhoc reports. Involved in creating calculated fields, mapping and hierarchies.
●Generated Context Filters and used performance actions while handling huge volume of data. Generated tableau dashboards for sales with forecast and Reference lines. Very proficient in working with large Databases in DB2, Oracle, Teradata and SQL Server.
●Strong understanding of Data warehouse concepts, ETL, Star Schema, Snowflake, physical and logical data models.
●Depth knowledge of Normalizations, Fact and Dimensional tables.
●Involved in creating Complex Stored Procedures, Triggers, Cursors, Tables and Views and other SQL joins and statements for reporting application development.
● Worked on building queries to retrieve data into Tableau from Oracle and developed SQL statements (ETL) for loading data into Target Schema.
● Excellent analytical and problem solving using Tableau and SQL debugging skills. Drive informed decisions by analyzing business and product performance
●
Bank of America, Data Developer Los Angeles CA Jan 2009 to Jan 2010
Built business intelligence infrastructure to support the acquisition and maintenance of private banking customers including the storage and retrieval of market intelligence and the execution of predictive and prescriptive analytics.
●Gathered business requirements, definition and design of the data sourcing and data flows, data quality analysis, working in conjunction with the data warehouse architect on the development of logical data models.
●Used Erwin for data modeling.
●Created complex Stored Procedures, Triggers, Functions, Indexes, Tables, Views and other T-SQL code and SQL joins for applications.
●Implemented database standards and naming convention for the database objects.
●established data granularity standards, designed and built star and snowflake dimensional models
●Developed SSIS Packages to extract, transform and load (ETL) data into the data warehouse database from heterogeneous databases/data sources
●Designed Star Schema modeling creating Facts, Dimensions, Measures and Cubes in SSAS.
●Developed drill down and drill through reports from multi-dimensional objects like star schema and snow flake schema using SSRS and SharePoint server.
●Designed Aggregations and pre-calculating in SSAS.
Education and Training
University of Findlay, Master in Business Administration May 2010
IGNOU University, Master in Computer Application June 2006
Sandra Patel Univesity, Bachelor in Science. June 2002