Post Job Free

Resume

Sign in

Engineer Data

Location:
Wesley Chapel, FL
Posted:
February 07, 2020

Contact this candidate

Resume:

ARNOLD DAJAO

Tampa, FL C: 813-***-**** adbnzu@r.postjobfree.com

PROFESSIONAL SUMMARY

Highly analytical Big Data Engineer/Developer/Architect with extensive experience in Big Data/Hadoop development and Scala programming. Enthusiastic team player with over 20 years of designing and developing innovative solutions to unusual and difficult problems. Passionate, Innovative and eager to work on emerging and open source Big Data Technologies.

TECHNICAL SKILLS

Big Data: Spark, Implala, Hve, Hadoop.HDFS, Tez, ML AI, AWS (EMR, S3, EC2)

Streaming: Spark Streaming, Kafka, Flume, AWS Kinesis

NoSQL: HBase, DynamoDB, Cassandra

RDBMS: MySql, Oracle, SQL Server, SQLite, Access, AWS RDS

Business Intelligence Tools: Tableau, Zoomdata

Programming Languages: Scala, Python, JAVA, SQL, Perl, Shell

Others: Jira, Git, Jenkins CI/CD, Dockers, IntelliJ

EDUCATION

11/2017 - 11/2020 Master of Science: Computer Science/Data Science

Colorado Technical University: Colorado Springs, Colorado USA

06/1989 - 03/1995 Bachelor of Science: Electronics and Communications Engineering

Mapua Institute of Technology: Manila Philippines

EXPERIENCE

4/2019 to 01/2020 Sr. Big Data Developer/Engineer (Remote Big Data Support)

Kelly Mitchell for Bayer – Maryland Heights, MO

Develop and maintain Big Data ETL/ELT pipelines to ingest large data from heterogenous data sources through AWS EMR with Hive and Spark to improve data ingestion process by almost 50%.

Improved process for several Data Engineering task to de-duplicate, transform, cleanse and enriched data for Data Science team.

Successfully advocated the migration of Apache Hive data processing to Apache Spark Scala to improve data processing to as much as 90% run time.

Significantly improved overall code quality, code coverage and code reusability by adapting TDD and Clean Code principles and processes.

10/2018 to 4/2019 Sr Big Data Developer/Engineer

Iris Software for Citi – Tampa FL

Develop and maintain Hadoop ETL/ELT framework to ingest large data from desperate data sources using Hive and Spark to improve run-times performance reduce run time by 25%.

Successful transition on the use of Scala and Spark to improve productivity and testability in developing concise and testable code adhering to the clean code principles which significantly reducing bugs by up to 80%

Evangelization on the TDD and Clean Code principles for Scala and Spark to improve productivity in developing concise, testable, sustainable code.

10/2016 to 10/2018 Sr Software/Data Engineer (Remote Big Data Support)

T-Mobile – Bellevue, WA

Successful Evangelization on the use of TDD and Clean Code Scala and Spark to improve productivity by twice (2x) in development with fewer bugs when its deployed to production after achieving 80% code coverage.

Improved data delivery to internal customers including data scientist by as much as tenfold (10x) in the Fastdata Platform using SMACK stack to process multi-petabyte size data.

Develop and maintain Hadoop ETL/ELT framework to ingest large data from desperate data sources using Hive, Tez, and Spark to improve run-times up to 50% of existing code

Development of Machine Learning, AI application in the Azure and AWS cloud to for self-service data delivery to increase productivity 30% and reduce manpower for service desk by 40%

06/2015 to 09/2016 Big Data Architect/Engineer

Tata Consulting Services for Nielsen Media Research - Tampa, FL

Develop Hadoop ETL/ELT framework to ingest large data from desperate data sources using Hive, Impala, Sqoop and Spark to improve run-times up to 200% form legacy ETL.

Designed and Developed Cost efficient and faster data ingest in the cloud that could handle twice data (2X) as in premise cluster and 10% faster ingestion using AWS EMR reserved and spot instances.

Refactored existing architecture to more scalable and fault tolerant Lambda Architecture to separate batch, serving and speed layer.

04/2014 to 04/2015 Sr Software/Big Data Developer V (Remote Big Data Resource)

JMA International for T-Mobile - Snoqualmie, WA

Designed, Developed, Tested and Implemented Java, Scala application to ingest large data from different sources utilizing Hadoop distributed file system (HDFS), running Map/Reduce, Spark, Impala to improve data processing by 40% from legacy Perl ETL framework.

Implemented Spark, Impala, Hive, HBase, HDFS performance tuning to achieved to as much as 100% improvement in query and Map/Reduce jobs execution time.

Performed system monitoring and management of Hadoop cluster, RHEL Linux collection cluster and Apache Web server.

Created prototyping/proof of concept application for Business Intelligence (BI) Platform with Tableau and Zoomdata.

10/2011 to 04/2014 Sr Software/Development Engineer

Worldlink/Nextgen/Glotel - Frisco, TX

Resolved application limitation from using application to more robust Java to reduce execution by 75%.

Designed, Developed and Implemented Cloud Computing Services to develop fully responsive application for Performance, Configuration and Fault Management System which resulted 20% reduction of overhead cost.

Developed and Designed POC and POT for ETL platform that ingest large amount of data to Hadoop data warehouse.

Updated legacy application to a more responsive and dynamic web application using HTML5, JavaScript, jQuery, AngularJS, Node.js and AWS DynamoDB, AWS EC2, AWS EMR and Elastic Beanstalk.

Managed an Agile team which designed, developed and Implemented responsive application for Operation and Maintenance (O&M), Performance, Configuration and Fault Management (Alarm) System using open-source software like Java, Perl, Bootstrap, MySql.

03/1996 to 07/2011 Developer/UTRAN/BSS/Transport/OSS Engineer

Ericsson/AT&T/Nokia/Globe Telecom – Various Location

Designed, Developed and Implemented Java, Perl, SQL Application for Performance, Configuration and Fault Management System to support network rollout for and operation and maintenance (O&M) automated tools to improve execution time and streamline of process workflow.



Contact this candidate