Resume

Cloud Services Sql Server

Location:

Jersey City, NJ

Posted:

January 01, 2024

Contact this candidate

Resume:

Alvaro Mercado

Address

**** **** * ******* ****., Apt 1210

Philadelphia, PA 19103

Cell: 310-***-****

ad2c64@r.postjobfree.com

RECENT RELEVANT WORK EXPERIENCE

Blockchain NYC, New York, NY

ML & AI Consultant 6/2023-10/2023

- Worked with the founder on their new crypto platform launch on Lever.io in supporting the effort to add GenAI features through the use of prompt engineering of Crypto marketing information

-Used the OpenAI API and Jupyter notebooks for developing and testing the LLM for the data quality and the cusotmized GenAI response to the input prompts on the startup’s text of crypto marketing content

- Worked with the web application development team on the embedding of the prompt features with the GPT-4 model or similar LLM that can enhance the UX/UI requirements of the application

-Implemented the H2O ai GPT platform for prompt tuning and benchmark testing of various open source LLMs from Llama 2 to Falcon for performance.

Jefferson Frank & Missioncloud El Segundo, CA

Cloud and Data Consultant 4/2023-8/2023

- Worked through the Frank Group on supporting the effort to replace the legacy BI and data pipeline services for the healthcare client, CPS, with similar AWS managed services

-Determined the AWS configuration for Athena and Quicksight to replicate the dashboard analytical insights offered by GCP Looker

-Created the architecture to run the migration of Looker files for data models to the native SQL in Athena

-Determined the best AWS infrastructure setup to deploy a Pentaho DI Docker image for ETL pipelines with AWS RDS

Nalyzer LLC, Boston, MA

Cloud Devops Consultant 1/2023-3/2023

- Worked on supporting the lead engineer on new infrastructure deployments for life sciences research groups on GCP and AWS

-Created docker files and images for deploying new applications through serverless cloud such as GCP Cloud Run and AWS ECS and Fargate

-Developed scripts for automation of git operations such as branching, committing new files, and pushing to the main repositories for developer teams using a Cloud IDE

-Provisioned cloud infrastructure such as VM instances and database systems using the Pulumi framework in Python for IAC

Streamsets, San Francisco, CA

AWS MLOps Integration Engineer 6/2022-12/2022

- Worked on promoting the AWS Sagemaker Canvas for deployment of ML classification algorithms for specific business use cases

-Conducted the integration of Streamsets Data Collector with Canvas for a ML modeling pipeline of laboratory data for a life sciences application

-Ran the dataset cleansing and preparation with Alteryx Cloud Trifacta Wrangler for the ML model for S3 storage data quality comparisons with the Streamsets Data Collector output

-Reviewed rejected rows of data from the training set with Trifacta Wrangler

-Worked on the draft of a blog for model deployment using the AWS Sagemaker platform with no code testing and training for data processed with the AWS Pipelines or Streamsets platform

-Provided feedback to Peerspot on the experience with running Streamsets transformer and collector modules

Creative Intelligence Services, New York, NY

Cloud Prediction Analytics Engineer 5/2021-06/2022

- Served as part of a data engineering and analytics team handling the global data feeds for the retail/e-commerce client (Estee Lauder) that managed data insights for brand managers

-Worked on a data pipeline modernization project to update a cloud based distributed ETL architecture built on Airflow to incorporate serverless functions on GCP.

-Worked on extending an open source BI dashboard (Apache Superset) to integrate with an AI Engine for analytical adhoc predictions and reports comparable to Looker.

- Updated and developed GCP Composer service code changes to ingest SAP and social media marketing data to GCP BigQuery for running historical queries required by Data Analytics Teams

- Updated and developed GCP Composer service code changes to connect SAP and 3rd party marketing data through APIs to GCP Cloud Storage for running the DBT tasks, analytics and ML predictions required by Data Analytics Teams

- Worked on the data pipeline modernization project through adding new Kubernetes driven tasks on existing Airflow Source Code that leverage Container Images in GCP Container Registry and Cloud Source Repositories for Python based functional applications

DitoWeb, New York, NY

Cloud ML & Data Engineer 4/2020-4/2021

- Worked on data migration and modernization projects to create a cloud based distributed SQL database and AI platform service for replacing older enterprise legacy databases and analytical systems.

- Developed a managed service to query and migrate data from another cloud data lake (AWS S3) or on-premise environment (Vmware) to GCP Cloud Storage to support the ML model deployment for Data Science Teams of the clients

-Developed a data flow process using Alteryx Partner Trifacta (aka Cloud Dataprep) for the feature engineering of the dataset to compare with Python coding in Airflow DAGs (aka Cloud Composer) of the similar data

-Ran the Trifacta data process recipe for the ML model as a job and tested it for data outlier detection in newly ingested data

- Developed a managed service to migrate Github Source Code and AWS Container Images to GCP Cloud Source Repositories and Container Registry for Microservices architecture software development teams

- Worked on producing Best Practices Assessments for the biopharma client (Parker Cancer Research Group) transitioning to GKE Kubernetes clusters for their application management

- Worked on ETL processes in python to ingest and migrate data and tables from PostgresSQL to a distributed SQL Engine in Cloud Spanner for a proof of concept

- Set up the SAS cloud analytics platform on a GCP Instance for testing with the managed services

- Created predictive models for SAS using the Python interfaces for SAS packages

- Set up a benchmarking demo to compare SAS performance with GCP ML services

- Worked on building Business Value Insights services by implementing Google Cloud AutoML or no code Cloud AI platforms with cloud data warehouses

- Worked as part of the migration team to transfer SQL Server Warehouses to the Cloud Infrastructure for a SAS client (Satellites Unlimited) by developing Python scripts for file migration and SQL Server database backup images for cloud SQL DB restoration

Mondo, New York, NY

Data Science Engineer 2/2019-3/2020

-Developed web app components in Python (Dash Plotly) for data warehouses (BigQuery) that were to be part of a microservices architecture with orchestration by Kubernetes on GCP

-Tested the web component integration with backend SQL and noSQL databases and the API endpoints on the GCP app engine and GCP Cloud Run

-Migrated data from Cloud Storage into BigQuery (GCP) for the food service client (Aramark) and extracted transformed data to test out App development and Dashboard applications on GCP DataStudio versus Looker

- Developed dashboards in Google DataStudio from BigQuery Tables and Views for insights into competition food establishments offerings and price tiers in markets served by the client’s food services

- Developed recommender models for the e-commerce client (Bloomingdales) using AI frameworks (Gluon/ MxNet) in Python (Collaborative Filtering) and Pyspark (ALS) that could be deployed as a containerized application or run on the Azure Databricks platform

- Developed Docker container files (GCP) and pushed images to the platform to test out containerized machine learning applications

-Wrote Python scripts for processing the sales and user data for machine learning models from the H2O Driverless AI VM as well as Azure spark applications and datalake storage

- Orchestrated workflows using tools such as Visual Studio and serverless apps for deployment on Azure

-Tested the Docker containers and logic app workflows on Azure resources for job scheduling with existing workflows

- Extracted big data sets through SQL Server queries for jobs for object storage for ML models and the on premise datalake

Access, New York, NY

AWS & Alteryx Engineer -GTO 11/2018-1/2019

-Wrote ETL scripts in R and Python for the digital agency (Greaterthanone or GTO with Jazz Pharma) handling digital pharma marketing data for adding to the client’s machine learning model and datalake

-Set up the Alteryx application to run in the AWS cloud with on premise data repositories on the client’s MS server drive

-Set up an ETL pipeline with Alteryx GUI buttons or nodes for data preparation for the Tableau BI platform

- Integrated R Lang and wrote R scripts for custom data transformations in Alteryx workflows

- Aggregated the data files from different e-commerce & digital publishers (e.g. Medscape) and set up the pipeline for visualization with the current Tableau dashboard and other BI tools

- Conducted tests with the transformed data using the development of Tableau worksheets for specified graph visualizations

-Tested the ETL scripts in Alteryx Designer workflow manager for job scheduling with existing workflows

- Set up an EC2 instance on AWS for aggregating incoming data to the AWS S3 data repository to mirror the on premise datalake

Retention Inc, New York, NY

AWS Cloud Engineer 7/2018-10/2018

-Leveraged the use of AWS services for providing a demo data flow solution to media and pharma clients that were migrating towards a cloud environment

-Set up a streaming data analytics use case with AWS (Kinesis) and a cloud based SPARK cluster (Databricks) for data ingestion through API calls (website user events, news and sentiment) using the latest SPARK libraries

-Worked on configuring AWS Redshift (Postgres) and data modeling for marketing data as well as for logs in Dynamo DB (NoSQL)

- Developed a use case with cloud based (Google Cloud, Azure) Jupyter IDE and on premise auto ML AI libraries (Extreme Gradient Boosted Trees in H2O AI) and TPOT for pharmaceutical datasets

- Developed a baseline model with the AWS Sagemaker ML service for the same pharmaceutical datasets using the AWS Autogluon framework

- Conducted the baseline model deployment with the container orchestrator on the AWS Sagemaker cloud stack for testing the tabular predictor with the S3 ingested data

-Tested migration tools for code builds, testing and deployments using Dynatrace DevOps as a Service Platform on AWS

OTHER WORK EXPERIENCE

EDI, Stamford, CT

Big Data Analyst & Data Scientist–Henkel NA

8/2017-6/2018

Worked with the on premise enterprise data warehouse on MS SQL Server for developing queries against historical sales data for pipelines on the JVM in Knime

Wrangled big datasets for HDFS storage on a Hadoop cluster running Java APIs for job scheduling and data pipelining onto a server with the Business Intelligence Visualization Platform Application

Worked with the on premise enterprise data warehouse on MS SQL Server for developing dashboards on BI platforms that included Tableau & PowerBi

Extracted and transformed datasets for loading onto Hive using Python libraries such as Numpy, Pandas, SciPy, Sci-Kit Learn, & SPARK SQL

Prepared pipelines using jython orchestration combined with a Python execution engine for data transformations on the JVM for dashboard tools and apps

Created data models and schemas for testing with sample retail data sets on a MySQL instance on GCP before any migration

Implemented forecasting (ARIMA, Prophet) and supervised learning models (Linear Models in libraries such as PyStan) for retail stores to predict inventory and sales for supply chain planning

Developed ML models (Gradient Boosted Trees, Random Forests) for pricing optimization using tools such as H2O AI that could be deployed on a Hadoop cluster with the JVM

Started to develop SPARK based apps (Azure/ Databricks) to provide any data transformation and loading of new big sales data from various groups on their own data lake

Ran ETL tests on financial statement datasets to compare with Databricks ETL performance

Techvanguard, New York, NY

Big Data Analytics Support (Remote –E-Commerce/Energy/Fintech)

10/2013-7/2017

Ran preliminary deep learning model testing (Long Short Term Memory- LSTM) for an Investment Advisory group on a financial analysis use case (forecasting cryptocurrency prices) for new investment strategies

Optimized a feed forward Deep Learning model (Linear Regression in Keras Tensorflow) for IOT Energy applications with a regressor predictor to compare with other models (boosted regression trees, wide feed forward NN) under implementation in Azure ML

Reviewed Chiller Power Plant Data from the Power Engineering group (Smith Engineering) for data anomalies from the sensors for feature values

Lead a team of junior analysts on dynamic pricing analysis of e-commerce data in Python and SPARK for the client

Worked with the e-commerce client team of Caprice Electronics on setting up the architecture for a new consumer site using AWS Kinesis, DynamoDB, and RedShift to handle the data warehouse and data feeds

Migrated web extracted data and supplier data to AWS in S3 and extracted views of sales data from the client’s ERP platform with SQL Server on the backend

Ran SPARK SQL for any data analytics on the different datasets and exported filtered views back to the client’s platform on SQL Server or for storage on AWS Athena

Updated and prepared e-commerce pricing reports from raw data (Excel,csv) source documents using filtering and joins (Dply library in R) of the E-Commerce client’s (Caprice Electronics) products ranging from industrial to computer peripheral products

Worked on Web Data extraction of e-commerce site listings of client and competitor products (Python & Selenium) through a web scraping application for price monitoring

Reshaped price and web data and checked for statistical outliers in R prior to regression analysis

Implemented SPARK SQL jobs on a Databricks Spark cluster for querying big data sets of retail financial intelligence data from the web site

Worked on Pandas scripts to replace legacy Excel functions (e.g. Vlookup ) for historical product data received from existing supplier database files and competitor reports

Converted raw data from Excel to SQL (R & SQLdf) for updating product vendor SKU files and price testing new e-commerce sale files

Developed sales reports from the data to support the sales manager in supplier acquisition and managed the relationships with new suppliers

DecisionTrend, Mountainview, CA

Remote NY Technology Market Research & Data Analyst

4/2010-9/2013

Conducted primary market research surveys for data acquisition by interviewing respondents over the telephone that were using analytical instrumentation

Conducted market research surveys by interviewing respondents over the telephone that were using Agilent’s analytical instrumentation for petrochemicals, materials, geological, and environmental applications

Conducted market research analysis, through the telephone, by surveying the needs of scientists using analytical instrumentation for life sciences applications ( MALDI TOF mass spectrometry) for proteomics analysis

Conducted market research analysis in software such as MRI medical software used for imaging and analysis applications

Communicated with customers with questions about product pain points and preferred solutions in any model upgrades

Identified research objectives based on client's secondary market research reports

Analyzed the client’s secondary market research reports for developing a competitive matrix of the products offered by competing companies

Trained the market research team through conferences and webinars on the technical aspects of the hardware and software

Conducted the quantitative Data analysis, along with the project manager, of questionnaire results (Survey Monkey-SM) using Excel, Access, and sentiment text analysis of transcriptions

Analyzed the data for extracting statistical trends in specific user requirements and presenting the findings to the technology instrumentation client company

Princeton Review, New York, NY

Part Time Instructor

Taught and tutored the physical sciences (Chemistry and Physics) for review for the MCAT exam and SAT II exam

Reviewed the latest version of the AP Physics Exam Workbook for content accuracy

6/2008-3/2010

Chemsil Inc, Chatsworth, CA

Chemometrics Quality Assurance Lead

Maintained the ERP system and data warehouse for active ingredients and raw material products distributed by chemical manufacturers (e.g. BASF, Rhone Poulenc, Ciba) for cosmetic, medical device and topical over the counter (OTC) applications

Developed a database application in Sybase SQL for monitoring production and lab quality control data for excipient polysiloxane and polyglycol polymers that were distributed to clients

Managed and maintained all laboratory systems used in outgoing product testing for calibration and performance reliability under CFR 211

Reviewed the stored data and recommended changes to formulation and production as part of GMP guidelines for Out of Specification (OOS) Incidents and Corrective Action and Prevention Action (CAPA) changes.

Managed the external testing of new products in support of customer service for preparation of material safety data sheets and technical brochures

Served as leader of the internal audit team and electronic systems validation group prior to periodic reviews of operations by the clients

Managed and maintained laboratory systems such as the ATR-FTIR analyzer (Thermo-Fisher) used in testing for calibration and performance reliability as part of GLP protocols

Implemented laboratory method validation protocols and an ELN for instrumentation used for research in coploymer and emulsion products employing ATR-FTIR spectroscopy, optical scattering, and chemometrics techniques

Integrated the data acquisition from instruments such as a Thermo Fisher Magna FT-IR spectrometer to the Labtrack Electronic Laboratory Notebook (ELN) as part of the LIMS setup for QC test monitoring

2/2003-5/2008

DATA TOOLIKIT & TECHNOLOGY STACK

Project Management – Cloud Coach, Salesforce CRM, Trello

Relational Database Management Systems –Bigquery, AWS RDS, Oracle, MS SQL Server, MySQL, Postgresql, SPARKSQL, MySQL

Cloud Services, BI & Office Applications –Tableau, Cloudera (Hadoop, Hive, Hue), Alteryx, Pulumi, Jira, AWS (S3, DynamoDB, Athena, RDS, Glue, Redshift), Azure (Blob Store, Databricks, Logic Apps, DataStudio), GCP Trifacta Wrangler (Dataprep), MS (Azure ML, Power BI-Excel, Visual Studio, Container Service, Office) GoogleCloud (GCE, GKE, Cloud Storage, BigQuery), NoSQL (Mongo Atlas)

Scripting and Programming Stack – Python (Pandas, Sci-Kit Learn), AWS Sagemaker, GCP AI Platform, Python AI frameworks (Keras/Tensorflow), H2O ai, R(Tidyverse, MxNet), Apache Arrow, SQL, SPARK ML, Deeplearning4J, NLP (Huggingface), Unix/ Linux

EDUCATIONAL BACKGROUND

https://github.com/amercado-chemistsclub/ML_science_applications

Completed Developer Certificates & University Degrees

Amazon Web Services, Seattle, WA 2020

-AWS Certified Machine Learning Engineer

https://www.youracclaim.com/badges/dffd4352-9daf-42a2-9885-65a04897c5e1/public_url

Google, Mountain View, CA 2020

-Google Certified Professional Cloud Architect

https://www.credential.net/d74599e3-9410-4df6-a178-99fb6d2fc29c

-Google Professional Data Engineer 2020

https://www.credential.net/739ae68c-90e4-4faa-9fbb-000b45f8d654?

Skymind, Mountain View, CA 2017

-Enterprise Deep Learning with Deeplearning4J

University of Rochester, Rochester, NY 1990

M.S. in Optics

Areas of Concentration: Optical and Digital Image Processing

Stevens Institute of Technology, Hoboken, NJ

B.S. in Engineering

Areas of Concentration: Engineering Physics

1988

REFERENCES

Justin Meloni, Pacbasin Technology, ad2c64@r.postjobfree.com,Tel: 212-***-****

Ahmad Juma, FloatingPointe Inc., ad2c64@r.postjobfree.com, Tel: 212-***-****

Thomas Herlihy, Namely., ad2c64@r.postjobfree.com Tel: 855-***-****

Contact this candidate