Post Job Free

Resume

Sign in

Data Analyst Engineer

Location:
New Jersey
Posted:
March 06, 2024

Contact this candidate

Resume:

Jeevan K Chevula

ML Engineer Senior Data Engineer Data Analyst

I am a professional with Nine years of experience in data analysis, business analysis, and data engineering. I am targeting new data engineering assignments that allow me to integrate my skills and experience to transform data into actionable insights and drive decision-making. I am looking to join a forward-thinking environment where I can continue to leverage my existing capabilities and embrace new challenges that enable professional growth while making meaningful contributions to data-centric roles.

Professional Summary

10 years of IT industry experience as a Data Analyst and Data Engineer with working exposure in domains such as Supply Chain Management and Banking and Finance.

Experience in Migrating SQL database to Azure Data Lake, Azure Data Lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data Warehouse and controlling and granting database access and Migrating On-premises databases to Azure Data Lake store using Azure Data Factory.

Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into customer usage patterns.

Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.

Good understanding of business requirements, Warehouse schemas and mapping rules for the implementation of simple to complex ETL designs.

Expert in implementing various business rules for Data Extraction, Transforming and Loading (ETL) between Homogenous and Heterogeneous Systems using Azure Data Factory (ADF).

Conducted AI research in partnership with various AI collaborators and Adobe’s AI/ML team to assess the

optimal AI product mix, including the evaluation of Large Language Models (LLM) for the Firefall

architecture stack.

Extensive experience in developing tabular and multidimensional SSAS Cubes, Aggregation, KPIs, Measures, Partitioning Cube, Data Mining Models, deploying and Processing SSAS objects.

Scheduled reports for Daily, Weekly, and Monthly reports on sales and marketing information for various categories and regions based on business needs using the Power BI reporting tool.

Experienced in designing and developing cutting-edge Generative AI Retrieval Augmented Generation agents for efficient data retrieval and enhancing the Conversational AI capabilities.

Ability to work independently and as part of a team to accomplish critical business objectives as well as good decision-making skills under high-pressure complex scenarios.

Competency with ETL procedures and data loading and integration in Snowflake

Exposure to extracting raw data from different sources, establishing database connections for various databases and preparing analysis reports on different parameters. Also, possess knowledge of ETL.

Involved in importing data from various source systems and creating dashboards using Power BI.

Good Understanding of Azure Big data technologies like Azure Data Lake Analytics, Azure Data Lake Store, Azure Data Factory.

I have a solid understanding of statistics, probability-sampling techniques, and A/B testing, allowing me to build analytical, pattern recognition, and predictive models, Causal Inference models as well as risk management models, using statistical methods and complex AI algorithms. I am experienced in algorithm design and testing techniques.

Quick learner having the zeal to learn new technologies and adopt Domain and business knowledge based on business needs.

Skilled in developing comprehensive Generative AI web applications utilizing Python and Streamlit alongside Large Language Models (LLMs).

Leading the way in sector-specific advancements within Data Modernization and Generative AI, integrating technologies such as OpenAI, Google BARD, and Azure.

IT Skill

Azure Technology: Azure Databricks (ADB), Azure Data Factory (ADF), Azure Data Lake Storage (ADLS), Service Principal, Azure Key Vault

Database: SQL Server, Azure SQL Database, Azure SQL Data Warehouse

Machine Learning: Supervised & Unsupervised learning; Predictive & Preventive ML models; Regression, Classifications, Neural network, Bagging, Boosting ML models, Sci Kit learn, SK Learn, Stats-models.

Deep Learning: TensorFlow, Py Torch, Neural network, Computer vision, GenAI, LLM, Chat GPT, BERT, GAN, Generative AI

Languages: Python, PySpark, SQL and Spark SQL

Reporting Tools: Power BI, Tableau and MS Excel

Versioning Tools: Git, Git Hub

MLOps/AI Ops Architect: Technology development using Agile, DevOps, MLOps, CI/CD, KPI, ROIs, Jenkins, Snowflake, Databricks, RDBMS, SQL, AWS, Azure, GC, Sage maker.

ETL Architecture: Medallion Architecture

Certification

DP 203 - Azure Data Engineering (Certification No.: 149EAM-E823AF)

Academic Details

MS in Computer Science from Northwest Missouri State University, MO, USA

Professional Summary

Optum, New Jersey June 2023- Till Date

GEN AI/Data Engineer

Responsibilities:

Implemented and fine-tuned GPT-3.5-turbo, Text-Davinci-003, DALL-E 2/3 using OpenAI API for diverse applications.

Contributed to Hugging Face open-source projects, showcasing collaborative skills in model development and enhancement.

Applied RAG/Connectors techniques for improved information retrieval, integrating generative and retrieval-based approaches effectively.

Utilized Langchain to enhance language technologies, showcasing adaptability and proficiency in cutting-edge advancements.

Develop, train, and optimize large language models (LLMs) and GPT-like models using extensive datasets to achieve the highest performance and accuracy.

Apply various data science techniques to extract, cleanse, and reshape data, prioritizing its quality and integrity throughout the process.

Spearheaded the design and construction of a cutting-edge Retrieval Augment Generation (RAG) agent using Generative AI GPT.

Proficient in constructing and executing fine-tuning pipelines tailored for Large Language Models (LLMs) on specific datasets.

Skilled in adapting LLMs to address challenges in datasets with extensive data volumes.

Experienced in crafting intricate LLM workflows suited for diverse client requirements.

Proficiently integrate and optimize prompts to improve LLM outputs.

Conducted cutting-edge research in the fields of GEN-AI, LLM, and NLP, contributing to advancements in language processing techniques.

Design and enhance workflows using the RAG method.

Develop sophisticated guardrail workflows to integrate seamlessly into LLM pipelines.

Capable of evaluating cutting-edge tools and integrating them into LLM pipeline designs.

Proficient in discussing LLM design and solutions, and leading technical training sessions.

Seasoned in assessing LLM vulnerabilities, robustness, and performance characteristics.

Skilled in mitigating LLM issues such as hallucinations, biases, and low-quality data.

Lead the conceptualization, design, and development of finely tuned LLM capabilities as well as engineered solutions derived from LLMs.

Contribute to the development of language programming capabilities, statistical analysis, and Natural Language Processing (NLP) techniques.

Develop innovative LLM pipelines optimized for complex client scenarios.

Implement optimal models, prompts, tools, and infrastructure components aligned with specific use case requirements.

Apply responsible AI methodologies to ensure unbiased and trustworthy application of LLMs and mitigate associated risks.

Foster enterprise-wide understanding of LLM guardrails based on emerging research and insights from open-source and commercial models.

Introduce novel optimizations to enhance LLM performance in applied use cases.

Skillfully apply LLM outputs to complex scenarios.

Expertly integrate data retrieval tools within a RAG framework to effectively utilize LLMs for advanced client scenarios.

Provide guidance and mentorship to junior LLM engineers on concepts, methods, and best practices.

Contribute to the design and evaluation of automated tools for integrating guardrails into LLM workflows.

Educate clients on optimal LLM use cases, best practices for prompts, RAG workflows, and guardrail implementation.

Rapidly iterate on data processing and model improvements to enhance LLM pipeline performance for specific use cases post initial testing.

Process datasets for fine-tuning LLMs for targeted use cases.

Technology Stack: NLP techniques, LLM (RAG, Connectors, Semantic Analysis) Python, Postgres, AWS snowflake, Alation data catalog tool, snow sql, AWS EC2, S3, AWS lambda, AWS secrets manager, AWS SQS, Adobe analytics, Linux, Scikit-learn, SciPy, NumPy, Pandas, Matplotlib, Seaborn, JIRA, GitHub, Agile/ SCRUM.

Net Health – Richardson, Texas Dec 2022 to June 2023

ML Engineer/ Python

Responsibilities:

Worked on development of internal testing tool framework written in Python.

Developed GUI using Python and Django for dynamically displaying block documentation and other features of python code using a web browser.

Wrote scripts in Python for extracting data from HTML file.

Developed views and templates with Python and Django's view controller and templating language to create a user-friendly website interface.

Used JavaScript and JSON to update a portion of a webpage.

Performed troubleshooting, fixed and deployed many bug fixes for applications that were a main source of data for both customers and internal customer service team.

Handled potential points of failure (database, communication points and file system errors) through error handling and communication of failure.

Proficient in data visualization tools such as Tableau, Python Matplotlib, and Python Seaborn to create visually powerful and actionable interactive reports and dashboards.

Extensive knowledge of UNIX and LINUX operating systems, storage environments, network protocols and file systems.

Experience using Linux doing all activities such as installation, system integration, image development, patching, upgrades.

Strong Knowledge of Regression/ Linear Analysis Familiarity with any of Deep Learning, Generalized Linear Models, K- Mean, and naive Bayes.

Created visualizations, reports, and technical documentation appropriate to data and reporting requirements.

Involved in utilizing Machine Learning techniques like SVM, Naive Bayes, decision tree, Random Forest, Xgboost, and Gradient boost to classify data.

Proficient in core java concepts like Collection Framework, Multi-threading, Generics, Annotations

Create CloudFormation templates for different environments (Dev/stage) to automate infrastructure (ELB, CloudWatch alarm, SNS etc.) on click of a button.

Responsible for debugging the project monitored on JIRA (Agile)

Wrote Python scripts to parse JSON documents and load the data in database.

Worked on Restful APIs to access the third-party data for the internal analysis and statistical representation of data.

Used Python and Django to interface with the jQuery UI and manage the storage and deletion of content.

Automated the data flow and data validations on the input and output data to simplify the testing process using Shell Scripting and SQL

Build SQL queries for performing various CRUD operations like create, update, read and delete.

Developed entire frontend and backend modules using Python on Django including Tastypie Web Framework using Git.

Validate Scoop jobs, and Shell scripts & perform data validation to check if data is loaded correctly without any discrepancy.

Created database using MySQL, wrote several queries to extract data from database.

Worked in NoSQL database on simple queries and writing Stored Procedures for normalization and renormalization.

Knowledge of advanced level programming in c/c++ includes thread synchronization multithreading, multiple processing concurrency and TCP/IP Socket Programming.

Setup automated cron jobs to upload data into database, generate graphs, bar charts, upload these charts to wiki and backup the database.

Developed Merge jobs in Python to extract and load data into MySQL database.

Successfully migrated the Django database from SQLite to MySQL to PostgreSQL with complete data integrity.

Designed front end using UI, HTML, Bootstrap, Node JS, underscore JS, Angular JS, CSS, and JavaScript.

Followed AGILE development methodology to develop the application.

Involved in Data base design, normalizations and de-normalization techniques.

Involved in User Acceptance Testing and prepared UAT Test Scripts.

Used Test driven approach (TDD) for developing services required for the application.

Implemented Integration test cases.

Used Git to resolve and coding the work on python and portlet.

Technology Stack Python 2.7, Django 1.4, HTML5, CSS, XML, MySQL, JavaScript, Angular JS, jQuery, CSS Bootstrap, JavaScript, Eclipse, Git, GitHub, Bigdata, Linux, Shell Scripting.

Vanguard/Capgemini Oct 2021 to July 2022

SR Azure Data Engineer

Project Summary:

As an Azure Data Engineer focusing on designing, implementing, optimizing and maintaining data solutions on Microsoft Azure, I have work exposure in data modelling, ETL, data transformation, building and managing data pipelines, data storage, and data processing to enable data-driven decision-making solutions using Medallion architecture.

Responsibilities:

Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data. Understand the current Production state of the application and determine the impact of new implementation on existing business processes.

Competency with ETL procedures and data loading and integration in Snowflake

Extracting data with snowsql from the internal stage to snowflake tables.

Used Object Oriented Programming (Python), Unix scripting, or related programming languages.

Used libraries such as Pandas, NumPy, Scipy, NLTK, and Spacy for data wrangling.

Built ML models such as random forest, Xgboost, Light GBM, and other ensemble modeling algorithms.

To validate the internal stage files, use the COPY, LIST, PUT, and GET commands.

Performed import and export from the external stage (AWS S3) to the internal stage (snowflake)

use the Snowflake Cloud Data Warehouse to write intricate Snowsql scripts for reporting and business analysis.

Continuous data import from the S3 bucket was accomplished using SNOW PIPE

Created snowflake processes for branching and looping operations.

Analyzed customer reviews and found emotions/sentiment analysis using NLP. Also, worked on Text summarization and Chatbots using NLP.

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks.

Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.

Developed Spark applications using PySpark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into customer usage patterns.

Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that processes the data using the SQL Activity.

Involved in some product business and functional requirements through gathering team, updating the user comments in JIRA 6.4 and documentation in Confluence.

Collaborating with the Internal/Client BA’s in understanding the requirements and architecting a data flow system.

Collaborated with the infrastructure, network, database, application, and BI teams to ensure data quality and availability.

Developed T-SQL stored procedures, Triggers, Constraints, and indexes using various DDL and DML commands.

Worked performing SQL tuning and optimizing queries for reports which take longer time in execution with SQL Server.

Worked on scheduling the SQL jobs using SQL Server Agent service.

Troubleshooting any potential problems, creating, and submitting test reports and database improvement.

Worked with front-end development team to integrate the modules.

Technology Stack Python, ML, NLP, Flask, Deep Learning, Scapy, TensorFlow, Keras, Spacy, LLM, Generative AI, Lang Chain, BERT, Chat GPT, PALM, RDS, Computer vision, Postgre Sql, Air Flow, Tableau, Azure Data Factory, Databricks, Synapse, SQL Server, PySpark

Spsoft Pvt Ltd, India April 2015 to Sep 2021

Data Engineer

Project Summary:

As a Data Engineer focusing mainly using Medallion architecture, I can enable data-driven decision-making solutions with my job expertise in data modeling, ETL, data transformation, pipeline building and management, data processing, and data storage. My primary responsibilities as an Azure Data Engineer are developing, putting into practice, refining, and managing data solutions on Microsoft Azure.

Responsibilities:

Applying the Azure PaaS service, analyze, create, and develop modern data solutions that facilitate data visualization. Recognize the application state in production at the moment and assess how a new implementation will affect ongoing business procedures.

Using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics, extract, transform, and load data from source systems to Azure Data Storage services. Data processing in Azure Databricks following data ingestion to one or more Azure Services (Azure Data Lake, Azure Storage, Azure SQL, and Azure DW).

Written multiple python/API scripts using python to connect to Tableau to refresh/extract the data.

To extract, transform, and load data from many sources, including Azure SQL, Blob storage, Azure SQL Data Warehouse, write-back tools, and rearward, pipelines were created in ADF utilizing linked services/datasets/pipeline.

Built Spark apps for data extraction, transformation, and aggregation from various file formats using PySpark and Spark-SQL. The data was then analyzed and transformed to reveal insights into client usage trends.

Experience in Object Oriented Programming (Scala, Python), Unix scripting or related programming languages and exposure to some of Python’s ML ecosystem (numpy, panda, sklearn, TensorFlow, etc.).

Participated in gathering team input for some product business and functional requirements, updated user comments in JIRA 6.4, and updated Confluence documentation.

Practical expertise in data cleansing, which involves eliminating mistakes, inconsistencies, and duplication to prepare data for analysis. To comprehend the properties of the data and spot patterns and trends, use exploratory data analysis, or EDA.

Good Experience with ODBC, JDBC, and Other Connection Properties for Data Extraction from Various Data Sources, Including Databases, Flat Files, and Web Scraping

Worked together to guarantee data availability and quality with the infrastructure, network, database, application, and business intelligence teams.

Technology Stack: Azure Data Factory, Databricks, Azure Synapse, Datalake, Blob Storage, ETL, Python, PySpark, TensorFlow

Prokarma, India April 2013 to Mar 2015

Data Analyst

Project Summary:

As a Data Analyst focusing mainly on data cleansing, data transformation, visuals, and dashboards, organizations need to drive insights from their data. My role primarily focuses on building Power BI dashboards and MS Excel and partly Tableau reports. Performed requirement gathering, data exploration, advanced analytics, and project management and contributed to data-driven decision-making processes.

Responsibilities:

Worked with stakeholders to gather business requirements for building data cleansing, data transformation, data analysis, data visualization and dashboarding using Power BI and MS Excel.

Participated in requirements meetings and Scrum Calls to understand the Report Requirements.

Identified and documented detailed business rules. Used Power Query / Query Editor for Data transformation as per the reporting needs. Created a Data Model in Power Pivot by Establishing the Relationship Between the tables.

Created New Columns and New Measures over data models for the purpose of enhancing Power BI visuals using DAX. Developed at least two-three visual options for each report for discussion with stakeholders and customers.

Created Microsoft Power BI Dashboards in Power BI Service using Reports. Provided continued maintenance and development of bug fixes for the existing and new Power BI Reports.

Hands-on experience in data cleansing, preparing data for analysis by removing errors, inconsistencies, and duplicates. Performing exploratory data analysis (EDA) to understand the characteristics of the data and identify patterns and trends.

Good Exposure to Data Extraction from various data sources such as databases, flat files and web scraping using ODBC, JDBC and various other connection properties.

Building data models using Power BI to organize and structure the data for analysis. Creating visualizations using Power BI to communicate insights to stakeholders clearly and concisely. Developing dashboards and reports using Power BI to track key performance indicators (KPIs) and monitor business performance.

Developed T-SQL stored procedures, Triggers, Constraints, and indexes using various DDL and DML commands.

Worked performing SQL tuning and optimizing queries for reports which take longer time in execution with SQL Server.

Worked on scheduling the SQL jobs using SQL Server Agent service.

Troubleshooting any potential problems, creating, and submitting test reports and database improvement.

Worked with the front-end development team to integrate the modules.

Technology Stack: Power BI, SQL, SQL Server, MS-Excel

ad35jb@r.postjobfree.com www.linkedin.com/in/jeevan-c-a9548a2a7 +1-303-***-****



Contact this candidate