Post Job Free
Sign in

Data Engineer Aws Cloud

Location:
Rensselaer, NY
Posted:
May 03, 2025

Contact this candidate

Resume:

Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/ PROFESSIONAL SUMMARY:

• 10+ Years of experience in designing, developing, and maintaining AWS (Amazon Web Services),

• GCP Certified Þ (Google AWS in both Solution Cloud AWS Platform) and Architect GCP and Professional Microsoft Azure – Certification Link Þ Google Cloud Certified Professional Data Engineer – Certification Link

• Designed and developed POC Solutions for Gen AI Data Pipeline using AWS Cloud and GCP Cloud.

• Designed Data Mart and in AWS developed and GCP multi-Cloud tenant data applications supporting Data Warehouse, Data Lake and Þ In AWS Cloud I have used services like: Redshift, S3, RDS (Oracle, MySQL, MSSQL), Glue

(Jobs, Connections, Notebooks, Crawlers & Catalog), EMR Cluster, Ec2, Step Function & Þ Airflow, In GCP Cloud SNS Topic I have & SQS, used Lambda services (Functions like: BigQuery, & Layers)Cloud, IAM Storage, (Roles Cloud & Policies)SQL, (etc.Oracle,, MySQL), Data Flow, Data Proc, Compute Engine, Cloud Composer, Pubsub, Cloud

• Hands-on Function, experience IAM (designing Roles & Service and implementing Accounts), ect.Generative, AI pipelines and LLM-based

• solutions context-Designed aware and across developed automation. AWS and Proof-GCP of-cloud Concepts ecosystems (POCs) for to support Gen AI applications chatbots, agentic using Amazon workflows, Bedrock, and

• SageMaker in Created customer intelligent JumpStart, service, search agents Titan augmentation, and Embeddings, workflow and and automations document Vertex AI intelligence. using with Gemini Bedrock models Agents, to explore Amazon use cases Lex,

• Comprehend, across Designed Generative Þ cloud-In and AWS AI native Solutions developed and Cloud services. AWS to I highly Step have support Functions used scalable real-services time data to and deliver applications like: batch Glue, context-use case Lambda, for aware, in Machine AWS OpenSearch, goal-and Learning, driven GCP Cloud task Data SageMaker, orchestration Science and S3, Þ DynamoDB, In GCP Cloud I have used services like: Cloud Run, Cloud Function, AlloyDB, Vertex AI,

• Designed Document and developed AI, Gemini, Model BigQuery Monitoring solutions to validate model performance and model

• Metrix Designed in and AWS developed Cloud feature store for Machine Learning and Data Science use cases in AWS

• Cloud Designed and developed real-time and batch reports and Dashboards in both AWS and GCP cloud Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/ Þ In AWS Cloud I have used services like: S3, Lambda, Glue, SNS, Cloud Watch Event Þ StepFunction In GCP Cloud and I have QuickSight used services like: Cloud Storage, Cloud Function, Pubsub, Cloud

• Extensive Scheduler, work experience Workflow, in Implemented Cloud Composer, continuous BigQuery integration and Power and BI continuous deployment

• • (using Extensive Designed CI/CD) Gitlab through experience and and developed Terraform. Bit in Bucket writing purpose-and Infrastructure Jenkins built using layers as UCD using Code (UrbanCode AWS (IaaC) services in Terraform, Deploy) like also RDS AWS worked (Oracle, CloudFormation extensively MySQL),

• Redshift, Developed DMS complex (Data mappings Migration and Service)loaded data, S3, Cloud from various Watch sources alarms, into the Data Warehouse, using different transformations/stages like Joiner, Transformer, Aggregator, Update Strategy, Rank,

• Lookup, Extensive Filter, working Sorter, experience Source in Qualifier, Apache Stack: Stored HDFS, Procedure Hadoop, transformation, Hive, MapReduce, etc. Spark, Kafka,

• Zookeeper, Scheduled both Þ AWS In and AWS and Airflow, GCP Orchestrated Cloud Cloud Flume, I have data used Super applications services Set like: for Cloud both Watch real-time Event, event Lambda, triggering SNS jobs Topic, and Airflow, batch in Step Function, S3, Glue (Jobs, Connections, Notebooks, Crawlers & Catalog), Redshift, Cloud Watch logs

Þ In GCP Cloud I have used services like: Cloud Scheduler, Workflow, Cloud Composer, Pub- PROFESSIONAL • • Migrated Used QlikView, Sub, legacy EXPERIENCE: DataFlow, Kibana, data applications BigQuery, Tableau from and Cloud Apache Hadoop SQL superset ecosystem for to visualization. AWS Cloud and GCP Cloud. JP Morgan Chase & Co, Albany, NY Sep 2021- Till Date Role: Senior Data Engineer

Description: developed developed monthly, machine significate learning daily amount an Machine As ETL and a engineers, pipeline team of hourly learning time we they using basis. developed Data platform spent cutting Also Scientist of a developed recreating and Data edge and Data Lake technologies Data pipelines feature analytical and analysts Data and to to Warehouse feature see can instead validate model use repository focus those for performance Machine on Analytics. features, model using learning building. this on Also AWS Dashboards has designed models. services, decreased Also and for so Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/ Roles • Involved & Responsibilities: with project team members to deliver data models to meet data requirements and create all

• documentation Metadata Partnered transfer with ML and amongst Engineers deliverables various and in Data proprietary accordance Scientists systems. with to build enterprise AI-driven guidelines solutions and including standards. chatbots, Work RAG on pipelines, and embedding workflows leveraging AWS Gen AI services such as Amazon Bedrock,

• • Amazon Designed Kendra Designed for Titan and and semantic implemented maintained models, search Amazon scalable and Retrieval-Bedrock-SageMaker data Augmented pipelines hosted JumpStart, LLMs supporting Generation for and response SageMaker Retrieval-(RAG) generation. architecture Pipelines. Augmented using Generation Amazon

• • • (Developed intelligent, Created efficient Designed RAG) architecture vector-semantic and chatbot multilingual, developed based search frameworks to power feature and across data context-intelligent stores pipeline integrated structured aware and for conversational embedding with Gen and conversations. unstructured Amazon AI (Generative pipelines agents Lex, data. Comprehend, with using Artificial real-SageMaker time Intelligence) and contextual Bedrock + FAISS, knowledge. & to machine enabling support

• • • learning Designed Automated Created Terraform platform. and daily developed ETL resources process data pipeline using and Apache modules using Docker, Airflow to deploy Kubernetes, and Crontab. AWS services Logstash, like Hadoop S3 buckets, and AWS Lambda Stack. Functions, Glue Jobs, EMR clusters, Step Functions, I AM roles and Policies, Athena and

• • • • • • • • • • • QuickSight. Created Created Kubernetes Moved Analysed Experienced as Coordinated AWS Developed Implemented Consumed Used Building a data Spark QuickSight. large YML models a pipe-large the ETL mechanism and amount to in daily, files Spark line data and store pipelines writing Spark-in to system. spark weekly, critical from using use of data data SQL live (Data Docker Kafka in using in PySpark datasets bi-and Real-to from Hadoop weekly, pipeline) read python using out images, time AWS and the using of and Apache data monthly, in Processing S3 parquet Spark to configured Databricks EMR, AWS. buckets warehouse collect spark. SQL Glue data and to and for data and to and quarterly AWS and using faster GitLab core create Spark. from SageMaker Redshift jobs tools testing CI/oversight application the using CD like tables using and pipeline Notebooks. Python Spark processing monitoring in Glue running hive with Streaming and and using AWS Terraform. EMR. of in activities data. the Docker Glue. with Scala Kafka using API. and Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/

• • • Worked Experience Proficient on with migrating in building container SQL CICD scripts systems pipelines from like for Redshift Docker testing and and and Athena. production container environments orchestration using like EC2 Terraform. Container

• • • Service, Worked Used and Done technologies. databases, AWS some on Kubernetes, EMR POC’s POC such to on with transform new as worked Apache Amazon technologies. and with Spark Simple move Terraform. Followed using Storage large Scala amounts agile Service to implement methodology. of (data Amazon into spark S3) Suggested and in and out project. Amazon of other and implemented AWS DynamoDB. data stores new RCG Atlanta, GA Jan 2018 – Aug 2021

Role: Data Engineer

• • Description: Warehouse) Roles Designed, databases Designed, & Responsibilities: for developed developed using As analytical part Confluent Data and and team Analytical deployed deployed Kafka and Team, Datawarehouse DataLake AWS responsibility S3. to store using Realtime is to AWS enable Redshift and data batch sources following data from (Data best applications Lake practices and Data and and

• • • • • • optimized Used using Integrated auditable, Developed enabling Designed, SageMaker, Used Build AWS QlikView, Fast Self-data and service the Fargate API developed self-IAM, and readiness queries highly and service Kibana, API EMR. data and Python. available Gateway, performance and pipelines pipelines ECS for Tableau Machine for deployed model scalable and for using and and internal CloudWatch Learning access Apache batch AWS deployment storing patterns. data Services pipelines superset and the teams with tables. Data of containerized like, Machine using for using Science SNS, visualization. SNS, SNS, StepFunction, workloads. Learning Lambda, end Lambda, points Models Glue, inference StepFunction, Lambda, EMR, to ensure services and Glue, Athena, secure, Glue EMR, built,

• • EC2, Designed Designed Athena, database and SageMaker, developed architecture data QuickSight, pipelines. and developed Redshift Automated self-etc. based service on data the requirement. platform in AWS Cloud. Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/

• Created Terraform resources and modules to deploy AWS services like S3 buckets, Lambda Functions, Glue Jobs, EMR clusters, Step Functions, I AM roles and Policies, Athena and

• • • • • • • • • QuickSight. Developed and Developed Developed Developed Designed Gathering Established Represented Created RDS. POC and requirements Topic and process complex AWS on team developed automated Maria extraction Bastion in to data few generate DB from data summits. pipelines host Audit data (Topic SME, cooking missing for normalizing Plugin, secure Modelling) using Project model data identified AWS data Manager from model to transaction resource model normalize source few and using using performance S3, Business files. Python between LDA and EC2, transform and Lambda, analysis. libraries. application issues other customized raw then Glue, data. and escalated EMR, database NLP Step to techniques. Maria Function team. DB research and Develop team.

S&P Global Charlottesville, VA Feb 2016- Dec 2017

Role: Data Engineer

Description: multi-Roles dimensional, & Responsibilities: Involving conducting in S&P Network Global optimization, is prestige’s 360-Predictive degree analytics, profiles and assembling Social Analytics. project. Analysing

• • • Designed, Designed, Extensive experience developed developed and and on Google deployed deployed cloud DataLake, batch services ad Realtime DataMart, like Google data Datawarehouse pipelines Cloud Storage, using GCP Cloud in Google services. Function, Cloud DataFlow, Platform.

• BigQuery, Designed Confluent and Cloud Kafka developed Composer to ingest and Realtime (Airflow)deployed and, DataProc, batch Realtime data Compute from streaming Relational Engine, pipelines Cloud Databases, using Run etc. Apache DataLakes Kafka and IOT and

• • • • applications Designed Developed Designed, Migrated On-and developed various perm data. developed dags, Hadoop and plugins and deployed ecosystem deployed and streaming custom various DataLake, hooks pipelines batch Datawarehouse in data Airflow. using pipelines PubSub and using and Data Airflow. Dataflow. Pipelines to Google cloud

• Platform. Created Realtime consumers to read data from Kafka and load data into DataLake (Cloud Storage ) and transformed data using Spark in DataProc cluster. Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/

• • • Designed Built Transformed complex and raw developed ETL text process data steaming and from stored blogs data process and pipelines social data using networking in Hive Apache and sites HDFS. Kafka, provided Flume, from Hive third and Pig. party vendors

• • • for Performed Date Good Developing sentimental and understanding Time data database etc. profiling analysis, on which machine to information learn contains leaning the behaviour Tables, extraction techniques Stored of various and like Procedures, information supervised features such Functions, and retrieval. as unsupervised traffic pattern, Views, leaning. Triggers location, time, and

• • • Indexes Created Understating Extracted a in POC data SQL sales on from SERVER image from third past classification and party few connecting application years, model. using to existing and Machine conducted CMS leaning system. data preprocessing techniques. and data mining using R and Python.

IBM, Hyd IN. Jul 2014- Dec 2015

Role: Data Analyst

Description: domains, focusing customers’ Roles & Responsibilities: on industries IT the As challenges.a delivery Data and Mining s verticals. of the team, very Our we organization are latest responsible technologies has to expanded provide and methodologies data-its resources specific IT and services that knowledge successfully across multiple base meet by

• • • • • Participated using This Define loading Wrote Developed system Oracle, UNIX database the new data consists Shell in Developer analysis, and from structure, Scripts of modified source the design, and to various mapping run for PL/existing development, ETL database SQL. functional (and Extracting packages, transformation jobs modules. testing, on Transforming server Database and side. logic. implementation triggers, Creation and Loading) stored of of External various procedure Jobs. financial Table and scripts Systems other for

• code Worked modules with various using functional PL/SQL in experts support to of implement business requirements. their functional knowledge into business rules in turn as working code modules like procedures and functions. Sandhya Nunemunthala

Email: **.*******@*****.***

Mobile: LinkedIn: 518-https:684-//5357 www.linkedin.com/in/sandyndegenai/ TOOLS AND TECHNOLOGIES:

Languages Shell, SQL, Python, PySpark and R

Generative Embedding AI & Vector & LLMs LLMs: Gemini BLOOM Amazon (Google)Titan,, Llama GPT-2 / 3 4 (/ Meta)GPT-, 3.Falcon, 5 (OpenAI)Mixtral,, Claude DeepSeek, (Anthropic)Cohere,, DBs RAG & Agentic SentenceTransformers, Pinecone, Weaviate, Milvus, Hugging OpenSearch Face Embeddings, FAISS, ChromaDB, Frameworks LangChain, Llama Index, Haystack, CrewAI, AutoGen, PromptLayer Packages NumPy, Soup, image, Selenium, Pillow, Pandas, Psycopg, Scikit-PyOd, learn, SQLAlchemy Spacy, TensorFlow, Matplotlib,, Flask PyTorch, and Seaborn, more. Lime, Bokeh, H2O, Beautiful Scikit- Container technologies AWS EKS, Docker and Kubernetes. Cloud Platforms AWS and GCP.

EDUCATION:

Master of Computer Applications Jun 2014

Osmania University, Telangana, India

Bachelor of Science B.sc (Mathematics. Physics. Chemistry) Jun 2010 Kakatiya University, Telangana, India



Contact this candidate