Lavanya Thota
DATA ENGINEERING MANAGER
Current Location: Florida, USA
Contact: +1-407-***-****
Mail ID: adwhrq@r.postjobfree.com
LinkedIn: https://www.linkedin.com/in/lavanya-thota/ Career Summary
I am a Data Engineering Manager at Accenture with 10+ years of overall experience. I have a deep interest and good knowledge of Large Data Analysis, Machine learning, Natural Language Processing, Artificial Intelligence, Knowledge Discovery, Data Modeling, ETL, Reporting and Web Application Development. Have an acumen for solving business problems and provide solutions using data to increase customer satisfaction
Certified and hands on experience in GCP platform
Have good exposure of Cloud computing (AWS Databricks)
Have good knowledge on several machine learning techniques like LTV, CV, Relevancy, Propensity, LDA, Random forest, Decision tree, clustering technique
Worked on automated builds of CI/CD, Git, scheduled builds, and up/downstream jobs using Jenkins
Certified in MongoDB for DBA
Excellent Experience in Pig and Hive Using CLI Mode (Hive Shell), SQOOP
Hands on Experience in Core Java, Ruby for MapReduce Concepts and HDFS
Developed Oozie workflow for scheduling and orchestrating the ETL process
Hands on Experience in python and spark Implementation for better performance Optimization
Excellent Experience in Shell Scripting
Special expertise in APIs and web services (SOAP and REST)
Knowledge of software engineering concepts, such as object-orientation, MVC, unit testing, and Agile
Good knowledge in PostgreSQL, MySQL and RDBMS
Flexible and ready to take on new challenges
Good presentation, data analysis, coordinating and team skills Technical Skills
Languages Web stack & DB Cloud Platforms Machine Learning Techniques
Python Jquery Google (GCP) LTV
Spark AngularJS Amazon (AWS) CV - RFM
Hive Node.js /Bower/Grunt Microsoft (Azure) Relevancy
Sqoop JSP Databricks Propensity
Oozie Big Query LDA
Java MongoDB Random forest
Hadoop (PIG LATIN) PostgreSQL Decision tree
Ruby HBase Clustering technique
R language MySQL
Shell Scripting
Professional Experience
Data Engineering Manager,
Accenture India Private Limited, Hyderabad Dec 2021 – Dec 2022 Data Engineering Consultant,
Accenture India Private Limited, Hyderabad Dec 2018 – Dec 2021 Senior Big Data Engineer,
Kogentix Technologies Pvt. Ltd (Acquired by Accenture), Hyderabad Apr 2017 – Dec 2018 Big Data Engineer,
Kogentix Technologies Pvt. Ltd (Acquired by Accenture), Hyderabad Oct 2015 – Apr 2017 Technical Associate,
Sears Holdings India Private Limited, Pune
Jul 2012 – Oct 2015
Project #9 Serum Analytics Hub
Customer McDonald’s (https://www.mcdonalds.com/)
Duration February 2021 to December 2022
Tools & Technologies GCP, Python, Big query, GCS, App Engine, AI platform, AWS S3, AWS Elastic beanstalk
Database Databricks & Big query
Location Accenture, Hyderabad, India. (offshore)
Role Manager - Data Engineering
Team Size 10
Project Summary: Serum AHUB Analytics – McDonalds - Serum Flow is the user interface dashboard for marketers to learn about audience data, and ultimately create a test & learn experiment to deploy for a group of customers within a segment.
My Contributions and Responsibilities
• Collaborating with product owners and providing solutions
• Design and architecture of tools
• Enhancing new features as per client requirement using python Api’s
• Loading Tables from Aws S3 to Databricks environment
• Analyzing data and running models on data bricks and gcp environments
• Creating jobs, scheduling, and monitoring jobs in Databricks
• Setting up CICD pipeline for Application deployment for AWS and GCP environment.
• Deploying Application in AWS Elastic Beanstalk, GCP App engine.
• Coordinating with other team members and mentoring the team
• Providing secure and user-friendly platform for retail clients Project #8 INTIENT
Customer Accenture Internal Product (https://www.accenture.com/) Duration April 2020 to February 2021
Tools & Technologies GCP, Java,Big query
Database POSTGRESQL
Location Accenture, Hyderabad, India. (offshore)
Role Team Lead - Data Engineer
Team Size 5
Project Summary: Intient analytical studio is an end to end platform, supporting all INTIENT products suits, that explores, transforms, models and visual supporting the realization of business for life sciences industry deployed on GCP. Intient tools are stitched together to harmoniously enable an analytical platform, from exploration to operation.
My Contributions and Responsibilities
• Collaborating with product owners and providing solutions
• Design and architecture of tools
• Exposed REST API’s using Java and involved in UI Integration
• Worked on Unit Test cases using power Mockito
• Coordinating with other team members and mentoring the team
• Providing secure and user friendly platform for life sciences clients Project #7 Duke Energy Utilities – Pilot Project
Customer Duke Energy (https://www.duke-energy.com/) Duration July 2019 to March 2020
Tools & Technologies GCP - Cloud Functions and Vision APIs, Python Database MySQL
Location Accenture, Hyderabad, India. (offshore)
Role Team Lead - Data Engineer
Team Size 10
Client Profile: Duke Energy is one of the largest electric power holding companies in the United States which supplies power to retail customers
Project Summary: Upon finishing construction, Duke Energy and/or Contractor crews take a variety of geo-tagged pictures of the constructed asset. AI/ML inspects the images to identify any irregularities; remote Duke Energy inspectors assess construction based on imagery with their insights used to incrementally train algorithms. Once these assets are digitized, there are innumerable possible applications for further monitoring for condition and vegetation growth.
My Contributions and Responsibilities
• As a capacity of lead, designed the solution and provided detailed architecture rite from backend to front end
• Pre-processing the images like rename, resize, orientation
• Creation of Data set for object detection and annotating the images using Vision ML
• Working on cloud SQL for data storage.
• Worked on cloud functions to trigger HTTP services.
• Co ordinating with IOS, WebApp teams for integrating front end and back end application.
• Updating the fire store database to notify the WebApp team.
• Took complete ownership of cloud development, mentored the team to complete deliverables in short deadlines.
Project #6 HR capital Analytics (HCA)
Customer Accenture Internal Product (https://www.accenture.com/) Duration December 2018 to July 2019
Tools & Technologies R Language, Text analysis using machine learning techniques Location Accenture, Hyderabad, India. (offshore)
Role Data Engineer
Team Size 5
Project Summary: HCA is a global end to end Human Capital Analytics service provider, delivering critical insights about our people, their preferences, what makes them most effective and their contribution to business success to plan, hire, develop, engage and retain key talent. HR analytics related to UKI Dashboard, EXIT survey analysis, New joiner survey analysis, Performance driver analysis. My Contributions and Responsibilities
• Collaborating with data providers.
• Storing data on AWS s3 platform.
• Developing ML Models for text analytics on AWS platform using R.
• Providing insights as per statistical and text analysis on HR Analytics.
• Co ordinating with other team members and mentoring the team.
• Delivering projects end to end.
• Prepared Power Point decks on various analysis to present it to leadership Project #5 ADVANCING DBS with AI (ADA) Platform
Customer Development Bank of Singapore (DBS) (https://www.dbs.com/) Duration August 2017 to December 2018
Tools & Technologies Hadoop, Spark, Scala, Hive, HBase, Alluxio, Airflow, Talend, Jenkins Database HBase
Location DBS Asia Hub 2., Hyderabad, India. (onsite) Role Senior Developer
Team Size 7
Client Profile: DBS Bank Ltd is a Singaporean multinational banking and financial services corporation. It has market-dominant positions in consumer banking, treasury and markets, asset management, securities brokerage, equity and debt fund-raising in Singapore, Hong Kong and Taiwan. Project Summary: ADVANCING DBS with AI (ADA) Platform is a data-first and data-centric platform that provides analytical, data science and business intelligence (BI) functions across DBS' banking products and domains. My Contributions and Responsibilities
• Designer, developer and worked closely with Domain Experts
• Contributed heavily doing R&D, architecture design, preparing prototypes and presenting to clients and customers for validation
• Development of platform on AWS and Cloud with different technologies and verifying compatibilities
• Worked on spark optimization using spark context, bucketing techniques, memory management, data locality
• Providing integration test results for each platform
• Analyzing performance benchmarks
• Worked on Airflow to schedule the tasks
• Worked on development and continuous integration of jobs with Jenkins
• Created support and transition documents
Project #4 DMLE – Digital Matching Learning Environment Customer Nielsen, Chicago ( http://www.nielsen.com ) Duration October 2015 to August 2017
Tools & Technologies Java, Angular JS, Junit, Web services(REST,SOAP), Python, Hadoop, Spark Database Postgres
Location Kogentix Technologies Pvt. Ltd, Hyderabad, India. (offshore) Role Java Developer
Team Size 7
Client Profile
Nielsen invented an approach to measuring competitive sales results that made the concept of “market share” a practical management tool. With their deep knowledge of consumer shopper behavior helps you understand the
“why” behind the buy so that you can enhance your marketing approach at retail. Project Summary: Digital Matching and Machine Learning (DMLE) will automate the coding to classification and matching the Nielsen data. Machine learning process to match data to identify unique products with confidence levels assigned to reduce manual intensive coding process. My Contributions and Responsibilities
• Worked a as lead developer
• Implemented multiple ML models for text mapping using spark, data bricks, AWS
• Did resource optimization using Spark context
• Developed a generic tool for monitoring the jobs
• Automated the process of job creation and monitoring using Oozie
• Automated the builds of CI/CD, up/downstream jobs using Jenkins.
• Guided the team members, trained them for the further development of the product.
• Contributed for all round development of the product and did R&D for several modules
• Working as a team member and tracked the development progress
• Designing UI for easy operations to use high level management
• Discussing the technical way of development with onsite and offshore team lead
• End-to-end testing of application
• Closely working with clients for reviews and approvals
• Responsible for Preparing Technical Specs, analyzing Functional Specs Project #3 IMPACT Back Traffic
Customer Sears Holdings Corporation, Chicago (https://searsholdings.com/ Duration September 2013 to October 2015
Tools & Technologies Java, ETL (IBM Datastage), Linux, Teradata, DB2 and Shell Scripting Location Sears Holdings India, Pune, India. (offshore) Role PIG, Hive, Java Developer
Team Size 10
Client Profile
Sears Holdings Corporation is an integrated retailer. The Company is the parent company of Kmart Holding Corporation (Kmart) and Sears, Roebuck and Co. (Sears). The Company operates through two segments - Kmart and Sears Domestic. The Company is the home appliance retailer, as well as offers tools, lawn and garden, fitness equipment, automotive repair and maintenance. It operates a range of Websites under the sears.com and kmart.com banners.
Project Summary: The main moto of this module is to send the Sears prices to its stores on daily basis. There might be changes in product prices on time to time based on items and by the other sellers of the item. So, this module runs some business rules on those products and sends the updated prices to the stores. My Contributions and Responsibilities
• Responsible for Preparing Technical Specs, analyzing Functional Specs of Data stage
• Importing Data from DB2 to Hadoop using Sqoop
• Replicating the Legacy mainframe, DataStage into Hadoop framework using Pig and Hive
• End-to-end testing of Data warehouse
• Scheduled jobs using Control M
Project #2 Executive Information System (EIS)
Customer Sears Holdings Corporation, Chicago (https://searsholdings.com/ Duration November 2012 to September 2013
Tools & Technologies Hadoop (PIG LATIN), JAVA, RUBY, UV(Vancouver Utilities) and Linux Database Oracle
Location Sears Holdings India, Pune, India. (offshore) Role Hadoop-Ruby Developer
Team Size 30
Client Profile
Sears Holdings Corporation is an integrated retailer. The Company is the parent company of Kmart Holding Corporation (Kmart) and Sears, Roebuck and Co. (Sears). The Company operates through two segments - Kmart and Sears Domestic. The Company is the home appliance retailer, as well as offers tools, lawn and garden, fitness equipment, automotive repair and maintenance. It operates a range of Websites under the sears.com and kmart.com banners.
Project Summary: This Project is related to Executive Information Systems Reports generation based on sales. All business logic available in mainframe will be translated into HADOOP without change or any addition. This change will have no visible impact to the business users of the applications. My Contributions and Responsibilities
• Analyzing the LLD (low level document) & Understanding the logic of program
• Prepare the sample code in Hadoop environment
• Code review maintaining the client standard
• Involved in Impact Analysis & Design
• Involved in Unit Testing, Integration Testing and Production Parallel Testing
• Prepare issue log to explain to Clients
Project #1 Best Vendor Pack (BVP)
Customer Sears Holdings Corporation, Chicago (https://searsholdings.com/ Duration August 2012 to October 2013
Tools & Technologies Hadoop (PIG LATIN), Java, Putty and Control-M Database DB2
Location Sears Holdings India, Pune, India. (offshore) Role Hadoop-Ruby Developer
Team Size 10
Client Profile
Sears Holdings Corporation is an integrated retailer. The Company is the parent company of Kmart Holding Corporation (Kmart) and Sears, Roebuck and Co. (Sears). The Company operates through two segments - Kmart and Sears Domestic. The Company is the home appliance retailer, as well as offers tools, lawn and garden, fitness equipment, automotive repair and maintenance. It operates a range of Websites under the sears.com and kmart.com banners.
Project Summary: The goal of Best Vendor Pack (BVP) is to build a database with the Best Vendor Pack for all KSN and Store combinations. The database will carry all KSN/LOCN combinations even though some stores may not carry the KSN. Initially this project was developed in legacy mainframe, later we migrated it into Hadoop. With this open source technology, it eliminated legacy system and secured cost of license technologies. Directly it saved million dollars to the Organization.
My Contributions and Responsibilities
• Understanding of mainframe JCL’s
• Involved in the PIG script development for corresponding mainframe JCL’s
• Writing Scoop jobs for moving data between Relational Databases and HDFS
• Involved in Enhancement of Jobs as per the given requirements
• Involved in unit testing, system integration testing for the Hadoop jobs
• Designing the jobs into Control-M auto controlled Tool
• Providing support for the migrated jobs
Awards
1. Appreciated by client for automating for manual deployment process in Serum Project. 2. Got “Spot Award” for my performance for Duke project at Accenture 3. Rewarded as Best Team Member award for managing the full project from offshore for DMLE Project at Kogentix
4. Won Enterprise level award for designing and developing tool for log analyzer for IMPACT Back Traffic at Sears Holdings
Education
G. Narayanamma Institute of Science and Technology, JNTU, Hyderabad, India Bachelor of Technology in Computer Science – 2012