Data Analyst - Python, SQL, BI & ETL Expert

Location:

Atlanta, GA

Posted:

February 03, 2026

Contact this candidate

Resume:

NAME: -Naveen Reddy Lankela

MAIL ID: - **********************@*****.***

CONTACT: - 860-***-****

LINKEDIN: - NaveenReddy Lankela

Professional Summary: -

Analytical and results-oriented Data Analyst with over 4+ years of experience supporting business decision-making through data- driven insights, advanced analytics, and scalable reporting solutions. Strong background in analysing large, complex datasets using Python and SQL, with hands-on experience in data extraction, transformation, and validation across multiple structured and semi-structured data sources. Proven ability to ensure data accuracy, completeness, and consistency by implementing automated data quality checks and reconciliation processes. Experienced in designing and maintaining analytical data models, including star and snowflake schemas, to support efficient querying and reporting. Skilled in building end-to-end ETL pipelines using Python and cloud-based tools, enabling near real-time data availability for analytics and dashboards. Adept at collaborating with data engineers and business stakeholders to define KPIs, align metrics with business definitions, and ensure parity between legacy and modern reporting systems. Highly proficient in business intelligence and visualization tools such as Power BI and Tableau, with hands-on experience creating interactive dashboards, optimized DAX measures, and performance-tuned data models. Familiar with implementing role-level security, scheduled refreshes, and automated monitoring to deliver secure, reliable, and production-ready dashboards for both operational and executive reporting.

Possesses a solid foundation in statistical analysis and machine learning concepts, including exploratory data analysis, hypothesis testing, regression, classification, and time-series analysis. Comfortable applying predictive techniques and integrating analytical outputs into dashboards and reporting workflows to provide forward-looking insights. Experienced working in cloud-based analytics environments on AWS and GCP, integrating data from databases, APIs, and external sources. Strong communicator and collaborative team player with experience working in Agile/Scrum environments, translating business requirements into technical solutions, documenting data lineage and transformation logic, and supporting end-user adoption through training and clear documentation. Technical Skills: -

Data Analysis & Querying: Strong experience in SQL for complex queries, joins, CTEs, and performance optimization; proficient in Python for data analysis and automation using Pandas and NumPy.

Data Visualization & BI: Hands-on experience building interactive dashboards and reports using Power BI and Tableau, including DAX measures, Power Query transformations, and performance tuning.

Statistics & Analytics: Solid foundation in exploratory data analysis, hypothesis testing, regression analysis, correlation analysis, and basic time-series techniques to support data-driven decisions.

ETL & Data Pipelines: Experience developing and maintaining ETL pipelines using Python and SQL, performing data cleansing, validation, and incremental refresh for reliable analytics.

Databases & Data Modelling: Worked with relational databases such as MySQL, PostgreSQL, SQL Server, and Redshift; skilled in designing star and snowflake schema models for analytical reporting.

Cloud & Platforms: Practical exposure to AWS services including S3, RDS, Redshift, Athena, and Glue; working knowledge of GCP Big Query for analytics use cases.

Automation & Integration: Integrated REST APIs and external data sources; implemented Python-based automation for reporting, monitoring, and data quality checks.

Governance & Reporting: Experience defining KPIs, validating metrics, documenting data lineage, and implementing role-based security in BI tools.

Tools & Methodologies: Git, GitHub, Jupiter Notebook, VS Code; Agile/Scrum methodology. Education:

Master’s in computer science from university of Bridgeport 2024 December

Bachelor’s in computer science

2021 may

Professional Experience:

Client: Fiserv July 2023 – Till Date

Role: Sr. Data Analyst GA

Responsibilities:

Involved in requirement analysis, application development, application migration, and maintenance using Software Development Lifecycle (SDLC) and Python technologies.

Developed MLOps pipelines on AWS using SageMaker Pipelines, Lambda, and Step Functions to orchestrate training, tuning, and deployment.

Adapted existing Retrieval-Augmented Generation (RAG) pipelines to leverage Google Vertex AI and Google LLM APIs for scalable LLM deployments.

Designed hybrid search strategies combining vector similarity search and keyword-based search using Google Vector Search for improving document retrieval accuracy.

Set up model monitoring and alerting workflows to ensure ongoing model performance using CloudWatch and SageMaker Model Monitor.

Deployed LLM-based workflows for call transcript summarization using GPT-4, LangChain, and vector databases (RAG architecture).

Fine-tuned domain-specific LLMs using LoRA/PEFT and created performance benchmarks using accuracy, BLEU, perplexity, and human feedback.

Developed end-to-end ML pipelines using AWS SageMaker Pipelines, Step Functions, and Lambda.

Integrated Google LLMs and Google Vector Search into NLP workflows for intelligent document search and hybrid search solutions.

Built RAG (Retrieval Augmented Generation) pipelines utilizing LangChain and custom embedding models.

Deployed and monitored LLM fine-tuned models (GPT-4, Claude) using LoRA and PEFT techniques.

Built REST APIs using Flask and deployed on Docker containers in hybrid AWS/Google Cloud environments.

Optimized model training costs through advanced instance management and parallel processing.

Designed, implemented, and monitored ML solutions ensuring high performance and low latency.

Built Support Vector Machine algorithms for detecting the fraud and dishonest behaviors of customers by using several packages: Scikit-learn, Numpy, Scipy, Matplotlib, and Pandas in Python.

Used AWS S3, Dynamo DB, and AWS Lambda, AWS EC2 for data storage and models& deployment. Worked extensively on AWS services like Sage Maker, Lambda, Lex, EMR, S3, Redshift etc.

Used AWS transcribe to obtain call transcripts, perform text processing (cleaning, tokenization, and lemmatization)

Participated in features engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn pre-processing.

Designed the data marts in dimensional data modeling using Snowflake schemas.

Generated executive summary dashboards to display performance monitoring measures with Power BI.

Developed and implemented predictive models using Artificial Intelligence/ Machine Learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, RandomForest, K-means clustering, KNN, PCA and regularization for Data Analysis.

Leverage AWS Sage Maker to build, train, tune and deploy state of art Artificial Intelligence/ Machine Learning and Deep Learning models.

Built classification models include: Logistic Regression, SVM, Decision Tree, and Random Forest.

Used Pandas API to put the data as time series and tabular format for east timestamp data manipulation and retrieval.

Worked with creating ETL specification documents, & creating flowcharts, process workflows and data flow diagrams.

Designed both 3NF data models for OLTP systems and dimensional data models using star and snowflake Schemas.

Worked on the Snow-flaking the Dimensions to remove redundancy.

Created reports utilizing Excel services and Power BI.

Applying Deep Learning (RNN) to find the Optimum route for guiding the tree trim crew.

Using XGBOOST algorithm predicting storm under different weather conditions and using Deep Learning analyzing the severity of after storm effects on the Power lines and Circuits.

Worked with Snowflake SaaS for cost effective data warehouse implementation on cloud.

Developed Data Mapping, Transformation and Cleansing rules for the Master Data Management Architecture involved OLTP, ODS and OLAP.

Produced A/B test readouts to drive launch decisions for search algorithms including query refinement, topic modeling, and signal boosting and machine-learned weights for ranking signals.

Implemented an Image Recognition (CNN + SVM) anomaly detector and convolutional neural nets (CNN/ Image Recognition) to determine fraud purchase direction.

Designed and developed Power BI graphical and visualization solutions with business requirement documents and plans for creating interactive dashboards.

Environment: SDLC, Python, Scikit-learn, Numpy, Scipy, Matplotlib, Pandas, AWS S3, Dynamo DB, and AWS Lambda, AWS EC2, Sage Maker, Lex, EMR, Redshift, Snowflake, RNN, Machine Learning, Deep Learning, OLAP, ODS, OLTP, 3NF, Naive Bayes, RandomForest, K-means clustering, KNN, PCA, Power BI. Client: Wipro India March 2022 to Feb 2023

Role: Sr. Data Analyst

Responsibilities:

Involved in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch. Performed data imputation using Scikit-learn package in Python.

Built advanced GenAI capabilities leveraging Hugging Face Transformers and LLaMA for multi-label classification on architectural documentation.

Integrated NLP pipelines with LangChain and vector search components for document similarity scoring, semantic search, and clustering.

Applied fine-tuning techniques like LoRA and PEFT to improve model accuracy on specialized architectural text datasets and implemented GraphRAG for structured output generation.

Built several predictive models using machine learning algorithms such as Logistic Regression, Linear Regression, Lasso Regression, K-Means, Decision Tree, Random Forest, Naïve Bayes, Social Network Analysis, Cluster Analysis, and Neural Networks, XGboost, KNN and SVM.

Building detection and classification models using Python, TensorFlow, Keras, and scikit-learn.

Used Amazon Web Services, AWS provisioning and good knowledge of AWS services like EC2, S3, Red shift, Glacier, Bamboo, API Gateway, ELB (Load Balancers), RDS, SNS, SWF and EBS.

Integrated NLP pipelines with LangChain and Google Vector Search for semantic document clustering and hybrid retrieval.

Built fine-tuned models on architectural datasets using Hugging Face Transformers and Google Cloud Vertex AI.

Implemented Flask-based microservices deployed on Docker for scalable ML model inference.

Developed monitoring dashboards using AWS CloudWatch, Google Cloud Monitoring, and QuickSight.

Led initiatives enhancing communication across distributed teams using agile methodologies.

Developed the required data warehouse model using Snowflake schema for the generalized model

Worked on processing the collected data using Python Pandas and Numpy packages for statistical analysis.

Used Cognitive Science in Artificial Intelligence/ Machine Learning for Neuro feedback training which is essential for intentional control of brain rhythms.

Worked on data cleaning and ensured Data Quality, consistency, integrity using Pandas, and Numpy.

Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.

Used NumPy, SciPy, Pandas, NLTK (Natural Language Processing Toolkit), and Matplotlib to build models.

Involving in Text Analytics, generating data visualizations using Python and creating dashboards using tools like Power BI.

Performed Naïve Bayes, KNN, Logistic Regression, RandomForest, SVM and Boost to identify whether a design will default or not.

Managed database design and implemented a comprehensive Snowflake Schema with shared dimensions.

Application of various Artificial Intelligence (AI)/ machine learning algorithms and statistical modeling like decision trees, text analytics, natural language processing (NLP), supervised and unsupervised, regression models.

Implemented Ensemble of Ridge, Lasso Regression and XGboost to predict the potential loan default loss.

Performed data cleaning and feature selection using MLlib package in PySpark and working with deep learning frameworks.

Involved in scheduling refresh of Power BI reports, hourly and on-demand. Environment: SDLC, Python, Scikit-learn, Numpy, Scipy, Matplotlib, Pandas, AWS S3, Dynamo DB, AWS Lambda, AWS EC2, Sage Maker, NLTK, Lex, EMR, Redshift, Machine Learning, Deep Learning, Snowflake, OLAP, OLTP, Naive Bayes, RandomForest, K- means clustering, KNN, PCA, PySpark, XGBoost, Tensor flow, Keras, Power BI. Client: Mphasis India June 2021-Feb 2022

Role: Data Analyst

Responsibilities:

Performed Data Analysis, Data Migration, and Data Preparation useful for Customer Segmentation and Profiling.

Implementing investigation calculations in Python. Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit learn, and NLTK in Python

Implementing Data Warehousing and Data Modelling procedures to build ETL pipelines to extract and transform data across multiple sources.

Architected scalable algorithms using Python programming and capable of performing Data Mining, Predictive Modelling using all kinds of statistical algorithms as required.

Utilize ETL tooling to build, template, and rapidly deploy new pipelines for gathering and cleaning data.

Developed Multivariate data validation scripts in Python for equity, derivate, currency and commodity-related data, thereby improving efficiency of pipeline by 17%.

Used Predictive Analysis to develop and design sample methodologies and analyzed data for pricing of client's products.

Involved in optimizing the ETL process of Alteryx to Snowflake.

Used Data visualization tools Such as Tableau, Advanced MS Excel (macros, index, conditional list, arrays, pivots, and lookups), Alteryx Designer, and Modeler.

Used Data Analytics, Data Automation and coordinated with custom representation instruments utilizing Python, Mahout, Hadoop and MongoDB.

Performed all necessary day-to-day GIT support for different projects, Responsible for design and maintenance of the GIT Repositories, and the access control strategies.

Fostered teamwork, communication, and collaboration while managing competing priorities of weekly, bi-weekly, monthly and quarterly priorities.

Worked extensively on ER/ Studio in several projects in both OLAP and OLTP applications.

Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy Oracle and SQL Server database systems.

Analysed business requirements and upgraded function specification while conducting testing on multiple versions and resolving critical bugs to improve the functionality of the Learning Management System.

Built and Deployed a UI/ UX e-learning web application using jQuery, JavaScript, HTML, and NodeJS for various courses.

Cleaned and transformed the data using Python, developed dashboards and visual KPI reports using Tableau.

Involved in publishing of various kinds of live, interactive data visualizations, dashboards, reports and workbooks from Tableau Desktop to Tableau servers.

Contact this candidate