Post Job Free
Sign in

Data Analyst Machine Learning

Location:
New York City, NY
Posted:
July 29, 2025

Contact this candidate

Resume:

Sai Krishna Nagulapati

Data Analyst

551-***-**** ***********************@*****.*** VA LinkedIn SUMMARY

Data Analyst with 4+ years of experience in the finance industry, leveraging advanced analytics to drive strategic business decisions. Proficient in Python, SQL and AWS with a strong track record of transforming complex data into actionable insights. Skilled in developing real-time PowerBI and Tableau dashboards, enhancing stakeholder visibility and decision-making. Implemented large-scale machine learning models that improved forecasting accuracy by 30%. Adept at managing large-scale data pipelines, implementing data governance, and automating processes to improve data integrity and reporting efficiency. Experienced in cloud-based data storage and processing using AWS (S3, Redshift, Athena) and Azure, ensuring scalable and cost-effective data solutions.

EDUCATION

Master in Data Analytics George Mason University, Virginia Aug 2022 – May 2024 Coursework: Machine Learning Algorithms, Programming with Python and R Studio, Natural Language Processing, Spark SKILLS

Languages Python, SQL (JOIN, CTE, Window functions), R, SAS, Scala Visualization Tools Tableau (Calculated Fields, Dual-Axis Graphs, Sets, Dashboards, Stories) Power BI (DAX Functions, Power Query, Bookmarks), Excel (Pivot Tables, VLOOKUP, XLOOKUP) IDEs Jupyter Notebook, Databricks, PyCharm, Visual Studio Packages NumPy, Pandas, Matplotlib, Scikit-learn, TensorFlow, Seaborn, duly, ggplot2, SQLAlchemy, psycopg2 Machine Learning Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, SVM, Random Forests, Naive Bayes, KNN, k-means, CNN Database MySQL, SQL Server, NoSQL, Collibra, MongoDB Data Warehousing Snowflake, Redshift, Synapse, Big Query Cloud Platforms AWS(S3, EC2, Lambda), Azure(Blob Storage, Data Factory) Other Skills Data Cleaning, Critical Thinking, Strong Communication, Presentation and Problem-Solving EXPERIENCE

Citigroup, VA Data Analyst Aug 2023 – Current

• Facilitated Agile Scrum practices, driving cross-functional collaboration between finance, data, and engineering teams to improve workflows, enhance efficiency, and accelerate delivery of actionable insights.

• Utilized Python and SQL to analyze and forecast credit risk and loan default rates, enabling data-driven customer segmentation and process improvements that reduced processing errors and enhanced operational efficiency.

• Automated financial data sourcing with Python, enhancing data integrity by 25% and enabling more accurate analysis of transaction histories and loan records for improved customer trend insights.

• Monitored KPIs such as loan approval rates and customer creditworthiness using PowerBI dashboards, delivering actionable insights from trends across 50+ customer segments to drive strategic financial decisions.

• Performed advanced data extractions on transactional records across three key financial domains using SQL (windows, CTE), reducing report generation time by approximately five hours each week.

• Optimized data management by leveraging AWS services, including Amazon S3 for scalable storage and Redshift for data warehousing, ensuring secure, efficient, and cost-effective handling of sensitive customer financial information.

• Developed ETL pipelines in Databricks on Azure using PySpark, improving data extraction, transformation, and loading into a centralized data warehouse, enhancing analysis and reporting efficiency by 30% across Citi’s departments

• Leveraged Python (scikit-learn) and SQL to identify demographic correlations with credit delinquency, to create targeted strategies that improved customer retention by 25%.

• Developed and enforced data governance protocols, including data classification and access controls, to enhance data security and streamline reporting processes, leading to a 15% reduction in data-related risks. Capgemini, India Data Analyst Jan 2020 - July 2022

• Implemented structured project planning and task tracking in JIRA to unify collaboration across technical and business teams and consistently delivered financial analytics solutions with a 20% improvement in reporting accuracy.

• Developed SQL queries on Snowflake to retrieve and structure data from 10+ relational databases, enabling accurate decisioning and preserving data integrity for high-quality business reporting.

• Explored datasets with 1M+ records through EDA using Plotly and Seaborn to visualize correlations between income levels, age demographics, and credit scores, providing data insights that enhanced model accuracy by 15%.

• Developed dynamic Tableau dashboards and visualizations, turning complex data into actionable insights and empowering business leaders to make informed, data-driven decisions in real time.

• Developed and fine-tuned classification models with Scikit-learn (Logistic Regression, Decision Trees, Random Forest) to assess loan risk and forecast defaults, boosting prediction accuracy by 18%.

• Optimized feature extraction and transformation for large-scale financial datasets by building efficient data pipelines using Pandas and NumPy, leading to a 30% reduction in model training time and accelerating the overall modeling workflow.

• Utilized Apache Spark for real-time processing of large financial transaction datasets, facilitating immediate detection of high-risk loan applications and improving fraud detection accuracy by 20%.

• Conducted A/B testing on two loan approval algorithms, adjusting customer demographic and loan history features, which led to a 15% increase in approval accuracy and enhanced the overall risk model.

• Used advanced Excel functions such as VLOOKUP, INDEX-MATCH, and conditional formatting to build interactive financial dashboards, enabling real-time analysis of key performance indicators and enhancing decision-making efficiency.

• Enhanced data accuracy by 20% across 1M+ customer records by automating data validation, cleaning, and reconciliation using SQL and Python (Pandas, NumPy), ensuring compliance with data governance policies. PROJECTS

Loan Approval Risk Analysis SQL, Python, Power BI

• Extracted and analyzed loan applicant data using SQL and Python (Pandas, NumPy) to identify key risk factors affecting loan approval outcomes.

• Analyzed correlations between applicant demographics and loan approval outcomes, uncovering significant patterns in income, credit score, and employment stability that influenced risk levels.

• Built interactive dashboards in Power BI to visualize approval trends, risk factors, and applicant demographics, enabling data-backed decision-making for financial institutions.



Contact this candidate