Harivardhana Naga Naidu Polireddi
408-***-**** Linkedin GitHub **************@*****.*** San Jose, CA 95112 Education
Master of Science in Data Analytics Aug 2023 – Dec 2024 San Jose State University, United States of America Course Work: Database Systems for Analytics, Data Visualization, Data Mining, Big Data, Probability and Statistics, Machine Learning, Deep Learning, Python Programming Experience
Data Analysis and IT Intern Santa Clara Valley Water District San Jose, CA Sep 2024 – May 2025
• Developed a robust anomaly detection system using Python, Pandas, NumPy, and Scikit-learn to process 1.6M data points, boosting detection accuracy by 30% through advanced data cleansing, feature engineering, and Isolation Forest modeling.
• Conducted product research on employee in/out tools by analyzing 6+ competitors with SQL, Excel, and PowerBI techniques, and developed AttendanceBot using Streamlit, Python, RESTful APIs, and real-time status visualization and daily task updates, streamlining operations by 25%.
• Redeveloped the RESU database in Oracle APEX by resolving 20+ critical error points with SQL scripts, Python automation, and ETL processes reducing discrepancies by 40% and filling data gaps through rigorous validation and transformation while integrating a comprehensive user guide and coordinating cross-functional teams with agile methodologies to enhance data integrity by 95% and cut turnaround time by 50%. Data Analysis and GIS Intern Santa Clara Valley Water District San Jose, CA May 2024 – Aug 2024
• Enhanced the RESU database by resolving 20+ critical errors using SQL, Python automation, and ETL processes reducing discrepancies by 40% and boosting reliability with advanced anomaly detection and robust validation.
• Designed fee and easement maps with ArcGIS and AutoCAD, integrating spatial datasets into RESU to cut mapping errors by 25% and deliver dynamic, actionable visualizations through geospatial analysis.
• Implemented an automated daily update via Jupyter Notebook and scheduling tools, reducing manual intervention by 70% with Python scripting and continuous integration for real-time GIS data refreshes. Financial Data Analyst XL Dynamics Pvt. Ltd Chennai, India Feb 2021 – Jul 2023
• Developed dashboards in Tableau and PowerBI to visualize financial trends and KPIs, improving mortgage funding decisions by 12.5%. Leveraged SQL and Excel for data preprocessing to cleanse and model large datasets for real-time insights.
• Automated data extraction using Python, OCR, and regex to integrate unstructured mortgage data into a CMS, reducing funding decision time by 4 hours. Optimized ingestion workflows with custom scripting and validation.
• Enhanced dashboards with advanced SQL, refined Excel models, and dynamic PowerBI visuals for mortgage tracking and demographic analysis, increasing profitability by 20%. Leveraged data aggregation, pivot analysis, and KPI monitoring to provide actionable insights for strategic planning.
• Performed EDA with Python and SQL to build data marts and define critical KPIs, achieving a 30% improvement in operations. Applied statistical analysis and R techniques to optimize workflows and drive efficiency. Junior Analyst XL Dynamics Pvt. Ltd Chennai, India Oct 2020 – Feb 2021
• Automated mortgage data processing with Python and SQL, ensuring timely preparation of Closing Disclosure and Loan Estimate documents for US-based clients.
• Prepared and visualized financial data using Excel and PowerBI, ensuring 100% on-time delivery for critical documents through effective data wrangling and BI tool usage.
• Developed a comprehensive task checklist with advanced Excel formulas and pivot tables, collaborating with the funding review team to improve process yield by 25%.
Projects
Blockchain Anomaly Detection
• Developed an end-to-end anomaly detection system for Bitcoin transactions using unsupervised learning Isolation Forest, CBLOF, PCA, K-means, Autoencoders with TensorFlow, PyTorch, and Scikit-learn, achieving up to 87% accuracy in detecting fraudulent activities.
• Preprocessed and engineered features from the Bitcoin Transaction Network Metadata (2011-2013) using normalization and log transformation, resulting in a 40% increase in feature reliability. Evaluated models with precision, recall, F1 score and ROC-AUC, achieving up to 84% ROC-AUC for auto-encoder-based fraud detection. Stack overflow Analysis
• Extracted and processed 85.6GB of unstructured XML data using Python, Boto3, and EC2, applying data wrangling, transfers, and data quality checks for scalable ingestion.
• Built predictive models using Clustering, Decision Trees, Ensemble and Pattern Recognition methods to analyze customer meta-data and uncover hidden patterns.
• Designed an optimized data warehousing and visualization pipeline in AWS S3, Redshift, Glue, Tableau, and QuickSight, improving query execution time by 50% and reporting efficiency by 30%. Technical Skills
Languages and Softwares: Python, SQL, Django, Streamlit, R, MATLAB, MS Azure, Git, JIRA, Docker Databases: MySQL, Oracle, Google Cloud, MongoDB, Postgres, Snowflake, Redshift, Hubspot Data Visualization: PowerBI, Tableau, Splunk, Matplotlib, Seaborn, Kibana, Beats, GIS, d3.js Machine Learning: Pandas, Numpy, Scikit-learn, OpenCV, PyTorch, Keras, Tensorflow Statistics and Modelling: Time series Analysis, Regression, Clustering, Statistical Modelling, AB Testing, Pattern Recognition