Praveen Allu
Newark, California *****
847-***-**** # **************@*****.*** ï praveen-allu
Education
USC Viterbi School of Engineering Jan 2022 – Dec 2023 Master of Science in Applied Data Science Los Angeles, CA Manipal Institute of Higher Education Aug 2016 – Aug 2020 Bachelor of Technology in Computer Science and Engineering Manipal, India Technical Skills
Programming/Scripting: Python (NumPy, Pandas, Scikit-Learn), R, SQL, Shell Frameworks/Tools: PyTorch, TensorFlow, Keras, Flask, Power BI, Docker, AWS, Git Data Science: Data Analysis, Machine Learning, Deep Learning, NLP, Computer Vision Visualization: Matplotlib, Seaborn, Plotly
Experience
Software Concepts LLC Feb 2024 – Present
Data Analytics Intern California
• Developed high-impact Power BI dashboards to track wind farm performance and energy production, enabling data-driven decision-making for stakeholders.
• Built and deployed medium-term forecasting pipelines using GRUs and Graph Neural Networks to account for environmental factors and turbine interactions.
• Enhanced model robustness with Empirical Mode Decomposition (EMD) for noise reduction, achieving a MAE of 0.031 and RMSE of 0.045.
• Utilized TensorFlow, Kafka, Docker, and AWS for scalable model training and real-time data ingestion. Samsung R&D Institute Jan 2020 – Aug 2020
Data Science Intern Bangalore, India
• Implemented on-device document classification and topic modeling using LDA2VEC, integrated into the Samsung My Files application.
• Optimized memory and inference speed for on-device usage, achieving 92% accuracy on primary topics and 89% on sub-topics.
• Led data preprocessing and feature engineering efforts, collaborating with cross-functional teams to ensure robust NLU services.
• Key Technologies: Python (LDA, NMF), PySpark, Anaconda, Jupyter Projects
Smart Summarizer for Long Documents Node.js, React.js, MongoDB, NLTK, LangChain Spring 2024
• Developed a scalable full-stack application to summarize PDFs (1000+ pages) by combining LDA for topic extraction and GPT-3.5 for summary generation.
• Created a responsive React.js interface for seamless display and navigation of multi-level summaries.
• Deployed Node.js microservices to handle file uploads, API calls, and data processing, reducing latency and operational costs.
• Improved user engagement and reduced manual reading time with concise, high-level document insights. Air Quality and Traffic Monitoring (AQI) Dashboard Vue.js, D3.js, GeoJSON, Power BI Fall 2023
• Created a real-time AQI dashboard in Vue.js with Power BI integration, enabling interactive exploration of air quality trends.
• Leveraged D3.js for custom visualizations and drill-down analytics, yielding a strong correlation (90%+) between traffic data and AQI.
• Provided actionable insights for city planners by correlating pollution spikes with traffic congestion patterns and other environmental metrics.
Handling Imbalanced Parkinson’s Disease Data with Deep Learning Spring 2023
• Addressed severe class imbalance by implementing novel data augmentation techniques and advanced deep neural networks (stacked LSTM, TapNet, Transformers).
• Achieved a classification accuracy of 84.9% and F1 score of 85.3%, surpassing standard baselines for Parkinson’s Disease detection.
• Optimized model inference time and memory footprint, proving feasibility for larger clinical datasets.