Data Analyst Scientist

Location:

Hyderabad, Telangana, India

Posted:

September 10, 2025

Contact this candidate

Resume:

AKHILESH AKOJU

USA 667-***-**** ****************@*****.*** LinkedIn

PROFESSIONAL SUMMARY

Data Analyst with 4+ years of experience developing design solutions, test scenarios, and end-to-end integrations in agile environments. Skilled in PL/SQL, Oracle Exadata, and Microsoft Office, consistently refining requirements and accurately estimating work to mitigate risks. Proven track record of translating technical details into actionable insights and collaborating cross-functionally to drive business impact.

TECHNICAL SKILLS

• Programming & Analytics: Python, SQL, R, MATLAB, Java, C

• Data Science: ML algorithms, Data mining, Data modeling, NLP, Data visualization, Data warehousing, Deep learning

• ML Libraries & Frameworks: Scikit-learn, PyTorch, TensorFlow, Numpy, Pandas, Transformers, Hugging Face, OpenCV

• Big Data & Cloud: Apache Spark, Hadoop, Databricks, AWS, GCP, Docker, Git, Kubernetes

• Data Engineering: ETL Pipelines, MySQL, PostgreSQL, MongoDB, OracleDB, BigQuery, Hive, Oracle Exadata

• Visualization & BI: Tableau, Power BI, Matplotlib, Seaborn, Streamlit

• Methodologies: A/B Testing, Agile Development, Causal Inference, Feature Engineering, Model Deployment

• Business & Soft Skills: Microsoft Office, Analytical Thinking, Innovative Thinking PROFESSIONAL EXPERIENCE

Ontel Products Corporation. Feb 2024 - Present

Data Scientist New Jersey, USA

• Automated real-time data pipelines using Python, Apache Spark, and SQL, improving project tracking accuracy by 99% across telecom deployments. Demonstrated design solutions development and integration of robust, well-tested interfaces.

• Built anomaly detection models with Scikit-learn to identify QA issues in PIM/fiber test data, reducing rework by 30% while contributing to story refinement and clear definition of requirements.

• Developed interactive dashboards in Power BI and Seaborn to visualize crew check-ins, equipment usage, and site performance, supporting effective test scenario creation and validation.

• Created predictive ML models (Random Forest, Logistic Regression) to forecast site delays with 90%+ accuracy, enhancing work estimation and mitigating risk through proof of concept initiatives.

• Collaborated with DevOps, field engineers, and QA analysts to deliver data-driven reporting tools in an Agile environment, ensuring successful integration into overall applications and systems. Accenture Plc. Jan 2022 - Jan 2023

Data Scientist 1 Hyderabad, India

• Engineered SQL-Python ETL pipelines to process over 1M daily records, reducing data retrieval time by 30% and eliminating 20% of quality issues via automated checks, leveraging strong PL/SQL skills.

• Built and deployed XGBoost and Random Forest models for cross-sell and profitability prediction, resulting in $500K cost savings and a 15% revenue uplift while contributing to accurate work estimation.

• Created real-time Tableau dashboards and A/B testing frameworks to support data-driven decisions, increasing customer satisfaction by 25% across five teams and validating design solutions.

• Developed Java microservices and HTML/CSS interfaces for scalable claim processing, integrated with MySQL for backend reporting and workflow management, and participated in Agile story refinement sessions.

• Partnered with QA, DevOps, and product teams to deliver end-to-end services using Git, CI/CD, and Agile methodology in client-facing projects, illustrating an end-to-end view and clear communication across technical and non-technical audiences.

• Conducted data analysis and feature engineering on structured and unstructured datasets to support predictive modeling, improving model interpretability and aligning with comprehensive test scenario development. Virtu Tech Solutions Feb 2020 - Dec 2021

Associate Data Scientist Hyderabad, India

• Deployed a fraud detection system achieving 93% accuracy on a 50K+ transaction dataset by applying advanced feature engineering

(PCA, SHAP) and ensemble methods (LightGBM), demonstrating strong ML model development skills.

• Led a 3-person team through the complete project lifecycle including data preprocessing, model selection, validation, and deployment planning, reinforcing collaborative and independent problem-solving capabilities.

• Implemented explainable AI techniques to provide stakeholders with interpretable model insights for regulatory compliance.

• Optimized model latency by 35% through quantization and ONNX runtime integration, enabling real-time fraud detection for high-frequency transactions.

• Reduced false positives by 20% by refining anomaly detection thresholds using ROC curve analysis, thereby enhancing operational efficiency for the fraud investigation teams.

• Designed an automated retraining pipeline using Airflow and MLflow to refresh models bi-weekly, maintaining accuracy drift below 2% over 12 months and exemplifying strong MLOps practices.

• Collaborated with DevOps to containerize models with Docker, reducing deployment time by 50% and scaling operations to handle a five-fold increase in transaction volume.

EDUCATION

University of Maryland, Baltimore County Dec 2024

Master of Science, Data Science

• GPA: 3.93/4.0

• Achievements: Dean's List recognition for maintaining 3.9+ GPA throughout graduate studies

• Coursework: Advanced Machine Learning, Mathematics for ML, Big Data Technologies, Statistical Inference Sreenidhi Institute of Science & Technology Aug 2021 Bachelor of Technology, Electronics & Communication Engineering

• GPA: 3.8/4.0

• Coursework: Statistics & Probability, Linear Algebra, Signal Processing, Programming Fundamentals KEY PROJECTS

AgriSmart: Intelligent Crop Recommendation System Sep 2024

• Achieved 98% prediction accuracy using ensemble modeling (Random Forest, SVM) and K-NN clustering on 10K+ soil/weather data points.

• Developed farmer-friendly interface with Streamlit and Power BI dashboards providing top 5 crop recommendations and yield optimization insights.

• Integrated real-time weather APIs to provide dynamic recommendations based on current environmental conditions. FinSight: Predictive Stock Market Analytics Engine May 2024

• Generated 22% profitability improvement for investment analysis using LSTM neural networks on 2-year historical market data (500K+ records).

• Built scalable data pipeline using Hadoop/PySpark on Databricks, processing real-time market feeds through financial APIs.

• Deployed production system on AWS with automated retraining capabilities and interactive Tableau dashboards for portfolio managers. CO2 Emission Forecasting System Mar 2024

• Developed PyTorch regression model achieving 95% accuracy in predicting vehicle emissions using multivariate analysis of 15+ vehicle parameters.

• Analyzed 25K+ vehicle records to identify key emission correlates and generate regulatory compliance reports for automotive industry standards.

• Created automated reporting pipeline for continuous emissions monitoring and trend analysis. NanoGPT Domain Fine-tuning Jan 2024

• Fine-tuned GPT-2 model on 50K+ domain-specific text samples, improving output relevance by 40% using custom training datasets.

• Implemented efficient training pipeline with gradient accumulation and learning rate scheduling, reducing training time by 25%.

• Optimized model performance through hyperparameter tuning and regularization techniques. CERTIFICATIONS & PUBLICATIONS

• Machine Learning with Python:Internshala (2021)

Contact this candidate