MIHIRA GUDIMETLA
Data Science Analyst
USA 734-***-**** *********@*****.*** linkedin.com/in/mihirag2k github.com/Mihira20 PROFESSIONAL PROFILE
• Results-driven Data Scientist with 2+ years of experience developing and deploying machine learning models, statistical analysis, and data visualization solutions. Proficient in Python, Scikit-learn, TensorFlow, and SQL for building end-to-end AI workflows.
• Proven ability to translate ambiguous business problems into scalable data science solutions. Strong communicator who thrives in fast-paced, unstructured environments. Adept at stakeholder collaboration, prototyping, and operationalizing AI products in real-world settings.
• Demonstrates deep expertise in Python for complex data analysis, including writing efficient, scalable code, as well as SQL for effective database management and querying.
• Skilled in leveraging Python libraries such as NumPy and Pandas for advanced numerical computing and data manipulation.
• Experienced in using Matplotlib for intricate data visualization tasks.
• Proficient in various database management systems including MySQL, SQL Server, and MongoDB, ensuring optimal data storage and retrieval.
• Skilled in data cleaning, wrangling, mining, and statistical analysis, applying critical thinking for accurate and insightful data interpretation.
• Utilizing algorithms such as Random Forest, SVM, and k-means clustering for predictive modeling and pattern recognition.
• Expertise extends to creating insightful data visualizations and dashboards using Tableau and Power BI.
• Proficient in AWS for data storage, processing, and analysis, utilizing services like EC2, S3, RDS, and Lambda for efficient cloud-based operations.
• Experienced with GCP services such as Virtual Machines (VMs), Google Kubernetes Engine (GKE), Virtual Private Cloud (VPC), Identity and Access Management (IAM), and Key Management
• Knowledgeable in statistical analysis tools such as SPSS and SAS, as well as machine learning frameworks like TensorFlow and Sci-kit learn for predictive analytics and data modeling.
• Proficient in designing and implementing ETL (Extract, Transform, and Load) processes using tools like SSIS (SQL Server Integration Services) for efficient data integration and transformation.
• Familiarity with Agile, SDLC, and Waterfall methodologies for project management, and proficiency in version control using Git, along with strong skills in MS Office for documentation and productivity.
• Managed multiple concurrent analytics projects under tight deadlines, demonstrating strong time management and organizational skills. Demonstrated responsiveness and ownership when resolving critical issues in real-time. TECHNICAL SKILLS
• Languages: Python, SQL, R
• Database: MySQL, NOSQL, Oracle, MongoDB
• Data Visualization: Tableau, Power BI, MS Excel (Advance), Looker, Snowflake (familiar)
• Cloud Technologies: AWS (EC2, S3, Lambda), MS Azure, Google Cloud (VMs, GKE, VPC, IAM, KMS)
• Python Libraries: Pandas, Matplotlib, NumPy, Seaborn
• Version Control Tools: Git, GitHub
• Methodologies: SDLC, Agile, Waterfall
• ML Algorithm: Linear Regression, Random Forest, Support Vector Machines, And K-means clustering
• ML Framework: Tensor Flow, Sci-kit learn, ANOVA, Multivariate Regression, PyTorch
• Analytical Skills: Data Wrangling, Data preprocessing, Data profiling, Data Mining, Data Analysis, Data Visualization EDUCATION
Illinois Institute of Technology IL, USA
Master's in Computer Science Jan 2023 – Dec 2024
Jawaharlal Nehru Technological University Hyderabad, India Bachelor of Engineering (B.E) in Computer Science & Engineering Jun 2018 –Apr 2022 PROFESSIONAL EXPERIENCE
Data Science Analyst Mar 2025- Present
Molina Healthcare, VA
• Led data science initiatives focused on patient readmission trends and claims optimization using statistical modeling, enabling predictive insights and improving care workflows.
• Developed and optimized ETL pipelines in GCP Dataflow, integrating claims, EHR, pharmacy datasets across systems to support real-time analytics and reporting on care quality metrics.
• Managed streaming data pipelines using Spark Streaming to monitor patient discharge summaries, lab results, and follow-up compliance, allowing clinical teams to take timely action.
• Executed advanced SQL queries on Google Cloud SQL to extract patient history and provider performance data, facilitating deep dives into healthcare utilization patterns and cost drivers.
• Utilized MongoDB to manage semi-structured patient survey data and created aggregation pipelines for behavioral pattern analysis related to preventive health programs.
• Developed and deployed interactive Tableau dashboards Python-based visualizations, providing real-time feedback loops for 5+ operational teams, enabling business teams to track KPIs like HEDIS scores, appointment no-shows, chronic disease trends improving outreach efficiency by 25%.
• Built deployed ML models using Random Forest and SVM to identify high-risk patients; integrated into production workflows using GCP services.
• Employed Google Cloud Storage and Compute Engine to manage large-scale model training tasks and securely store longitudinal health data used for population health analytics.
• Facilitated Agile-based sprint planning and reviews with cross-functional teams, translating business objectives into analytics deliverables for reporting and strategic planning.
Data Analyst Intern Mar 2025 – Apr 2025
Changing The Present, VA
• Collaborated with program leads to analyze 10K+ donor records, campaign data using Python, resulting in a 15% improvement in donor retention.
• Cleaned and organized donation records and volunteer activity logs to support accurate reporting and grant proposals, ensuring data consistency and integrity across multiple sources.
• Contributed to dashboard creation using Excel and Power BI, providing visual insights into fundraising performance and donor retention patterns for board presentation.
Data Analyst Sep 2021 – Dec 2022
Trigent Software, India
• Supported enterprise software clients by developing custom analytics modules integrated within SaaS applications, enhancing user dashboards and reporting capabilities across finance and operations domains.
• Created automated ETL workflows using SSIS and SQL, enabling seamless integration between client CRM systems and backend data lakes, streamlining data access for reporting layers.
• Conducted extensive data wrangling and transformation using Python (Pandas, NumPy), converting raw system logs and transactional data into clean, analysis-ready formats for performance diagnostics.
• Designed and deployed Power BI dashboards to monitor product KPIs (usage, SLA, incident trends), reducing support time by 20% and increasing visibility across engineering and operations teams.
• Built automated ETL workflows (SSIS, SQL) to power reporting dashboards, streamlining monthly financial reconciliation client invoicing by 30%.
• Utilized Matplotlib and Seaborn to produce high-fidelity visualizations for internal stakeholder reports, guiding enhancements in UX and system performance tuning.
• Collaborated with product and QA teams during sprint cycles to validate data consistency, support user acceptance testing (UAT), and document data logic within feature release notes.
• Integrated AWS S3 storage for log archival and leveraged AWS Lambda for processing high-volume telemetry data from client-facing applications, enhancing traceability and uptime diagnostics.
• Conducted root cause analysis (RCA) on reporting discrepancies by tracing pipeline failures, SQL logic errors, and schema mismatches across development and production environments.
• Documented end-to-end data pipelines, dashboard logic, and API integrations, enabling smooth handovers between engineering, analytics, and DevOps teams.
Data Analyst Mar 2021 – Aug 2021
Cipla, India
• Collaborated with cross-functional teams to analyze pharmaceutical CRM and sales data using Python (Pandas, NumPy) and SQL, identifying trends in HCP engagement and improving customer satisfaction metrics through data-backed campaign strategies.
• Conducted in-depth statistical analysis (ANOVA, multivariate regression) to identify productivity drivers, resulting in a 15% improvement in workforce planning and allocation.
• Created insightful visualizations using Matplotlib and Seaborn, highlighting relationships between product usage metrics and customer behavior across therapeutic categories.
• Performed data wrangling on commercial datasets ranging from 4,000 to 25,000 records, ensuring high-quality inputs for downstream analysis in regulatory and compliance reporting.
• Supported fraud risk modeling by analyzing historical transaction and order data, helping establish preventive backstop and mitigation strategies against potential anomalies in distribution channels.
• Assisted in integrating multiple internal data sources to create consolidated reports for the Sales & Marketing Analytics team, improving visibility into KPI performance for branded generic lines.
• Applied Agile principles to track project milestones and deliverables, working in sprints to prioritize tasks and ensure on-time delivery of analytics artifacts.
PROJECT
Uber Trip Data Engineering Pipeline GCP, Mage-AI, Looker
• Built ETL pipelines to process trip and fare data into BigQuery; visualized ride trends and cost patterns using Looker. GitHub Repository Analysis and Forecasting TensorFlow, Prophet, StatsModel
• Built and deployed time-series forecasting models using Prophet and TensorFlow to analyze GitHub repository activity; improved model accuracy by 30% through iterative tuning.
NLP-Driven Stock Price Prediction Python, NLP, Power BI
• Built an NLP model with 72% accuracy analysing 6,000+ financial articles; designed a Power BI dashboard for real-time insights. Drug-Target Interaction Search System BERT, RAG, TF-IDF
• Designed real-time visualization tool for CTA ridership metrics using PySpark and Flask caching to optimize performance.