Name: Parveen
Email: *********.****@*****.***
Contact: 503-***-****
LinkedIn: linkedin.com/in/parveen-a-73818432
Senior Data Scientist/AIML Engineer
SUMMARY
●Over 10+ years of experience in data analytics and operations leadership.
●Expertise in driving data-driven strategies within complex business environments.
●Proven ability to lead end-to-end analytics projects from concept to completion.
●Management of cross-functional teams to achieve strategic business goals.
●Skilled in leveraging data to influence strategic decisions across organizations.
●Adept at overseeing data products and prioritizing development initiatives.
●Track record of identifying opportunities and solving business problems using analytical approaches.
●Proficient in SQL, Python, and R for data analysis and modeling.
●Expertise in data visualization tools like Tableau and Power BI.
●Understanding of Data Warehousing and data mining.
●Develop and implement advanced algorithms for generating synthetic data to simulate real-world IoT scenarios.
●Strong experience in Image Recognition and Big Data technologies like Spark1.6, Spark SQL, PySpark, and Hadoop 2. X, HDFS, and Hive.
●Fine-tune AI/ML models using synthetic and real IoT data to enhance performance, accuracy, and scalability.
●Leverage vector databases for efficient storage, retrieval, and management of high-dimensional IoT data.
●Demonstrated experience in writing ad-hoc SQL statements.
●Proven experience with ERD/Data Modelling tools (Toad or others).
●Strong background in Natural Language Processing (NLP).
●Experience with cloud platforms like Snowflake and Azure for data storage and processing.
●Familiarity with AI tools such as ChatGPT, Claude AI, and Perplexity AI.
●Understanding of modeling strategies (dimensional, snowflake, relational, unstructured).
●Extensive experience in Machine Learning and predictive modeling.
●Experienced in developing AI chatbots to interact with customers to resolve outstanding issues.
●Expertise in implementing machine learning models to analyze effectiveness of marketing campaigns on revenue.
●Proficient in ETL processes and data engineering.
●Expert at developing revenue prediction models using machine learning algorithms and feature engineering.
●Skilled in customer segmentation and sales forecasting.
●Strong knowledge in playing a technical data steward role across our business and technology partners to understand and detail our data, appropriate.
●Apply innovative AI solutions to improve cloud-based IoT platforms, optimizing system performance and reliability.
●Experience designing and delivering data mapping specifications for large reporting platforms. Writing and maintaining business rules using SQL logic.
●Strong knowledge of Agile methodologies including Scrum and Kanban.
●Project and product management experience in data-driven environments.
●Proficiency in Salesforce and JIRA for project tracking and CRM.
●Expertise in data storytelling to communicate insights effectively.
●Strong statistical knowledge for data analysis and interpretation.
●Experience with AWS and Google Cloud Platform (GCP) for cloud computing.
●Proficient in Excel and VBA for advanced data manipulation and automation.
●Skilled in Alteryx for data blending and advanced analytics.
●Experience with Cursor AI and Replit for AI development and deployment.
●Proactive leader with a proven ability to innovate and drive operational excellence.
●Expertise in A/B testing, feature testing to improve product features.
●Great at fine-tuning hyperparameters of models to deploy machine learning models.
TECHNICAL SKILLS
Tableau: Data visualization and dashboard creation.
Power BI: Business intelligence and data visualization.
NLP (Natural Language Processing): Text analysis and language processing.
Programming and Technologies: R, Python, SQL
Statistical Skills: Regression, Decision Trees, Time Series Forecasting
Data Analysis and Visualization: Tableau, Power BI
Data Warehousing and Modeling: MySQL, Hadoop, Spark
ChatGPT: Conversational AI and chatbot development.
AI: Machine learning, deep learning, and model deployment.
Project Management: Agile methodologies, planning, and execution.
Cloud Platforms: AWS, Azure, GCP
Version Control and Collaboration Tools: GitHub
Advanced Excel Skills: Pivot Tables
SQL: Database management, querying, and data manipulation.
Python: Scripting, automation, data analysis, and machine learning.
R: Statistical analysis and data visualization.
Tableau: Data visualization and dashboard creation.
Power BI: Business intelligence and data visualization.
NLP (Natural Language Processing): Text analysis and language processing.
Snowflake: Cloud data warehousing and analytics.
Azure: Cloud computing and machine learning services.
ChatGPT: Conversational AI and chatbot development.
AI: Machine learning, deep learning, and model deployment.
EDUCATION
Master of Science in Analytics - Statistics, Predictive Modelling, and Big Data, Northeastern University Boston, MA
Bachelor of Engineering in Computer Science and Engineering, RV College of Engineering, VTU University Bangalore, India
Certifications:
AWS, Azure, and Tableau certified
PROFESSIONAL EXPERIENCE
Novelis, GA Oct 2021 to Till date
Senior Data Scientist/AIML Engineer
Responsibilities:
●Led the development of a predictive maintenance model using Neural Networks and Azure, achieving a 10% reduction in downtime and a 15% decrease in maintenance costs.
●Develop and implement advanced algorithms for generating synthetic data to simulate real-world IoT scenarios.
●Created and fine-tuned customer segmentation models using Large Language Modeling techniques, achieving a 15% increase in marketing campaign ROI and a 12% improvement in customer retention rates.
●Develop innovative solutions for natural language processing and generative modeling tasks using NLP, Generative AI, and LLMs.
●Strong technical skills in data engineering, machine learning, and cloud computing.
●End to End model lifecycle management starting from feature extraction to monitor machine learning models using high end tools and technologies.
●Generative AI, Deep Learning, Machine learning, LLMOps.
●Familiarity in predictive analytics - regression analysis, machine learning techniques, would be a huge plus.
●Maintain and improve data pipeline using Machine Learning infrastructure and existing Machine Learning models.
●Very strong hands-on Python expertise
●Design and deploy scalable data pipelines and machine learning models on cloud platforms.
●6 years of experience with Python Scala with working knowledge on Notebooks
●Background or experience in building and scaling Generative AI Applications specifically around frameworks like Langchain PGVector Pinecone AzureML.
●Experience around Generative AI and/or LLM would be a plus.
●Experience with software development (python application)
●Developed and deployed an AI-driven chatbot in Python working cross-functionally with data science and Product
●teams to resolve customer issues efficiently, achieving a 23% reduction in average response time.
●Experience designing and delivering data mapping specifications for large reporting platforms. Writing and maintaining business rules using SQL logic.
●Extracted data from HDFS using Hive, and Presto and performed data analysis using Spark with Scala, pySpark, Redshift, and feature selection and created nonparametric models in Spark.
●Apply innovative AI solutions to improve cloud-based IoT platforms, optimizing system performance and reliability.
●Led a team to consolidate global procurement data from 10,000 vendors and 15 ERP systems into a unified reporting solution, ensuring accuracy and generating valuable insights.
●Applied machine learning and NLP techniques within the Azure cloud environment, utilizing Python, PySpark, and SQL for data preprocessing, feature engineering, and model deployment, leading to the successful implementation and deployment of ML/NLP projects aimed at enhancing customer data.
●Demonstrated experience in writing ad-hoc SQL statements.
●Led a project to optimize vendor selection using inferential statistics and clustering algorithms, resulting in over $25 million in procurement cost savings for Novelis.
●Strong experience in playing a technical data steward role across our business and technology partners to understand and detail our data, appropriate.
●Performed data cleaning and feature selection using MLlib package in PySpark and working with deep learning frameworks such as Caffe, Keras, etc.
●Proven experience with ERD/Data Modelling tools (Toad or others).
●Utilized natural language processing (NLP) techniques to analyze customer reviews and feedback, providing actionable insights that improved product development and customer satisfaction by 20%.
●Increased product adoption by 15% by implementing A/B testing on new features during new product launches,
●providing data-driven insights that improved feature rollout strategies and user engagement.
●Utilize AI frameworks and cloud services to create scalable and intelligent IoT systems.
●Stay current with emerging trends in AI, IoT, and cloud technologies, incorporating new tools and methodologies as appropriate.
●Drive the development of smart IoT ecosystems through the integration of AI-driven decision-making processes.
●Crafted SQL queries in Snowflake and deployed it in Alteryx to set up ETL/ELT processes for daily transactional data, reducing data processing time by 30% and enhancing data availability for real-time analytics.
●Enhanced model accuracy by 20% through iterative improvements and hyperparameter tuning of XGBoost models, resulting in a 25% increase in predictive accuracy for customer churn.
management.
Performing Data fixes for data quality issues and ad-hoc data update requests as per the business
needs.
Experience with provisioning and configuration management tools and technologies such as Puppet,
Rundeck, kickstart, and others strongly preferred
Worked on Spring Frameworks Spring IOC, Spring Boot, Spring Cloud) and using third party libraries.
Worked extensively with Helm chart creation, PowerShell, Chef, python, groovy, SOC, GCP, C#,
Puppet, Salt & Ansible in production environments.
Implemented Atlassian Stash along with GIT to host central repositories for source code across
products, and facilitate code reviews and login audits for Security Compliance.
Writing SQL queries on Oracle DB to Generate reports as per the business requirements.
Environment: Python, Neural Networks, XGBoost, Prophet, Pyspark,Azure, Machine Learning, Power BI, Tableau, Large Language Models, ELT, ETL, Extract Transform Load, Customer Insights, Marketing mix modeling, Alteryx, Snowflake
Acuity Eye Group, MA Feb 2019-Oct 2021
Senior Data Scientist/AIML Engineer
Responsibilities:
●Developed and deployed a customer classification model using Neural Networks, which improved classification accuracy by 15% and resulted in a 20% increase in targeted marketing efficiency.
●Engineered and deployed scalable machine learning solutions on AWS, which improved data throughput by 35% and reduced latency by 25%, contributing to enhanced real-time analytics.
●Developed revenue forecasting model in Python, leveraging advanced statistical techniques and machine learning
●algorithms to predict future revenue trends with 95% accuracy, informing business planning and resource
allocation.
Programing experience in one of the languages - Python, R, Java, C#
Strong understanding and experience with NLP, Generative AI, and LLMs.
As a ML Engineering Specialist you will be a key stakeholder and owning responsibility in designing and architecting end to end ML solutions
Proficiency in Python for data analysis and visualization, along with experience in working with APIs, Linux OS, databases, big data technologies, and cloud services.
●Develop and implement advanced algorithms for generating synthetic data to simulate real-world IoT scenarios.
●Developed smart IoT ecosystems through the integration of AI-driven decision-making processes.
●Developed and optimized machine learning pipelines with Python and Azure Machine Learning, leading to a 30% improvement in model training speed and a 20% reduction in data preprocessing time.
●Monitored and improved model efficiency by integrating advanced performance metrics and optimization techniques, resulting in a 25% increase in model prediction speed and a 20% decrease in computational resource usage.
●Engineered end-to-end data pipelines and feature engineering processes, utilizing SQL and Apache Spark, which streamlined data processing and enhanced model performance by 30%.
●Strong experience in playing a technical data steward role across our business and technology partners to understand and detail our data, appropriate.
●Worked with Sales Operations teams to define sales strategies and Cross-sell/Upsell solutions based on market
●research and historical customer purchase trends that resulted in 5% revenue growth year-over-year.
●Spearheaded the integration of AWS and Snowflake data warehouses into combined Snowflake instance,
●collaborating with the Operations team and Data Engineering teams on data modeling design considerations.
●Developed sales pipeline dashboard in Power BI, visualizing key metrics such as pipeline velocity and deal stage
●progression, enabling sales teams to prioritize high-value opportunities and improve sales conversion rates by 20%.
•Developed and maintained financial models using Azure SQL and Cosmos DB, enhancing credit risk assessment and loan decision-making processes.
•Implemented continuous integration and deployment pipelines using Azure DevOps, Jenkins, and Git, ensuring efficient code management and deployment.
•Utilized Azure Service Bus and Event Hub to facilitate seamless data integration and real-time data streaming across banking applications.
•Configured and managed Azure VMs and Azure Function Apps to support scalable, serverless computing for complex data processing tasks.
•Employed Azure Key Vault to secure sensitive data, ensuring compliance with financial regulations and data protection standards.
•Performed predictive analytics and data mining with Python to identify trends and insights in customer spending and payment behaviors.
•Managed database upgrades and migrations using Azure SQL Managed Instance (Azure SQL MI), optimizing performance and scalability.
•Automated infrastructure configuration and provisioning using Ansible and Shell Scripting within a Red Hat Linux environment.
•Developed secure access controls and network configurations using Azure AD and SSH, enhancing system security and data integrity.
•Created dynamic dashboards and reports in Azure Power BI connected to Azure SQL databases, providing real-time insights into financial metrics.
•Integrated multiple data sources into a centralized data lake using Azure Data Lake Storage, enabling advanced analytics and data discovery.
•Conducted advanced statistical analyses to support marketing strategies and customer segmentation, employing Azure Machine Learning Service for model training and evaluation.
Environment: Python, Neural Networks, Prophet, Azure, Machine Learning, Pyspark,Power BI, Tableau, AWS, Classification model, Azure SQL, and Azure SQL MI, SSH, YAML, WebLogic,
Allstate India Pvt Ltd, Benguluru Nov 2016- Aug 2017
Senior Data Scientist
Responsibilities:
●Optimized data structure efficiency by incorporating dictionary data structures in Python, which decreased data processing time by 40% and improved overall model efficiency.
●Leveraged GPU acceleration for model training using TensorFlow, reducing training times by 50% and enabling faster experimentation and iteration on complex Neural Networks.
●Implemented GPU-based parallel processing techniques to optimize large-scale data computations, resulting in a 60% improvement in model training efficiency and a 45% reduction in time-to-insight for real-time analytics.
●Designed and implemented a recommendation system using advanced Neural Networks for an Ecommerce client, which increased user engagement by 18% and boosted revenue by 22% through personalized recommendations.
●Design and maintain databases using Python and developed Python based API (RESTful Webservice) using Flask, SQL Alchemy and PostgreSQL.
●Conducted A/B testing and performance evaluation of machine learning models, using metrics like precision, recall, and F1-score, which improved model robustness and reliability by 18%.
●Built a Revenue dashboard for a SaaS B2B client in Power BI that tracked sales and pipeline velocity analyzing large
●and complex data sets, resulting in a remarkable 4% reduction in average lead conversion time YoY.
●Championed rigorous code review and quality assurance within the development team, enhancing machine
●learning model quality by resolving over 30% of potential issues pre-deployment, thereby accelerating project
●timelines.
●Created the Microsoft SharePoint repository that stored a portfolio of data products (dashboards, advanced SQL
●queries, metrics) identifying gaps, and prioritizing future development.
●Developed PySpark framework to process JSON data to data files & load them in Redshift.
●Wrote UNIX shell scripting for automation.
●Responsible for debugging the project monitored on JIRA (Agile).
●CI/CD experience including Jenkins and migration to GitLab.
ENVIRONMENT: Neural Networks, Python, Tableau, Machine Learning, Panda, SQL, Snowflake, Prophet, XGBoost, Data Structures, Dictionary, Pytorch, TensorFlow, GPU acceleration, Parallel Processing, Sharepoint, SaaS B2B, Tableau, Power BI, Data Pipeline, Sales pipeline design, PostgreSQL, Flask, JSON, Redshift, UNIX scripting, Jira, Github, CI/CD
Tech Mahindra Pvt Ltd, Benguluru Mar 2015 to Nov 2016
Data Scientist
Responsibilities:
●Designed Business Intelligence Tableau dashboards for a CPG client, which were adopted by 5 departments, driving a 15% increase in sales by optimizing product performance insights across various markets and consumer segments.
●Designed scalable data pipelines using advanced SQL techniques for a software company, enhancing data extraction and analysis capabilities and enabling 50% larger dataset processing and 40% faster processing.
●Collaborated with cross-functional teams to integrate machine learning solutions into production systems, leading to a 20% boost in operational efficiency and a 10% increase in customer satisfaction through improved personalization features.
●Developed and deployed machine learning models for predictive analytics using Python and scikit-learn, leading to a 20% improvement in forecast accuracy and a 15% increase in actionable insights for business decision-making.
●Developed and deployed predictive models for inventory management and demand forecasting using time-series analysis and machine learning algorithms, leading to a 20% reduction in stockouts and a 15% decrease in excess inventory costs.
●Implemented customer segmentation and personalized recommendation systems using clustering algorithms and collaborative filtering, which increased customer engagement by 25% and boosted sales by 18%.
●Engineered and optimized machine learning pipelines for real-time pricing and dynamic promotions, resulting in a 10% increase in conversion rates and a 12% uplift in average order value.
Environment: Python, Tableau, Clustering Algorithms, Scikit-Learn, Inventory Management, Demand Forecasting, Time-Series Analysis
Cognizant Technology Solutions, Benguluru Sep 2010 – Apl 2014
Data Analyst
Responsibilities:
Developed sophisticated queries in SQL/Server and Oracle 9i to extract and analyze data, significantly improving business decision-making.
Created automated reporting systems with Business Objects, delivering timely and precise business insights to stakeholders.
Managed extensive data migrations and integrations using ER Studio and XML technologies, ensuring consistency and precision in data handling.
Utilized HDFS and MapReduce for processing large data sets, enabling the identification of significant patterns and insights.
Developed predictive models in Spark and R Studio to project market trends and support strategic business planning.
Configured and maintained HIVE databases, facilitating advanced data analysis and business intelligence initiatives.
Automated data quality assurance processes using Informatica, upholding rigorous data integrity and reliability standards.
Utilized PIG scripts for data transformation and enrichment, optimizing the data preparation phase for analytical endeavors.
Managed data security and compliance, applying best practices in AWS and database management to safeguard information.
Conducted thorough data audits and performance optimizations on Teradata 14.1 systems, enhancing overall system efficiency and resource management.
Created dynamic data visualizations with R Studio and Business Objects, facilitating easy interpretation of complex data sets.
Designed ETL processes integrating Informatica and Hadoop (HDFS), supporting timely data integration and reporting in real-time.
Continuously engaged in learning and applying the latest advancements in AI/ML, including attending conferences, participating in research, and implementing cutting-edge techniques in projects.
Environment: S Redshift, Power Designer, PL/SQL, Oracle 11g, SQL/PL, Python, Navigator, UNIX, Microsoft Access, Teradata,R Scripting, Netezza, Teradata SQL assistant, Microsoft Visio, IBM DB2,SQL Server 2008, Tableau, MS PowerPoint, MS Access, Microsoft Excel.