Poonam Gillurkar
Data Manager, Provenance & Lineage Python, SQL, Plotly Dash, NetworkX I help
companies Track, Visualize & Manage their Data Pipelines Plano, TX 75075
***************@*****.***
• Detail-oriented data professional with a background in Computer Science and 3 years of industry experience.
• High problem-solving skills with success in developing, designing, and launching dashboard applications.
• Motivated self-starter, fast learner, and prepared for a learning curve.
• Developed and implemented the Data Lineage logic and corresponding graphical visualization on a dashboard using the Plotly Dash framework and NetworkX.
• Followed agile software development practice paired programming, test-driven development, and scrum status meetings.
• Effectively interacted with team members and business users from different regions for requirements capture and analysis.
• Prepared graphical unit test cases and reviewed the test results.
• Automated the testing process.
• Experience in Data Science Technologies geared towards NLP. Willing to relocate: Anywhere
Authorized to work in the US for any employer
Work Experience
Data Manager
Bayer
September 2021 to Present
• Contributed to the development of the Data Provenance tracking Python package under the Governance subteam of the company’s Data Quality team.
• Complied with the data quality standards, cleansing techniques, and data security guidelines.
• Used internal packages to interact with the various Data Stores to capture the metadata behind the organization’s Data Pipelines in a JSON format.
• Programmed logic to determine the Data Lineage of a dataset and visualize it on a dashboard as a network graph using NetworkX and Pyvis
• Built an interactive dashboard for the package using Plotly's Dash framework
• Leveraged Python's data manipulation libraries like Pandas, Numpy, etc. to clean and process the data coming in from querying SQL tables.
• Programmed logic to determine the Data Lineage of a dataset and visualize it on the dashboard in a network graph using NetworkX and Pyvis.
• Incorporated Atlas, Instance specific, and Topmost view of the Lineage.
• Performed and automated unit testing in Pytest by coming up with graphical test cases.
• Added caching of data, user inputs, and results for faster turnaround time.
• Enhanced the dashboard by adding features like looking up provenance reports, identifying orphan datasets, displaying package use statistics and metrics over time, and generating data inventory.
• Improved the User Interface of the dashboard to enhance the user experience. Data Scientist
ProfessorBob.ai
March 2021 to September 2021
• Developed the chatbot's Sentence Autocompletion by vectorizing user input to return the most similar questions from the Elastic Search index using the SentenceTransformer paraphrase identification model with Google Cloud Platform (GCP) Natural Language API and cosine similarity by clustering vectors together with DBSCAN algorithm.
• Transformed text into Knowledge Graphs to generate a network of domain-specific entities and keywords using NetworkX that are semantically connected using the Pagerank Algorithm and GCP'S NL API.
• Implemented clustering algorithms to group closely related vectors using cosine similarity, BM35 Data Analyst
Believe in Me
July 2020 to March 2021
• Created multiple interactive dashboards using Google Data Studio to visually evaluate the performance of the organization's social media marketing campaigns on data coming in via Supermetrics and Salesforce
Software Engineer Intern
Cerner Corporation
January 2016 to July 2016
• Enhanced technical skills and adapted to the Agile methodology by participating in the DevAcademy program.
• Developed a web application to render JSON data in the most suitable user interface using Web Technology Stack and acquainted with other software management tools like Git, JIRA, Crucible, Maven, Jenkins, etc.
• Learned an entirely new Cerner Command language (CCL) programming language, a wraparound to SQL, to query the Millennium database and analyze patient medical data to generate annual reports. Education
Master's degree in Computer Science
The University of Texas at Dallas
May 2020
Bachelor's degree in Information Technology
VIT University - Vellore, Tamil Nadu
May 2017
Skills
• Python
• R
• SQL
• MySQL
• Spark
• Elasticsearch
• NoSQL
• Git
• GitLab
• Jira
• Pandas
• NumPy
• SciPy
• Plotly
• Dash
• NetworkX
• Data governance
• Data Lineage
• Metadata
• JSON
• HTML5
• CSS
• AJAX
• JavaScript
• Google Data Studio
• Agile
• Maven
• Java
• User Interface (UI)
• Unit Testing
• Natural language processing
• Waterfall
• Graph databases
• Data Provenance
• Domino
• JupyterLab
• IntelliJ
• PyCharm
• RStudio
• Mysql Workbench
• Data Quality
Links
https://www.linkedin.com/in/poonam-gillurkar/