Anandita Bodas
Data Scientist
I conceptualise and build data science solutions for business requirements using clean, efficient, production- friendly, and scalable methods.
********.*****@*****.*** +97-156******* / +91-967******* Dubai, UAE - Pune, India linkedin.com/in/ananditabodas WORK EXPERIENCE
Lead Data Scientist
Riverus
06/2019 - Present, Pune, India
Riverus is a legaltech company started in 2017. It provides data-driven insights from legal documents in the area of tax law and from contracts. Currently leading a team of data scientists to understand and deliver business and product requirements.
End-to-end involvement in crucial projects for the company platform, working cross-functionally with CXOs, Product/Project Managers, Developers, and QA Engineers.
Part of several investor and potential partnership meetings as a representative of the data science team.
Involved in recruitment, resource planning, project management, training, and mentoring of fresh recruits.
Building pipelines and database architecture for several products and features to suit product and data science requirements. Extraction of structured and unstructured data points from various legal documents.
Creating models and processes from scratch where ready libraries failed for the legal domain.
NLP Engineer
Riverus
11/2017 - 05/2019, Pune, India
Riverus is a legaltech company started in 2017. It provides data-driven insights from legal documents in the area of tax law and from contracts. Extracting structured and unstructured data from case laws, working closely with domain experts for training and testing the data.
Designing APIs using Python Flask.
Complex database querying to provide relevant data for analysis and content generation to the Product and Marketing team. Linking cases based on citations across 300,000+ files with 80% accuracy using Pattern Matching and Regular Expressions. Generation of scheduled data reports to check data completeness and accuracy.
Identification of similar cases.
CERTIFICATIONS
NLP with Classification and Vector Spaces
by deeplearning.ai (Coursera); License - ZXF8GCKX9YNF TECHNOLOGIES AND FRAMEWORKS
Python Tensorflow Spacy Keras
Pytorch Transformers NLTK PostgreSQL
AWS Lambda Functions SKLearn Python Flask
Google News Vector Numpy Pandas PowerBI
SKILLS
Outcome-oriented
Ability to produce results in a fast-paced work environment independently and alongside a team
Curious and Efficient
Quick learner and can efficiently grasp the domain subject matter Analytical thinking and problem-solving
Ability to breakdown a problem analytically and offer holistic, scalable, and out-of-the-box solutions.
Adaptive to business limitations
Extracting data points with little to no training data, condensed timelines, and limited resources
Understanding business requirements
Capable of anticipating the business requirements and extracting relevant data in advance
Team Player
Inclusive and collaborative work style
PROJECTS
Riverus Umbrella - a Contract Management and Analysis Solution (04/2020 - Present)
Building an end-to-end pipeline to preprocess contracts, extract and store legal clauses and business intelligence in the database in a developer- friendly format.
Leveraging AWS's Lambda Functions to increase the processing speed by 10X.
Designing a database and run time pipeline for an in-built user feedback mechanism on the extracted data points.
Extraction of structured and unstructured data from contracts, forms, and other legal documents.
Segmentation and hierarchy-building for the text structure. Master to supplementary document linking.
Understanding the product UX/UI to deliver data efficiently. Achievements/Tasks
Achievements/Tasks
Page 2 of 2
CERTIFICATIONS
Microsoft Technology Associate in Software Development Fundamentals
License - F196:1144
EDUCATION
Bachelor of Engineering - Computer (B.E.)
MIT College of Engineering, Pune (affiliated to
University of Pune, India)
2013 - 2017, First Class with Distinction (76%)
AISSCE, CBSE (12th Grade)
Royale Concorde International School, Bangalore,
India
2011 - 2013, First Class with Distinction (89.4%)
AISSCE, CBSE (10th Grade)
Our Own English High School, Dubai, UAE
2011, Gold Medalist (10/10 CGPA)
WORK ELIGIBILITY
UAE Residence Visa Indian Passport
PROJECTS
Named Entity Recognition (NER)
Extraction of precedents mentioned in case laws.
Extraction of semi-structured data such as list of lawyers, judges, PAN numbers, party names, etc. from a judgement.
Capturing comparable company names with similar tax issues using Spacy’s pre-trained model.
Classification Models
Identifying texts from case laws as issues, arguments, holdings, and outcomes.
Transfer Learning using Google’s BERT model for case outcome classification with 95% accuracy.
Identification of Industry using Google News Vector. Division of text into Header-Body-Footer as part of preprocessing. Keyphrase Extraction
Built an algorithm from scratch to extract domain specific key phrases using POS (Parts of Speech) tagger.
Significantly improved the case clustering as ready-to-use key phrase extraction tools like Rake failed for the legal domain. User Analytics and Recommendations
Analysis of user behavior for product development and marketing on PowerBI and developing an interface to display this for internal stakeholders.
Capturing and calculating average session duration, most-clicked pages, search terms, onboarding preference selection, filters and tags applied etc. Built a Recommendation Engine to suggest relevant Iegal issues to a user based on their preferences.
LANGUAGES
English
Proficient
Hindi
Proficient
Marathi
Proficient
French
Intermediate
Arabic
Basic