Post Job Free
Sign in

Data Engineer

Location:
Dubai, United Arab Emirates
Posted:
September 27, 2020

Contact this candidate

Resume:

Page * of *

Anandita Bodas

Data Scientist

I conceptualise and build data science solutions for business requirements using clean, efficient, production- friendly, and scalable methods.

********.*****@*****.*** +97-156******* / +91-967******* Dubai, UAE - Pune, India linkedin.com/in/ananditabodas WORK EXPERIENCE

Lead Data Scientist

Riverus

06/2019 - Present, Pune, India

Riverus is a legaltech company started in 2017. It provides data-driven insights from legal documents in the area of tax law and from contracts. Currently leading a team of data scientists to understand and deliver business and product requirements.

End-to-end involvement in crucial projects for the company platform, working cross-functionally with CXOs, Product/Project Managers, Developers, and QA Engineers.

Part of several investor and potential partnership meetings as a representative of the data science team.

Involved in recruitment, resource planning, project management, training, and mentoring of fresh recruits.

Building pipelines and database architecture for several products and features to suit product and data science requirements. Extraction of structured and unstructured data points from various legal documents.

Creating models and processes from scratch where ready libraries failed for the legal domain.

NLP Engineer

Riverus

11/2017 - 05/2019, Pune, India

Riverus is a legaltech company started in 2017. It provides data-driven insights from legal documents in the area of tax law and from contracts. Extracting structured and unstructured data from case laws, working closely with domain experts for training and testing the data.

Designing APIs using Python Flask.

Complex database querying to provide relevant data for analysis and content generation to the Product and Marketing team. Linking cases based on citations across 300,000+ files with 80% accuracy using Pattern Matching and Regular Expressions. Generation of scheduled data reports to check data completeness and accuracy.

Identification of similar cases.

CERTIFICATIONS

NLP with Classification and Vector Spaces

by deeplearning.ai (Coursera); License - ZXF8GCKX9YNF TECHNOLOGIES AND FRAMEWORKS

Python Tensorflow Spacy Keras

Pytorch Transformers NLTK PostgreSQL

AWS Lambda Functions SKLearn Python Flask

Google News Vector Numpy Pandas PowerBI

SKILLS

Outcome-oriented

Ability to produce results in a fast-paced work environment independently and alongside a team

Curious and Efficient

Quick learner and can efficiently grasp the domain subject matter Analytical thinking and problem-solving

Ability to breakdown a problem analytically and offer holistic, scalable, and out-of-the-box solutions.

Adaptive to business limitations

Extracting data points with little to no training data, condensed timelines, and limited resources

Understanding business requirements

Capable of anticipating the business requirements and extracting relevant data in advance

Team Player

Inclusive and collaborative work style

PROJECTS

Riverus Umbrella - a Contract Management and Analysis Solution (04/2020 - Present)

Building an end-to-end pipeline to preprocess contracts, extract and store legal clauses and business intelligence in the database in a developer- friendly format.

Leveraging AWS's Lambda Functions to increase the processing speed by 10X.

Designing a database and run time pipeline for an in-built user feedback mechanism on the extracted data points.

Extraction of structured and unstructured data from contracts, forms, and other legal documents.

Segmentation and hierarchy-building for the text structure. Master to supplementary document linking.

Understanding the product UX/UI to deliver data efficiently. Achievements/Tasks

Achievements/Tasks

Page 2 of 2

CERTIFICATIONS

Microsoft Technology Associate in Software Development Fundamentals

License - F196:1144

EDUCATION

Bachelor of Engineering - Computer (B.E.)

MIT College of Engineering, Pune (affiliated to

University of Pune, India)

2013 - 2017, First Class with Distinction (76%)

AISSCE, CBSE (12th Grade)

Royale Concorde International School, Bangalore,

India

2011 - 2013, First Class with Distinction (89.4%)

AISSCE, CBSE (10th Grade)

Our Own English High School, Dubai, UAE

2011, Gold Medalist (10/10 CGPA)

WORK ELIGIBILITY

UAE Residence Visa Indian Passport

PROJECTS

Named Entity Recognition (NER)

Extraction of precedents mentioned in case laws.

Extraction of semi-structured data such as list of lawyers, judges, PAN numbers, party names, etc. from a judgement.

Capturing comparable company names with similar tax issues using Spacy’s pre-trained model.

Classification Models

Identifying texts from case laws as issues, arguments, holdings, and outcomes.

Transfer Learning using Google’s BERT model for case outcome classification with 95% accuracy.

Identification of Industry using Google News Vector. Division of text into Header-Body-Footer as part of preprocessing. Keyphrase Extraction

Built an algorithm from scratch to extract domain specific key phrases using POS (Parts of Speech) tagger.

Significantly improved the case clustering as ready-to-use key phrase extraction tools like Rake failed for the legal domain. User Analytics and Recommendations

Analysis of user behavior for product development and marketing on PowerBI and developing an interface to display this for internal stakeholders.

Capturing and calculating average session duration, most-clicked pages, search terms, onboarding preference selection, filters and tags applied etc. Built a Recommendation Engine to suggest relevant Iegal issues to a user based on their preferences.

LANGUAGES

English

Proficient

Hindi

Proficient

Marathi

Proficient

French

Intermediate

Arabic

Basic



Contact this candidate