Rohith Pillai
858-***-**** adlrte@r.postjobfree.com La Jolla
EDUCATION
University of California San Diego B.S. Data Science GPA: 3.82 (06/2019 - 06/2022)
- Relevant Coursework:
- Practice of Data Science - Ethics of Data Science
- Advanced Statistics - Principles of Data Science
- Data Structures (Simple and Advanced) - Business Analytics
- Theoretical Foundations of Data Science - Database Management (SQL)
- Honors: TMC Honors Program, Provost Honors, Eta Kappa Nu invitee (Top 20% of major class)
- Organizations: ACM @ UCSD, MannMukti @ UCSD (Board Member), ISA (Board Member) SKILLS
- Python, Pandas, R, Java, SQL, Sci-Kit learn, Microsoft Excel, Statistics, Tableau, AWS EXPERIENCE
Data Analyst- Scripps Institution of Oceanography(01/2021 - Present)
- Developed Python scripts that communicated with the company API to extract updated Geodata from various types of back-end database les- utilizing di erent text and le processing tools- and subsequently patched the updates to the company website as geoJSON. Currently developing a python package to automate descriptive text le creation as data from research vessels are made live by SIO.
Data Analyst Intern- Fleetilla LLC (06/2020 - 08/2020)
- Used Python to clean and analyze large volumes of data from source and target softwares to check for discrepancies and trace them across algorithmic lines. Made several detailed reports explaining the discrepancies and examining where the data pipeline fails to follow the algorithm, which assisted in the testing of the software. Director of Membership- ACM Cyber at UCSD(06/2020 - 03/2021)
- Led a team to host socials, moderate at events, engage current and potential membership, carry out outreach, and more. PROJECTS
Text Data Analysis (11/2020)
- Using regex canonicalization collected all hashtags in 90,000 tweets likely posted by the IRA, and computed statistics pertaining to the presence of speci c hashtags across all these tweets and identi ed potentially signi cant trends.
- Using the TF-IDF statistic found a single word that “best summarizes the review” for ~200,000 customer reviews of phones and phone accessories on Amazon
Web Crawlers (11/2020)
- Built a crawler to scrape and parse through data in FinancialModelPrep’s public API to investigate stock price uctuations and compute total transaction volumes over a month and year of choice for historical stock data
- Built a web crawler for Hacker News’ public API that used a recursive algorithm to collect all attributes of comments and sub-comments in a news story of choice into an organized pandas dataframe. NYPD Complaints Dataset Analysis (11/2020)
- Carried out Exploratory Data Analysis on a public dataset pertaining to complaints against the NYPD police force since 1985 and after cleaning, missingness analysis, bivariate analysis, merging, grouping, and other exploratory techniques developed a hypothesis test to answer the question: Are non-white cops complained against more by black civilians? ML Predictive Model (Titanic Data) (12/2020)
- Using Sci-Kit Learn created an ML pipeline that used a Decision Tree Classi er to predict whether a passenger aboard the Titanic survived given their age, honori c title, ticket fare, and sex. The model hit over 85% accuracy consistently on a cross-validation train-test holdout assessment.
Rohith Pillai
858-***-**** adlrte@r.postjobfree.com La Jolla