AISHWARYA NITIN KAPSE
******@***.*** 949-***-**** https://github.com/aishwaryakapse https://www.linkedin.com/in/aishwaryakapse 3655 Pruneridge Avenue, #21, Santa Clara - 95051
EDUCATION:
Master of Science – Computer Engineering (Computer Software) GPA: 3.74 Jun 2017 University of California, Irvine CA
Relevant Courses: Information Storage, Middleware and Distributed Systems, Next Generation Search Systems, Projects in Databases and Web Applications, Design and Analysis of Algorithms, Advanced System Software, Computer Networks EXPERIENCE:
Software Engineer (Contract) – Python, Scala, Spark, Bash, pyspark May 2017 – Aug 2017 MapR Technologies – 350 Holger Way, San Jose, CA.
Developed proactive support for MapR customers resulting in case deflection and resource conservation.
• Constructed data lake in MapR File System from customer logs using rsync Linux utility, SFTP, and bash scripting.
• Indexed logs from data lake on Elasticsearch using Fluentd for visualization on Kibana.
• Calculated aggregate statistics by running SQL queries on the indexed data using Spark-SQL, pyspark and Hive.
• Achieved logs storage as POC from MapR Streams to MapR-DB using Spark Streaming and Kafka API. Informatics Intern – Python, Django Framework, Docker, Jenkins Oct 2016 – Mar 2017 Zymo Research Corp – Irvine, CA
Automated data collection and filtering. Implemented storage of hierarchical data for faster access.
• Accomplished data collection and filtering of DNA data from domains like NCBI and ArrayExpress.
• Realized data storage in Amazon S3 and HBase on Amazon EMR.
• Implemented nested set model to store hierarchical bioinformatics data in MySQL using Django Framework.
.
Graduate Student Researcher May 2016 – Sep 2016
Donald Bren School of Information and Computer Sciences – University of California, Irvine, CA
Achieved crawling of crawler-unfriendly AJAX generated websites for the Cloudberry project.
• Extracted dynamic content related to zika virus from domains such as “healthmap.org” and “promedmail.org”.
• Employed open source crawlers like crawljax and Scrapy with splash.
• Accomplished collection of live twitter feeds using Twitter Streaming API and Apache Kafka. Movie Recommendations from MovieLens Data Set – Scala, Apache Spark Mar 2016 – Jun 2016
• Generated recommendations using Item-Based Collaborative Filtering and Cosine Similarity on one million ratings.
• Achieved better performance results compared to results with Alternating Least Squares model inbuilt in MLlib. Persistent Storage of Access Logs – Scala, Apache Spark Mar 2016 – Jun 2016
• Simulated real-time generation of access logs using netcat utility and huge log file integrated with Apache Kafka.
• Achieved information extraction and storage using Spark Streaming, regex, and Cassandra database. Search Engine for UCI ICS Domain using Java Jan 2016 – Mar 2016
• Crawled the content on ics.uci.edu domain using crawler4j and built an inverted index over the data.
• Ranked the results based on term frequency, inverse document frequency, HTML tags, URL data and length. E-commerce movie shopping website – Java, JavaScript Jan 2016 – Mar 2016
• Designed Ecommerce website with support for add, delete, search, update, and shopping cart.
• Utilized HTML, CSS, JavaScript, Servlets, JSPs, and MySQL database. Deployed website on AWS.
• Optimized performance using prepared statements, load-inline functions and extended DB using SAX parsing. TECHNICAL SKILLS:
Programming: Java, Python, Scala, SQL, Bash
Operating Systems: Linux, Unix
Big Data Technologies: Apache Spark, Hadoop 2.0, MapReduce, Spark Streaming, MapR File System, MapR Streams, Hive Databases: MySQL, Cassandra, HBase, MapR-DB
Tools: Amazon S3, Amazon EC2, Amazon EMR, MLLib, Jenkins, Docker, Elasticsearch, Fluentd, Eclipse, IntelliJ Web Crawlers: crawljax (for AJAX - Java), Scrapy with splash (Python) Web Development: HTML5, CSS3, JSP, Servlets, JavaScript, Django Framework