Email: firstname.lastname@example.org **** Riverside Station Blvd,
Tel: 917-***-**** RISHI SHAH Secaucus, NJ 07094
Highly competent, hard working, proactive developer with keen interest in software development. Along with proven technical & analytical problem solving skills, possesses quick learning ability to deliver effective solutions. Education:
MS in Computer Science, Syracuse University, Syracuse, NY, December 2018 GPA: 3.8/4.0 BE in Computer Engineering, LDRP Institute of Tech & Research, Gujarat, India, May 2016 GPA: 3.9/4.0 Technical Skills:
Languages : Python (conda), Java, C/C++, PL/SQL
OpenSource : Spark (Python/SQL), Hadoop/MapReduce, HBase Database : SQL Server, Oracle
OS/Tools : Linux, Windows, GIT
Software Engineer in Data Science Platform at Barclays New York, NY Feb 2019 – Present Tech: Python, Pyspark, SQL, HDFS, Hadoop
Equities Data Science Platform:
• Designed & developed a standard framework for loading data from various sources & extended API to perform various analytical workload (cleaning, dedupe, aggregation) to streamline the process.
• Developed generic client reporting engine to deliver T0/T1 reports. This includes handling real-time data from execution systems, data from HDFS(Impala)/SQL and delivering customized matrix.
• Developed generic trends tool to deliver weekly and quartlerly trends to the clients.
• Designed & developed data quality microservice to ensure reliability of a product. Various benchmarks were calculated to make the service configurable, easy to use, scale & eventually help minimize production surprises before stats reach client’s inbox.
• Actively participated in rotational support with other team members to ensure business continuity, effective communication with all the stakeholders & trading desk about business requirements. Also lead discussions around data engineering with the broader team. Software Developer Intern at Logical Software Solutions (Capital Novus), Gujarat, India Jan 2016 – May 2016 Tech: Python, Java, Apache Hadoop, MS SQL, NoSQL (HBase), Maven eZAnalytics – Solving BigData problems in E-Discovery Domain
• Designed a system to filter required text data, meta data from various forms of input like images, compressed files etc.
• Gathered requirements around E-Discovery system to process, store & index terabytes of data.
• Designed a NoSQL schema to store such huge footprint of data to be processed.
• Developed a 2nd generation smart analytical tool using algorithms such as key phrase extraction, entity extraction, clustering, near duplicate identification. Projects:
• Sentimental Analysis of Amazon reviews data: As part of Natural language processing course work, performed data processing/cleanup task using tokenization, sentence creation, regular expression processing, stop word filtering etc. Also used n-gram word features as a baseline to improve the accuracy of the classification. Utilized subjectivity/sentiment lexicon with scores, SciKit Learn classifiers as well.
• Analysis of National Science Foundation dataset (NLP): extracting key information about their research work and analyzed it over the period to produce helpful matrixes.
• Metadata Analysis: Worked with Amazon API to analyze various metadata information to provide greater statistical power to put to work. The analyzed data quantifies the general public sentiments or reactions toward certain products and reveal the contextual polarity of the information. Utilized various graphs for spotting the trend.