Basu Sanjay Chinivar(http://www.basuchinivar.com)
607-***-**** ********@**********.*** https://www.linkedin.com/in/basuchinivar/ https://github.com/basuchinivar Education:
Binghamton University, State University of New York, Watson School of Engineering Expected: May 2018 Masters in Computer Science
SJB Institute of Technology, Bangalore, India
Bachelor of Engineering in Computer Science Graduated: June 2014 Technical Skills:
Hadoop eco system, Regression, Classification, Kafka, Languages: Java, Python, R(Beginner), Oracle SQL, HTML, MapReduce, Hive, Pig Libraries: Pandas(Intermediate), OpenCV (intermediate), numpy (intermediate) Tools: Eclipse, SQL Developer, Putty, Anaconda Spyder, Ipython Notebook, Informatica, BitBucket, Apache Thrift, Google Protocol Buffer, LIBSVM, Weka.
Technical courses: Design and analysis of algorithms, Data structures, Programming languages (Haskell and Prolog), Operating Systems, Computer Architecture, Machine Learning, Distributed Systems, Design Patterns. Online courses: Machine Learning A-ZTM: Python & R In Data Science, Python for Data Science & Machine Learning Bootcamp. Certifications and Achievements:
• CCDH-410 (Cloudera certified developer for Apache Hadoop).
• MIT Professional Education: Data Science: Data to Insights
• Pig and Hive from Big Data University.
Achievements:
• Received 'Pat-On-The-Back' award for ensuring data-integrity for transforming data from 7 different instances of database systems
(Teradata, Oracle). Designed an Informatica mapping that atomically transformed data concurrently from all the instances.
• IEVOLVE Kaizen certificate from IGATE for facilitating Gemfire XD Data Modelling, Data Loading & Synchronization using ETL and GXD tools.
Professional Experience:
Software Engineer, Capgemini (IGATE Global solutions) Bangalore, India June 2014-June 2016 Client: BIG DATA COE
Product Name: IV3
• IV3 platform is a Hadoop-based technology platform with a set of out-of-the-box business value solutions and reusable components, which have been developed for specific use cases across multiple industry domains. Roles/Responsibility:
• Built components for ‘IV3’ on the BIG DATA platform. Focused on large volume data transmission and migration into the in- memory database GemFire XD. Languages: Java MapReduce. Client: GE Aviation
• Worked extensively on efficient data transformation (ETL) from Oracle and Teradata source systems using Informatica by optimizing session run-time and thus decreasing data traffic across the network (transfer of millions of rows reduced to thousands). This was achieved by evaluating the integrity of the incoming data and identifying the changes to the source by comparing the hash values. Result: - 19 times faster performance. Roles/Responsibility:
• Handled large data sets and provided quicker and faster solutions in the data transformation lifecycle. Worked well in highly agile and malleable situation, and adept in SDLC lifecycle. Tools: Oracle SQL Developer, Teradata Manager, Informatica. Intern, Software Developer, Sequel Consulting, Bangalore, India January 2014-April 2014
• Worked on lossless compressing JPEG images for feature extraction using self-organizing maps to cater for the company’s presence in e-learning platform. Language: Java.
Academic Projects:
• Cursor control through Gesture recognition (Language and Library: Python and OpenCV) Dec 2016 – Jan 2017 Mapping mouse pointer to the finger-tip detected via web-cam to control all the mouse functions.
• Custom Classifier Object Detection (Language and Library: Python and OpenCV) Jan 2017 – Feb 2017 A cascade classifier which is trained to detect a variety of day to day objects.
• Traffic Sign Recognition (Language and Library: Python and OpenCV) Feb 2017 – Feb 2017 A bunch of classifiers that detects American traffic signs in real-time.
• Recommendation System (Language: Python) April-2017 – May-2017 Recommending movies to a user based on content based filtering using the Netflix Dataset.
• Chord Distributed Hash Table: (Language and server: Java, Thrift server) Jun–2017 –July–2017 Implementation of a fully functional DHT using SHA-256 hash.
• The Snapshot Algorithm: (Language: Java, Communication: Google Protocol Buffer) Aug–2017 –Sept–2017 A multi-threaded program that implements Chandy-Lamport Snapshot algorithm.
• Decision Tree for Text Classification (Language: Python) Aug–2017 –Sept–2017 Implementation of a Decision tree that gave accurate results as that of the Sci-kit learn library
• Eventually Consistent Key-Value Store: (Language: Java, Communication: ProtoBuf) Sept-2017 – Sept-2017 Uses Cassandra like Read-Repair and Hinted Hand-off to achieve Eventual consistency.
• Naïve Bayes Text Classification (Language: Python) Sept-2017 – Sept-2017 Classified more than 2000 text files as Spam or Ham with an accuracy of 95.3%