Post Job Free

Resume

Sign in

Data Engineering

Location:
Logan, UT
Salary:
100000
Posted:
November 10, 2020

Contact this candidate

Resume:

Bishal Sainju Albany, NY, *****

https://www.linkedin.com/in/bishal-sainju-73a898b1/ Cell: 775-***-**** Email: adhqaj@r.postjobfree.com Skills

Languages: Python, Java, R

Big Data: HDFS, Sqoop, Hive, Hbase, Flume, Spark, Kafka, Impala, Tableau

Web: HTML, CSS, JavaScript, ReactJS, NodeJS, Express

Machine Learning: pytorch, scikit-learn, pandas, numpy, tensorflow, keras, openCV

Databases: MySQL, NoSQL(MongoDB, Cassandra)

Technologies:git, Latex, Linux bash, MATLAB, IntelliJ, PyCharm, MySQL Workbench

Platforms: Mac, Windows, Linux

Cloud Platform: AWS

Education

Utah State University, 3.95 Logan, UT

Master of Science in Computer Science Jan 2019 - ongoing

(Expected Graduation: Dec 2020)

Thesis:Job satisfaction and employee turnover determinants in Fortune 50 companies: Insights from employee reviews from Indeed.com Courses Taken: Data Mining, Data Science,Machine Learning, Computer Vision, Advanced Algorithm, Data Visualization, Data Science Incubator, Rapid Problem Solving

Institute of Engineering, Tribhuvan University Kathmandu, Nepal Bachelor of Engineering in Computer Nov 2012 - Nov 2016 Experience 4 years 9 months

Graduate Research Assistant Logan, UT

Utah State University Jan 2019 – ongoing

(Business and Computer Science Dept. Collaboration)

Analyzed and extracted latent satisfaction aspects in employee reviews from Indeed.com of Fortune 50 companies using K-means clustering, Latent Dirichlet Allocation (LDA), Structural Topic Modeling (STM).

Used nltk, spacy, gensim, scipy, numpy, pandas, tidyverse, dplyr, stm libraries in python and R to perform natural language processing and topic modeling.

Scraped employee reviews from Indeed.com after getting approval using Beautiful Soup package in python.

Analyzed topic discovered using Hierarchical Clustering, for further grouping of topics into hierarchical order.

Also, determined factors contributing to employee turnover and provided an elegant way to compare topic contribution between sectors and within each sector’s companies.

Used d3 and JavaScript to build topic-term link diagrams that helped us easily understand each satisfaction aspect and terms used to describe them.

Used matplotlib, seaborn, bokeh for data visualization.

Used react, d3, JavaScript, HTML, css to build a website to display the result. Software Engineer Kathmandu, Nepal

Toggle Corp Solutions Private Limited Aug 2015 – July 2018

Integrated applications with existing software, converting data to present seamless results.

Designed and developed data pipeline to ingest structured and semi-structured data from different points - online(website) and offline(retail outlet)

Used Sparks APIs for FP-growth and Collaborative Filtering algorithm.

Used Spark for interactive queries, processing of streaming data and integration with Hbase for huge volume of data.

Involved in moving all log files generated from various sources to HDFS for further processing through Flume.

Implemented Spark using python and utilized DataFrame and DataSet for faster processing of data.

Worked with various HDFS file formats like Avro, Sequence File and various compression like Snappy, gzip

Developed data pipeline using flume, sqoop and MapReduce to ingest customer behavioral data

Developed and executed queries against large datasets for analysis and data validation.

Utilized Hive queries, Pig scripts and MapReduce program for data analysis

Built dashboards using Tableau for interactive data analysis. Achievements:

Built Inventory Management System (order prediction software), Customer Experience Analytics (sentiment analysis and knowledge extraction of real-time streaming data) and Recommendation System(product recommendation)

Maintained existing Office Automation System through user interface design and optimization. Publications

John M Edwards, Joseph Ditton, Bishal Sainju and Joshua Dawson, “Different Assignments as Different Contexts: Predictors Across Assignments and Outcome Measures in CS1”, 2020 Intermountain Engineering, Technology and Computing (IETC), Orem, UT

Analyzed programming events like keystrokes, text pastes, task switches, and run attempts of students to identify which programming behaviors generalize well as predictors across programming assignment and outcome measures. Manil Vaidhya, Bikash Shrestha, Bishal Sainju, Kiran Khaniya and Aman Shakya, "Personality Traits Analysis from Facebook Data," 2017 21st International Computer Science and Engineering Conference (ICSEC), Bangkok, 2017, pp. 1-5

Used nltk and spacy for text mining and used KNN, Linear SVM to predict the personality of a person into 5 personality dimensions of Big 5 model.

Obtained precision of upto 65% on one of the personality dimensions (Conscientiousness). Projects

Microsoft Malware Prediction

Worked on feature engineering, feature encoding, model building and evaluation of large dataset (8M dataset, 83 attributes).

Used Decision Tree and Light GBM and obtained ROC_AUC score of 71.71% on test dataset. A machine learning based study on pedestrian-type classification in a heterogeneous flow involving individuals with disabilities

Demonstrated that walking behaviors of different individuals (with and without disabilities) are different in microscopic level, and can be classified successfully.

Used SVM, kNN, Decision Tree, ANN and obtained 95% accuracy using kNN. Analysis of Cryptocurrency Transactions

Built data pipeline generating, processing, and visualizing stream of data from public dataset.

Used Ripple API, Hadoop, Spark, Kafka, SparkSQL, Hive, Impala, and Tableau Interactive Data Visualization for Topic Modeling Algorithms

Generated an interactive visualization using d3 and JavaScript for visualizing topic term distribution of an LDA and STM. Extra-curricular

Represented Nepal in Dana Cup, Denmark and Jawaharlal Nehru Cup Soccer Tournament, India

Best Representative Award in Today’s Youth Asia Leadership Programme



Contact this candidate