Venu K Tangirala
adlf77@r.postjobfree.com linkedin.com/in/venuktan/ 510-***-****
Summary
** ***** *********** ********** **** 9 years in Data Science and Big Data.
Extensive experience in data analytics and machine learning of various data sets.
Extensive experience in feature selection and data modeling.
Extensive experience in training Deep Learning models.
Extensive experience in Machine Learning with python and R.
Extensive experience in Convolutional Neural Networks(CNN).
Extensive experience in Data Visualization with python and R.
Expertise in Deep Learning with Tensor Flow, Keras and Theano.
Extensive experience in using big data tools like Spark, Hadoop, Cassandra, Elastic Search.
In depth understanding of HDFS and Map-Reduce framework.
Worked with Apache Spark in AWS, local, Databricks deployments.
Extensive experience with Cassandara and Elastic Search data modelling.
Experience in deploying scalable Hadoop clusters over cloud like amazon AWS, S3, EMR, ec2.
Setting up time series analysis for Cassandra.
Ability to play with various scalable data sets on a variety of platforms.
Experience in designing and developing applications spanning full life cycle of software development (SDLC) from writing functional specification, designing, documentation, unit testing and support.
SKILLS
Programming
Python, R, Java, Scala, C++, C
Datastores
MySQL, Hive, Cassandra, Dynamo DB, Elastic Search
ML Big Data Tools
Tensor Flow, Pytorch, Apache Spark, Deep Learning, Turi/Graph Lab, Theano, Keras, Hadoop, MapReduce, Cassandra, Kafka, Hive, Zookeeper, EMR, Caffe, Natural language processing (NLP), Ray, Modin, Kubeflow
Misc
S3, VMWare, Intellij, Maven, SBT, Machine learning, Recommendation Engine, GraphX, AWS, Unix, Git, ec2, Distributed computing, Object orient programming (OOPS)
EXPERIENCE
Adaptive Insights-Workday, CA Staff Data Scientist
Python, Deep Learning, Tensor flow, Keras, Docker, Kubeflow, Kubernetes, Airflow, Java April 2018-Present
Adaptive Insights has a Financial planning and Sales planning product
Apply machine learning to time series financial and sales data
Built ETL pipelines with airflow and apache spark.
Build LSTM time series models based on historical data and make predictions for the future
Write docker files to productionize the code
Compose k8s yamls for qa and production deployments
Build Temporal Convolution Network (TCN) for time series prediction.
Time series based anomaly detection for financial data
Time series data visualization with streamlit python
Cyngn, CA Perception Lead, AI Scientist
Python, Tensor Flow, Deep Learning, Spark, ROS Feb 2017-March2018
Cyngn is in the autonomous driving space
Built model for recognizing objects in images, worked with VGG-16 based models like Yolo, Single Shot Detector and SqueezeDet, MultiNet
Worked with Image segmentation models like SegNet, Mask RCNN, MultiNet
Lane detection with classical computer vision technologies
Deployed these model for edge processing on Nvidia TX2
Leeo, CA Data Scientist
Python, Machine Learning, Spark, Hadoop, R, Tensor Flow, AWS, Tensor Flow, OpenTSDB July2014-Feb2017
Leeo is in the IoT space building devices for home automation
We built an Iot device that listens to the audio when a Smoke, Co, Water Alarm.
Used a Random forest classifier as an audio predictor.
Used FFT and audio spectrogram as features to the Machine learning.
Built an Image similarity and classification with Deep learning in Tensor flow and Keras.
Used Convolution Neural Networks (CNN) for model building on dual Titan X GPUs.
Used pre-built VGG-16 CNN model in Keras for image classification.
Built a temperature and humidity calibration system.
Used Elastic search with spark for searching and indexing logs.
Used Linear regression model for data calibration.
User OpenTSDB for time series data logging and visualization.
nFlate, CA Co-Founder, Data Science Architect
Tensor Flow, Keras, Spark, Python, Machine Learning, Turi/Graph Lab, AWS, DynamoDB Dec2013-Mar2018
Built an image retrieval service based on similar images for searching products
Resnet transfer learning from Imagenet Model to custom dataset.
Used Nearest Neighbor’s (ANN) to find most similar items to an item.
I was responsible in building algorithms like Frequently bought together, Similar items, People that bought this also bought this based on collaborative filtering.
Built these recommendations on Apache Spark with Alternating Least Square fit.
Extracted the dominant colors in an image with k-means clustering.
Ran the dominant color extractor code on spark for scalability.
Used Elastic Search for searching products by color.
Made sure these were running on a daily batch basis in the AWS cloud with EMR.
Statistical simulation of data with R.
Cloudwick, CA Data Engineer
Spark, Hadoop, Mahout, Python, Java, Amazon’s Elastic MapReduce, R Apr2013-July2014
Used mahout for k means clustering of scattered photon data points.
Implemented namenode High Availability (HA).
Map Reduce code to clean and transform loaded data in HDFS.
Import data from open data sources from S3 and private clusters.
Build an automated data pipeline to import and pre-process.
Map Reduce code to clean and transform loaded data in HDFS.
Architect Cassandara tables based on primary key and cluster key
Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
The Hive tables created as per requirement were internal or external tables defined with proper static and dynamic partitions, intended for efficiency.
Used the RegEx, JSON and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
Northwestern Polytechnic University, CA Doctoral Student
A Recommendation System based on Ratings and Textual Reviews July2010-April2015
Built a Collaborative Filtering Based Recommendation System with reviews.
Used textual features to feed to the CF algorithm for better prediction.
Used Latent Semantic Indexing to obtain higher dimensional features of the textual reviews.
Used python NLTK for text cleaning.
Used ALS for Matrix Factorization.
Ran this algorithm on a 10 node cluster to test for scalability.
Lawrence Berkeley National Lab, Berkeley, CA Research Assistant
Parallel Computing( C++, Python, Boost Python, Linux, Inter Process Communication, MPI) Sep2010-Apr2012
Analyzing massive data from the Linear Coherent Light Source(LCLS) at the SLAC National Accelerator Laboratory which is the world’s longest linear accelerator in real time by parallelizing processes.
Simulation and analysis of data from beam lines and understanding various applications.
IBM, Boulder, CO
Shared ID Boarding Tool- SIBT(Java, UNIX, DB2, Servlets, JSP, RAD, CMVC) Aug2009-Sep2010
Geometrics Inc., San Jose, CA
Fourier Transform (Java, JDBC, SQL, AWT) Dec2007-Dec2008
Database Management (ACT 6.0!, Microsoft BCM, SQL2005)
EDUCATION
Doctorate in Computer Engineering (2015)
Northwestern Polytechnic University, CA
M.S. in Computer Science (2008)
Northwestern Polytechnic University, CA
B.Tech. Electronics & Communications (2007)
Jawaharlal Nehru Technological University, Hyderabad
Cloudera Certified Hadoop Developer (2013)
Cloudera
Cloudera Certified Hadoop Administrator (2013)
Cloudera
Datastax Certified Cassandra Developer (2013)
Datastax
Elastic Search Developer (2016)
Elastic
PATENTS
Tangirala, Venu. 2018. Calibrating an environmental monitoring device. U.S. Patent Application 10026304, filed January 2015.
Tangirala, Venu. 2019. Prediction model training using detected anomalies. U.S. Patent Application 16/601,309, filed September 2019. Patent Pending
BLOG POSTS AND PUBLICATIONS
Venu Tangirala; Vectorized intersection over union (iou) in numpy and tensorflow; 03/02/2018; https://venuktan.wordpress.com/2018/03/02/vectorized-intersection-over-union-iou-in-numpy-and-tensor-flow/
Venu Tangirala; Setting up mahout and running recommender job; 12/28/2012; https://venuktan.wordpress.com/2012/12/28/setting-up-mahout-and-running-recommender-job/
Venu Tangirala; Running jobs on emr with data on s3; 12/27/2012; https://venuktan.wordpress.com/2012/12/27/running-jobs-on-emr-with-data-on-s3/
Venu Tangirala; Wordcount mapreduce from command line; 11/19/2012; https://venuktan.wordpress.com/2012/11/19/wordcount-mapreduce-from-command-line/
Venu Tangirala; Wordcount map reduce on Hadoop eclipse plugin; 11/19/2012; https://venuktan.wordpress.com/2012/11/19/wordcount-mapreduce-hadoop-eclipse-plugin/