Lakshay Malhotra Website: lakshaymalhotra***.wixsite.com/portfolio
512-***-**** GitHub: https://github.com/lakshaymalhotra123 ******************@*****.*** LinkedIn: linkedin.com/in/lakshay-malhotra/ EDUCATION
The University of Texas at Dallas (GPA 3.7/4.0) M.S., Business Analytics (with a concentration in Data Science) MAY 2020 Bharati Vidyapeeth University, India (GPA 3.8/4.0) B.Tech., Information Technology JULY 2015 TECHNICAL SKILLS
Programming Skills/BI Tools: R, Python, Tableau, MS Excel, Shell Scripting, SAP BI, SAP Lumira, Power BI Database/Data warehousing: Informatica, Oracle, DB2, MySQL, NoSQL, MongoDB, PL/SQL, ERP, PostgreSQL, Snowflake Big Data: Hadoop, MapReduce, Flume, Sqoop, Hive, Pig, PySpark Tools: RStudio, Jupyter Notebook, Crystal Reports, Linux, SAP BO, SAS, Git Certificates: DataCamp - Python Certification, Udemy - Tableau Certification, Coursera – Python Visualization AWS/Business: EFS, Cloud Watch, CloudFront, kinesis firehose, S3, Insurance, Microfinance Machine Learning: Logistic Regression, DNN, FB Prophet, K-means clustering, SVM, KNN, XGboost, Light GBM PROFESSIONAL EXPERIENCE
DATA MIGRATION SPECIALIST, OMNNA, TEXAS SEP 2019-JAN 2020 Legacy Datawarehouse Migration (Python, Excel, AWS, Linux)
• Re-designed and developed ETL pipeline to migrate data from multiple legacy systems to AWS and decreasing time by 70%
• Led a team of engineers and coordinated with Vendors to ensure data migration of 1.5 million products
• Analyzed and reported the data inconsistencies within the transaction and product data thus assured data integrity DATA SCIENTIST, DATA ONE GLOBAL (SYSTEMATIC TRADING START-UP), TEXAS JUN 2019-AUG 2019 ETF S&P500 Forecasting (Python, SQL, Oracle)
• Extracted trade information from FIX-Logs to understand the executed trades, and strategized company funds on profit ETF’s
• Developed pipeline to extract ETF dividend info, used web scraper, stored data in the database, provided insights into trading
• Deployed Twitter streaming pipeline, used NLP techniques to conduct sentiment-analysis and modeled the market fluctuation
• Developed the forecasting models estimating ETF S&P500 prices over time and increased trading revenue by $120K Daily File automation (Python)
• Automated process of downloading and storing of the daily updated stock exchange data in the server eliminating the data lags DATA ANALYST, TATA CONSULTANCY SERVICES, INDIA JAN 2016-JUN 2018 Insurance Premium Price Analysis (Python, SQL, Tableau, Excel, Oracle, Agile, PowerBI)
• Performed statistical modelling and used hypothesis testing on complex customers dataset to find KPI for premium charges
• Predicted with 79.81% regression accuracy which optimized the premium charged increasing turnover of $1.30 million
• Developed Tableau/PowerBI dashboards to provide insights on variation of insurance premium charges for 5 different clients North America Database Migration (PostgreSQL, SSIS, MYSQL, Excel)
• Modified Informatica mapping to load flat files from multiple MySQL databases, incorporating new business requirements
• Designed ETL pipeline to migrate legacy data to warehouse and optimized SQL procedures to decrease load time by 25%
• Optimized SQL scripts to validate Accounts transaction table for data inconsistencies and reduced validation time by 20% Multi-Project Bugs Analysis (Excel)
• Performed bugs analysis and designed a weekly dashboard, used Macros to ensure smooth monitoring for 8 projects PROJECTS
WALMART RETAIL GOODS FORECASTING (JUPYTER NOTEBOOK, TENSORFLOW, PYTHON) MAY 2020-JUN 2020
• Expedited data manipulation and analytical operations on a dataset of 60M+ rows using memory optimization techniques
• Transformed time series data into supervised problem and forecasted retail goods prices for next 28 days using ML techniques MOVIE REVIEW SENTIMENT ANALYSIS (NLP, R, KERAS, KNIT, RMD) MAR 2020-APR 2020
• Applied LDA transformation on the movie review text reducing dimensions, thus decreasing model training time by15%
• Transformed cleaned data into vectors and built a neural network to classify the reviews as good or bad with 85.32% accuracy AWS CLOUD HACK PREVENTION (AWS LAMBDA, EC2, CLOUDWATCH, KINESIS, WAF) SEP 2019-NOV 2019
• Hosted PHP website on AWS server, used Web application firewall to secure the database from unauthorized access
• Autoblocked IP address for 4 hrs using AWS lambda and setup up an alarm which sends email to security team on SQL Injection LENDING CLUB ANALYSIS (SAS, TABLEAU, PYTHON, JUPYTER NOTEBOOK) MAR 2019-MAY 2019
• Conducted Exploratory analysis (EDA) to understand the statistical distribution and for feature selection
• Built statistical models to classify loan defaulters with max 96.4% accuracy and saved average $6156 of loss for each defaulter