RAGHVENDRA SINGH
716-***-**** Ellery St, Cambridge, MA
LinkedIn GitHub Email
TECHNICAL SKILLS
Programming Languages: Scala, Python, Java
Database: MS-SQL, PostgreSQL, AWS S3, AWS Aurora, Exasol Data Visualization: Tableau, Microsoft-SSRS, Matplotlib, Seaborn Big Data: Spark (Scala, Python), Presto, Redis, Hive, HDFS Other Tools: Docker, Jenkins, Apache Airflow, Databricks, GitHub, Microsoft SSIS PROFESSIONAL EXPERIENCE
SecurityScorecard - SSC is a late-stage funded cyber security ratings startup that helps organizations monitor risks in real time around their cloud-based security and analyzes third party-vendors that companies work with. Senior Software Engineer Jan 2021 – Present (Boston)
• Created a data workflow to generate alerts on score changes and write the events to an SQS queue allowing customers to be notified whenever their followed scorecard is updated
• Migrated away from legacy mongo database for storing platform data to AWS Aurora (Postgres), implementing a snapshot design via spark to allow a decoupled database access pattern Software Engineer Feb 2020 – Dec 2020 (Boston)
• Worked with Scala, Spark, Postgres in addition to Airflow, Databricks and AWS EMR /S3 as part of the core scoring team helping score more than a million domains daily
• Implemented rest API endpoints using Scala akka-http module for new measurements added helping customers easily integrate the data points in their analytics
• Replaced default Hash partitioning in Spark with custom field-based partitioning to solve the skew of large and small dataset customers resulting in equal write times across measurements reducing pipeline time from 4-5hrs to 2.5-3hrs
• Improving the data pipeline efficiency over Airflow DAG, implementing new features and fixing bugs cropping up in the daily run
• Continuous contribution to SSC’s GitHub repo implementing the migration from HDFS to Postgres and reviewing changes Data Engineering Intern Jun 2019 – Dec 2019 (NYC, US)
• Implemented POC to replace Hadoop File system (HDFS) with AWS Aurora (Postgres) as a serving layer
• Worked with AWS EMR and AWS S3 to facilitate the scoring directed acyclic graph (DAG) processing 4TB of data daily through the analytics pipeline
• Created a Data pipeline publishing metrics such as Number of domains scored in the past week and new malware found per domain etc. to a public s3 bucket ultimately showcased at trust.securityscorecard.com McKinsey & Company – Data Engineer Jan 2018 – Aug 2018 (Gurugram, IN)
• Developed python module for pre ETL testing, quantifying the accuracy of data files against the source of truth
• Automated the integration process bringing down the execution time from 8 hours to 4 hours
• Designed Tableau reports over derived data, highlighting product performance across geographies
• Created Stored Procedures, Functions, Triggers and complex SQL joins to load and manipulate data from SQL server Infosys – Data Engineer Jul 2015 – Dec 2017 (Bangalore, IN)
• Developed code in MS-SQL and used MS-SSIS to automate the entire data integration process reducing workflow time from 4 days to half a day
• Formulated an ETL pipeline & reporting module for mortgage data, identifying potential leads for the marketing team
• Designed an interactive dashboard using MS-SSRS for stakeholders to measure the store performance across North Pacific region
PROJECTS
Implement Super Resolution GAN (SR-GAN) using Python Deep learning framework - GitHub
• Implemented a Generative & Discriminative (GAN) neural network with the vanilla residual blocks replaced by Inception-A blocks to generate higher resolution images from the low-resolution originals as part of the Master’s Thesis Apply rate prediction of job Applicants (Boosting classifier, Linear Regression, SK-learn, Python) - GitHub
• Worked with heavily unbalanced dataset and predicted whether a candidate applied to the job or not using gradient boost classifier with an AUC of 0.85
• Imputed around 25 percent of missing values using linear regression and cleansed duplicate applicants EDUCATION
University at Buffalo Sep 2018 - Feb 2020
Master’s in Computer Science Buffalo, NY
Jaypee University of Information and Technology Jul 2011 - May 2015 Bachelor’s in Computer Science and Engineering H.P, India