Post Job Free

Resume

Sign in

Senior Software Engineer

Location:
Cambridge, MA
Posted:
March 31, 2021

Contact this candidate

Resume:

RAGHVENDRA SINGH

716-***-**** Ellery St, Cambridge, MA

LinkedIn GitHub Email

TECHNICAL SKILLS

Programming Languages: Scala, Python, Java

Database: MS-SQL, PostgreSQL, AWS S3, AWS Aurora, Exasol Data Visualization: Tableau, Microsoft-SSRS, Matplotlib, Seaborn Big Data: Spark (Scala, Python), Presto, Redis, Hive, HDFS Other Tools: Docker, Jenkins, Apache Airflow, Databricks, GitHub, Microsoft SSIS PROFESSIONAL EXPERIENCE

SecurityScorecard - SSC is a late-stage funded cyber security ratings startup that helps organizations monitor risks in real time around their cloud-based security and analyzes third party-vendors that companies work with. Senior Software Engineer Jan 2021 – Present (Boston)

• Created a data workflow to generate alerts on score changes and write the events to an SQS queue allowing customers to be notified whenever their followed scorecard is updated

• Migrated away from legacy mongo database for storing platform data to AWS Aurora (Postgres), implementing a snapshot design via spark to allow a decoupled database access pattern Software Engineer Feb 2020 – Dec 2020 (Boston)

• Worked with Scala, Spark, Postgres in addition to Airflow, Databricks and AWS EMR /S3 as part of the core scoring team helping score more than a million domains daily

• Implemented rest API endpoints using Scala akka-http module for new measurements added helping customers easily integrate the data points in their analytics

• Replaced default Hash partitioning in Spark with custom field-based partitioning to solve the skew of large and small dataset customers resulting in equal write times across measurements reducing pipeline time from 4-5hrs to 2.5-3hrs

• Improving the data pipeline efficiency over Airflow DAG, implementing new features and fixing bugs cropping up in the daily run

• Continuous contribution to SSC’s GitHub repo implementing the migration from HDFS to Postgres and reviewing changes Data Engineering Intern Jun 2019 – Dec 2019 (NYC, US)

• Implemented POC to replace Hadoop File system (HDFS) with AWS Aurora (Postgres) as a serving layer

• Worked with AWS EMR and AWS S3 to facilitate the scoring directed acyclic graph (DAG) processing 4TB of data daily through the analytics pipeline

• Created a Data pipeline publishing metrics such as Number of domains scored in the past week and new malware found per domain etc. to a public s3 bucket ultimately showcased at trust.securityscorecard.com McKinsey & Company – Data Engineer Jan 2018 – Aug 2018 (Gurugram, IN)

• Developed python module for pre ETL testing, quantifying the accuracy of data files against the source of truth

• Automated the integration process bringing down the execution time from 8 hours to 4 hours

• Designed Tableau reports over derived data, highlighting product performance across geographies

• Created Stored Procedures, Functions, Triggers and complex SQL joins to load and manipulate data from SQL server Infosys – Data Engineer Jul 2015 – Dec 2017 (Bangalore, IN)

• Developed code in MS-SQL and used MS-SSIS to automate the entire data integration process reducing workflow time from 4 days to half a day

• Formulated an ETL pipeline & reporting module for mortgage data, identifying potential leads for the marketing team

• Designed an interactive dashboard using MS-SSRS for stakeholders to measure the store performance across North Pacific region

PROJECTS

Implement Super Resolution GAN (SR-GAN) using Python Deep learning framework - GitHub

• Implemented a Generative & Discriminative (GAN) neural network with the vanilla residual blocks replaced by Inception-A blocks to generate higher resolution images from the low-resolution originals as part of the Master’s Thesis Apply rate prediction of job Applicants (Boosting classifier, Linear Regression, SK-learn, Python) - GitHub

• Worked with heavily unbalanced dataset and predicted whether a candidate applied to the job or not using gradient boost classifier with an AUC of 0.85

• Imputed around 25 percent of missing values using linear regression and cleansed duplicate applicants EDUCATION

University at Buffalo Sep 2018 - Feb 2020

Master’s in Computer Science Buffalo, NY

Jaypee University of Information and Technology Jul 2011 - May 2015 Bachelor’s in Computer Science and Engineering H.P, India



Contact this candidate