Praketa Saxena Work Experience: ** yrs Software Engineer Jersey City, NJ USA 650-***-****
*************@*****.*** Linkedin Github Repo Portfolio
Certifications
GCP Certified Professional Data Engineer Professional GCP Cloud Security Associate Cloud Engineer Google Cloud Digital Leader BASE S.A.S
Executive Summary
●11 years experienced SoftwareEngineer Engineer
●Proficiency in developing, deploying, and maintaining software applications, distributed data pipelines on- PREM and AWS Cloud,Databricks, Airflow and Big data workflows
●Proficient in Python, JAVA,SQL, Scala Spark, PySpark, Hadoop, & Cloudera, Oracle Exadata, MS-SQL Server, SSIS
Projects
●Credit Risk Data Pipeline: Developed a robust data pipeline framework for Bank of America's Credit Risk Platform using Python, Hadoop and SQL.
●Network Monitoring Tool (NMT): Developed backend data automation for inhouse FullStack JAVA Angular front-end application with Hadoop and Python Backend to monitor Samsung network cell nodes.
●Supply Chain Logistics Pipeline: Created data pipelines for Nike's supply chain logistics, optimizing it with Python, Nifi, Pandas, and PySpark.
●Web Scraping Automation: Implemented web scraping tasks for various projects using Python, Selenium, Requests, and proxy hopping techniques.
Professional Experience
Morgan Stanley
Software Engineer NYC, USA Feb 24 - Present
●Designed and implemented a distributed K-Means clustering system using Python, Scala Spark, and HDFS and Hive, and extracting transaction level model features like transaction type, amount, currency, timestamp,destination, channel and establishing correlation matrix with originator profile, external third party risk sources like PEP, sanction screening data, adverse media feeds and blacklist data to improve High Risk originator ALERTS for SEC Daily Feeds..
●Developed and exposed clustering functionality via Flask REST APIs, supporting real-time predictions of originator ID’s with Daily Alert Flagging, Daily model retraining based on T -1 Data and features
●Engineered and benchmarked a custom Python-based K-Means framework against rule-based models, achieving false alert reduction and supporting incremental training logic.
●Automated deployment pipelines using Jenkins, Autosys, Groovy, and YAML, managing build and release workflows across QA, UAT, and PROD environments on RHEL with end-to-end monitoring.
Bank of America
Application Programmer NJ, USA Feb 23 - Feb 24
●Built and deployed large-scale data pipeline frameworks using Python, Scala, SQL, PL/SQL, and Oracle Exadata, supporting credit and market risk analysis across diverse financial instruments. Developed Oracle stored procedures and triggers to automate validation, enrichment, and regulatory reporting logic directly within the database layer, optimizing compute-intensive workflows.
●Designed and developed JVM-based batch applications in Scala to ingest and transform data across Cloudera Hadoop, Impala, and Oracle Exadata. Integrated Red Hat AMQP-based messaging pipelines to orchestrate data movement from SFTP landing zones into Hive/HDFS via automated Spark or Sqoop jobs. Leveraged PL/SQL stored procedures and database triggers for downstream transformation and audit tracking within Exadata.
●Automated and orchestrated end-to-end risk workflows using Autosys (JIL scripting), Jenkins (Gradle/Groovy pipelines), and optimized Spark jobs (caching, partitioning, delta logic) to ensure timely execution of risk models and data pipelines under stringent SLAs. Incorporated Exadata-specific features like smart scans, storage indexes, and parallel execution, alongside PL/SQL-based error handling and notification logic via triggers.
●Processed historical and real-time market data (pricing, volatility, rates) using Spark, Impala, and PL/SQL routines on Exadata to compute VaR, PnL, and sensitivities (delta, gamma). Enabled high-performance analytics via partitioned tables, materialized views, and trigger-driven refresh mechanisms, supporting daily exposure and stress testing across multiple market portfolios.
●Integrated third-party credit feeds and internal financial data, leveraging Python, internal Scala Spark frameworks, and PL/SQL APIs to streamline ETL pipelines for counterparty risk, default probability, and collateral analytics. Built Oracle stored procedures to encapsulate business rules and audit logic; deployed database triggers to enforce data integrity, manage state transitions, and initiate downstream actions. All code and schema changes were version-controlled via JIRA and Bitbucket.
Samsung
Senior Tech Lead Plano, TX, USA Apr 21 - Feb 23
●Built and deployed batch data pipelines using Python, Hadoop, and SQL to process Samsung network cell node data for operational insights and performance analysis.
●Developed Spark-based ETL workflows to automate data extraction, transformation (via Pandas), and loading into HDFS for telecom data aggregation and reporting.
●Automated web data scraping and transformation using Python and Selenium to collect structured datasets from public and internal web sources.
●Integrated Nagios monitoring to track system resource usage, pipeline execution status, and trigger alerts on failures across distributed Hadoop infrastructure.
●Streamlined infrastructure tasks using Python scripting and Rsync for file synchronization, with project tracking and collaboration handled via JIRA
Nike
Senior Tech Lead Portland, Oregon, USA Apr 20 - Apr 21
●Designed, built, and deployed end-to-end applications for real-time inventory, demand, and supply visibility across global vendors, optimizing procurement and stock management.
●Migrated supply chain workflows from SAP-based systems to AWS S3, implementing batched ETL pipelines using Python, Airflow Workloads, and Amazon EMR for scalable data processing.
●Designed and implemented a production ready RESTful API using Flask to expose critical supply chain metrics
●Developed data transformation and validation Nifi Workflows using Pandas and PySpark, ensuring high-quality inputs for business intelligence and reporting.
●Programmatically managed AWS resources like S3, EC2 using Boto3 and maintained project tracking and resolution via JIRA, supporting agile delivery and operational continuity.
Fledging Electronics
Full Stack Developer Intern Birmingham, AL, USA Apr 19 - Sep 19
●Developed and integrated Python-based data Airflow workflows on AWS EC2, transforming backend datasets and visualizing KPIs through Amazon QuickSight dashboards.
●Designed and deployed secure AWS VPC architectures, configuring IAM roles/policies and managing RDS instances and S3 buckets for data storage and access control.
●Implemented automated ETL pipelines using standard Python libraries pandas, boto3, scheduled to deliver daily, weekly, and quarterly reports with minimal manual intervention.
●Built custom web crawlers and data scrapers to ingest external data sources into internal pipelines, supporting business intelligence and trend analysis.
●Streamlined in-house reporting workflows by orchestrating Python automation scripts on EC2, reducing report generation time and improving data freshness.
Dept. of E.C.E U.A.B
Graduate Assistant Birmingham, AL, USA Dec 18 - Sep 19
●Designed front-end visualization of Matrix Multiplication using HTML, CSS, and Javascript.
●Benchmarked and optimized matrix multiplication visualization using std C++ libraries.
MasterCard
Associate Analyst New York, USA Dec 15 - Jul 18
●Designed and deployed SSIS (.dtsx) packages in C# for end-to-end ETL workflows for Banking Clients extracting from flat files, transforming datasets, and loading into MS SQL Server staging and production environments.
●Automated Excel-based data validation, formatting, and reporting using advanced VBA scripting integrated with SQL queries and ADO connections.
●Developed reusable SSIS components including Data Flow and Script Tasks to implement business logic, and perform lookup-based enrichment.
●Enabled non-technical users to trigger ETL jobs from Excel through macro-driven Shell calls to SSIS packages, streamlining business reporting pipelines.
●Optimized SQL Server stored procedures and views used by both SSIS packages and Excel dashboards to reduce query time and support real-time insights
DesignTech Systems
Pune, Maharashtra, India Automation Engineer CAD / CAM Apr 14 - Dec 15
●Designed and developed solutions for CAD objects automation and validation, including using Inventory BOM and Autodesk Inventor CAD metadata for associated Product Margin Reports and Inventory Profiling reports in base SAS.
●Developed back end infrastructure and data models to manage CAD data for Autodesk Vault and MS SQL Server Express.
●Designed and developed Excel-based Product Margin and Inventory Profiling reports using advanced formulas, pivot tables, and VBA-driven data transformations.
●Built automated reporting workflows with VBA macros and standard templates to generate and distribute weekly Excel deliverables with minimal manual input.
●Developed relational backend structures within Excel (using named ranges, structured tables, and Power Query) to manage and analyze Q.S.R. operations data.
●Streamlined business reporting by integrating dynamic VBA modules for data refresh, formatting, validation, and PDF export.
●Standardized recurring Excel reporting tasks to improve consistency, accuracy, and delivery speed across the reporting lifecycle.
Education
●Master's in Electrical and Computer Engineering U.A.B Birmingham, Alabama, USA Sep 2018 - Dec 2019
●Bachelor's in Aerospace Engineering Amity University, India Sep 2010 - Dec 2014
Activities & Hobbies
●Singapore Patent: Secure mobile transaction authentication via NFC Binding
●Badminton, Motorcycling, Reading, Music