Software Engineer Data Scientist

Location:

Troy, MI

Posted:

April 03, 2024

Contact this candidate

Resume:

A Sakshira Reddy

SOFTWARE ENGINEER / DATA SCIENTIST

Michigan, USA Mobile: 908-***-**** Email: **************@*****.*** SUMMARY

• With three years of professional experience as a Software Engineer, I specialize in Database Development, ETL Development, Data Modeling, Report Development, and Big Data Technologies, primarily in the Financial and Healthcare sectors.

• Proficient in Python, SQL, and R programming languages, utilizing libraries like NumPy, Pandas, TensorFlow, Scikit-Learn, and Matplotlib for analyzing and visualizing data.

• Skilled in managing databases across various platforms including MySQL, PostgreSQL, MongoDB, and Oracle.

• Well-versed in cloud technologies such as Microsoft Azure, AWS, and Redshift, focusing on optimizing data delivery within Agile methodologies.

• Specializing in Big Data tools like Hadoop, MapReduce, HDFS, Sqoop, Hive, NIFI, Kafka, and Apache Spark. SKILLS

Languages: Python, SQL, R

Packages: NumPy, Pandas, Matplotlib, TensorFlow, Scikit - Learn Database Management: MySQL, PostgreSQL, MongoDB, Oracle Cloud Technologies: Microsoft Azure, Amazon Web Services (AWS), Redshift Big Data Technologies: Hadoop, MapReduce, HDFS, Sqoop, Hive, NIFI, Kafka, Apache Spark IDEs: Visual Studio Code, PyCharm, Juypter Notebook Visualization Tools: Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP, Logical Functions, INDEX & MATCH, VBA Macros)

Machine Learning Algorithms: Supervised Learning (Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM, Classification), Unsupervised Learning (Clustering, KNN, Factor Analysis, PCA) Other Technical Skills: Data Management, Marketing Analytics, Information Technology Strategy, Jira, Digital Innovation, ETL/ELT Process Innovation & Management, Big Data Technology, Microsoft SQL Server, SSIS, SSRS, SSAS, Kubernetes, Snowflake, Informatica, Data Warehouse, Data Architecture Design, Data Security, Data Management, Data Modeling, Data Integration Version Control Tool: Git, GitHub, SVN, Bitbucket

Operating System: Windows, Linux, Mac OS

EXPERIENCE

RDIC Inc, NJ Jul 2022 – Present

Software Engineer

• Developed and applied data warehouse and processing algorithms for Pharmacy Benefits Management (PBM) projects, resulting in a 20% boost in data delivery efficiency within the Agile framework.

• Created, managed, and optimized reusable data pipelines using pySpark, Hadoop, and other modern technologies.

• Enhanced ETL/ELT development efficiency by 25% through PowerShell and CLI tools, improving performance, scalability, and reliability.

• Implemented Parquet files and global tables for streamlined storage and performance in MongoDB NoSQL databases.

• Provided support for existing Snowflake and AWS cloud Data Management systems, focusing on performance enhancement and feature implementation.

• Automated DAG tasks and workflow management with Apache Airflow, reducing manual intervention by 30%.

• Conducted data analysis using SQL, Python (Pandas, Numpy), and AWS services for efficient data processing and storage. EDoor Solutions, India June 2020 – Jul 2021

Data Scientist

• Crafted a robust data pipeline employing Spark, Hive, and HBase, boosting the efficiency of integrating customer behavioral data and financial histories into the Hadoop cluster by 25% for analysis.

• Designed and implemented partitioned tables in Databricks with Spark, optimizing the data model in Azure Snowflake Datawarehouse, and integrating Nifi with Snowflake to improve client session performance by 25%. Performed data labeling, data exploration, data cleaning to optimize the accuracy and loss in the training procedures and implemented ML model achieved 92% precision.

• Utilized Azure Databricks to structure data into notebooks for streamlined visualization via dashboards.

• Effectively managed Spark Databricks by diagnosing issues, estimating needs, and monitoring clusters, resulting in a 20% increase in overall system reliability.

• Executed Data Aggregation, Validation, and Azure HDInsight operations using Python-based Spark scripts, cutting down data processing time by 15%.

• Implemented Azure Stream Analytics for real-time processing of Geo-Spatial data for targeted sales campaigns based on location.

• Oversaw the monitoring and administration of the Hadoop cluster using Azure HDInsight.

• Able to create prototype modeling and solutions in python with sklearn,

• Experience utilizing both qualitative analysis (e.g., content analysis, phenomenology, hypothesis testing) and quantitative analysis techniques (e.g., clustering, regression, pattern recognition, descriptive and inferential statistics)

• Collaborate with engineers and other data scientists to evaluate and improve.

• Working knowledge of image and video processing techniques and algorithms

• Experience with data visualization tools.

• Work in a high-performing team to identify new revenue opportunities and extract interesting insights (Patterns, correlations, etc.) from large data sets using dashboards, visualizations, and other reporting Solutions including SAP Business Objects, Tableau, or other tools.

EDUCATION

Master of Science in Information Systems

New Jersey Institute of Technology, Newark, New Jersey, USA Bachelor of Engineering in Information Technology

Malla Reddy Engineering College For Women, Hyderabad, India

Contact this candidate