Summary
Experienced Data Engineer with *+ years of expertise in developing and optimizing data pipelines, enhancing system scalability, and improving data processing efficiency. Proficient in Python, SQL, Apache Spark, AWS, GCP, Azure, and cloud-based solutions. Applied advanced ETL techniques and data transformations with Apache Spark, boosting data processing speed by 20%. Designed and maintained data models in AWS Redshift, supporting real-time analytics and enhancing reporting accuracy. Automated data extraction and integration processes, reducing manual work. Developed interactive dashboards with Tableau and Power BI to improve reporting efficiency.
Technical Skills
Programming Languages: Python, C, C++, Java, TypeScript, Scala, R, Bash
Web Technologies: HTML, CSS, Grafana, Yarn, Bootstrap, Sass, JavaScript, jQuery
Frontend Frameworks & Libraries: React.js, Angular.js, Next.js,OAuth, VBScript
Backend Frameworks & Technologies: Django, Flask, FastAPI, Node.js, Spring, Shell Scripting
Databases: SQL, MySQL, PostgreSQL, SQL Server, NoSQL,MongoDB, Cassandra
Software Development Methodologies: Agile, SCRUM, Waterfall
Cloud Platforms & Services: AWS, EC2, S3, Lambda, ECS, ECR, CloudFront, CloudWatch, CloudFormation, Azure, GCP, SalesForce, IaaS, PaaS, Dynatrace,SAP
DevOps & CI/CD: Git, GitHub, Jenkins, Docker, Kubernetes, CircleCI, Bitbucket
Messaging & Streaming: Redis, Kafka
Testing Frameworks: Jest
Data Science & Machine Learning Libraries: Pandas, NumPy, Matplotlib, Scikit-learn, SciPy, Plotly, Seaborn, Deep Learning, Databricks,data lake, TensorFlow
Data Visualization Tools: Tableau, Power BI,Pega,Qlik,SAS,Terraform, BI Tools
Big Data Technologies: Apache Spark, Hadoop, Looker, Apache Kafka, Semantic
Software Development Life Cycle
Natural Language Processing, Excel, statistics, mathematics, change management, technical writing
operating systems: Linux, UNIX, Unix
Professional Experience
Voltas Consulting Inc February 2024 – Present Data Engineer United States Technologies: Python, SQL, Java, Django, Flask, HTML/HTML5, CSS3/CSS, JavaScript, TypeScript, Git, Apache Spark, AWS (EC2, S3, Lambda, ECS, EMR), Azure, AWS Redshift
Developed ETL and data pipelines for processing healthcare data, ensuring smooth data ingestion, integration, and high-quality data flow using Python and SQL.
Applied data transformations and aggregation logic within Apache Spark, leveraging efficient data structures to enhance processing performance by 20% for large-scale healthcare datasets, often managing tasks within an Agile framework like Kanban.
Engineered cloud-based data solutions and pipelines using Azure and AWS services (S3, Lambda, EMR), improving system scalability, reliability, and real-time data ingestion capabilities, while managing Source Code throughout the software development life cycle.
Automated the data ingestion and extraction of medical claims data from multiple sources, utilizing scripting languages like PowerShell for efficient data management and improved reporting accuracy.
Designed optimized data models in AWS Redshift to support real-time analytics and reporting, integrating structured data pipelines into the data warehouse architecture and often utilizing T-SQL for complex queries and procedures.
Collaborated with teams to build web-based applications using Django and Flask, streamlining data access and leveraging clean data structures to improve operational efficiency by 10%.
I-Sparrow February 2021 – June 2023 Data Engineer India Technologies: Google Cloud Platform (GCP), REST, SOAP, JSON, API, Hadoop, Hive, PL/SQL, Snowflake, Jenkins, GitHub
Established and optimized ETL pipelines using Hadoop and Hive, reducing data processing time and enhancing storage efficiency for business intelligence applications.
Incorporated REST and SOAP APIs for seamless data extraction, transformation, and loading, improving system performance by 15% and scalability.
Managed large-scale data workflows and automated processes with Jenkins and GitHub Actions, enhancing deployment speed and consistency, while overseeing Source Code throughout the software development life cycle.
Applied Snowflake to design data models and optimize queries, reducing query response time by 30% and improving data access reliability for analytics teams, often leveraging SQL and T-SQL for complex data manipulations.
Developed interactive dashboards in Tableau and Power BI to provide key stakeholders with real-time business insights, increasing reporting efficiency.
Cooperated in an Agile environment, including Kanban methodologies, to deliver high-quality, data-driven solutions while adhering to project timelines and meeting sprint goals.
Projects
ETL Pipeline with Scheduling and Monitoring April 2021 – October 2021
Constructed an end-to-end ETL pipeline, automating data extraction from APIs and databases, reducing manual processing.
Improved data transformation processes, increasing data handling speed by 50%, and ensured seamless loading into AWS Redshift.
Configured and managed Apache Airflow for pipeline scheduling, automating daily data jobs and minimizing downtime.
Executed robust error monitoring and logging systems, reducing data processing failures by 30% and enhancing system reliability.
Education
Master of Science in Computer Science August 2023 – May 2025 University of Central Missouri
Certificates & Achievements
Attained AWS Data Engineer - Associate Certification.
Earned AWS Developer - Associate Certification.
Acquired Oracle Database SQL Certified Associate Certification.
Manikanta J
Data Engineer
Email: *************@*****.*** LinkedIn: linkedin.com/in/Manikanta Mobile: +1-816-***-**** Place: United States