Post Job Free
Sign in

Data Engineer - AI & Geospatial Analytics Expert

Location:
Houston, TX
Posted:
February 03, 2026

Contact this candidate

Resume:

Manikanta Guduru

Data Engineer Ph: +1-913-***-**** Email: **************@*****.*** LinkedIn

Professional Summary:

Data Scientist with 4+ years of experience designing and deploying scalable, cloud-native AI and data solutions. Proven track record of delivering machine learning models, real-time pipelines, and geospatial analytics across transportation, healthcare, and life sciences industries. I am expert in Python, Spark, SQL, Snowflake, and AWS/Azure platforms. Adept at cross-functional collaboration, driving business impact through applied AI, and leading initiatives across full data product lifecycles.

TECHNICAL SKILLS:

Languages: Python, SQL, Pyspark, Java, Shell.

Frameworks: Apache Spark, Spark Streaming, Hadoop, Airflow, NiFi.

ML/AI Tools: Scikit-learn, XGBoost, SVM, TensorFlow (basic), CNN, NLP (spaCy, TF-IDF, embeddings)

Data Platforms: Snowflake, Redshift, PostgreSQL, SQL Server, Big Query, Oracle, Teradata.

Cloud & Tools: AWS (Glue, EMR, S3, Step Functions), Azure (ADF, Data Lake, HDInsight).

Visualization & GIS: Tableau, Power BI, ArcGIS, QGIS, spatial SQL, mapping APIs.

Streaming & Messaging: Kafka, Kinesis.

CI/CD & IaC: Terraform, Jenkins, Azure DevOps, GitHub, Bitbucket.

Certification: Snowpro core Certified

Professional Experience:

Client: Texas Department for Transportation Role: Data Analyst Nov 2023 to Present

Responsibilities:

Built and maintained GIS data mapping pipelines to standardize roadway, asset, and project-location data, enabling consistent geocoding, spatial joins, and map-ready layers for planners.

Integrated shapefiles and feature layers into Snowflake and AWS storage, validating coordinate systems and geometry quality for statewide reporting.

Developed an AI/ML driven cost estimation tool that predicts project cost ranges using historical engineer estimates, bid values, scope attributes, schedule and material features, Fred API.

Engineered training datasets in Snowflake and Python (Pandas, NumPy) and Spark and built predictive models using Random Forest, XGBoost and SVM baselines; experimented with CNN-based approaches for pattern learning on structured representations.

Applied NLP on unstructured project descriptions and notes to extract signal (keywords, topics, embeddings) and improve estimate accuracy and explainability for engineering stakeholders.

Implemented automated data quality checks (nulls, duplicates, referential integrity) and performance tuning (clustering keys, pruning-friendly modeling, warehouse sizing) to keep dashboards fast and trustworthy.

Client: Kaiser Permanente Role: Data Engineer Jan 2023 to Oct 2023

Responsibilities:

Built secure ingestion pipelines in Azure Data Factory and Azure Databricks (Pyspark) for clinical and subscription datasets, implementing HIPAA-aligned access controls, encryption, and audit logging.

Developed reusable Pyspark transformation frameworks for cleansing, duplication, and aggregations to standardize pipeline patterns and accelerate new dataset onboarding.

Optimized Databricks transformations through partitioning, and join strategies, managed Delta tables and compaction for consistent performance.

Client: Crispr Therapeutics Role: Data Engineer April 2020 to July 2021

Responsibilities:

Built scalable AWS Glue + Spark ETL pipelines orchestrated with Airflow to ingest, cleanse, and transform genomic and clinical trial datasets, including incremental loads and automated validation checks.

Designed star-schema dimensional models and curated analytics marts to support near real-time reporting, enabling fast self-service analysis for trial operations and outcome tracking.

Optimized large-scale Spark processing by tuning partitioning, join strategies, and file compaction, improving pipeline performance and reliability for downstream BI Team.

Education:

M.S. in Data Science — Wichita State University, Wichita, KS.

B.Tech. in Computer Science — KL University, India.



Contact this candidate