KIRAN MAI MALLUVALASA
806-***-**** *****.***.***@*****.***
LinkedIn GitHub Medium
SUMMARY
Over 5+ years of experience in the IT industry, specializing in Data Engineer, Data analytics, cloud migration, and Automation. Proven ability to design and deliver data-driven solutions that enhance operational efficiency and drive business insights. Proficient in Databricks, Advanced SQL queries and python and skilled in cloud-based architectures and DevOps tools such as Docker and CI/CD pipelines. Demonstrated ability to lead end-to-end projects from inception to deployment using Agile methodologies, including the development of data-driven interfaces and dashboards that empower users with actionable insights.
SKILLS:
• Frontend Development: HTML5, CSS, SASS, JavaScript, React, Next.js
• Backend Development: Node.js, Express.js
• Database Management: Oracle, SQL Server, MySQL, Teradata, DB2, HBase
• Cloud & DevOps: Azure, AWS, Docker, CI/CD pipelines.
• API Development: REST APIs, Postman
• Version Control: Git, CSV.
• Programming Languages: C, C++, C#, Java, Python.
• Agile Methodologies: Agile (Scrum, Kanban), Waterfall
• Security Protocols: OAuth, Active Directory, Firebase.
• ETL Tools: Informatica PowerCenter, Power Exchange, MDM, IDQ.
• Big Data: Hadoop, Hive, Pig, Sqoop, HDFS
• Project Management: SDLC, Process Improvement, Problem-solving. EDUCATION:
Texas Tech University - Master's, Computer Science GITAM University - Bachelor’s, Computer Science
Certificates & Awards:
Microsoft Fundamentals (2021)
Power BI Data Analyst Associate (2024)
Cognizant Cheers Award (2022)
Silver Award (Client Appreciation for the innovative Strategies of problem solving) (2022) PROFESSIONAL EXPERIENCE:
Texas Tech University (Non-Profit Organization) Lubbock, Texas Feb 24 - Present Data Engineer
• Conducted business requirements analysis for ORS, ORI, and RDMA, delivering data-driven solutions that enhanced operational efficiency by 25%.
• Conducted Designed and developed reliable ETL/ELT pipelines across One Lake, Azure Data Lake/Blob, Azure SQL, and Fabric, using Fabric Data Pipelines / Data Factory, Databricks/Spark SQL, and Python to transform high-volume data from SIS, LMS, HR, and finance applications.
• Led the migration of legacy SQL Server and departmental databases to Fabric Warehouse/Lakehouse (and Azure Synapse where required), designing schema mappings, validation checks, and cutover plans to ensure zero data loss and uninterrupted access for critical campus systems
• Improved Fabric report and model performance by optimizing semantic models, enabling Direct Lake where applicable, implementing incremental refresh strategies, and refactoring inefficient measures, boosting dashboard responsiveness by 38%.
• Integrated academic and administrative data from multiple platforms including Azure SQL, Snowflake, Azure Synapse, and legacy BI datasets into Microsoft Fabric (Lakehouse/Warehouse), delivering unified analytics view for academics, admissions, finance, and student services teams.
• Designed Power BI dashboards with advanced visualizations and DAX functions, improving decision-making efficiency by 30%.
• Created advanced SQL queries, enhancing report accuracy and quality by 20%.
• Led data migration to Azure and Databricks, modernizing infrastructure and increasing data availability by 15%.
• Automated workflows with Power Automate and Python, boosting productivity by 25%.
• Created and managed tasks in Asana and Jira, incorporating Agile methodology to track deliverables, prioritize backlog items, and enhance team collaboration.
• Utilized Snowflake for scalable data warehousing and analysis, integrating it with business intelligence tools to streamline reporting.
• Knowledgeable in SAS, R, Cayuse, BANNER, IBM Cognos Framework Manager, Report Studio, and F&A Reports for data analysis and reporting.
• Developed research metrics dashboards and reports for TTU, integrating diverse data sources like HERD and patents data.
• Implemented row-level security and managed user roles to ensure data accessibility and security.
• Utilized Power BI Gateway for real-time report updates and ensured consistency across schemas and databases.
• Maintained report logic, validated data integrity, and published metrics reports on the university portal.
• Implemented forecasting models using machine learning, specifically leveraging the PolyS2 model, to predict the number of awards, proposals, and funding trends, improving strategic planning and decision-making accuracy by 20%.
Texas Tech University Lubbock, Texas May 23 – Dec 23 Student Assistant
• Spearheaded the migration of a legacy system by redeveloping core modules using JavaScript and C#, resulting in a 40% improvement in system performance and increased reliability.
• Improved front-end user interaction by developing components in React.js, increasing user satisfaction scores by 20% through faster, more responsive navigation.
• Configured and managed cloud-based infrastructure using Microsoft Azure, leading to a 35% cost reduction in server infrastructure while maintaining peak performance.
• Implemented CI/CD pipelines, reducing deployment time by 50% and enabling faster, more frequent, and reliable software releases
• Reduced security risks by implementing industry-standard protocols such as Active Directory and OAuth, decreasing unauthorized access incidents by 15%.
• Designed and optimized database structures using SQL Server, enhancing query performance and reducing data retrieval times by 20% for large datasets.
• Collaborated with UI/UX teams to create more responsive web designs, improving mobile user experiences by 30% and increasing traffic by 18%.
• Developed custom automation tools and software, reducing manual data entry and processing by 15% and significantly improving departmental workflow efficiency.
• Mentored and provided code reviews for junior developers, improving coding standards and team efficiency by 15%, and reducing post- deployment bugs by 10%.
Cognizant (Client: AVEVA) Bangalore, India Jun 21 – Jul 22 Programmer Analyst
• Utilized Informatica PowerCenter to develop ETL mappings, enabling seamless data integration and improving data processing efficiency by 30%.
• Designed mappings to process data into landing tables and created physical data objects using SQL queries, increasing data accessibility for business users by 25%.
• Authored and maintained shell scripts to backup, rename, and transfer data files from S3 buckets to servers, streamlining workflows and reducing manual intervention by 40%.
• Integrated automation schedules using Informatica Scheduler and Crontab, improving operational efficiency by 35%.
• Conducted comprehensive testing of applications in MDM Hub Console, IDQ, and S3 Browser, identifying and resolving 95% of defects through collaboration with developers using Azure DevOps.
• Created detailed test plans, scenarios, and cases in Microsoft Excel, tracking and reporting issues to facilitate efficient debugging and enhanced software quality by 20%. Cognizant Chennai, India Feb 21 – Jun 21
Intern
• Completed training on Microsoft Azure, focusing on pipeline development, cloud deployment, and DevOps practices, contributing to a project that automated CI/CD pipelines, increasing software delivery efficiency by 40%. This included deploying robust pipelines to streamline builds, testing, and deployment workflows, reducing manual intervention and errors by 30%.
• Executed projects using Hadoop Hive and PySpark, R language achieving a 50% improvement in data processing and transformation efficiency by optimizing queries and leveraging distributed computing for large-scale datasets. These efforts significantly accelerated data preparation for analytics and reporting.
• Earned Java programming certification, demonstrating proficiency in object-oriented programming and advanced algorithms. Applied problem-solving skills to real-world scenarios, reducing development time for key modules by 25%.
• Enhanced teamwork and professional communication skills, enabling seamless collaboration on cross-functional teams and increasing project delivery efficiency by 20% through effective stakeholder management and streamlined team coordination.
• Designed secure data pipelines by implementing Kerberos authentication for Hadoop clusters and enforcing fine- grained access control using AWS IAM, ensuring data privacy and HIPAA-compliant cloud operations.
• Automated daily ingestion workflows using Sqoop and Apache Oozie, extracting data from relational databases and loading it into Hive external tables with dynamic partitioning for efficient querying and downstream analytics.
• Developed data validation and quality-check utilities using Python and PySpark to ensure consistency across MySQL, Oracle, FTP, flat files, and Hive/HDFS, streamlining QA for compliance-driven pipelines. PROJECTS
F1 Racing Data Integration and Transformation
• Led the integration of F1 racing telemetry data into Azure Databricks, enabling real-time data processing and analytics for race performance insights, boosting data accessibility by 35%.
• Transformed raw telemetry data using PySpark, achieving a 50% improvement in processing efficiency and ensuring data readiness for downstream analytics and machine learning workflows.
• Designed and implemented dynamic visualizations with Power BI, providing actionable insights into team and driver performance, which enhanced decision-making capabilities by 30%.
• Utilized Databricks Delta Lake to establish a unified data layer, improving data reliability and query performance by 25%.
• Developed optimized ETL pipelines to automate the ingestion, transformation, and storage of large datasets, reducing manual effort by 40%.
• Ensured platform scalability and performance, accommodating data streams from multiple sources and reducing latency by 20%.
• Collaborated with cross-functional teams to align technical implementations with business goals, achieving a 25% improvement
• in project delivery timelines.
• Documented workflows and best practices for future scalability and knowledge sharing
• Link: GitHub - kiranmaimalluvalasa/Databricks
Cognizant (HDFS Data Management and Processing)
• Created sample data files (Salaries.csv and Percentages.csv) and uploaded them to a designated directory) in HDFS, improving data storage management by 30%.
• Executed HDFS report commands to analyze block distribution, reducing data retrieval time by 25% and ensuring data integrity across distributed storage.
• Improved file retrieval efficiency from HDFS by 40% through optimized data access processes.
• Enhanced data processing efficiency by 20% through optimized file handling in HDFS for large datasets. Streamlined the file management process in HDFS, reducing overall data processing steps by 35%.