Praveen Kumar Polinati Data Engineer
San Jose, CA +1-669-***-**** ************@*********.*** SUMMARY
Innovative Senior-Big Data Engineer with 4+ years of experience in designing, developing, and optimizing highly scalable and efficient data infrastructure that drives key business insights and supports large-scale analytics initiatives.
Proven expertise in building and maintaining high-performance data pipelines and infrastructure that scale with rapid business growth and change, utilizing a wide range of big data technologies and cloud platforms.
Established and maintained a robust data quality framework, implementing automated checks that improved data accuracy by 45% and ensured compliance with industry standards and regulatory requirements. EDUCATION
San Jose State University ME in Data Engineering Aug 2022 – May 2024 San Jose, CA BML Munjal University BE in Electrical, Electronics and Communications Engineering Aug 2016 – Jul 2020 Kapriwas, HR TECHNICAL SKILLS
Programming Languages: Python, R, SQL, C++, JSON-RPC, Scala, Java, Unix Shell Scripting, React Database Systems: SQL (MySQL, PostgreSQL), Redshift, MongoDB, Teradata, Netezza, BigQuery NoSQL databases: DynamoDB, Neo4J, Titan, Elasticsearch, Storm Performance Tuning: SQL Tuning, Execution Plan Analysis, Index Optimization Big Data Technologies: Hadoop, Apache Spark, Apache HBase, Apache Hive, MapReduce, ClickHouse, Presto Cloud Platforms: AWS (Redshift, EMR, IAM), Azure (SQL Database, ADLS2, Event Hub, Data Lake), Google Cloud Machine Learning & NLP: TensorFlow, PyTorch, scikit-learn, JAX, JIT, NLTK, spaCy, Hugging Face, TransformersVersion Control & Collab Tools: Git, SVN, Jira, Confluence, Trello, GitHub, Docker, Kubernetes,ITIL framework
Other Skills: Documentation, Team Collaboration, Data Cleaning, Data Wrangling, Qlik, Troubleshooting, Data Warehousing, Data Analysis, Data Migration, Data Visualization WORK EXPERIENCE
Morgan Stanley, USA Data Engineer Jan 2024 – Present
Architected comprehensive BI solutions using Power BI and Tableau, providing actionable insights that improved decision- making efficiency by 40% across multiple business units, including investment management and risk assessment teams.
Implemented robust statistical data quality procedures on new data sources, supporting Data Scientists in data preparation and visualization, resulting in 30% faster insight generation and a 25% increase in data reliability for critical financial models.
Utilized cutting-edge Generative AI techniques to automate data extraction and risk management processes, reducing manual effort by 60% and improving accuracy of insights by 25% in areas of market trend analysis and portfolio risk assessment.
Collaborated with Big Data Policy and Security teams to create comprehensive data policies, develop sophisticated interfaces for data synthesis and anonymization, ensuring 100% compliance with data protection regulations such as GDPR and CCPA.
Developed and implemented a Generative AI tool for rapid report generation and data visualization, increasing reporting efficiency by 45% and enabling real-time analysis of market trends and investment performance.
Optimized ETL processes for integrating diverse data sources, including financial market data, company performance metrics, and ESG indicators, improving data integration speed by 55% and enabling more comprehensive investment analysis.
Optimized large-scale data transformations using JAX, leveraging JIT compilation and vectorization for high-performance processing.
Gained expertise in ISS Proxy Exchange platform, integrating it with Aladdin and other external systems to streamline voting operations and enhance data flow efficiency by 35%.
Supported the development of next-generation client reporting tools, automating manual processes and connecting stewardship activities to third-party data, resulting in a 50% reduction in reporting time. KPIT Business Solutions, India Jr. Data Engineer Aug 2019 - Jun 2022
Designed and implemented sophisticated ETL processes using Python, SQL, and Apache Airflow, reducing data processing time by 50% for high-volume data integration tasks involving multiple financial data sources.
Developed and maintained robust data pipelines using Apache Airflow and Kafka, ensuring efficient and real-time data flow across various systems, including trading platforms and risk management systems.
Implemented advanced machine learning models using Python, TensorFlow, and scikit-learn, leading to a 25% increase in predictive accuracy for business forecasting and market trend analysis.
Collaborated with cross-functional teams to integrate ESG data into existing financial models, enabling more comprehensive risk assessment and investment decision-making processes.
Managed end-to-end Business Intelligence applications, resulting in a 30% increase in user satisfaction and a 25% reduction in time-to-insight for critical business processes, particularly in areas of financial reporting and performance analytics. PROJECTS
• Real-Time Stock Market Examination with Kafka (EC2, Athena, Apache Kafka)
• Electric Vehicles Trend in San Franscisco (AWS Glue, AWS S3, Redshift, Tableau)
• Uncovering Coordinated Twitter Campaigns Using Spark/AWS (Pyspark,EMR, Redshift)