Job Role and Skills: Data Software Engineer - Spark, Python, (AWS, Kafka or Azure Databricks or GCP)
Job Type/Mode - PARTIALLY REMOTE
Job Description
5-12 Years of in Big Data & Data related technology experience
Expert level understanding of distributed computing principles
Expert level knowledge and experience in Apache Spark
Hands on programming with Python
Proficiency with Hadoop v2, Map Reduce, HDFS, Sqoop
Experience with building stream-processing systems, using technologies such as Apache Storm or Spark-Streaming
Experience with messaging systems, such as Kafka or RabbitMQ
Good understanding of Big Data querying tools, such as Hive, and Impala
Experience with integration of data from multiple data sources such as RDBMS (SQL Server, Oracle), ERP, Files
Good understanding of SQL queries, joins, stored procedures, relational schemas
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
Knowledge of ETL techniques and frameworks
Performance tuning of Spark Jobs
Experience with native Cloud data services AWS or AZURE Databricks
Ability to lead a team efficiently
Experience with designing and implementing Big data solutions
Practitioner of AGILE methodology
LEVEL OF EXPERTISE
Apache Spark - 5 years
Python - 5 years
PySpark - 3 years
AWS - 3 years
GCP - 3 years
Azure - 3 years
Apache-Kafka - 3 years