Job Description:
- Administer and maintain the Cloudera Data Platform (CDP) across all environments (dev/test/prod)
- Strong expertise in Big Data ecosystem like Spark, Hive, Sqoop, HDFS, Map Reduce, Oozie, Yarn, HBase, Nifi.
- Develop and optimize complex Hive queries, including the use of analytical functions for reporting and data
...
transformation.
- Create custom UDFs in Hive to handle specific business logic and integration needs.
- Ensure efficient data ingestion and movement using Sqoop, Nifi, and Oozie workflows.
- Work with various data formats (CSV, TSV, Parquet, ORC, JSON, AVRO) and compression techniques (Gzip, Snappy)
to maximize performance and storage.
- Monitor and tune performance of YARN and Spark applications for optimal resource utilization.
- In depth Knowledge on Architecture of Distributed Systems and Parallel Computing.Internal- Good knowledge in Oracle PL/SQL and shell scripting.
- Strong problem-solving and analytical thinking.
- Effective communication and documentation skills.
- Ability to collaborate across multi-disciplinary teams.
- Self-driven with the ability to manage multiple priorities under tight timelines.
experience
5show more Job Description:
- Administer and maintain the Cloudera Data Platform (CDP) across all environments (dev/test/prod)
- Strong expertise in Big Data ecosystem like Spark, Hive, Sqoop, HDFS, Map Reduce, Oozie, Yarn, HBase, Nifi.
- Develop and optimize complex Hive queries, including the use of analytical functions for reporting and data
transformation.
- Create custom UDFs in Hive to handle specific business logic and integration needs.
- Ensure efficient data ingestion and movement using Sqoop, Nifi, and Oozie workflows.
- Work with various data formats (CSV, TSV, Parquet, ORC, JSON, AVRO) and compression techniques (Gzip, Snappy)
to maximize performance and storage.
- Monitor and tune performance of YARN and Spark applications for optimal resource utilization.
- In depth Knowledge on Architecture of Distributed Systems and Parallel Computing.Internal- Good knowledge in Oracle PL/SQL and shell scripting.
- Strong problem-solving and analytical thinking.
- Effective communication and documentation skills.
- Ability to collaborate across multi-disciplinary teams. ...
- Self-driven with the ability to manage multiple priorities under tight timelines.
experience
5show more