This is an Ab Initio Administration Lead position and not a developer position.
We are seeking a highly skilled Ab Initio Admin with a robust background in Ab Initio and some experience with Cloudera and AWS to join our dynamic IT team.
The Lead Ab Initio ETL Administrator is responsible for leading all the tasks involved in the administration of the ETL tool (Ab-Initio) in the Cloud.
This position also required hands-on work.
The candidate will support the implementation of a Data Integration/Data Warehouse for the Data products on-prem and in AWS.
The position does not have direct reports but is expected to assist in guiding and mentoring less experienced staff.
May lead a team of matrixed resources.
Qualification Requirements:
Advanced (expert preferred) level experience in administrating and engineering relational databases (ex. MySQL, PostgreSQL, Mongo DB, RDS, DB2), Big Data systems (ex. Cloudera Data Platform Private Cloud and Public Cloud), automation tools (ex. Ansible, Terraform, Bit Bucket) and experience working cloud solutions (specifically data products on AWS) are necessary.
Require prior experience with AWS Cloud.
At least 10 years of Experience with all the tasks involved in the administration of ETL Tool (Ab Initio)
At least 10 years of Experience with advanced knowledge of Ab Initio Graphical Development Environment (GDE), Meta Data Hub, Operational Console
Created Big Data pipelines (ETL) from on-premises to Data Factories, Data Lakes, and Cloud Storage such as EBS or S3.
Experience with advanced knowledge of UNIX, Linux, Shell scripting, and SQL.
Experience working and troubleshooting issues related to Hive, ICFF, and HDFS.
Experience with managing metadata hub-MDH, Operational Console, and troubleshooting environmental issues that affect these components
Experience with scripting and automation such as designing and developing automated ETL processes and architecture and unit testing of the ETL code
Troubleshoot potential issues with Kerberos, TLS/SSL, Models, and Experiments, as well as other workload issues that data scientists might encounter once the application is running.
Strong knowledge of Cloudera (Hadoop, HDFS, YARN, Hive, Spark) administration and management.