Location: Houston, TX 77077 - Onsite 4 days per week.
Overview of Role:
As Lead Data Engineer, you will be responsible for helping to scale, implement, and architect new AI/ML initiatives at the enterprise level. This role is part of a brand new team as our client's business grows. This role is a combination of systems engineering, data analytics, and data integration.
Company Description:
Our client manages a large portfolio of companies and has a large focus on upholding integrity and being held accountable to the highest ethical standards. They are committed to fostering a culture of open communication, teamwork, and personal development.
Role Description:
Design and implement scalable, reliable data pipelines using technologies like Apache Spark, Hadoop, and Kafka.
Leverage AWS or Azure services (e.g., EC2, RDS, S3, Lambda, Azure Data Lake) for efficient data handling and processing.
Develop and optimize data models and storage solutions (SQL, NoSQL, Data Lakes) to ensure data quality and accessibility for operational and analytical applications.
Automate data workflows with ETL tools and frameworks (e.g., Apache Airflow, Talend) for efficient data integration and timely availability.
Collaborate with data scientists, providing the necessary infrastructure and tools for complex analytical models, using Python or R.
Ensure data governance and security compliance, implementing best practices in encryption, masking, and access controls within a cloud environment.
Skills and Experience:
Bachelor's degree in Computer Science, MIS, or equivalent education/experience.
Extensive background and experience in ETL/ELT pipelining.
3+ years of big data technology (Hadoop, Spark, Kafka), as well as cloud services (AWS preferred, Azure, GCP) in a storage/processing capacity.
Experience with cloud computing environments (AWS, Azure, GCP) and Data/ML platforms (Databricks, Spark).
Relational database management system Software (RDBMS) experience - e.g. PostGreSQL
ML model deployment experience is a plus!
Nice to have certifications: AWS Certified Solutions Architect, Azure Data Engineer Associate, Databricks Certified Associate Developer for Apache Spark.
Benefits:
Health, Dental, and Vision Insurance
15+ Days of PTO Annually
Educational Assistance available
Annual bonus (20%)
401(k) Matching Program
Please note: Candidates must be authorized to work in the United States to be considered at this time.