Lead Data Engineer at Harnham in Houston, Texas

Posted in Other about 2 hours ago.

Type: full-time





Job Description:

As a Lead Data Engineer you will:
  • Design and implement scalable and reliable data pipelines to ingest, process, and store diverse data at scale, using technologies such as Apache Spark, Hadoop, and Kafka.
  • Work within cloud environments like AWS or Azure to leverage services including but not limited to EC2, RDS, S3, Lambda, and Azure Data Lake for efficient data handling and processing.
  • Develop and optimize data models and storage solutions (SQL, NoSQL, Data Lakes) to support operational and analytical applications, ensuring data quality and accessibility.
  • Utilize ETL tools and frameworks (e.g., Apache Airflow, Talend) to automate data workflows, ensuring efficient data integration and timely availability of data for analytics.
  • Collaborate closely with data scientists, providing the data infrastructure and tools needed for complex analytical models, leveraging Python or R for data processing scripts.
  • Ensure compliance with data governance and security policies, implementing best practices in data encryption, masking, and access controls within a cloud environment.
  • Monitor and troubleshoot data pipelines and databases for performance issues, applying tuning techniques to optimize data access and throughput.
  • Stay abreast of emerging technologies and methodologies in data engineering, advocating for and implementing improvements to the data ecosystem.

What We Need From You
  • Bachelor's Degree computer science, MIS, or other business discipline and 10+ years of experience in data engineering, with a proven track record in designing and operating large-scale data pipelines and architectures Req or
  • Master's Degree computer science, MIS, or other business discipline and 5+ years of experience in data engineering, with a proven track record in designing and operating large-scale data pipelines and architectures Req
  • Expertise in developing ETL/ELT workflows
  • Comprehensive knowledge of platforms and services like Databricks, Dataiku, and AWS native data offerings
  • Solid experience with big data technologies (Apache Spark, Hadoop, Kafka) and cloud services (AWS, Azure) related to data processing and storage
  • Strong experience in AWS and Azure cloud services, with hands-on experience in integrating cloud storage and compute services with Databricks
  • Proficient in SQL and programming languages relevant to data engineering (Python, Java, Scala)
  • Hands on RDBMS experience (data modeling, analysis, programming, stored procedures)
  • Familiarity with machine learning model deployment and management practices is a plus
  • Strong communication skills, capable of collaborating effectively across technical and non-technical teams
  • AWS Certified Solution Architect Preferred
  • Databricks Certified Associate Developer for Apache Spark Preferred
  • Azure Data Engineer Associate Preferred
  • or other relevant certifications. Preferred

More jobs in Houston, Texas

Other
less than a minute ago

Best Buy
Other
less than a minute ago

Best Buy
Other
1 minute ago

InfoSpeed Services, Inc.
More jobs in Other

Other
less than a minute ago

Planet Technology
Other
less than a minute ago

Walmart
Other
less than a minute ago

Walmart