Work closely with various Business, IT, Analyst and Data Science groups to collect business requirements.
Design, develop, deploy and support high performance data pipelines both inbound and outbound.
Optimize data pipelines for performance, scalability, and reliability.
Implement CI/CD pipelines to ensure continuous deployment and delivery of our data products.
Ensure quality of critical data element, prepare data quality remediation plans and collaborate with business and system owners to fix the quality issues at its root.
Document the design and support strategy of the data pipelines.
Capture, store and socialize data lineage and operational metadata.
Troubleshoot and resolve data engineering issues as they arise.
Develop REST APIs to expose data to other teams within the company.
Mentor and guide junior data engineers.
Qualifications:
Six (6) plus years of experience in building data lakes and cloud data platforms
Experience working with GCP technologies like BigQuery, Composer, GCS, DataStream, Dataflows
Expert knowledge on SQL and Python programming
Experience working with Airflow as workflow management tools and build operators to connect, extract and ingest data as needed.
Experience in tuning queries for performance and scalability
Experience in Real Time Data Ingestion using GCP PubSub, Kafka, Spark or similar.
Excellent organizational, prioritization and analytical abilities
Have proven experience working in incremental execution through successful launches.
Excellent problem-solving and critical-thinking skills to recognize and comprehend complex data issues affecting the business environment.
Additional Information:
Benefits including-
Health, dental, vision, life and disability insurance
401k Retirement Program + 6% employer match
Participation in Flexible Time Off (FTO) Policy
12 Paid Holidays
Keywords: GCP, Google Cloud Platform, BigQuery, Composer, SQL, Python, Data Engineering, Data Pipelines