Saga is an AI-powered anime production platform built for content creators. We are looking for a ambitious and highly-skilled DevOps / Infrastructure Engineer to join our engineering team.
Responsibilities:
Design and maintain scalable infrastructure for our platform, ensuring performance, reliability, and security across all environments.
Collaborate with cross-functional teams (AI, design, product, etc.) to build and optimize cloud-based infrastructure that supports our AI-driven anime production tools.
Automate deployment and provisioning using CI/CD pipelines, ensuring streamlined workflows from development to production.
Monitor, troubleshoot, and optimize system performance, ensuring high availability and minimal downtime across our services.
Ensure cloud infrastructure security by implementing best practices and conducting regular security audits and improvements.
Implement monitoring and alerting systems to proactively detect and resolve issues in staging and production environments.
Manage databases and data pipelines within our GCP environment, optimizing for scalability and cost-effectiveness.
Support the integration of AI/ML models by ensuring the infrastructure is optimized to handle machine learning workloads and high-performance computing.
Qualifications:
Our platform is built with React JS, Python, and Typescript, and hosted on Google Cloud Platform (GCP).
Proficiency in Google Cloud Platform (GCP), including services like Cloud Functions, GCS, BigQuery, and Cloud Storage.
Experience with infrastructure as code tools (Terraform or similar).
Familiarity with containerization and orchestration tools like Docker and Kubernetes.
Strong understanding of CI/CD pipelines, automation, and version control (Git).
Experience with databases, particularly PostgreSQL, and managing data storage solutions at scale.
Knowledge of cloud security best practices and experience with securing infrastructure and services.
Familiarity with serverless architectures, microservices, and modern API development practices (RESTful APIs, GraphQL).
Experience managing high-traffic, production-level applications and optimizing infrastructure for reliability and performance.
New grads and experienced engineers welcome to apply.