As an ML Accelerator Compiler Developer, you will be responsible for designing and optimizing compilers for Machine Learning (ML) accelerators. You will work on enhancing performance, efficiency, and portability of ML models by developing compiler toolchains, optimizations, and code generation techniques tailored for specialized hardware architectures.
Responsibilities:
Develop and optimize compiler toolchains for ML accelerators, including front-end parsing, intermediate representation (IR) transformations, and backend code generation.
Implement and enhance ML-specific optimizations such as operator fusion, memory layout transformations, quantization-aware compilation, and scheduling.
Collaborate with hardware architects to co-design compiler optimizations aligned with accelerator capabilities.
Work on ML frameworks (PyTorch, ONNX) to integrate compiler passes for efficient execution on target hardware.
Improve performance through domain-specific optimizations, autotuning, and parallelization techniques.
Debug and analyze performance bottlenecks across software and hardware stacks.
Develop automated testing, benchmarking, and profiling tools for validating compiler optimizations.
Qualifications:
Strong proficiency in compiler development, including experience with LLVM, MLIR, TVM, or similar frameworks.
Expertise in Machine Learning model execution, optimization, and deployment.
Strong programming skills in C++, Python, and assembly-level optimizations.
Knowledge of parallel computing, vectorization, and memory hierarchy optimizations.
Familiarity with deep learning frameworks (TensorFlow, PyTorch, ONNX).
Strong analytical skills for performance profiling and debugging.
Experience in graph optimizations, quantization, and code generation.
Preferred Qualifications
Knowledge of heterogeneous computing, DSPs, and low-level hardware programming.
Familiarity with AI model deployment and inference optimization techniques.