As a Machine Learning Engineer (LLM Fine-tuning & Performance Specialist), you'll play an integral role in improving the accuracy and performance of fine-tuned Large Language Models (LLMs) for real-world applications. This ground-breaking opportunity allows you to work with brand new ML technologies, collaborating closely with partners to drive innovation and ensure smooth integration and deployment of ML solutions. Your expertise will be essential in automating ML workflows and optimizing performance, making a lasting impact in the field of AI.
What you'll be doing:
Develop, implement, and be responsible for processes for optimizing LLMs using domain-specific data, improving the model's capability to complete structured tasks.
Develop and implement techniques to measure LLM performance, defining and monitoring metrics such as recall, F1, perplexity, BLEU, ROUGE, etc.
Develop with tools like ONNX, TensorRT for optimizing model inference on specialized hardware.
Collaborate with ISVs and IHVs to understand their outstanding performance requirements and ensure successful model integration.
Use C++ to improve ML model performance, specifically in performance-critical systems, and provide technical mentorship to junior engineers.
What we need to see:
8+ years of validated experience in system software or related field.
M.S. or higher degree in Computer Science/Data Science/Engineering and related field or equivalent experience.
Deep understanding of transformer architectures and large language models like GPT, BERT, T5, or similar.
Validated hands-on experience with fine-tuning LLMs for specific tasks and improving model performance using libraries like PyTorch.
Strong ability to assess and optimize model performance using relevant metrics and evaluation techniques.
Proficiency in crafting and automating ML workflows using tools such as Kubeflow, MLflow, or Airflow.
Excellent problem-solving skills, especially in debugging and improving LLM accuracy for real-world applications.
Proficiency in Python and knowledge of C++ for optimizing performance and developing system-level integrations.
Strong interpersonal skills for effective collaboration with internal teams and external partners.
Ways to stand out from the crowd:
Experience with LLM-based function and tool calling systems.
Understanding of distributed training for LLM fine-tuning and cloud platforms like Nvidia's NVCF.
Familiarity with hardware acceleration for ML workloads, including GPU and specialized hardware optimizations.
The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.