Role Description:
The engineer will be responsible for the design and development of optimization tools for neural networks, transformers, and large language models (LLMs). This role involves applying post-training, training-aware, and other advanced optimization techniques to enhance model efficiency and performance.
Key responsibilities:
1. Develop optimization toolchains for computer vision models and large language models (LLMs).
2. Perform hardware-aware model optimization and porting for Ambarella platforms.
3. Research and evaluate emerging technologies, including pruning, quantization, and fine-tuning techniques for convolutional neural networks (CNNs), transformers, and LLMs.
4. Provide technical support and solutions to customers regarding model optimization and deployment.
Requirements:
1. Education background: Master degree or Ph.D degree
2. Minimum experience: At least one year of relevant work or academic experience
3. Similar or other experiences:
- Experience of model deployment would be a plus
- Experience of model optimization such as pruning and quantization would be a plus
- Experience of LLM fine-tuning would be a plus
- Experience of model porting would be a plus
4. Skills
- Background of machine learning based computer vision or LLM knowledge
- Familiar with machine leaning frameworks like Pytorch, TensorFlow, or Huggingface
- Skilled in Python and C/C++ programing.