The candidate would optimize onboard GPU performance, including profiling GPU applications to understand bottlenecks, improving GPU kernel implementation, and applying modern GPU features (such as memory allocation, data transmission, resources constraint) .RequirementsStrong knowledge in CUDAHands on experiences on GPU Direct/Nsight System/MIGGood to haveExperiences with TensorRTLevel: junior & senior recblid 9ueg1tqyqaglagj0n6ane9dk0ao7ut