Our AI Technology Group enables state-of-the-art ML and DL model development across our hardware portfolio, using sophisticated model compression and acceleration techniques to deploy previously impractical AI tasks to battery-powered environments. Our team identifies neural architectures best suited to our customer’s needs, selects those models most amenable to deployment on our platform, trains them carefully, tuning for memory, compute, and energy constraint tradeoffs, and deploys them using AI runtimes optimized for our hardware platform. Finally, we publish and socialize our findings via conferences, workshops, and publications.
Beyond a healthy obsession with computational efficiency, the successful candidate will be comfortable operating in a ‘version zero’ environment, marshaling internal, open-source, and third-party resources to solve our customers' problems quickly and elegantly.
Specific Responsibilities
Optimize embedded AI runtimes such as Tensorflow Lite for Microcontrollers to utilize hardware products efficiently.
Develop advanced inference performance profiling tools to help customers identify optimization targets and solutions.
Develop novel, ahead-of-time AI model inference compilers to achieve better power, latency, and memory performance incorporating state-of-the-art pruning and quantization techniques.
Develop training-side tools and libraries to help AI developers identify neural architectures that optimally run on our platforms.
Publish and maintain these tools, including documentation and other assets our customers need to bootstrap their internal AI features.
Socialize their achievements via conferences, meetups, workshops, and publications.
Requirements
Education
A bachelor’s degree in computer science or a related field requires at least 2 years of relevant experience. A master’s degree or PhD in related topics is highly desirable
Required Skills/Abilities
Experience writing CPU kernels leveraging vector accelerators such as Arm Helium, Arm Neon, or Intel AVX. Past work with CUDA, OpenCL, or other low-level kernel development environments is a plus.
Experience with AI model performance profiling.
Experience with embedded C or C++
Experience with Keras and Tensorflow (TFLite, TFLite for Microcontrollers).
Bonus Qualifications
Experience with compiler development
Experience with developing for embedded NPUs
Past TinyML/EdgeAI involvement or experience
Experience developing and optimizing for TFLite for Microcontrollers
Experience with model-to-binary compilers (IREE, MicroTVM, etc)
Experience with ONNX, TOSA, Jax, LLVM, and/or MLIR
Experience with optimizing for heterogeneous AI compute (e.g., CPU+NPU+DSP)