Optimization with Olive and AIMET, plus tools

Over the last two weeks I’ve been researching open source tools starting with optimization – was particularly interested in revisiting the Olive announcement at Microsoft Build 2023 back in May, which promised a hardware-aware model optimization framework to any target device. While we all know about ONNX, the standard data exchange format for machine learning models, Olive was noted as the recommended data exchange for ONNX Runtime and I forked it on GitHub. Looking forward to running Olive end to end and seeing if it really delivers on the promise presented at Microsoft Build. Many other chip vendors such as Intel, Qualcomm and others already have Execution Providers up and running with Olive, so it’s something I want to spend more time on.

Overall it is positive that the world of edge AI is moving out of opaque esoteric manual development (esp on the compiler mid-to-backend) and out into more open frameworks for common use. To illustrate, I found this interesting article on Medium by the Microsoft AI framework team that combines OpenAI’s Whisper models with ONNX Runtime and Olive for speech recognition… try it out?

Separately, my Twitter post on AI compilers led one of my Twitter followers @LightSpeedEsper to introduce me to Qualcomm’s open sourced AIMET framework a few weeks ago. It seems AIMET is an efficiency fwk, ie does quantization and compression as well so not optimization-specific, but looks promising.

Also tools! I’ve been looking into any tools out there to help with the task of NN model simplification and debugging. Although I knew of onnxsim, I came across two interesting tools yesterday from NVidia’s TensorRT toolbox called Polygraphy and ONNX Graph Surgeon. Seems a bit outdated (ie two years old) but may be useful… anyone tried them out?