Tuesday 08 April 2025
The quest for efficiency in artificial intelligence has led researchers to develop novel techniques for accelerating sparse deep neural networks on microcontrollers, small devices that are increasingly used in edge computing applications. These tiny computers are designed to perform tasks locally, without relying on cloud services or other remote infrastructure.
To achieve this goal, a team of scientists has created lightweight software kernels and hardware extensions that significantly speed up the execution of sparse neural networks on microcontrollers. The researchers focused on N:M pruning, a technique that removes weights from deep neural networks while preserving their functionality.
The new approach combines optimized software kernels with a custom-designed instruction set architecture (ISA) extension, which accelerates specific operations required by the sparse neural networks. This hybrid solution enables a 3.21-fold speedup over traditional dense neural networks on a ResNet18 model, while maintaining an accuracy loss of less than 1.5%.
The team also demonstrated that their method can be applied to other deep learning models, such as Vision Transformers (ViTs), achieving comparable results. The increased efficiency is particularly important for edge devices, which are often limited by power consumption and memory resources.
To further optimize the performance of sparse neural networks on microcontrollers, researchers have developed a set of optimized software kernels that target ultra-low-power, multicore RISC-V microcontrollers. These kernels achieve up to 2.1-fold speedup over their dense counterparts at 1:8 sparsity, while reducing memory footprint by up to 79.59%.
The lightweight ISA extension, dubbed xDecimate, plays a crucial role in the acceleration of sparse neural networks. It allows for efficient decompression and indirect load operations required by the kernels, resulting in an extra speedup of up to 1.9% at a minimal area overhead.
The success of this hybrid approach highlights the importance of collaboration between software and hardware designers in developing optimized solutions for edge computing applications. As deep learning models continue to grow in complexity, efficient execution on microcontrollers will become increasingly essential for widespread adoption in areas such as computer vision, natural language processing, and robotics.
The research has far-reaching implications for the development of autonomous devices, smart home appliances, and wearables, which rely on efficient processing of neural networks. The team’s findings demonstrate that it is possible to achieve high performance while minimizing power consumption and memory usage, paving the way for more widespread deployment of AI-enabled devices in various industries.
Cite this article: “Efficient Sparse Deep Neural Networks on Microcontrollers: A Lightweight Software-Hardware Co-Design Approach”, The Science Archive, 2025.
Artificial Intelligence, Deep Learning, Neural Networks, Microcontrollers, Edge Computing, Sparse Neural Networks, Risc-V, Xdecimate, Instruction Set Architecture, Hybrid Approach