Towards a General Compilation Approach for On-device Training in Embedded Systems
I Topko, T Harbaum, J Becker - 2024 IEEE Nordic Circuits and …, 2024 - ieeexplore.ieee.org
I Topko, T Harbaum, J Becker
2024 IEEE Nordic Circuits and Systems Conference (NorCAS), 2024•ieeexplore.ieee.orgMachine learning (ML) compilers become essential for the deployment process on hardware
with the increasing application of artificial intelligence algorithms on a wide range of
devices. Current ML frameworks support multi-level optimizations for diverse hardware back-
ends, but focus only on the inference phase of the ML life cycle. However, deployment of an
ML model on hardware platforms in the real environment rises the already known
challenges from the classical ML. The issues, such as model performance degradation and …
with the increasing application of artificial intelligence algorithms on a wide range of
devices. Current ML frameworks support multi-level optimizations for diverse hardware back-
ends, but focus only on the inference phase of the ML life cycle. However, deployment of an
ML model on hardware platforms in the real environment rises the already known
challenges from the classical ML. The issues, such as model performance degradation and …
Machine learning (ML) compilers become essential for the deployment process on hardware with the increasing application of artificial intelligence algorithms on a wide range of devices. Current ML frameworks support multi-level optimizations for diverse hardware back-ends, but focus only on the inference phase of the ML life cycle. However, deployment of an ML model on hardware platforms in the real environment rises the already known challenges from the classical ML. The issues, such as model performance degradation and distribution shift, can be solved for hardware platforms by performing domain adaptation techniques in the deployment environment. This paper demonstrates a general compilation approach based on Apache TVM for on-device training in embedded systems, and presents the first results of performance measurements for embedded GPU and CPU platforms, and an AI accelerator. This compilation approach will simplify the process of deploying deep neural networks with the prebuilt ability to train on the device.
ieeexplore.ieee.org
Showing the best result for this search. See all results