An adiabatic method to train Binary Neural Networks

We proposed a new training method to realize binary neural networks. [1] This method can successfully train a conventional neural network into a binary one, where the neuron activations and/or weights take binary values of (+1 and -1). 

This work turned out to be compeletely in the field of Computer Science. It has little to do with Physics except the adiabatic method that is often used in physics studies. In fact, our initial goal was to try realize artificial neural network in a physical system based on MTJ-based MRAMs. But, in order to simplify the hardware, it is much more efficient to binarize the network first. 

Compare with the conventional neural network using floating-point weights and neuron activations, the binary neural network uses binary values (0 or 1) for its weights and activations. Not only can the binarized network save much of the storage, it also speeds up the inference significantly because the heavy floating point matrix multiplication can be replaced by simple bitwise XNOR operations. 

The advantage of binarization is obvious. However, how to train a network to cooperate with binary weights and activations is very challenging. One of the main obstacle is the lack of gradient in binary values, thus the widely used back-propagation algorithm cannot be used in training. In order to solve this issue, we propose an adiabatic method to train a binary neural network, where the real valued weights/activations can be tuned adiabatically toward binary values in a smooth fashion, thus avoiding the gradient error. Our method is tested with several tasks, including the hand-written number recognition using dense network (see figure below, where the activations and weights take binary values at the end of training), spoken number recognition and dog-cat recognition using convoluted neural network, and the image recognition of 10 classes using ResNet20 or VGG-Small network. The performance of realized binary neural networks for all these tasks are almost identical to the full precision real-valued neural networks. 

Screen Shot 2021-09-24 at 5.41.35 PM

In comparison with other binarization method, the most essential advantage of this adiabatic training method is its simplicity: the adiabatic method does not require any change in terms of network structure or size, and only require a few lines change in coding in the existing back-propagation algorithms. We believe that this method can be directly applied for industrial level applications.

This work was conducted mainly by our former group member, Mr. Yuansheng Zhao, who is currently pursuing his PhD in The University of Tokyo. 

[1]. An Adiabatic Method to Train Binarized Artificial Neural Networks
Yuansheng Zhao and Jiang Xiao*
Sceintific Reports 11, 19797 (2021)


© Jiang Xiao 2014