A Ternary Neural Network with Compressed Quantized Weight Matrix for Low Power Embedded Systems
Received: 14 January 2022 | Revised: 2 February 2022 | Accepted: 11 February 2022 | Online: 9 April 2022
Corresponding author: S. N. Truong
In this paper, we propose a method of transforming a real-valued matrix to a ternary matrix with controllable sparsity. The sparsity of quantized weight matrices can be controlled by adjusting the threshold during the training and quantizing process. A 3-layer ternary neural network was trained with the MNIST dataset using the proposed adjustable dynamic threshold. The sparsity of the quantized weight matrices varied from 0.1 to 0.6 and the obtained recognition rate reduced from 91% to 88%. The sparse weight matrices were compressed by the compressed sparse row format to speed up the ternary neural network, which can be deployed on low-power embedded systems, such as the Raspberry Pi 3 board. The ternary neural network with the sparsity of quantized weight matrices of 0.1 is 4.24 times faster than the ternary neural network without compressing weight matrices. The ternary neural network is faster as the sparsity of quantized weight matrices increases. When the sparsity of the quantized weight matrices is as high as 0.6, the recognition rate degrades by 3%, however, the speed is 9.35 times the ternary neural network's without compressing quantized weight matrices. Ternary neural network work with compressed sparse matrices is feasible for low-cost, low-power embedded systems.
Keywords:quantized neural network, Ternary neural netw, deep learning, image recognition
K. L. Masita, A. N. Hasan, and T. Shongwe, "Deep Learning in Object Detection: a Review," in International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems, Durban, South Africa, Aug. 2020, pp. 1–11. DOI: https://doi.org/10.1109/icABCD49160.2020.9183866
A. Alsheikhy, Y. Said, and M. Barr, "Logo Recognition with the Use of Deep Convolutional Neural Networks," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6191–6194, Oct. 2020. DOI: https://doi.org/10.48084/etasr.3734
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in 26th Annual Conference on Neural Information Processing Systems, Nevada, USA, Dec. 2012, vol. 25, pp. 1097–1105.
S. Sahel, M. Alsahafi, M. Alghamdi, and T. Alsubait, "Logo Detection Using Deep Learning with Pretrained CNN Models," Engineering, Technology & Applied Science Research, vol. 11, no. 1, pp. 6724–6729, Feb. 2021. DOI: https://doi.org/10.48084/etasr.3919
J. Lee, J. Lee, D. Han, J. Lee, G. Park, and H.-J. Yoo, "An Energy-Efficient Sparse Deep-Neural-Network Learning Accelerator With Fine-Grained Mixed Precision of FP8–FP16," IEEE Solid-State Circuits Letters, vol. 2, no. 11, pp. 232–235, Aug. 2019. DOI: https://doi.org/10.1109/LSSC.2019.2937440
K. Yokoo, M. Atsumi, K. Tanaka, H. Wang, and L. Meng, "Deep Learning based Emotion Recognition IoT System," in International Conference on Advanced Mechatronic Systems, Hanoi, Vietnam, Dec. 2020, pp. 203–207. DOI: https://doi.org/10.1109/ICAMechS49982.2020.9310135
N. Lee, M. H. Azarian, M. Pecht, J. Kim, and J. Im, "A Comparative Study of Deep Learning-Based Diagnostics for Automotive Safety Components Using a Raspberry Pi," in IEEE International Conference on Prognostics and Health Management, San Francisco, CA, USA, Jun. 2019, pp. 1–7. DOI: https://doi.org/10.1109/ICPHM.2019.8819436
B. H. Curtin and S. J. Matthews, "Deep Learning for Inexpensive Image Classification of Wildlife on the Raspberry Pi," in 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference, New York, NY, USA, Oct. 2019, pp. 0082–0087. DOI: https://doi.org/10.1109/UEMCON47517.2019.8993061
E. Kristiani, C.-T. Yang, and K. L. Phuong Nguyen, "Optimization of Deep Learning Inference on Edge Devices," in International Conference on Pervasive Artificial Intelligence, Taipei, Taiwan, Dec. 2020, pp. 264–267. DOI: https://doi.org/10.1109/ICPAI51961.2020.00056
M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1," Mar. 2016, Accessed: Feb. 12, 2022. [Online]. Available: http://arxiv.org/abs/1602.02830.
Y. Wang, J. Lin, and Z. Wang, "An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 2, pp. 280–293, Oct. 2018. DOI: https://doi.org/10.1109/TVLSI.2017.2767624
T. Simons and D.-J. Lee, "A Review of Binarized Neural Networks," Electronics, vol. 8, no. 6, Jun. 2019, Art. no. 661. DOI: https://doi.org/10.3390/electronics8060661
C. Baldassi, A. Braunstein, N. Brunel, and R. Zecchina, "Efficient supervised learning in networks with binary synapses," Proceedings of the National Academy of Sciences, vol. 104, no. 26, pp. 11079–11084, Jun. 2007. DOI: https://doi.org/10.1073/pnas.0700324104
K. Hwang and W. Sung, "Fixed-point feedforward deep neural network design using weights +1, 0, and −1," in IEEE Workshop on Signal Processing Systems, Belfast, UK, Oct. 2014, pp. 1–6. DOI: https://doi.org/10.1109/SiPS.2014.6986082
H. Yonekawa, S. Sato, and H. Nakahara, "A Ternary Weight Binary Input Convolutional Neural Network: Realization on the Embedded Processor," in IEEE 48th International Symposium on Multiple-Valued Logic, Linz, Austria, May 2018, pp. 174–179. DOI: https://doi.org/10.1109/ISMVL.2018.00038
S. Yin et al., "An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width," IEEE Journal of Solid-State Circuits, vol. 54, no. 4, pp. 1120–1136, Apr. 2019. DOI: https://doi.org/10.1109/JSSC.2018.2881913
L. Deng, P. Jiao, J. Pei, Z. Wu, and G. Li, "GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework," Neural Networks, vol. 100, pp. 49–58, Dec. 2018. DOI: https://doi.org/10.1016/j.neunet.2018.01.010
S. N. Truong, "A Dynamic Threshold Quantization Method for Ternary Neural Networks for Low-cost Mobile Robots," International Journal of Computer Science and Network Security, vol. 20, no. 2, pp. 16–20, 2020.
S. N. Truong, "A Low-cost Artificial Neural Network Model for Raspberry Pi," Engineering, Technology & Applied Science Research, vol. 10, no. 2, pp. 5466–5469, Apr. 2020. DOI: https://doi.org/10.48084/etasr.3357
J. L. Greathouse and M. Daga, "Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format," in SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, USA, Nov. 2014, pp. 769–780. DOI: https://doi.org/10.1109/SC.2014.68
X. Feng, H. Jin, R. Zheng, K. Hu, J. Zeng, and Z. Shao, "Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs," in 17th International Conference on Parallel and Distributed Systems, Tainan, Taiwan, Dec. 2011, pp. 165–172. DOI: https://doi.org/10.1109/ICPADS.2011.91
H. Kabir, J. D. Booth, and P. Raghavan, "A multilevel compressed sparse row format for efficient sparse computations on multicore processors," in 21st International Conference on High Performance Computing, Goa, India, Dec. 2014, pp. 1–10. DOI: https://doi.org/10.1109/HiPC.2014.7116882
J. C. Pichel and B. Pateiro-Lopez, "Sparse Matrix Classification on Imbalanced Datasets Using Convolutional Neural Networks," IEEE Access, vol. 7, pp. 82377–82389, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2924060
J. Ranjani, A. Sheela, and K. P. Meena, "Combination of NumPy, SciPy and Matplotlib/Pylab -a good alternative methodology to MATLAB - A Comparative analysis," in 1st International Conference on Innovations in Information and Communication Technology, Chennai, India, Apr. 2019, pp. 1–5. DOI: https://doi.org/10.1109/ICIICT1.2019.8741475
L. Deng, "The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, Aug. 2012. DOI: https://doi.org/10.1109/MSP.2012.2211477
How to Cite
MetricsAbstract Views: 457
PDF Downloads: 351
Copyright (c) 2022 S. N. Truong, Minh Le, Hien Huynh Thi Thu, Trang Dang Phuoc Hai
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.