Using Network-On-Chip Structure in Deep Neural Network Accelerator Design
Özet
The widespread adoption of Deep Neural Networks (DNNs) in various fields, such as image and speech recognition, natural language processing (NLP), and autonomous systems, has been noted.
However, the computational cost of these networks is often prohibitively high due to the large number of communicating layers and neurons and the significant amount of energy consumed.
To address these challenges, developing new architectures to accelerate DNNs is necessary.
In this thesis, a Network-on-Chip (NoC)-based DNN accelerator is proposed, taking into consideration both fully connected and partially connected DNN models.
Heuristic methods, including Integer Linear Programming (ILP) and Simulated Annealing (SA), are utilized to group the neurons, to minimize the total volume of data among the groups. The neurons are then mapped onto a 2D mesh NoC fabric, utilizing ILP and SA, to minimize the system's total communication cost.
The proposed design is novel in that it addresses the issue of high data communication in DNNs by utilizing the scalable, low-overhead, and energy-efficient NoC communication structure. Through extensive experimentation on various benchmarks and DNN models, an average improvement of 40% in communication cost has been observed.
The proposed design targets low-overhead inferencing and training DNNs on edge devices in the Internet-of-Things (IoT) era, with a combination with cloud computing.
The results of this thesis provide a new approach for the acceleration of DNNs and can be applied to various fields, such as edge computing, IoT, autonomous systems, computer vision, natural language processing, speech recognition, and cloud computing.