Proceedings of the 2025 2nd International Conference on Electrical Engineering and Intelligent Control (EEIC 2025)

Architectural Design of Artificial Intelligence Inference Accelerator

Authors
Haozhe Li1, *
1College of Information Science and Technology, Northwest University, Xi’an, China
*Corresponding author. Email: hl23886@essex.ac.uk
Corresponding Author
Haozhe Li
Available Online 23 October 2025.
DOI
10.2991/978-94-6463-864-6_5How to use a DOI?
Keywords
AI Inference Accelerator; Rapid; Quickloop; FPNA; Low-Precision Computing
Abstract

The paper present RaPiD, a revolutionary low-precision accelerator with a wide spectrum of supported precisions from 16-bit floating-point to 2-bit fixed-point, the first system supporting both training and inference operations at very low precision levels without sacrificing performance. On top of latest 7nm EUV technology, RaPiD demonstrates high efficiency with 3.5 TFLOPS/W at FP8 precision and 16.5 TOPS/W at INT4 precision. This makes it very suitable for AI computation that must be fast as well as powerful. Quickloop, an innovative accelerator engine utilizing reinforcement learning, significantly accelerates AI accelerator design by reducing turnaround and exploration times. Field Programmable Neural Array (FPNA) is also highlighted because it supports post-fabrication reconfigurability, making edge AI viable for it. The paper also elaborates on optimization techniques for micro-AI platforms, including Neural Architecture Search, quantization, and compression, that enable cost-effective runtime for DNN on resource-limited systems. FPNA, RaPiD, Quickloop, and micro-AI optimization schemes together constitute a valuable contribution in hardware design for AI and the future of intelligent, efficient, and resource-optimized implementations for AI applications across industries.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2025 2nd International Conference on Electrical Engineering and Intelligent Control (EEIC 2025)
Series
Advances in Engineering Research
Publication Date
23 October 2025
ISBN
978-94-6463-864-6
ISSN
2352-5401
DOI
10.2991/978-94-6463-864-6_5How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Haozhe Li
PY  - 2025
DA  - 2025/10/23
TI  - Architectural Design of Artificial Intelligence Inference Accelerator
BT  - Proceedings of the 2025 2nd International Conference on Electrical Engineering and Intelligent Control (EEIC 2025)
PB  - Atlantis Press
SP  - 32
EP  - 40
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-864-6_5
DO  - 10.2991/978-94-6463-864-6_5
ID  - Li2025
ER  -