The Basic Architecture of TPU and Analysis of Optimization Scenarios
- DOI
- 10.2991/978-94-6463-986-5_48How to use a DOI?
- Keywords
- TPU ASIC; Systolic Array; Energy Efficiency; Algorithm-Hardware
- Abstract
Over the past decade, transformer models have ballooned from 110 M to 540 B parameters, rendering Central Processing Unit (CPU)/Graphics Processing Unit (GPU) baselines infeasible for both training and latency-critical inference. Google’s Tensor Processing Unit (TPU) program—spanning four silicon generations from the 28 nm, 40 W TPU v1 to the 7 nm, 175 W TPU v4—has emerged as the first large-scale deployment of domain-specific ASICs purpose-built for dense and semi-structured tensor arithmetic. This paper provides a holistic, quantitative study of the TPU ecosystem. This paper dissects the deterministic, compiler-co-designed micro-architecture of TPU v3, highlighting how systolic arrays, scheduling, and compiler-managed scratchpads eliminate cache-miss variability and sustain 90 %+ MAC utilization. Scenario analyses of four production workloads. This paper further identifies the memory wall, dynamic-shape compilation stalls, and ecosystem breadth as the primary challenges, and map emerging optimizations—weight-streaming, compiler-ahead sharding via JAX/pjit, optical-circuit switching, and the Pathways runtime—that together promise to extend TPU leadership.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Zeyou Zhu PY - 2026 DA - 2026/02/18 TI - The Basic Architecture of TPU and Analysis of Optimization Scenarios BT - Proceedings of the 2025 International Conference on Electronics, Electrical and Grid Technology (ICEEGT 2025) PB - Atlantis Press SP - 465 EP - 472 SN - 2352-5401 UR - https://doi.org/10.2991/978-94-6463-986-5_48 DO - 10.2991/978-94-6463-986-5_48 ID - Zhu2026 ER -