TinyTinyTPU
A minimal 2×2 systolic-array TPU-style matrix-multiply unit, implemented in SystemVerilog and deployed on FPGA.
This project implements a complete TPU architecture including:
2×2 systolic array (4 processing elements)
Full post-MAC pipeline (accumulator, activation, normalization, quantization)
UART-based host interface
Multi-layer MLP inference capability
FPGA deployment on Basys3 (Xilinx Artix-7)
Resource Usage (Basys3 XC7A35T):
LUTs: ~1,000 (5% utilization)
... continue reading