BLITZEN: A highly integrated massively parallel machine
The design of BLITZEN, a highly integrated chip with 128 processing elements (PEs) is presented. The bit serial processing element is described, and some comparisons with the massively parallel processor (MPP) and the Connection Machine are provided. Local control features and methods for memory access are emphasized. The organization of PEs on the custom chip, with emphasis on interconnection and I/O schemes, is described. Details of the custom chip design and instruction pipeline are provided. An overview of system architecture concepts and software for BLITZEN is also given. Each PE has 1 Kb of static RAM and performs bit-serial processing with functional elements for arithmetic, logic, and shifting. Unique local control features include modification of the global memory address by data local to each PE and complementary operations based on a condition register. Fixed-point operations on 32-b data can exceed a rate of one billion operations per second. Since the processors are bit-serial devices, performance rates improve with shorter word lengths. The bus oriented I/O scheme can transfer data at 10,240 MB/s.