fpgaConvNet

SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip

FCCM 2024

Petros Toupas, Zhewen Yu, Christos-Savvas Bouganis and Dimitrios Tzovaras

Partially offloading both weights and activations to off-chip memory without penalising the computation pipeline.
- preprint
AutoWS: Automate Weights Streaming in Layer-wise Pipelined DNN Accelerators

DATE 2024

Zhewen Yu and Christos-Savvas Bouganis

A memory management methodology for balancing the allocation of weights both on-chip and off-chip.
- preprint
SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on FPGA Devices

FPT 2023

Alexander Montgomerie-Corcoran, Petros Toupas, Zhewen Yu, and Christos-Savvas Bouganis

Design space optimisations and hardware support for state-of-the-art object detection models.
- preprint
Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific Tensor Decomposition

FPL 2023

Zhewen Yu, and Christos-Savvas Bouganis

Tensor Decomposition methods for approximating CNN parameters and improving accelerator performance.
- preprint
fpgaHART: A toolflow for throughput-oriented acceleration of 3D CNNs for HAR onto FPGAs

FPL 2023

Petros Toupas, Christos-Savvas Bouganis, Dimitrios Tzovaras

Optimising and mapping state-of-the-art 3D-CNN models for human action recognition (HAR) while aiming for throughput-oriented designs.
- preprint
PASS: Exploiting Post-Activation Sparsity in Streaming Architectures for CNN Acceleration

FPL 2023

Alexander Montgomerie-Corcoran*, Zhewen Yu*, Jianyi Cheng, Christos-Savvas Bouganis

this work addresses the challenges associated with exploiting post-activation sparsity for performance gains in streaming CNN accelerators.
- preprint
FMM-X3D: FPGA-based modelling and mapping of X3D for Human Action Recognition

ASAP 2023

Petros Toupas, Christos-Savvas Bouganis, Dimitrios Tzovaras

This study bridges the gap between recent developments in computer vision on videos and their deployment and applications on FPGAs by optimising and mapping a state-of-the-art 3D-CNN model (X3D) for human action recognition (HAR).
- preprint
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

FCCM 2023

Benjamin Biggs, Christos-Savvas Bouganis, George A. Constantinides

ATHEENA looks into using early exit networks with fpgaConvNet, and the challenges associated with handling early exit branches.
- doi
HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices

FCCM 2023

Petros Toupas*, Alexander Montgomerie-Corcoran*, Christos-Savvas Bouganis, Dimitrios Tzovaras

A new architecture based off of the design principles of fpgaConvNet is developed in this work, but has the benefit of runtime parameter reconfiguration. This makes it suitable for low-latency applications as it avoids the need to do full bitstream reconfiguration, which is necessary for Human Action Recognition (HAR).
- doi
SAMO: Optimised Mapping of Convolutional Neural Networks to Streaming Architectures

FPL 2022

Alexander Montgomerie-Corcoran*, Zhewen Yu* and Christos-Savvas Bouganis

This work builds on the capabilities of fpgaConvNet's design space exploration, by proposing two optimisers (simulated annealing and rule-based), as well as expanding the scope to other FPGA-based ML accelerator frameworks, namely FINN and HLS4ML.
- doi
StreamSVD: Low-rank Approximation and Streaming Accelerator Co-design

FPT 2021

Zhewen Yu and Christos-Savvas Bouganis

This work looks into methods of compressing ML models by using Singular Value Decomposition (SVD) to reduce the number of weight parameters required. fpgaConvNet was used to show how this method benefits ML accelerators.
- doi
DEF: Differential Encoding of Featuremaps for Low Power Convolutional Neural Network Accelerators

ASP-DAC 2021

Alexander Montgomerie-Corcoran and Christos-Savvas Bouganis

In this paper, a novel coding scheme is introduced which exploits the properties of feature maps to reduce the activity along off-chip memory data busses, leading to lower system power consumption.
- doi
Power-Aware FPGA Mapping of Convolutional Neural Networks

FPT 2019

Alexander Montgomerie-Corcoran and Christos-Savvas Bouganis

This work explores the power consumption aspect of fpgaConvNet, producing high-level models of how design choices affect the power consumption of the accelerator.
- doi
fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs

TNNLS 2018

Stylianos I. Venieris and Christos-Savvas Bouganis

The scope of fpgaConvNet is expanded to support more irregular network styles such as GoogleNet and ResNet, by including element-wise and concatenation layers in the design. This work addresses the design space exploration aspect of including multiple branches in the model.
- doi
Latency-Driven Design for FPGA-based Convolutional Neural Networks

FPL 2017

Stylianos I. Venieris and Christos-Savvas Bouganis

This work explores methods of targeting latency for fpgaConvNet by constraining the design to a single bitstream, and making use of runtime weight parameter reconfiguration.
- doi
fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs

FCCM 2016

Stylianos I. Venieris and Christos-Savvas Bouganis

The original paper detailing the design principles of fpgaConvNet, such as the application of Synchronous Dataflow (SDF) performance modelling for streaming architectures, and high-level resource estimation. This work also explores the use of bitstream reconfiguration for high-throughput applications.
- doi

Publications

SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip

FCCM 2024

AutoWS: Automate Weights Streaming in Layer-wise Pipelined DNN Accelerators

DATE 2024

SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on FPGA Devices

FPT 2023

Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific Tensor Decomposition

FPL 2023

fpgaHART: A toolflow for throughput-oriented acceleration of 3D CNNs for HAR onto FPGAs

FPL 2023

PASS: Exploiting Post-Activation Sparsity in Streaming Architectures for CNN Acceleration

FPL 2023

FMM-X3D: FPGA-based modelling and mapping of X3D for Human Action Recognition

ASAP 2023

ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

FCCM 2023

HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices

FCCM 2023

SAMO: Optimised Mapping of Convolutional Neural Networks to Streaming Architectures

FPL 2022

StreamSVD: Low-rank Approximation and Streaming Accelerator Co-design

FPT 2021

DEF: Differential Encoding of Featuremaps for Low Power Convolutional Neural Network Accelerators

ASP-DAC 2021

Power-Aware FPGA Mapping of Convolutional Neural Networks

FPT 2019

fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs

TNNLS 2018

Latency-Driven Design for FPGA-based Convolutional Neural Networks

FPL 2017

fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs

FCCM 2016