Skip to content

Submitting Koios benchmarks #1753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 14, 2021
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions .github/kokoro/presubmit/nightly_test4.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Format: //devtools/kokoro/config/proto/build.proto

build_file: "vtr-verilog-to-routing/.github/kokoro/run-vtr.sh"

# 72 hours
timeout_mins: 4320

action {
define_artifacts {
# File types
regex: "**/*.out"
regex: "**/vpr_stdout.log"
regex: "**/parse_results.txt"
regex: "**/qor_results.txt"
regex: "**/pack.log"
regex: "**/place.log"
regex: "**/route.log"
regex: "**/*_qor.csv"
regex: "**/*.out.gz"
regex: "**/vpr_stdout.log.gz"
regex: "**/parse_results.txt.gz"
regex: "**/qor_results.txt.gz"
regex: "**/pack.log.gz"
regex: "**/place.log.gz"
regex: "**/route.log.gz"
regex: "**/*_qor.csv.gz"
strip_prefix: "github/vtr-verilog-to-routing/"
}
}

env_vars {
key: "KOKORO_TYPE"
value: "presubmit"
}

env_vars {
key: "KOKORO_DIR"
value: "vtr-verilog-to-routing"
}

env_vars {
key: "VTR_DIR"
value: "vtr-verilog-to-routing"
}

#Use default build configuration
env_vars {
key: "VTR_CMAKE_PARAMS"
value: ""
}

env_vars {
key: "VTR_TEST"
value: "vtr_reg_nightly_test4"
}

#Options for run_reg_test.py
# -show_failures: show tool failures in main log output
env_vars {
key: "VTR_TEST_OPTIONS"
value: "-show_failures"
}

env_vars {
key: "NUM_CORES"
value: "8"
}
2 changes: 1 addition & 1 deletion .github/kokoro/steps/vtr-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ find . -type f -regex ".*\.tar\.\(gz\|xz\)" -delete
find vtr_flow/tasks/regression_tests/vtr_reg_nightly_test1/ -type f -print0 | xargs -0 -P $(nproc) gzip
find vtr_flow/tasks/regression_tests/vtr_reg_nightly_test2/ -type f -print0 | xargs -0 -P $(nproc) gzip
find vtr_flow/tasks/regression_tests/vtr_reg_nightly_test3/ -type f -print0 | xargs -0 -P $(nproc) gzip

find vtr_flow/tasks/regression_tests/vtr_reg_nightly_test4/ -type f -print0 | xargs -0 -P $(nproc) gzip

# Make sure working directory doesn't exceed disk space limit!
echo "Working directory size: $(du -sh)"
Expand Down
50 changes: 47 additions & 3 deletions doc/src/vtr/benchmarks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,7 @@ They are suitable for FPGA architecture research and medium-scale CAD research.
stereovision0 Computer Vision
stereovision1 Computer Vision
stereovision2 Computer Vision
stereovision3 Computer Vision
tpu.32x32.int8 Deep Learning
tpu.16x16.int8 Deep Learning
stereovision3 Computer Vision
================ =================

The VTR benchmarks are provided as Verilog under: ::
Expand All @@ -66,6 +64,51 @@ The Titan benchmarks are suitable for large-scale FPGA CAD research, and FPGA ar

.. seealso:: :ref:`titan_benchmarks_tutorial`

Koios Benchmarks
-----------------
The Koios benchmarks :cite:`koios_benchmarks` are a set of Deep Learning (DL) benchmarks.
They are suitable for DL related architecture and CAD research.
There are 19 designs that include several medium-sized benchmarks and some large benchmarks.
The designs target different network types (CNNs, RNNs, MLPs, RL) and layer types (fully-connected, convolution, activation, softmax, reduction, eltwise).
Some of the designs are generated from HLS tools as well.
These designs use many precisions including binary, different fixed point types int8/16/32, brain floating point (bfloat16), and IEEE half-precision floating point (fp16).

.. table_koios_benchmarks:

.. table:: The Koios Benchmarks.

================= ======================================
Benchmark Description
================= ======================================
clstm_like CLSTM-like accelerator
dla_like Intel-DLA-like accelerator
lstm LSTM engine
tpu_like Google-TPU-v1-like accelerator
bnn 4-layer binary neural network
tiny_darknet_like Accelerator for Tiny Darknet
gemm_layer 20x20 matrix multiplication engine
attention_layer Transformer self-attention layer
conv_layer GEMM based convolution
spmv Sparse matrix vector multiplication
robot_rl Robot+maze application
reduction_layer Add/max/min reduction tree
softmax Softmax classification layer
conv_layer_hls Sliding window convolution
eltwise_layer Matrix elementwise add/sub/mult
================= ======================================

Koios benchmarks are fully compatible with the full VTR flow. Some Koios benchmarks use advanced DSP features that are available in only a few FPGA architectures provided with VTR. This is because they instantiate DSP macros to implement native FP16 multiplications or use the hard dedicated chains, and these are architecture-specific. If users want to use a different FPGA architecture file, they can replace the macro instantiations in the benchmarks with their equivalents from the FPGA architectures they wish to use.

Alternatively, users can disable these advanced features. The macro ``complex_dsp`` can be used for this purpose. If complex_dsp is defined in a benchmark file (using ```define complex_dsp`` in the beginning of the benchmark file), then advanced DSP features mentioned above will be used. If a user wants to run a Koios benchmark with FPGA architectures that don't have these advanced DSP features (for example, the flagship architectures: ``$VTR_ROOT/vtr_flow/arch/timing/k6_frac_N10_*_mem32K_40nm*``), then they can remove the line defining the complex_dsp macro. This enables the same functionality with behavioral Verilog that is mapped to the FPGA soft logic when an architecture without the required macro definitions is used.

The VTR benchmarks are provided as Verilog (enabling full flexibility to modify and change how the designs are implemented) under: ::

$VTR_ROOT/vtr_flow/benchmarks/verilog/koios

The FPGA architectures with advanced DSP that work out-of-the-box with Koios benchmarks are available here: ::

$VTR_ROOT/vtr_flow/arch/COFFE_22nm/k6FracN10LB_mem20K_complexDSP_customSB_22nm.*

MCNC20 Benchmarks
-----------------
The MCNC benchmarks :cite:`mcnc_benchmarks` are a set of small and old (circa 1991) benchmarks.
Expand Down Expand Up @@ -114,3 +157,4 @@ where :math:`K=` ``<#>``.
spla 2278
tseng 1583
========= ========================================

8 changes: 8 additions & 0 deletions doc/src/z_references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -415,3 +415,11 @@ @ARTICLE{murray_micro_symbiflow
number={},
pages={1-1}
}

@inproceedings{koios_benchmarks,
title={Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research},
author={Arora, Aman and Boutros, Andrew and Rauch, Daniel and Rajen, Aishwarya and Borda, Aatman and Damghani, Seyed A. and Mehta, Samidh and Kate, Sangram and Patel, Pragnesh and Kent, Kenneth B. and Betz, Vaughn and John, Lizy K.},
booktitle={International Conference on Field Programmable Logic and Applications (FPL)},
year={2021}
}

Loading