Skip to content

Adding a TPU-like design to the VTR benchmark suite #1573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 8, 2020

Conversation

aman26kbm
Copy link
Contributor

Description

This PR adds a new design to the VTR benchmark suite. The design is a Google TPU-like design and is a good representative of Deep Learning (DL) applications.

There are two designs actually - one that uses a 32x32 matrix multiplication unit and another that uses a 16x16 matrix multiplication unit.

There is a large comment section on the top of each file, which mentions the FPGA architectures that these designs have been tested with, the clock frequency/critical path obtained on each architecture and also the resource usage.

Related Issue

No issue exists for this. This was discussed in email threads with @vaughnbetz .

Motivation and Context

In the current VTR benchmark suite, we don't have any benchmarks that are DL/ML related. Adding a DL/ML design will be helpful for researchers aiming to explore FPGA architectures for DL/ML applications.

How Has This Been Tested?

Adding these doesn't modify an existing functionality. So, I haven't run any existing tests. I've run these VTR flow with these designs on the FPGA architectures specified in the files.

Types of changes

  • Bug fix (change which fixes an issue)
  • [ x] New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [ x] My change requires a change to the documentation
  • [ x] I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

…nchmarking.

The two designs use the same skeleton. The main difference is that one of them
has a 16x16 matrix multiplication unit, while the other one has a 32x32 unit.
@aman26kbm
Copy link
Contributor Author

@vaughnbetz , please review this PR and provide feedback.

@jeanlego
Copy link
Contributor

jeanlego commented Oct 17, 2020

This will break Odin full regression in its current state because of the wildcard we use on the VTR flow verilog directory

But full regression in only ran on buildbot, Kokoro doesnt run any Odin test as far as I can tell from the cfg, so it should not block you with the CI.

I'll generate the expected results for the Odin regression suite and do a PR onto yours to not break the flow.

@aman26kbm
Copy link
Contributor Author

Thanks, @jeanlego

@aman26kbm
Copy link
Contributor Author

Can someone approve this PR? Please let me know if there is something I need to do.

@jeanlego
Copy link
Contributor

Once kokoro run It cn be merge on my end (@vaughnbetz ?)

@jeanlego
Copy link
Contributor

@aman26kbm is it blocking anything?

@aman26kbm
Copy link
Contributor Author

Thanks for checking, @jeanlego . This isn't blocking anything. So, we can check it in later.

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like very nice designs to me Aman -- thanks!
@jeanlego : I think the Odin-II tests are OK now -- is that right?
Aman, I see you've added new tests for these designs but I think they're not run in any of our current CI. Is that right?
Also, I see you're adding them as new VTR benchmarks in the documentation -- I have some questions below on what's the best way to "productize" these benchmarks.
I think these can be merged, as all my comments are really enhancements and questions on what's next.

I'm not sure what the right approach in terms of benchmark suite is.
0. Do you have a testbench for these designs? If so, useful to check it in too. We don't have testbenches for most circuits, but good to keep them if we have them.

  1. We could add these designs to the existing VTR suite. In that case I think we should add them to CI for VTR designs by adding them to the benchmark list, after confirming that they don't take forever to do a binary search place and route. If binary search place and route takes forever, we could add them to a new VTR at fixed channel width test.

  2. Or we could start a new ML benchmark suite, with these being the first entries. I tend to think 1 is better for now, as these designs go through ODIN-II and hence are more like the existing VTR designs (rather than the Titan flow designs).

// Matrix multiplication unit
////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should either fill in as much of this as possible (engineer, company can be institution, etc.) or delete this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will edit and push a new change over the weekend

@vaughnbetz
Copy link
Contributor

@andrewboutros : adding Andrew.

@aman26kbm
Copy link
Contributor Author

aman26kbm commented Oct 29, 2020

Thanks for the review, @vaughnbetz . Please find comments/questions below.

#1

I see you've added new tests for these designs but I think they're not run in any of our current CI. Is that right?

Actually I didn't any tests. I assume you're referring to the changes in .json file in the PR? @jeanlego modified those because ODIN tests were picking up the new files.
I assume a "test" involves running the VTR flow with different settings/configurations/architectures. And adding a test for the design will "protect" the design from changes to ensure it always works with the flow. I am not familiar with the testing flow we have for VTR. I can read up on it and see which testlist we can add this to.

#2

Do you have a testbench for these designs? If so, useful to check it in too.

I do have a simple verilog testbench with a few tests that test the main features of the design. I can add that. Is there a recommended directory structure for this? I like to keep the design and the testbench together. So, then each benchmark would become a sub-directory containing design and testbench files.

#3

We could add these designs to the existing VTR suite. In that case I think we should add them to CI for VTR designs by adding them to the benchmark list, after confirming that they don't take forever to do a binary search place and route.

The time taken by the designs for the total VTR flow for 2 architectures is in the banner in the files. The 16x16 design should be okay to add to the CI. It takes <1 hour. Not sure what the current limits are, for something to be in the CI regression. The fixed-width part probably runs very quickly. I can find that time and report it here.

#4

Or we could start a new ML benchmark suite, with these being the first entries. I tend to think 1 is better for now, as these designs go through ODIN-II and hence are more like the existing VTR designs (rather than the Titan flow designs).

We can have a non-Titan flow ML benchmark suite as well. In fact, I think a non-Titan flow ML benchmark suite will have more general use, no?
For now, we only have one design (TPU-like design with 2 variants), but we'll have more soon. So, we can create a new directory called "ml" under "benchmarks", and put these designs there.

@vaughnbetz
Copy link
Contributor

vaughnbetz commented Oct 29, 2020

#1: For tests, I suggest

#2: if we make each benchmark a sub-directory I am pretty sure it will break the existing vtr_flow scripts that will run the regtests above. That could be fixed, but the faster way to check in the testbenches would be to make a /testbenches directory under the vtr_flow/benchmarks/verilog directory and put a README and testbenches (or subdirectories with testbenches) under it.

#3: Seems like these should be added to the benchmarks/verilog directory and run as part of the vtr suite during QoR runs then.

#4: I think we can make a benchmarks/ml directory and just put a README in it pointing to the designs in benchmarks/verilog and later titan/. That way we don't have to add a lot of new tests, and we can run these new designs alongside the existing Odin-II compatible designs (with our current scripts) and average QoR across the larger suite.
Basically this groups designs by the flow they go through, and then READMEs/lists define subsets and combinations of interest. I think that will be the more maintainable solution.

Adding @MohamedElgammal and @kmurray in case they have any thoughts.

@jeanlego
Copy link
Contributor

@jeanlego : I think the Odin-II tests are OK now -- is that right?

correct

I do believe they are a nice addition to the benchmarks, but I don't see them as essential to run with the Odin regression, I leave that up to you.

FYI we glob the VTR benchmark directory to make sure all the tests available synthesize as expected and there are no changes on the circuits, hence the JSON I generated.

@vaughnbetz
Copy link
Contributor

Discussed this with some other contributors (google etc.) in a VTR meeting today. They like the plan above (put in vtr_flow/benchmarks/verilog, add to list of benchmarks in the qor sweeps etc.).

@aman26kbm
Copy link
Contributor Author

Sounds good. I have added the sub-tasks here on my task list. Will post updates/ask questions as I work through them.

@vaughnbetz
Copy link
Contributor

I could merge this PR and then we could track the update tasks in #1 - 4 above in a separate PR if you like Aman. Let me know if you'd like that, or if you'd rather do everything on this PR.

@aman26kbm
Copy link
Contributor Author

I think let's merge this one and we can handle #1-#4 in another PR. I have added those to my task list. May be I can create an issue and track those tasks there?

@vaughnbetz
Copy link
Contributor

Sounds good. Yes, please go ahead and create a separate issue and track the updates there.

@vaughnbetz vaughnbetz merged commit a9402fc into verilog-to-routing:master Nov 8, 2020
@aman26kbm
Copy link
Contributor Author

Thanks, @vaughnbetz
I've filed #1590.
Please assign it to me if possible. It doesn't seem like I can do that myself.

@aman26kbm aman26kbm deleted the vtr_benchmark_tpu branch November 8, 2020 16:39
@aman26kbm aman26kbm restored the vtr_benchmark_tpu branch November 8, 2020 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants