Skip to content

Cluster attraction groups #1802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jul 30, 2021
Merged

Cluster attraction groups #1802

merged 18 commits into from
Jul 30, 2021

Conversation

sfkhalid
Copy link
Contributor

Description

In order to place blocks in user-specified floorplan regions, the packing stage of vpr should pack primitives together when they are part of the same floorplan region. This is especially important if the specified floorplan regions are small, since it will be harder to make sure all the necessary primitives are placed within the region if they are packed into many different clustered blocks.

To achieve this goal during packing, attraction groups were introduced to the clustering process. The attraction groups are made up of atoms that are part of the same partition. During clustering, the molecule gain is increased for molecules that have the same attraction group as the cluster (if the cluster has an associated attraction group). This helps ensure that primitives in the same attraction group get packed together.

Related Issue

This pull request is part of the work related to issue #932.

@github-actions github-actions bot added the VPR VPR FPGA Placement & Routing Tool label Jul 15, 2021
@vaughnbetz vaughnbetz changed the title Cluster attraction groups WIP: Cluster attraction groups Jul 15, 2021
@vaughnbetz
Copy link
Contributor

Added WIP to the title as this isn't ready for merge yet.
Email from @luujason indicates our understanding of the pb_stats and gains seems correct -- only the top-level (root/cluster) pb_stats is used for gains. That means @sfkhalid should refactor the code to move the gains-related data out of the pb_stats data structure and into a new cl_optimize data structure (along with the other data we talked about today).
That could be done in a separate pr if it is easier to get the current code going with the redundant pb_stats loading code for attraction gains -- I suspect that is best as it will minimize the size of each PR.

@sfkhalid
Copy link
Contributor Author

sfkhalid commented Jul 27, 2021

As a note, I ran the sanitized basic tests locally, and they are failing locally as well. I ran the test on both this branch (cluster_attraction_groups) and the master branch, and there are failing tests on both branches. The tests are failing with the exit code 23.

@tangxifan
Copy link
Contributor

As a note, I ran the sanitized basic tests locally, and they are failing locally as well. I ran the test on both this branch (cluster_attraction_groups) and the master branch, and there are failing tests on both branch. The tests are failing with the exit code 23.

@sfkhalid Can you share the log files on the locally failed tests? I can look into it and figure it how to patch. Thanks!

@sfkhalid
Copy link
Contributor Author

sfkhalid commented Jul 27, 2021

As a note, I ran the sanitized basic tests locally, and they are failing locally as well. I ran the test on both this branch (cluster_attraction_groups) and the master branch, and there are failing tests on both branch. The tests are failing with the exit code 23.

@sfkhalid Can you share the log files on the locally failed tests? I can look into it and figure it how to patch. Thanks!

Sure, below are the log files of some of the failing tests as well as a summary of which tests failed.

master_ch_intrinsics_vpr.txt
master_diffeq1_vpr.txt
master_mkPktMerge_vpr.txt
master_single_ff_vpr.txt
master_single_wire_vpr.txt
master_run_summary.txt

These are all from the run done on the master branch. I've compared with the run that was done on the attraction groups branch and so far all the failures seem to be the same.

@tangxifan
Copy link
Contributor

@sfkhalid Just checked the reports. It seems to be a TBB issue. I double checked the log file of S:Basic test. The TBB is actually disabled in these tests. In my local tests, TBB was off as well.

If you can turn off TBB, we may bypass this issue and see if we see the same issue as the Github runner shows.
Or we may consider to add some tests for TBB

@sfkhalid
Copy link
Contributor Author

sfkhalid commented Jul 28, 2021

@tangxifan Did your local sanitized basic tests pass with TBB off? Also, where do you check to see if TBB is off in the test?

@tangxifan
Copy link
Contributor

@tangxifan Did your local sanitized basic tests pass with TBB off?

@sfkhalid Yes. That is what I see on my side.

@sfkhalid
Copy link
Contributor Author

@tangxifan Did your local sanitized basic tests pass with TBB off?

@sfkhalid Yes. That is what I see on my side.

Ok, I see, thank you. I'll try running without tbb on my side and see what happens

@vaughnbetz
Copy link
Contributor

There is also a suppression file for the sanitizers that the regtests use. Are you using that @sfkhalid ? I believe there are a few issues in the Intel TBB library (small memory leaks, some object accesses that the sanitizer complains about) that we can't fix, so we suppress those errors.

@sfkhalid
Copy link
Contributor Author

There is also a suppression file for the sanitizers that the regtests use. Are you using that @sfkhalid ? I believe there are a few issues in the Intel TBB library (small memory leaks, some object accesses that the sanitizer complains about) that we can't fix, so we suppress those errors.

@vaughnbetz No, I wasn't aware of that file. Where is it located/how do I enable it for the reg test?

@vaughnbetz
Copy link
Contributor

The leak sanitizer suppression file is at https://github.com/verilog-to-routing/vtr-verilog-to-routing/tree/master/vpr/lsan.supp

But looking at it, it doesn't have TBB suppression, just one suppression for graphics. There is also a valgrind suppression file, and it also doesn't have TBB suppresion (but perhaps doesn't need one). @kmurray : should we make an lsan.supp file with the TBB memory leaks and invalid object messages suppressed? If you already know those are spurious warnings it seems like we should make one.

@sfkhalid
Copy link
Contributor Author

sfkhalid commented Jul 29, 2021

A summary of some test cases:
Attraction Groups Testing.docx
These test cases were done to see whether using attraction groups during packing was giving any added benefit to helping VPR succeed with floorplan constraints.

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that not all the comments from our in-zoom review have been addressed, although the comments are marked resolved. Please address before merging.
Should update qor spreadsheet too so it checks the final code (after all updates).

@@ -2522,6 +2532,37 @@ void add_cluster_molecule_candidates_by_highfanout_connectivity(t_pb* cur_pb,
cur_pb->pb_stats->tie_break_high_fanout_net = AtomNetId::INVALID(); /* Mark off that this high fanout net has been considered */
}

void add_cluster_molecule_candidates_by_attraction_group(t_pb* cur_pb,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment on what this function does. Should have one.
I think this is a file scope function so should be static (I don't think it is in the header).
Should apply the same logic to all added functions: comment their use, and make them static if their declaration is not in the .h file. Comment their use in the .h file if they are declared there, or in the .cpp file (at the definition) if declared static in the .cpp file.
Should also declare all static functions near the top of the .cpp file (where the other static functions are declared).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add a comment to this function and the other add_cluster_molecule functions.
I made all of the add_cluster_molecule_candidates_by_x functions static and added the declaration for this new one at the top of the file, I'm not sure why that's not showing here. I can see it on this pull request if I go on under files changed though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Maybe I clicked on the wrong link and wasn't looking at the latest commit.

@sfkhalid
Copy link
Contributor Author

QoR spreadsheet for code comparison after all changes:
att_groups_qor_comparison_2.xlsx
No significant changes were noted across the categories between the master branch and the attraction groups branch.

@vaughnbetz
Copy link
Contributor

vaughnbetz commented Jul 30, 2021

Looks good; merging once CI goes green unless you tell me there is a reason to wait @sfkhalid

@sfkhalid
Copy link
Contributor Author

Looks good; merging once CI goes green unless you tell me there is a reason to wait @sfkhalid

Nope, should be good to merge once the tests pass.

@sfkhalid
Copy link
Contributor Author

@vaughnbetz The tests passed, I think this is good to merge now

@vaughnbetz vaughnbetz merged commit 110bae4 into master Jul 30, 2021
@vaughnbetz vaughnbetz deleted the cluster_attraction_groups branch July 30, 2021 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants