Skip to content

Deploy RRGraphBuilder in place of rr_node_indices in routing resource graph builder functions #1747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 1, 2021

Conversation

tangxifan
Copy link
Contributor

This pull request is a follow-up PR on the #1693

Description

  • Fully use RRGraphBuilder when building the lookup in routing resource graph. Eliminate the use of rr_node_indice
  • Cut off the dead loop between two header files: rr_graph2.h and rr_graph_storage.h. Previously, rr_graph2.h is included in the rr_graph_storage.h due to a shared data structure rr_edge_info_set. However, rr_graph_storage.h should be the primitive header files. Therefore, a new header file rr_edge.h is created to host the shared data structure rr_edge_info_set.
  • Added an API resize_nodes() to RRSpatialLookup to pre-allocate memory efficiently
  • Added an API mirror_nodes() to `RRSpatialLookup`` to enable batch copy of indices.

Related Issue

This is part of the PR request plan on routing resource graph, being a follow-up PR on the #1693

Motivation and Context

Motivated by @vaughnbetz 's suggestion, the top priority is to deploy the new data structure RRGraphBuilder in place of the legacy data structure rr_node_indices in routing resource graph builder functions.
This PR addresses the comment, where we fully remove the use of rr_node_indices in builder functions.
As a result, RRGraphBuilder data structure is the only one that builder functions are talking to.

How Has This Been Tested?

Should pass all the existing tests, since this PR is about refactoring.

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

tangxifan added 2 commits May 28, 2021 16:44
…osit invalid node ids when allocating RRSpatialLookup; No long need complex sanity checks. The vectors in RRSpatialLookup can be well aligned and allocated just in need; Remove the dead loop between rr_graph2.h and rr_graph_storage.h
@github-actions github-actions bot added the VPR VPR FPGA Placement & Routing Tool label May 28, 2021
* Considering a bounding box (x, y)->(x + width, y + height) of a multi-height and multi-width grid,
* SOURCE and SINK nodes are indexable in any location inside the boundry.
*
* Note: currently this function only accept SOURCE/SINK nodes. May unlock for other depending on needs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typos:
accept -> accepts.
other -> other types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make the RRGraphBuilder comments Doxygen compatible so we could build documentation of the rr-graph API off them eventually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for correction. I have adapted the comments about the APIs by following the style in vpr_context.h

* Considering a bounding box (x, y)->(x + width, y + height) of a multi-height and multi-width grid,
* SOURCE and SINK nodes are indexable in any location inside the boundry.
*
* Note: currently this function only accept SOURCE/SINK nodes. May unlock for other depending on needs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment later in this file about making a high-level API to duplicate nodes for large blocks. Part of this comment is about how the caller should use this function, which is a sign (in my opinion) that we should try to capture the higher-level use.

Also, does this function copy the whole vector of nodes at that [x][y][type]... location? That's memory-efficient, although there may not be enough large blocks for it to be a major issue. We could fix this in the future by making the find_node function smarter; it could just look up the node index based on the anchor-point of the large block, without ever explicitly storing the rr_node indices in multiple (x,y) spots for a large block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This function will copy a vector of node indices from a source [src_x][src_y][type][side] to a destination [des_x][des_y][type][side]. It is required when building the lookup for SOURCE and SINK nodes, which are supposed to be indexable at any location inside a large block (x,y)->(x+width, y+height).

I agree that we should have a high-level API to do the job, while this low-level API can be a private one. Here are my thoughts:

  • We can create a high-level API like expand_nodes() which will copy the vector from a source coordinate to all the coordinates in a bounding box.
  • I also think that it is better to rework the find_node() API so that we always search the lowest (x,y) corner when dealing with large blocks. But this may require the data structure to be dependent on DeviceGrid information (it needs to identify if a grid has height > 1 as well as width >1)

We can keep brainstorming on this front. I have left comments in the header file as a TODO item.

I have also added comments (a quick example) about how to use the API.


/* Resize three dimensions of the lookup under a given type of node to be memory efficient
* This function is called to expand the matrix when x, y or side
* when one or more of them beyond current capacity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final sentence doesn't quite connect. "when x, y, or side is large enough that it is beyond the current capacity of rr_node_indices".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also say exactly what this does. resizes the given 3 dimensions of the RRNodeLookup data structure for the given type. Any existing data in the lookup for this type is copied over (? I think?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I have reworked the comment to be precise and also comment that any existing data will be maintained without modification.

for (int chan = 0; chan < num_chans - 1; ++chan) {
for (int seg = 1; seg < chan_len - 1; ++seg) {
/* Assign an inode to the starts of tracks */
int x = (type == CHANX ? seg : chan);
int y = (type == CHANX ? chan : seg);
const t_chan_seg_details* seg_details = chan_details[x][y].data();

for (unsigned track = 0; track < indices[type][chan][seg][0].size(); ++track) {
for (int track = 0; track < max_chan_width; ++track) {
if (seg_details[track].length() <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely should let the .length = 0 case go through eventually, to model muxes. Could comment that now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Added this as a TODO item. I will check details about the t_chan_details data structure and see what we can do.

@@ -1132,8 +1098,14 @@ static void load_block_rr_indices(RRGraphBuilder& rr_graph_builder,
int root_x = x - width_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the whole double-for loop comtaining these calls be moved into RRGraphBuilder as a high-level API? Call it something like dupLargeBlkLookups(). Then mirror_nodes could be removed or made a (private if possible) helper function.

Right now the comment for mirror_node describes a good chunk of the body of this for loop, despite the for loop being in the caller. I think that's a sign we really want a higher-level API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is what I thought:

  • We can create a high-level API like expand_nodes() which will copy the vector from a source coordinate to all the coordinates in a bounding box.

I will keep thinking about what high-level API we need. I am positive for moving the for-loops into API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let's land this and keep iterating on the higher-level api.

@vaughnbetz
Copy link
Contributor

Thanks Xifan. Looks mostly good, but I have some comments for you to consider.

@tangxifan
Copy link
Contributor Author

@vaughnbetz Thanks for the comments. I have addressed all of them and this PR is ready for your review.

Also, we can discuss if more should be added to this PR or leave them in next PRs:

  • Use find_node() API in place of the legacy function get_rr_node_index(). Then we can fully shadow rr_node_indices and remove it from the DeviceContext
  • Get rid of the legacy convention on swapping (x, y) for CHANX nodes when searching its index.
  • Develop high-level APIs in place of the mirror_nodes().

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Let's leave further improvements for the follow-on PRs you mention.

@@ -1132,8 +1098,14 @@ static void load_block_rr_indices(RRGraphBuilder& rr_graph_builder,
int root_x = x - width_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let's land this and keep iterating on the higher-level api.

@vaughnbetz
Copy link
Contributor

Will merge when CI goes green.

@tangxifan
Copy link
Contributor Author

@vaughnbetz Just the nightly test failed with an error:

13:10:43 up  4:06,  0 users,  load average: 3.04, 3.02, 3.00
++ free -h
              total        used        free      shared  buff/cache   available
Mem:           102G        5.5G         60G        8.6M         36G         96G
Swap:          1.0G          0B        1.0G
++ sleep 300
bash: : command not found


[ID: 7125009] Build finished after 14654 secs, exit value: 127


Warning: Permanently added 'localhost' (ED25519) to the list of known hosts.
[13:11:09] Collecting build artifacts from build VM
Build script failed with exit code: 127

But this happened after merging the latest master to my branch. I see similar failures in other PRs: #1728
Not sure why the error happens.

@vaughnbetz
Copy link
Contributor

This looks like another spurious kokoro failure; exit code 127 is apparently a command not found one. I'll merge this and hopefully kokoro recovers soon.

@vaughnbetz vaughnbetz merged commit c18b2f2 into master Jun 1, 2021
@vaughnbetz vaughnbetz deleted the rr_graph_builder branch June 1, 2021 21:00
@tangxifan
Copy link
Contributor Author

Thanks @vaughnbetz. I will keep an eye on the regression test and also start working on the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants