-
Notifications
You must be signed in to change notification settings - Fork 415
Added cleaning of a pb after unsuccessful molecule packing #1656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
369f5d1
to
5c3070c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a test.
af31ac8
to
86060ce
Compare
@litghost Test added. |
I've run
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, waiting for other reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think we need to re-run CI as the Nightly one errored out for some reason (link).
@@ -1179,6 +1179,64 @@ static void alloc_and_load_pb_stats(t_pb* pb, const int feasible_block_array_siz | |||
} | |||
/*****************************************/ | |||
|
|||
/** | |||
* Cleans up a pb after unsuccessful molecule packing | |||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to comment this routine and how it works as much as you can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is it cleaning up? When is it safe to call?
LGTM too. My one request is to comment it as much as you can. I realize that may be limited if you don't have full understanding of exactly how / why these structures are set up, but if you give as much insight as you can into why this routine is being called, what it does and when it is safe to call it that would be good. |
vpr/src/pack/cluster.cpp
Outdated
@@ -1358,6 +1416,10 @@ static enum e_block_pack_status try_pack_molecule(t_cluster_placement_stats* clu | |||
revert_place_atom_block(molecule->atom_block_ids[i], router_data, atom_molecules); | |||
} | |||
} | |||
|
|||
/* Placement failed, clean the pb */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, is the cleanup_pb function supposed to be called when a molecule fails try_pack_molecule for any reason? Or only when it fails during atom placement (try_place_atom_block_rec)? I’m asking because in the issue it says “Add (or fix?) cleanup code that removes child_pbs created during atom placement if it fails for any reason.”
If it is meant to be called when a molecule fails for any reason, it should also be called if the atom fails the detailed routing legality check in try_pack_molecule. (line 1305/1363)
If the cleanup function is only meant to be called if the atom fails during try_place_atom_block_rec , then maybe the call for cleanup_pb should be at the point right after try_place_atom_block_rec is called. So right after the atom placement attempt, you could have a check of the block_pack_status and call the cleanup function if it is needed. Right now, where the cleanup_pb function is called, it could have failed the placement check or the pin feasibility filter check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. It should be called whenever try_place_atom_block_rec
fails. But if you look at the loop that invokes it for each atom you'll see that it terminates immediately on failure:
vtr-verilog-to-routing/vpr/src/pack/cluster.cpp
Line 1241 in 8e00e82
for (i = 0; i < molecule_size && block_pack_status == BLK_PASSED; i++) { |
Later in the flow there are calls to some atom placement reversal routines (
vtr-verilog-to-routing/vpr/src/pack/cluster.cpp
Line 1350 in 8e00e82
if (block_pack_status != BLK_PASSED) { |
Diving into the low-level details: The problem I observed with the assertion failure in try_place_atom_block_rec
was that for some reason the function was hitting already allocated pbs with their modes set causing a conflict with the mode required by the currently packed atom. Digging deeper I've discovered that the function is called for a pb_graph_node corresponding to that atom. Then it recurses up the hierarchy allocating pbs and setting their modes until the root CLB is reached. But these pbs are never freed between subsequent molecule packing. So if packing succeeds everything is fine but when it fails there are leftover pbs set to specific modes. This may cause mode mismatch if the mode required for the next molecule is different than the last one checked.
I typed "Add (or fix?)..." in the issue as I was surprised that the problem haven't manifested itself until now. I'm still not sure whether I'm adding the actually missing data structure cleanup or if the cleanup is already there and requires fixing and I've missed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, that makes sense. In that case LGTM.
a676b2c
to
c065ae8
Compare
I added more comments to the cleanup function that I've added. |
Looks good to me. Looks like it needs to be updated to match the base branch and then can be merged. |
@vaughnbetz Updated. |
33790e1
to
c1411b1
Compare
Signed-off-by: Maciej Kurc <[email protected]>
Signed-off-by: Maciej Kurc <[email protected]>
Signed-off-by: Maciej Kurc <[email protected]>
Signed-off-by: Alessandro Comodi <[email protected]>
c1411b1
to
5fe4b9b
Compare
vtr_reg_nightly got stuck again, but based on the testing and QoR data I'm confident this is OK and we should merge it. @litghost: seems like this is another case of a stuck nightly reg. |
@vaughnbetz thanks. |
Attempt to fix #1655