-
Notifications
You must be signed in to change notification settings - Fork 414
[AP][GlobalPlacment] Added Bound2Bound Solver #2949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AP][GlobalPlacment] Added Bound2Bound Solver #2949
Conversation
4f05fe7
to
3c3593a
Compare
// Get the anchor weight based on the iteration number. We want the anchor | ||
// weights to get stronger as we get later in global placement. Found that | ||
// an exponential weight term worked well for this. | ||
double coeff_pseudo_anchor = anchor_weight_mult_ * std::exp((double)iteration / anchor_weight_exp_fac_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something commonly done across analytical placers? Also, do you print this number in the output when showing information for each iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of two main techniques used for quadratic global placers (the other is to apply forces to moveable blocks directly). I do not print this information since it is specific to only this type of analytical solver. It would be a good debug thing to print in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree; it would definitely be useful for debugging. I guess my question wasn’t so much about adding the link to the anchor, but more out of curiosity: is increasing the anchor weight a common practice in other analytical placers as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe increasing the weight over the GP iterations is necessary. Without increasing the weight, GP may never converge on a solution. This is due to the forces bringing the blocks together would be higher than the forces pushing them apart.
Another benefit of increasing the anchor-weights is that it guarantees convergence. Once the anchor-weights hit infinity, the solved solution will be equal to the mass-legalized solution and therefore the gap must be zero. This is in theory, but I have yet to find a circuit that did not converge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Thanks!
} | ||
} | ||
|
||
void B2BSolver::init_linear_system(PartialPlacement& p_placement) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned before that building the matrices is pretty fast. Still, I think it would be useful to measure the time it takes to construct the matrices and include it as one of the columns in the AP output table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be too much information to be shown each iteration of global placement. But I can include the total time spent constructing matrices in the print_statistics method! I think it would be useful to know!
for (size_t row_id_idx = 0; row_id_idx < num_moveable_blocks_; row_id_idx++) { | ||
// Since we are capping the number of iterations, it is likely that | ||
// the solver will overstep and give a negative number here. Just | ||
// clamp them to zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is not accurate. The negative values are not caused by capping the iterations or "overstepping." In CG, negative values may appear because the solver doesn't enforce constraints, and clamping the output to zero isn't typically part of the standard CG flow. Also, I have a question—should we be checking to ensure the values don’t exceed the width/height limits as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is caused by capping the number of iterations. CG takes steps to move from a guess solution towards a better solution.
The solution MUST be somewhere within the bounds of the device since the fixed points are within the bounds of the device.
The guess is always within the chip by construction (its between 0 and the W/H of the device). If CG returns a negative number, it means that the solution was very close to zero (imagine all fixed blocks happen to be at 0) and it took a step towards 0 and overstepped. Given enough iterations it will step sufficiently close to 0 (either from the negative or positive direction); the capping causes it to stop early on the negative side.
You are correct that CG does not enforce constraints, but there are implicit constraints on the solution imposed by the locations of the fixed blocks (the solution will be within the bounding box of all fixed blocks by problem construction). The CG solver may return a point outside of this bounding box if it did not fully converge.
Regarding checking the width / height, technically these should be checked; however it should not be as big of a problem. The valid locations range from [0, W) for the x dim and [0, H) for the y dimension (technically its from [0, W - 1] and [0, H - 1], but we round down any number between (W - 1, W) for example). If CG does not converge and oversteps a solution which happens to be at W - 1, it would be unlikely if it overstep all the way to W + epsilon to cause a problem.
Long story short, the TODO still holds. This needs to be handled better. It may be possible to mathematically bound how far the step into the negative dimension may be given the iteration we stopped it. I am worried that this solution may allow bugs to slip through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overstepping may not be the best word to use here since we do not do the solving step ourselves in our code base. I will clean up the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the explanation!
vpr/src/base/read_options.cpp
Outdated
"Controls which Analytical Solver the Global Placer will use in the AP Flow.\n" | ||
" * qp-hybrid: olves for a placement that minimizes the quadratic HPWL of the flat placement using a hybrid clique/star net model.\n" | ||
" * lp-b2b: Solves for a placement that minimizes the linear HPWL of theflat placement using the Bound2Bound net model.") | ||
.default_value("lp-b2b") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the current tradeoff between runtime and QoR (similar QoR but significantly worse runtime), having B2B as the default option feels a bit odd to me. I’d suggest keeping qp-hybrid as the default until the tuning for lp-b2b is finalized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I will reactivate it with my tunings.
The Bound2Bound net model is a method to solve for the linear HPWL objective by iteratively solving a quadratic objective function. This method does obtain a better quality post-global placement flat placement; at the expense of being more computationally expensive. Found that this solver also has numerical stability issues. This may cause the CG solver to never converge which will hit the iteration limit of 2 * the number of moveable blocks. This makes this algorithm quadratic with the number of blocks in the netlist. To resolve this, set a custom iteration limit. This seems to work well on our benchmarks but may need to be revisited in the future.
3c3593a
to
f38a00c
Compare
f38a00c
to
96c7fb6
Compare
@amin1377 Thank you so much for the comments. I have resolved them and I am running CI now. Do you have any further comments? After this is merged I plan to merge in my tunings. |
@AlexandreSinger: Looks good to me. I don’t have any further comments. Feel free to merge it if you’ve made all the changes you wanted. |
Thanks @amin1377 , I agree that the comments will need to be updated when that paper comes out! But I am not sure about arXiv; we'll see! |
The Bound2Bound net model is a method to solve for the linear HPWL objective by iteratively solving a quadratic objective function.
This method does obtain a better quality post-global placement flat placement; at the expense of being more computationally expensive.
Found that this solver also has numerical stability issues. This may cause the CG solver to never converge which will hit the iteration limit of 2 * the number of moveable blocks. This makes this algorithm quadratic with the number of blocks in the netlist. To resolve this, set a custom iteration limit. This seems to work well on our benchmarks but may need to be revisited in the future.
I made B2B the default for the AP flow since I found that, although it takes more time, it achieves a better quality of results over using the hybrid net model.
I slightly tuned APPack to account for the improved global placement.
Quick results on Titan:
Although the 1% WL improvement does not seem like much, the 3.5x improved post GP hpwl implies that with some more tuning to APPack we can achieve even better QoR!