Skip to content

Commit 4f05fe7

Browse files
[AP][GlobalPlacment] Added Bound2Bound Solver
The Bound2Bound net model is a method to solve for the linear HPWL objective by iteratively solving a quadratic objective function. This method does obtain a better quality post-global placement flat placement; at the expense of being more computationally expensive. Found that this solver also has numerical stability issues. This may cause the CG solver to never converge which will hit the iteration limit of 2 * the number of moveable blocks. This makes this algorithm quadratic with the number of blocks in the netlist. To resolve this, set a custom iteration limit. This seems to work well on our benchmarks but may need to be revisited in the future.
1 parent 53b0cad commit 4f05fe7

File tree

35 files changed

+1005
-169
lines changed

35 files changed

+1005
-169
lines changed

doc/src/vpr/command_line_usage.rst

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1188,15 +1188,40 @@ Analytical Placement is generally split into three stages:
11881188

11891189
Analytical Placement is experimental and under active development.
11901190

1191-
.. option:: --ap_global_placer {quadratic-bipartitioning-lookahead | quadratic-flowbased-lookahead}
1191+
.. option:: --ap_analytical_solver {qp-hybrid | lp-b2b}
11921192

1193-
Controls which Global Placer to use in the AP Flow.
1193+
Controls which Analytical Solver the Global Placer will use in the AP Flow.
1194+
The Analytical Solver solves for a placement which optimizes some objective
1195+
function, ignorant of the FPGA legality constraints. This provides a "lower-
1196+
bound" solution. The Global Placer will legalize this solution and feed it
1197+
back to the analytical solver to make its solution more legal.
11941198

1195-
* ``quadratic-bipartitioning-lookahead`` Use a Global Placer which uses a quadratic solver and a bi-partitioning lookahead legalizer. Anchor points are used to spread the solved solution to the legalized solution.
1199+
* ``qp-hybrid`` Solves for a placement that minimizes the quadratic HPWL of
1200+
the flat placement using a hybrid clique/star net model. Uses the legalized solution
1201+
as anchor-points to pull the solution to a more legal solution.
11961202

1197-
* ``quadratic-flowbased-lookahead`` Use a Global Placer which uses a quadratic solver and a multi-commodity-flow-based lookahead legalizer. Anchor points are used to spread the solved solution to the legalized solution.
1203+
* ``lp-b2b`` Solves for a placement that minimizes the linear HPWL of the
1204+
flat placement using the Bound2Bound net model. Uses the legalized solution
1205+
as anchor-points to pull the solution to a more legal solution.
11981206

1199-
**Default:** ``quadratic-bipartitioning-lookahead``
1207+
**Default:** ``lp-b2b``
1208+
1209+
.. option:: --ap_partial_legalizer {bipartitioning | flow-based}
1210+
1211+
Controls which Partial Legalizer the Global Placer will use in the AP Flow.
1212+
The Partial Legalizer legalizes a placement generated by an Analytical Solver.
1213+
It is used within the Global Placer to guide the solver to a more legal
1214+
solution.
1215+
1216+
* ``bipartitioning`` Creates minimum windows around over-dense regions of
1217+
the device bi-partitions the atoms in these windows such that the region
1218+
is no longer over-dense and the atoms are in tiles that they can be placed
1219+
into.
1220+
1221+
* ``flow-based`` Flows atoms from regions that are overfilled to regions that
1222+
are underfilled.
1223+
1224+
**Default:** ``bipartitioning``
12001225

12011226
.. option:: --ap_full_legalizer {naive | appack}
12021227

vpr/src/analytical_place/analytical_placement_flow.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,8 @@ static PartialPlacement run_global_placer(const t_ap_opts& ap_opts,
130130
return p_placement;
131131
} else {
132132
// Run the Global Placer
133-
std::unique_ptr<GlobalPlacer> global_placer = make_global_placer(ap_opts.global_placer_type,
133+
std::unique_ptr<GlobalPlacer> global_placer = make_global_placer(ap_opts.analytical_solver_type,
134+
ap_opts.partial_legalizer_type,
134135
ap_netlist,
135136
prepacker,
136137
atom_nlist,

vpr/src/analytical_place/analytical_solver.cpp

Lines changed: 362 additions & 6 deletions
Large diffs are not rendered by default.

vpr/src/analytical_place/analytical_solver.h

Lines changed: 230 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
#pragma once
1010

1111
#include <memory>
12+
#include "ap_flow_enums.h"
1213
#include "ap_netlist.h"
1314
#include "device_grid.h"
1415
#include "vtr_strong_id.h"
@@ -31,15 +32,6 @@
3132
class PartialPlacement;
3233
class APNetlist;
3334

34-
/**
35-
* @brief Enumeration of all of the solvers currently implemented in VPR.
36-
*
37-
* NOTE: More are coming.
38-
*/
39-
enum class e_analytical_solver {
40-
QP_HYBRID // A solver which optimizes the quadratic HPWL of the design.
41-
};
42-
4335
/**
4436
* @brief A strong ID for the rows in a matrix used during solving.
4537
*
@@ -68,7 +60,7 @@ class AnalyticalSolver {
6860
* Initializes the internal data members of the base class which are useful
6961
* for all solvers.
7062
*/
71-
AnalyticalSolver(const APNetlist& netlist);
63+
AnalyticalSolver(const APNetlist& netlist, int log_verbosity);
7264

7365
/**
7466
* @brief Run an iteration of the solver using the given partial placement
@@ -90,6 +82,14 @@ class AnalyticalSolver {
9082
*/
9183
virtual void solve(unsigned iteration, PartialPlacement& p_placement) = 0;
9284

85+
/**
86+
* @brief Print statistics on the analytical solver.
87+
*
88+
* This is expected to be called after global placement to collect cummulative
89+
* information on how the solver performed.
90+
*/
91+
virtual void print_statistics() = 0;
92+
9393
protected:
9494
/// @brief The APNetlist the solver is optimizing over. It is implied that
9595
/// the netlist is not being modified during global placement.
@@ -112,14 +112,18 @@ class AnalyticalSolver {
112112
/// APBlock it represents. useful when getting the results from the
113113
/// solver.
114114
vtr::vector<APRowId, APBlockId> row_id_to_blk_id_;
115+
116+
/// @brief The verbosity of log messages in the Analytical Solver.
117+
int log_verbosity_;
115118
};
116119

117120
/**
118121
* @brief A factory method which creates an Analytical Solver of the given type.
119122
*/
120-
std::unique_ptr<AnalyticalSolver> make_analytical_solver(e_analytical_solver solver_type,
123+
std::unique_ptr<AnalyticalSolver> make_analytical_solver(e_ap_analytical_solver solver_type,
121124
const APNetlist& netlist,
122-
const DeviceGrid& device_grid);
125+
const DeviceGrid& device_grid,
126+
int log_verbosity);
123127

124128
// The Eigen library is used to solve matrix equations in the following solvers.
125129
// The solver cannot be built if Eigen is not installed.
@@ -263,14 +267,19 @@ class QPHybridSolver : public AnalyticalSolver {
263267
/// @brief The current guess for the y positions of the blocks.
264268
Eigen::VectorXd guess_y;
265269

270+
/// @brief The total number of CG iterations this solver has performed so far.
271+
unsigned total_num_cg_iters_ = 0;
272+
266273
public:
267274
/**
268275
* @brief Constructor of the QPHybridSolver
269276
*
270277
* Initializes internal data and constructs the initial linear system.
271278
*/
272-
QPHybridSolver(const APNetlist& netlist, const DeviceGrid& device_grid)
273-
: AnalyticalSolver(netlist) {
279+
QPHybridSolver(const APNetlist& netlist,
280+
const DeviceGrid& device_grid,
281+
int log_verbosity)
282+
: AnalyticalSolver(netlist, log_verbosity) {
274283
// Initializing the linear system only depends on the netlist and fixed
275284
// block locations. Both are provided by the netlist, allowing this to
276285
// be initialized in the constructor.
@@ -301,6 +310,213 @@ class QPHybridSolver : public AnalyticalSolver {
301310
* this object.
302311
*/
303312
void solve(unsigned iteration, PartialPlacement& p_placement) final;
313+
314+
/**
315+
* @brief Print statistics of the solver.
316+
*/
317+
void print_statistics() final;
318+
};
319+
320+
/**
321+
* @brief An Analytical Solver which tries to minimize the linear HPWL objective:
322+
* SUM((xmax - xmin) + (ymax - ymin)) over all nets.
323+
*
324+
* This is implemented using the Bound2Bound method, which iteratively sets up a
325+
* linear system of equations (similar to the QP Hybrid approach above) which
326+
* solves a quadratic objective function. For a net model, each block connects
327+
* to the current bounding blocks in the given dimension and the weight of this
328+
* connection is inversly proportional to the distance of the block to the bound.
329+
* After minimizing this system, the bounds are likely to change; so the system
330+
* needs to be reconstructed and solved iteratively.
331+
*
332+
* This technique was proposed in Kraftwerk2, where they proved that the B2B Net
333+
* Model will, in theory, converge on the linear HPWL solution.
334+
* https://doi.org/10.1109/TCAD.2008.925783
335+
*/
336+
class B2BSolver : public AnalyticalSolver {
337+
private:
338+
/**
339+
* @brief Enumeration for different initial placements that this class can
340+
* perform in the first iteration.
341+
*/
342+
enum class e_initial_placement_type {
343+
RandomNormal, //< Randomly distribute blocks over the grid using a normal distribution.
344+
RandomUniform, //< Randomly distribute blocks over the grid using a uniform distribution.
345+
LeastDense //< Randomly place blocks as a uniform grid over the device.
346+
};
347+
348+
/// @brief Which initial placement algorithm to use in the first iteration.
349+
/// In the first iteration, we need some solution to initialize the
350+
/// bounds. Some papers have found that setting it to a random
351+
/// initial placement is the best approach.
352+
static constexpr e_initial_placement_type initial_placement_ty_ = e_initial_placement_type::LeastDense;
353+
354+
/// @brief Since the weights in the B2B model divide by the distance between
355+
/// blocks and their bounds, that distance may get very very close to
356+
/// 0. This causes the weight matrix to become numerically unstable.
357+
/// We can gaurd against this by clamping the distance to not be smaller
358+
/// than some epsilon.
359+
/// Decreasing this number may lead to more instability, but can yield
360+
/// a higher quality solution.
361+
static constexpr double distance_epsilon_ = 0.5;
362+
363+
/// @brief Max number of bound update / solve iterations. Increasing this
364+
/// number will yield better quality at the expense of runtime.
365+
static constexpr unsigned max_num_bound_updates_ = 6;
366+
367+
/// @brief Max number of iterations the Conjugate Gradient solver can perform.
368+
/// Due to the weights getting very large in the early iterations of
369+
/// Global Placement, the CG solver may take a very long time to
370+
/// converge; but the solution quality will not change much. By
371+
/// default the max iteration is set to 2 * num_moveable_blocks;
372+
/// which causes the first iteration of B2B to become quadratic in the
373+
/// number of moveable blocks if it cannot converge. Found through
374+
/// experimentation that this can be clamped to a much smaller number
375+
/// to prevent this behaviour and get good runtime.
376+
// TODO: Need to investigate this more to find a good number for this.
377+
// TODO: Should this be a proportion of the design size?
378+
static constexpr unsigned max_cg_iterations_ = 200;
379+
380+
// The following constants are used to configure the anchor weighting.
381+
// The weights of anchors grow exponentially each iteration by the following
382+
// function:
383+
// anchor_w = anchor_weight_mult_ * e^(iter / anchor_weight_exp_fac_)
384+
// The numbers below were empircally found to work well.
385+
386+
/// @brief Multiplier for the anchorweight. The smaller this number is, the
387+
/// weaker the anchors will be at the start.
388+
static constexpr double anchor_weight_mult_ = 0.01;
389+
390+
/// @brief Factor for controlling the growth of the exponential term in the
391+
/// weight factor function. Larger numbers will cause the anchor
392+
/// weights to grow slower.
393+
static constexpr double anchor_weight_exp_fac_ = 5.0;
394+
395+
public:
396+
B2BSolver(const APNetlist& ap_netlist,
397+
const DeviceGrid& device_grid,
398+
int log_verbosity)
399+
: AnalyticalSolver(ap_netlist, log_verbosity)
400+
, device_grid_width_(device_grid.width())
401+
, device_grid_height_(device_grid.height()) {}
402+
403+
/**
404+
* @brief Perform an iteration of the B2B solver, storing the result into
405+
* the partial placement object passed in.
406+
*
407+
* In the first iteration (iteration = 0), the partial placement object will
408+
* be ignored, and a random initial placement will be used to initially
409+
* construct the system of equations. In all other iterations, the previous
410+
* solved solution will be used.
411+
*
412+
* The B2B solver will then iteratively solve the system of equations and
413+
* update the system to achieve a good HPWL solution which is close to the
414+
* linear HPWL solution. Due to numerical issues with this algorithm, we will
415+
* likely not converge on the true minimum HPWL solution, but it should be
416+
* close.
417+
*
418+
* See the base class for more information.
419+
*
420+
* @param iteration
421+
* The current iteration of the Global Placer
422+
* @param p_placement
423+
* A "guess" solution. The result will be written into this object.
424+
* In all iterations other than the first, this solution will be used
425+
* as anchor-points in the system.
426+
*/
427+
void solve(unsigned iteration, PartialPlacement& p_placement) final;
428+
429+
/**
430+
* @brief Print overall statistics on this solver.
431+
*
432+
* This is expected to be called after all iterations of Global Placement
433+
* has been complete.
434+
*/
435+
void print_statistics() final;
436+
437+
private:
438+
/**
439+
* @brief Run the B2B outer solving loop.
440+
*
441+
* The placement in p_placement should be initialized with the initial
442+
* positions of the blocks that the B2B algorithm should use to build the
443+
* first system of equations. This placement will be iteratively updated
444+
* with better and better solutions as B2B iterates.
445+
*
446+
* If iteration is 0, no anchor-blocks will be added to the system, otherwise
447+
* the solution in block_locs_legalized will be used as anchor-blocks.
448+
*/
449+
void b2b_solve_loop(unsigned iteration, PartialPlacement& p_placement);
450+
451+
/**
452+
* @brief Randomly distributes AP blocks using a normal distribution.
453+
*/
454+
void initialize_placement_random_normal(PartialPlacement& p_placement);
455+
456+
/**
457+
* @brief Randomly distributes AP blocks using a uniform distribution.
458+
*/
459+
void initialize_placement_random_uniform(PartialPlacement& p_placement);
460+
461+
/**
462+
* @brief Randomly distributes AP blocks using as a uniform grid.
463+
*/
464+
void initialize_placement_least_dense(PartialPlacement& p_placement);
465+
466+
/**
467+
* @brief Initializes the linear system with the given partial placement.
468+
*
469+
* This will set the connectivity matrices (A) and constant vectors (b) to
470+
* be solved by B2B.
471+
*/
472+
void init_linear_system(PartialPlacement& p_placement);
473+
474+
/**
475+
* @brief Updates the linear system with anchor-blocks from the legalized
476+
* solution.
477+
*/
478+
void update_linear_system_with_anchors(PartialPlacement& p_placement,
479+
unsigned iteration);
480+
481+
// The following are variables used to store the system of equations to be
482+
// solved in the x and y dimensions. The equations are of the form:
483+
// Ax = b
484+
// There are two sets of matrices and vectors since the x and y dimensions
485+
// of the objective are independent and can be solved separately.
486+
// These are updated each iteration of the B2B loop.
487+
488+
/// @brief The coefficient / connectivity matrix for the x dimension.
489+
Eigen::SparseMatrix<double> A_sparse_x;
490+
/// @brief The coefficient / connectivity matrix for the y dimension.
491+
Eigen::SparseMatrix<double> A_sparse_y;
492+
/// @brief The constant vector in the x dimension.
493+
Eigen::VectorXd b_x;
494+
/// @brief The constant vector in the y dimension.
495+
Eigen::VectorXd b_y;
496+
497+
// The following is the solution of the previous iteration of this solver.
498+
// They are updated at the end of solve() and are used as the starting point
499+
// for the next call to solve.
500+
vtr::vector<APBlockId, double> block_x_locs_solved;
501+
vtr::vector<APBlockId, double> block_y_locs_solved;
502+
503+
// The following are the legalized solution coming into the analytical solver
504+
// (other than the first iteration). These are stored to be used as anchor
505+
// blocks during the solver.
506+
vtr::vector<APBlockId, double> block_x_locs_legalized;
507+
vtr::vector<APBlockId, double> block_y_locs_legalized;
508+
509+
/// @brief The width of the device grid. Used for randomly generating points
510+
/// on the grid.
511+
size_t device_grid_width_;
512+
/// @brief The height of the device grid. Used for randomly generating points
513+
/// on the grid.
514+
size_t device_grid_height_;
515+
516+
/// @brief The total number of CG iterations that this solver has performed
517+
/// so far. This can be a useful metric for the amount of work the
518+
/// solver performs.
519+
unsigned total_num_cg_iters_ = 0;
304520
};
305521

306522
#endif // EIGEN_INSTALLED

vpr/src/analytical_place/ap_flow_enums.h

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,27 @@
88
#pragma once
99

1010
/**
11-
* @brief The type of a Global Placer.
11+
* @brief The type of an Analytical Solver.
1212
*
13-
* The Analytical Placement flow may implement different Global Placers. This
14-
* enum can select between these different Global Placers.
13+
* The Analytical Placement flow may implement different Analytical Solvers as
14+
* part of the Global Placer. This enum can select between these different
15+
* Analytical Solvers.
1516
*/
16-
enum class e_ap_global_placer {
17-
// Global placers based on the the SimPL paper.
18-
SimPL_BiParitioning, ///< Global Placer based on the SimPL technique to Global Placement. Uses a quadratic solver and a bi-partitioning Partial Legalizer.
19-
SimPL_FlowBased ///< Global Placer based on the SimPL technique to Global Placement. Uses a quadratic solver and a multi-commodity-flow-baed Partial Legalizer.
17+
enum class e_ap_analytical_solver {
18+
QP_Hybrid, ///< Analytical Solver which uses the hybrid net model to optimize the quadratic HPWL objective.
19+
LP_B2B ///< Analytical Solver which uses the B2B net model to optimize the linear HPWL objective.
20+
};
21+
22+
/**
23+
* @brief The type of a Partial Legalizer.
24+
*
25+
* The Analytical Placement flow may implement different Partial Legalizer as
26+
* part of the Global Placer. This enum can select between these different
27+
* Partial Legalizers.
28+
*/
29+
enum class e_ap_partial_legalizer {
30+
BiPartitioning, ///< Partial Legalizer which forms minimum windows around dense regions and uses bipartitioning to spread blocks over windows.
31+
FlowBased ///> Partial Legalizer which flows blocks from overfilled bins to underfilled bins.
2032
};
2133

2234
/**

0 commit comments

Comments
 (0)