Skip to content

Add raw setup slack analysis to placement quench #1501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Sep 9, 2020
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
f4ea4a1
Added interface for mapping between CLB pins and setup slacks. Refact…
Bill-hbrhbr Jul 23, 2020
c024603
Refactored criticalities update in place.cpp and added setup slacks u…
Bill-hbrhbr Jul 23, 2020
cb6e9a6
Fixe up format and compilation errors
Bill-hbrhbr Jul 23, 2020
63db2e1
Merged 3 update routines into 1 single routine
Bill-hbrhbr Jul 31, 2020
aa7b233
Resolve merge conflicts
Bill-hbrhbr Jul 31, 2020
e8f73c6
Resolve more merge conflicts
Bill-hbrhbr Jul 31, 2020
d80de58
Changed crit_exponent to first_crit_exponent/state.crit_exponent
Bill-hbrhbr Jul 31, 2020
831df44
Created a setup slack matrix that copies data from the PlacerSetupSla…
Bill-hbrhbr Aug 6, 2020
d329911
Provided more complete explanation for the record_setup_slacks routine.
Bill-hbrhbr Aug 6, 2020
225870e
Added placement snapshot functions that facilitates the reversion of …
Bill-hbrhbr Aug 6, 2020
9056301
Merge the master branch into PlacerSetupSlacks (updating vtr flow
Bill-hbrhbr Aug 6, 2020
0e01ed7
Implemented do_setup_slack_cost_analysis: softmax of negative slacks
Bill-hbrhbr Aug 6, 2020
2e212dc
Added single move reversion for setup slack analysis(rather than taki…
Bill-hbrhbr Aug 11, 2020
96e65ba
Corrected the timing update and reversion of setup slack analysis dur…
Bill-hbrhbr Aug 14, 2020
112bde5
Moved four boolean global variables controlling the timing update int…
Bill-hbrhbr Aug 14, 2020
29b55a3
Added vpr option --place_quench_metric to turn on/off setup slack ana…
Bill-hbrhbr Aug 20, 2020
f641b66
Added the option --place_quench_metric to VPR documentation on VPR co…
Bill-hbrhbr Aug 28, 2020
d88a38d
Moved timing_update_mode boolean variables into PlacerSetupSlacks and…
Bill-hbrhbr Aug 28, 2020
df9db48
Removed quench metric option.
Bill-hbrhbr Sep 1, 2020
9b01e1f
Changed PATH_TIMING_DRIVEN_PLACE to CRITICALITY_TIMING_PLACE, and SET…
Bill-hbrhbr Sep 1, 2020
e0b70c4
Changed e_place_algorithm to t_place_algorithm, a wrapper class that …
Bill-hbrhbr Sep 1, 2020
cd79ff5
Added --place_quench_algorithm option to VPR options. Specifies the p…
Bill-hbrhbr Sep 1, 2020
dd2cfe2
Utilized the is_timing_driven() method provided by the t_place_algori…
Bill-hbrhbr Sep 1, 2020
270d1ef
Added regression test strong_place_quench_slack to check if running S…
Bill-hbrhbr Sep 1, 2020
23e8782
Fixed slack cost routine bug and updated golden results. Added code d…
Bill-hbrhbr Sep 4, 2020
f6e809e
Code review update: enhance more documentations. Added function level…
Bill-hbrhbr Sep 6, 2020
e3d7d65
Removed find_affected_sink_pins() to eliminate the placer runtime inc…
Bill-hbrhbr Sep 8, 2020
06c9cf6
Merge branch 'master' into QuenchSlackDraft
Bill-hbrhbr Sep 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions vpr/src/base/SetupVPR.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,8 @@ static void SetupPlacerOpts(const t_options& Options, t_placer_opts* PlacerOpts)

PlacerOpts->effort_scaling = Options.place_effort_scaling;
PlacerOpts->timing_update_type = Options.timing_update_type;

PlacerOpts->place_quench_metric = Options.place_quench_metric;
}

static void SetupAnalysisOpts(const t_options& Options, t_analysis_opts& analysis_opts) {
Expand Down
46 changes: 46 additions & 0 deletions vpr/src/base/read_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -959,6 +959,41 @@ struct ParseTimingUpdateType {
}
};

struct ParsePlaceQuenchMetric {
ConvertedValue<e_place_quench_metric> from_str(std::string str) {
ConvertedValue<e_place_quench_metric> conv_value;
if (str == "auto")
conv_value.set_value(e_place_quench_metric::AUTO);
else if (str == "timing_cost")
conv_value.set_value(e_place_quench_metric::TIMING_COST);
else if (str == "setup_slack")
conv_value.set_value(e_place_quench_metric::SETUP_SLACK);
else {
std::stringstream msg;
msg << "Invalid conversion from '" << str << "' to e_place_quench_metric (expected one of: " << argparse::join(default_choices(), ", ") << ")";
conv_value.set_error(msg.str());
}
return conv_value;
}

ConvertedValue<std::string> to_str(e_place_quench_metric val) {
ConvertedValue<std::string> conv_value;
if (val == e_place_quench_metric::AUTO)
conv_value.set_value("auto");
if (val == e_place_quench_metric::TIMING_COST)
conv_value.set_value("timing_cost");
else {
VTR_ASSERT(val == e_place_quench_metric::SETUP_SLACK);
conv_value.set_value("setup_slack");
}
return conv_value;
}

std::vector<std::string> default_choices() {
return {"auto", "timing_cost", "setup_slack"};
}
};

argparse::ArgumentParser create_arg_parser(std::string prog_name, t_options& args) {
std::string description =
"Implements the specified circuit onto the target FPGA architecture"
Expand Down Expand Up @@ -1747,6 +1782,17 @@ argparse::ArgumentParser create_arg_parser(std::string prog_name, t_options& arg
.default_value("")
.show_in(argparse::ShowIn::HELP_ONLY);

place_timing_grp.add_argument<e_place_quench_metric, ParsePlaceQuenchMetric>(args.place_quench_metric, "--place_quench_metric")
.help(
"Controls which cost function the placer uses during the quench stage:\n"
" * auto: VPR decides\n"
" * timing_cost: The same cost formulation as the one used during\n"
" the annealing stage (more stable)\n"
" * setup_slack: Directly uses setup slacks (in combination with wiring)\n"
" to check if the block moves should be accepted\n")
.default_value("auto")
.show_in(argparse::ShowIn::HELP_ONLY);

auto& route_grp = parser.add_argument_group("routing options");

route_grp.add_argument(args.max_router_iterations, "--max_router_iterations")
Expand Down
1 change: 1 addition & 0 deletions vpr/src/base/read_options.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ struct t_options {
argparse::ArgValue<PlaceDelayModelType> place_delay_model;
argparse::ArgValue<e_reducer> place_delay_model_reducer;
argparse::ArgValue<std::string> allowed_tiles_for_delay_model;
argparse::ArgValue<e_place_quench_metric> place_quench_metric;

/* Router Options */
argparse::ArgValue<bool> check_rr_graph;
Expand Down
10 changes: 9 additions & 1 deletion vpr/src/base/vpr_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -851,7 +851,8 @@ struct t_annealing_sched {
* doPlacement: true if placement is supposed to be done in the CAD flow, false otherwise */
enum e_place_algorithm {
BOUNDING_BOX_PLACE,
PATH_TIMING_DRIVEN_PLACE
PATH_TIMING_DRIVEN_PLACE,
SETUP_SLACK_ANALYSIS_PLACE
};

enum e_place_effort_scaling {
Expand Down Expand Up @@ -884,6 +885,12 @@ enum class e_place_delta_delay_algorithm {
DIJKSTRA_EXPANSION,
};

enum class e_place_quench_metric {
TIMING_COST,
SETUP_SLACK,
AUTO
};

struct t_placer_opts {
enum e_place_algorithm place_algorithm;
float timing_tradeoff;
Expand Down Expand Up @@ -932,6 +939,7 @@ struct t_placer_opts {
std::string allowed_tiles_for_delay_model;

e_place_delta_delay_algorithm place_delta_delay_matrix_calculation_method;
e_place_quench_metric place_quench_metric;
};

/* All the parameters controlling the router's operation are in this *
Expand Down
691 changes: 511 additions & 180 deletions vpr/src/place/place.cpp

Large diffs are not rendered by default.

174 changes: 126 additions & 48 deletions vpr/src/place/timing_place.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@

#include "timing_info.h"

//Use an incremental approach to updaing criticalities?
//Use an incremental approach to updating criticalities and setup slacks?
constexpr bool INCR_UPDATE_CRITICALITIES = true;
constexpr bool INCR_UPDATE_SETUP_SLACKS = true;

/**************************************/

Expand All @@ -27,58 +28,29 @@ PlacerCriticalities::PlacerCriticalities(const ClusteredNetlist& clb_nlist, cons
, timing_place_crit_(make_net_pins_matrix(clb_nlist_, std::numeric_limits<float>::quiet_NaN())) {
}

/**************************************/
void PlacerCriticalities::update_criticalities(const SetupTimingInfo* timing_info, float crit_exponent) {
/* Performs a 1-to-1 mapping from criticality to timing_place_crit_.
* For every pin on every net (or, equivalently, for every tedge ending
* in that pin), timing_place_crit_ = criticality^(criticality exponent) */
void PlacerCriticalities::update_criticalities(const SetupTimingInfo* timing_info, float crit_exponent, bool recompute) {
//If the criticalities are not updated immediately after each time we call
//timing_info->update(), then timing_info->pins_with_modified_setup_criticality()
//cannot accurately account for all the pins that need to be updated.
//In this case, we pass in recompute=true to update all criticalities from scratch.
//
//If the criticality exponent has changed, we also need to update from scratch.

//Determine what pins need updating
if (INCR_UPDATE_CRITICALITIES) {
cluster_pins_with_modified_criticality_.clear();
if (crit_exponent != last_crit_exponent_) {
//Criticality exponent changed, must re-calculate criticalities for *all* sink pins
for (ClusterNetId net_id : clb_nlist_.nets()) {
for (ClusterPinId pin_id : clb_nlist_.net_sinks(net_id)) {
cluster_pins_with_modified_criticality_.insert(pin_id);
}
}

//Record new criticality exponent
last_crit_exponent_ = crit_exponent;
} else {
//Criticality exponent unchanged
//
//Collect the cluster pins which need to be updated based on the latest timing
//analysis
//
//Note we use the set of pins reported by the *timing_info* as having modified
//criticality, rather than those marked as modified by the timing analyzer.
//Since timing_info uses shifted/relaxed criticality (which depends on max
//required time and worst case slacks), additional nodes may be modified
//when updating the atom pin criticalities.

for (AtomPinId atom_pin : timing_info->pins_with_modified_setup_criticality()) {
ClusterPinId clb_pin = pin_lookup_.connected_clb_pin(atom_pin);

//Some atom pins correspond to connections which are completely
//contained within a cluster, and hence have no corresponding
//clustered pin.
if (!clb_pin) continue;

cluster_pins_with_modified_criticality_.insert(clb_pin);
}
}
if (!recompute && crit_exponent == last_crit_exponent_ && INCR_UPDATE_CRITICALITIES) {
incr_update_criticalities(timing_info);
} else {
//Non-incremental: all pins and nets need updating
for (ClusterNetId net_id : clb_nlist_.nets()) {
for (ClusterPinId pin_id : clb_nlist_.net_sinks(net_id)) {
cluster_pins_with_modified_criticality_.insert(pin_id);
}
}
recompute_criticalities();

//Record new criticality exponent
last_crit_exponent_ = crit_exponent;
}

//Update the effected pins
/* Performs a 1-to-1 mapping from criticality to timing_place_crit_.
* For every pin on every net (or, equivalently, for every tedge ending
* in that pin), timing_place_crit_ = criticality^(criticality exponent) */

// Update the effected pins
for (ClusterPinId clb_pin : cluster_pins_with_modified_criticality_) {
ClusterNetId clb_net = clb_nlist_.pin_net(clb_pin);
int pin_index_in_net = clb_nlist_.pin_net_index(clb_pin);
Expand All @@ -92,6 +64,41 @@ void PlacerCriticalities::update_criticalities(const SetupTimingInfo* timing_inf
}
}

void PlacerCriticalities::incr_update_criticalities(const SetupTimingInfo* timing_info) {
cluster_pins_with_modified_criticality_.clear();

//Collect the cluster pins which need to be updated based on the latest timing
//analysis
//
//Note we use the set of pins reported by the *timing_info* as having modified
//criticality, rather than those marked as modified by the timing analyzer.
//Since timing_info uses shifted/relaxed criticality (which depends on max
//required time and worst case slacks), additional nodes may be modified
//when updating the atom pin criticalities.

for (AtomPinId atom_pin : timing_info->pins_with_modified_setup_criticality()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code and the code to find the pins_with_modified_setup_slack is almost the same, except one uses the timing_info->pins_with_modified_setup_criticality() and the other uses timing_info->pins_with_modified_setup_slack().

  1. Do the timing_info->pins_with_modified_setup_criticality() and pins_with_modified_setup_slack() differ? Why? I expect they are close to the same; could we just use whichever one is the bigger set (maybe criticality, due to slack shifting) to track both slack and criticality pins that need updates? Would simplify the code and likely make it faster if we are using both criticalities and slacks, if indeed the pin update lists are usually the same (less work and less memory footprint).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can just use the bigger of pins_with_modified_criticality_ or pins_with_modified_setup_slack, some data members and some duplicated code can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recommend doing this since they are pre-computed (in slack_evaluation.cpp) and are not necessarily the same. The efforts that go into merging these two sets are not worth it in my opinion, and it might make the code less readable. I agree with your feeling that a lot of codes are duplicated, but I don't have a good way of resolving the issue without sacrificing code readability.

ClusterPinId clb_pin = pin_lookup_.connected_clb_pin(atom_pin);

//Some atom pins correspond to connections which are completely
//contained within a cluster, and hence have no corresponding
//clustered pin.
if (!clb_pin) continue;

cluster_pins_with_modified_criticality_.insert(clb_pin);
}
}

void PlacerCriticalities::recompute_criticalities() {
cluster_pins_with_modified_criticality_.clear();

//Non-incremental: all sink pins need updating
for (ClusterNetId net_id : clb_nlist_.nets()) {
for (ClusterPinId pin_id : clb_nlist_.net_sinks(net_id)) {
cluster_pins_with_modified_criticality_.insert(pin_id);
}
}
}

void PlacerCriticalities::set_criticality(ClusterNetId net_id, int ipin, float val) {
timing_place_crit_[net_id][ipin] = val;
}
Expand All @@ -100,6 +107,77 @@ PlacerCriticalities::pin_range PlacerCriticalities::pins_with_modified_criticali
return vtr::make_range(cluster_pins_with_modified_criticality_);
}

/**************************************/

/* Allocates space for the timing_place_setup_slacks_ data structure */
PlacerSetupSlacks::PlacerSetupSlacks(const ClusteredNetlist& clb_nlist, const ClusteredPinAtomPinsLookup& netlist_pin_lookup)
: clb_nlist_(clb_nlist)
, pin_lookup_(netlist_pin_lookup)
, timing_place_setup_slacks_(make_net_pins_matrix(clb_nlist_, std::numeric_limits<float>::quiet_NaN())) {
}

void PlacerSetupSlacks::update_setup_slacks(const SetupTimingInfo* timing_info, bool recompute) {
//If the setup slacks are not updated immediately after each time we call
//timing_info->update(), then timing_info->pins_with_modified_setup_slack()
//cannot accurately account for all the pins that need to be updated.
//In this case, we pass in recompute=true to update all setup slacks from scratch.
if (!recompute && INCR_UPDATE_SETUP_SLACKS) {
incr_update_setup_slacks(timing_info);
} else {
recompute_setup_slacks();
}

//Update the effected pins
for (ClusterPinId clb_pin : cluster_pins_with_modified_setup_slack_) {
ClusterNetId clb_net = clb_nlist_.pin_net(clb_pin);
int pin_index_in_net = clb_nlist_.pin_net_index(clb_pin);

float clb_pin_setup_slack = calculate_clb_net_pin_setup_slack(*timing_info, pin_lookup_, clb_pin);

timing_place_setup_slacks_[clb_net][pin_index_in_net] = clb_pin_setup_slack;
}
}

void PlacerSetupSlacks::incr_update_setup_slacks(const SetupTimingInfo* timing_info) {
cluster_pins_with_modified_setup_slack_.clear();

//Collect the cluster pins which need to be updated based on the latest timing analysis
//
//Note we use the set of pins reported by the *timing_info* as having modified
//setup slacks, rather than those marked as modified by the timing analyzer.
for (AtomPinId atom_pin : timing_info->pins_with_modified_setup_slack()) {
ClusterPinId clb_pin = pin_lookup_.connected_clb_pin(atom_pin);

//Some atom pins correspond to connections which are completely
//contained within a cluster, and hence have no corresponding
//clustered pin.
if (!clb_pin) continue;

cluster_pins_with_modified_setup_slack_.insert(clb_pin);
}
}

void PlacerSetupSlacks::recompute_setup_slacks() {
cluster_pins_with_modified_setup_slack_.clear();

//Non-incremental: all sink pins need updating
for (ClusterNetId net_id : clb_nlist_.nets()) {
for (ClusterPinId pin_id : clb_nlist_.net_sinks(net_id)) {
cluster_pins_with_modified_setup_slack_.insert(pin_id);
}
}
}

void PlacerSetupSlacks::set_setup_slack(ClusterNetId net_id, int ipin, float val) {
timing_place_setup_slacks_[net_id][ipin] = val;
}

PlacerSetupSlacks::pin_range PlacerSetupSlacks::pins_with_modified_setup_slack() const {
return vtr::make_range(cluster_pins_with_modified_setup_slack_);
}

/**************************************/

std::unique_ptr<PlaceDelayModel> alloc_lookups_and_criticalities(t_chan_width_dist chan_width_dist,
const t_placer_opts& placer_opts,
const t_router_opts& router_opts,
Expand Down
68 changes: 66 additions & 2 deletions vpr/src/place/timing_place.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ class PlacerCriticalities {
pin_range pins_with_modified_criticality() const;

public: //Modifiers
//Incrementally updates criticalities based on the atom netlist criticalitites provied by
//Updates criticalities based on the atom netlist criticalitites provided by
//timing_info and the provided criticality_exponent.
void update_criticalities(const SetupTimingInfo* timing_info, float criticality_exponent);
void update_criticalities(const SetupTimingInfo* timing_info, float criticality_exponent, bool recompute);

//Override the criticality of a particular connection
void set_criticality(ClusterNetId net, int ipin, float val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change val to crit_val, and ipin to inet_pin_index (or add a comment on what ipin is -- I think it is an index on that net from 0 to fanout).

Copy link
Contributor Author

@Bill-hbrhbr Bill-hbrhbr Sep 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assertion level 3 check with error message for ipin in timing_place.cpp.

Expand All @@ -81,6 +81,70 @@ class PlacerCriticalities {

//Set of pins with criticaltites modified by last call to update_criticalities()
vtr::vec_id_set<ClusterPinId> cluster_pins_with_modified_criticality_;

//Updates criticalities: incremental V.S. from scratch
void incr_update_criticalities(const SetupTimingInfo* timing_info);
void recompute_criticalities();
};

/* Usage
* =====
* PlacerSetupSlacks returns the clustered netlist connection setup slack used by
* the placer. This also serves to map atom netlist level slack (i.e. on AtomPinIds)
* to the clustered netlist (i.e. ClusterPinIds) used during placement.
*
* Setup slacks are calculated by calling update_setup_slacks(), which will
* update setup slacks based on the atom netlist connection setup slacks provided by
* the passed in SetupTimingInfo. This is done incrementally, based on the modified
* connections/AtomPinIds returned by SetupTimingInfo.
*
* The setup slacks of individual connections can then be queried by calling the
* setup_slack() member function.
*
* It also supports iterating via pins_with_modified_setup_slack() through the
* clustered netlist pins/connections which have had their setup slacks modified by
* the last call to update_setup_slacks().
*/
class PlacerSetupSlacks {
public: //Types
typedef vtr::vec_id_set<ClusterPinId>::iterator pin_iterator;
typedef vtr::vec_id_set<ClusterNetId>::iterator net_iterator;

typedef vtr::Range<pin_iterator> pin_range;
typedef vtr::Range<net_iterator> net_range;

public: //Lifetime
PlacerSetupSlacks(const ClusteredNetlist& clb_nlist, const ClusteredPinAtomPinsLookup& netlist_pin_lookup);
PlacerSetupSlacks(const PlacerSetupSlacks& clb_nlist) = delete;
PlacerSetupSlacks& operator=(const PlacerSetupSlacks& clb_nlist) = delete;

public: //Accessors
//Returns the setup slack of the specified connection
float setup_slack(ClusterNetId net, int ipin) const { return timing_place_setup_slacks_[net][ipin]; }

//Returns the range of clustered netlist pins (i.e. ClusterPinIds) which were modified
//by the last call to update_setup_slacks()
pin_range pins_with_modified_setup_slack() const;

public: //Modifiers
//Updates setup slacks based on the atom netlist setup slacks provided by timing_info
void update_setup_slacks(const SetupTimingInfo* timing_info, bool recompute);

//Override the setup slack of a particular connection
void set_setup_slack(ClusterNetId net, int ipin, float val);

private: //Data
const ClusteredNetlist& clb_nlist_;
const ClusteredPinAtomPinsLookup& pin_lookup_;

ClbNetPinsMatrix<float> timing_place_setup_slacks_; /* [0..cluster_ctx.clb_nlist.nets().size()-1][1..num_pins-1] */

//Set of pins with criticaltites modified by last call to update_criticalities()
vtr::vec_id_set<ClusterPinId> cluster_pins_with_modified_setup_slack_;

//Updates setup slacks: incremental V.S. from scratch
void incr_update_setup_slacks(const SetupTimingInfo* timing_info);
void recompute_setup_slacks();
};

/* Usage
Expand Down
Loading