diff --git a/doc/src/arch/example_arch.xml b/doc/src/arch/example_arch.xml index b11e950b1fd..8a319b2d0e4 100644 --- a/doc/src/arch/example_arch.xml +++ b/doc/src/arch/example_arch.xml @@ -80,7 +80,7 @@ - + @@ -95,7 +95,7 @@ - + @@ -105,7 +105,7 @@ - + @@ -114,7 +114,7 @@ - + diff --git a/doc/src/arch/reference.rst b/doc/src/arch/reference.rst index f98fee87b6b..aa9475721e8 100644 --- a/doc/src/arch/reference.rst +++ b/doc/src/arch/reference.rst @@ -799,7 +799,7 @@ Tile ~~~~ .. arch:tag:: - A tile refers to a placeable element within an FPGA architecture. + A tile refers to a placeable element within an FPGA architecture and describes its physical compositions on the grid. The following attributes are applicable to each tile. The only required one is the name of the tile. @@ -1179,19 +1179,46 @@ The following tags are common to all ```` tags: .. arch:tag:: - Describes the Complex Blocks that can be placed within this tile. + .. seealso:: For a step-by-step walkthrough on describing equivalent sites see :ref:`equivalent_sites_tutorial`. - .. arch:tag:: + Describes the Complex Blocks that can be placed within a tile. + Each physical tile can comprehend a number from 1 to N of possible Complex Blocks, or ``sites``. + A ``site`` corresponds to a top-level Complex Block that must be placeable in at least 1 physical tile locations. + + .. arch:tag:: :req_param pb_type: Name of the corresponding pb_type. - **Example: Equivalent Sites** + :opt_param pin_mapping: Specifies whether the pin mapping between physical tile and logical pb_type: - .. code-block:: xml + * ``direct``: the pin mapping does not need to be specified as the tile pin definition is equal to the corresponding pb_type one; + * ``custom``: the pin mapping is user-defined. + + + **Default:** ``direct`` + + **Example: Equivalent Sites** + + .. code-block:: xml + + + + + + .. arch:tag:: + + Desctibes the mapping of a physical tile's port on the logical block's (pb_type) port. + ``direct`` is an option sub-tag of ``site``. + + .. note:: This tag is need only if the pin_mapping of the ``site`` is defined as ``custom`` + + Attributes: + - ``from`` is relative to the physical tile pins + - ``to`` is relative to the logical block pins + + .. code-block:: xml - - - + .. _arch_complex_blocks: diff --git a/doc/src/tutorials/arch/equivalent_sites.rst b/doc/src/tutorials/arch/equivalent_sites.rst new file mode 100644 index 00000000000..6c505773fda --- /dev/null +++ b/doc/src/tutorials/arch/equivalent_sites.rst @@ -0,0 +1,234 @@ +.. _equivalent_sites_tutorial: + +Equivalent Sites tutorial +========================= + +This tutorial aims at providing information to the user on how to model the equivalent sites to enable ``equivalent placement`` in VPR. + +Equivalent site placement allows the user to define complex logical blocks (top-level pb_types) that can be used in multiple physical location types of the FPGA device grid. +In the same way, the user can define many physical tiles that have different physical attributes that can implement the same logical block. + +The first case (multiple physical grid location types for one complex logical block) is explained below. +The device has at disposal two different Configurable Logic Blocks (CLB), SLICEL and SLICEM. +In this case, the SLICEM CLB is a superset that implements additional features w.r.t. the SLICEL CLB. +Therefore, the user can decide to model the architecture to be able to place the SLICEL Complex Block in a SLICEM physical tile, being it a valid grid location. +This behavior can lead to the generation of more accurate and better placement results, given that a Complex Logic Block is not bound to only one physical location type. + +Below the user can find the implementation of this situation starting from an example that does not make use of the equivalent site placement: + +.. code-block:: xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ... + + + + + + + + + + + + + + ... + + + +As the user can see, ``SLICEL`` and ``SLICEM`` are treated as two different entities, even though they seem to be similar one to another. +To have the possibility to make VPR choose a ``SLICEM`` location when placing a ``SLICEL_SITE`` pb_type, the user needs to change the ``SLICEM`` tile accordingly, as shown below: + +.. code-block:: xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +With the above description of the ``SLICEM`` tile, the user can now have the ``SLICEL`` sites to be placed in ``SLICEM`` physical locations. +One thing to notice is that not all the pins have been mapped for the ``SLICEL_SITE``. For instance, the ``WE`` and ``AI`` port are absent from the ``SLICEL_SITE`` definition, hence they cannot appear in the pin mapping between physical tile and logical block. + +The second case described in this tutorial refers to the situation for which there are multiple different physical location types in the device grid that are used by one complex logical blocks. +Imagine the situation for which the device has left and right I/O tile types which have different pinlocations, hence they need to be defined in two different ways. +With equivalent site placement, the user doesn't need to define multiple different pb_types that implement the same functionality. + +Below the user can find the implementation of this situation starting from an example that does not make use of the equivalent site placement: + +.. code-block:: xml + + + + + + + + + + + + + LEFT_IOPAD_TILE.INPUT + LEFT_IOPAD_TILE.OUTPUT + + + + + + + + + + + + + RIGHT_IOPAD_TILE.INPUT + RIGHT_IOPAD_TILE.OUTPUT + + + + + + + + + + ... + + + + + + ... + + + +To avoid duplicating the complex logic blocks in ``LEFT`` and ``RIGHT IOPADS``, the user can describe the pb_type only once and add it to the equivalent sites tag of the two different tiles, as follows: + +.. code-block:: xml + + + + + + + + + + + + + LEFT_IOPAD_TILE.INPUT + LEFT_IOPAD_TILE.OUTPUT + + + + + + + + + + + + + RIGHT_IOPAD_TILE.INPUT + RIGHT_IOPAD_TILE.OUTPUT + + + + + + + + + + ... + + + +With this implementation, the ``IOPAD_SITE`` can be placed both in the ``LEFT`` and ``RIGHT`` physical location types. +Note that the pin_mapping is set as ``direct``, given that the physical tile and the logical block share the same IO pins. + +The two different cases can be mixed to have a N to M mapping of physical tiles/logical blocks. diff --git a/doc/src/tutorials/arch/index.rst b/doc/src/tutorials/arch/index.rst index f901d325b84..8a3a8d22482 100644 --- a/doc/src/tutorials/arch/index.rst +++ b/doc/src/tutorials/arch/index.rst @@ -30,6 +30,7 @@ Multiple examples of how this language can be used to describe different types o fracturable_multiplier configurable_memory xilinx_virtex_6_like + equivalent_sites **Modeling Guides:** diff --git a/libs/libarchfpga/src/arch_util.cpp b/libs/libarchfpga/src/arch_util.cpp index b16b5f9de83..27780c42ab4 100644 --- a/libs/libarchfpga/src/arch_util.cpp +++ b/libs/libarchfpga/src/arch_util.cpp @@ -252,10 +252,6 @@ void free_type_descriptors(std::vector& type_descriptors) vtr::free(type.is_pin_global); vtr::free(type.pin_class); - for (auto equivalent_site : type.equivalent_sites) { - vtr::free(equivalent_site.pb_type_name); - } - for (auto port : type.ports) { vtr::free(port.name); } diff --git a/libs/libarchfpga/src/echo_arch.cpp b/libs/libarchfpga/src/echo_arch.cpp index 292d53907df..8d1ea79971a 100644 --- a/libs/libarchfpga/src/echo_arch.cpp +++ b/libs/libarchfpga/src/echo_arch.cpp @@ -102,11 +102,23 @@ void EchoArch(const char* EchoFile, int index = Type.index; fprintf(Echo, "\tindex: %d\n", index); - if (LogicalBlockTypes[Type.index].pb_type) { - PrintPb_types_rec(Echo, LogicalBlockTypes[Type.index].pb_type, 2); + + for (auto LogicalBlock : Type.equivalent_sites) { + fprintf(Echo, "\nEquivalent Site: %s\n", LogicalBlock->name); } fprintf(Echo, "\n"); } + + fprintf(Echo, "*************************************************\n\n"); + fprintf(Echo, "*************************************************\n"); + + for (auto& LogicalBlock : LogicalBlockTypes) { + if (LogicalBlock.pb_type) { + PrintPb_types_rec(Echo, LogicalBlock.pb_type, 2); + } + fprintf(Echo, "\n"); + } + fclose(Echo); } diff --git a/libs/libarchfpga/src/physical_types.h b/libs/libarchfpga/src/physical_types.h index 7743a395220..36901e6f0ed 100644 --- a/libs/libarchfpga/src/physical_types.h +++ b/libs/libarchfpga/src/physical_types.h @@ -38,6 +38,7 @@ #include "vtr_ndmatrix.h" #include "vtr_hash.h" +#include "vtr_bimap.h" #include "logic_types.h" #include "clock_types.h" @@ -55,7 +56,11 @@ struct t_port_power; struct t_physical_tile_port; struct t_equivalent_site; struct t_physical_tile_type; +typedef const t_physical_tile_type* t_physical_tile_type_ptr; struct t_logical_block_type; +typedef const t_logical_block_type* t_logical_block_type_ptr; +struct t_logical_pin; +struct t_physical_pin; struct t_pb_type; struct t_pb_graph_pin_power; struct t_mode; @@ -521,27 +526,6 @@ enum class e_sb_type { constexpr int NO_SWITCH = -1; constexpr int DEFAULT_SWITCH = -2; -/* Describes the type for a logical block - * name: unique identifier for type - * pb_type: Internal subblocks and routing information for this physical block - * pb_graph_head: Head of DAG of pb_types_nodes and their edges - * - * index: Keep track of type in array for easy access - * physical_tile_index: index of the corresponding physical tile type - */ -struct t_logical_block_type { - char* name = nullptr; - - /* Clustering info */ - t_pb_type* pb_type = nullptr; - t_pb_graph_node* pb_graph_head = nullptr; - - int index = -1; /* index of type descriptor in array (allows for index referencing) */ - - int physical_tile_index = -1; /* index of the corresponding physical tile type */ -}; -typedef const t_logical_block_type* t_logical_block_type_ptr; - /* Describes the type for a physical tile * name: unique identifier for type * num_pins: Number of pins for the block @@ -626,14 +610,55 @@ struct t_physical_tile_type { int index = -1; /* index of type descriptor in array (allows for index referencing) */ - int logical_block_index = -1; /* index of the corresponding logical block type */ + std::vector equivalent_sites; - std::vector equivalent_sites; + /* Unordered map indexed by the logical block index. + * tile_block_pin_directs_map[logical block index][logical block pin] -> physical tile pin */ + std::unordered_map> tile_block_pin_directs_map; /* Returns the indices of pins that contain a clock for this physical logic block */ std::vector get_clock_pins_indices() const; }; -typedef const t_physical_tile_type* t_physical_tile_type_ptr; + +/** A logical pin defines the pin index of a logical block type (i.e. a top level PB type) + * This structure wraps the int value of the logical pin to allow its storage in the + * vtr::bimap container. + */ +struct t_logical_pin { + int pin = -1; + + t_logical_pin(int value) { + pin = value; + } + + bool operator==(const t_logical_pin o) const { + return pin == o.pin; + } + + bool operator<(const t_logical_pin o) const { + return pin < o.pin; + } +}; + +/** A physical pin defines the pin index of a physical tile type (i.e. a grid tile type) + * This structure wraps the int value of the physical pin to allow its storage in the + * vtr::bimap container. + */ +struct t_physical_pin { + int pin = -1; + + t_physical_pin(int value) { + pin = value; + } + + bool operator==(const t_physical_pin o) const { + return pin == o.pin; + } + + bool operator<(const t_physical_pin o) const { + return pin < o.pin; + } +}; /** Describes I/O and clock ports of a physical tile type * @@ -665,20 +690,26 @@ struct t_physical_tile_port { int index; int absolute_first_pin_index; int port_index_by_type; - int tile_type_index; }; -/** Describes the equivalent sites related to a specific tile type - * - * It corresponds to the tags in the FPGA architecture description +/* Describes the type for a logical block + * name: unique identifier for type + * pb_type: Internal subblocks and routing information for this physical block + * pb_graph_head: Head of DAG of pb_types_nodes and their edges * + * index: Keep track of type in array for easy access + * physical_tile_index: index of the corresponding physical tile type */ -struct t_equivalent_site { - char* pb_type_name; +struct t_logical_block_type { + char* name = nullptr; + + /* Clustering info */ + t_pb_type* pb_type = nullptr; + t_pb_graph_node* pb_graph_head = nullptr; + + int index = -1; /* index of type descriptor in array (allows for index referencing) */ - // XXX Variables to hold information on mapping between site and tile - // XXX as well as references to the belonging pb_type and tile_type - //t_logical_block_type* block_type; + std::vector equivalent_tiles; }; /************************************************************************************************* @@ -726,8 +757,9 @@ struct t_equivalent_site { * modes: Different modes accepted * ports: I/O and clock ports * num_clock_pins: A count of the total number of clock pins - * int num_input_pins: A count of the total number of input pins - * int num_output_pins: A count of the total number of output pins + * num_input_pins: A count of the total number of input pins + * num_output_pins: A count of the total number of output pins + * num_pins: A count of the total number of pins * timing: Timing matrix of block [0..num_inputs-1][0..num_outputs-1] * parent_mode: mode of the parent block * t_mode_power: ??? @@ -749,6 +781,8 @@ struct t_pb_type { int num_input_pins = 0; /* inputs not including clock pins */ int num_output_pins = 0; + int num_pins = 0; + t_mode* parent_mode = nullptr; int depth = 0; /* depth of pb_type */ @@ -861,6 +895,7 @@ struct t_port { int index; int port_index_by_type; + int absolute_first_pin_index; t_port_power* port_power; }; diff --git a/libs/libarchfpga/src/read_xml_arch_file.cpp b/libs/libarchfpga/src/read_xml_arch_file.cpp index c060945ecc0..32381facab6 100644 --- a/libs/libarchfpga/src/read_xml_arch_file.cpp +++ b/libs/libarchfpga/src/read_xml_arch_file.cpp @@ -52,6 +52,7 @@ #include "vtr_memory.h" #include "vtr_digest.h" #include "vtr_token.h" +#include "vtr_bimap.h" #include "arch_types.h" #include "arch_util.h" @@ -84,14 +85,16 @@ static void SetupPinLocationsAndPinClasses(pugi::xml_node Locations, static void LoadPinLoc(pugi::xml_node Locations, t_physical_tile_type* type, const pugiutil::loc_data& loc_data); -static std::pair ProcessCustomPinLoc(pugi::xml_node Locations, - t_physical_tile_type_ptr type, - const char* pin_loc_string, - const pugiutil::loc_data& loc_data); +template +static std::pair ProcessPinString(pugi::xml_node Locations, + T type, + const char* pin_loc_string, + const pugiutil::loc_data& loc_data); /* Process XML hierarchy */ static void ProcessTiles(pugi::xml_node Node, std::vector& PhysicalTileTypes, + std::vector& LogicalBlockTypes, const t_default_fc_spec& arch_def_fc, t_arch& arch, const pugiutil::loc_data& loc_data); @@ -106,7 +109,17 @@ static void ProcessTilePort(pugi::xml_node Node, const pugiutil::loc_data& loc_data); static void ProcessTileEquivalentSites(pugi::xml_node Parent, t_physical_tile_type* PhysicalTileType, + std::vector& LogicalBlockTypes, const pugiutil::loc_data& loc_data); +static void ProcessEquivalentSiteDirectConnection(pugi::xml_node Parent, + t_physical_tile_type* PhysicalTileType, + t_logical_block_type* LogicalBlockType, + const pugiutil::loc_data& loc_data); +static void ProcessEquivalentSiteCustomConnection(pugi::xml_node Parent, + t_physical_tile_type* PhysicalTileType, + t_logical_block_type* LogicalBlockType, + std::string site_name, + const pugiutil::loc_data& loc_data); static void ProcessPb_Type(pugi::xml_node Parent, t_pb_type* pb_type, t_mode* mode, @@ -212,9 +225,16 @@ e_side string_to_side(std::string side_str); static void link_physical_logical_types(std::vector& PhysicalTileTypes, std::vector& LogicalBlockTypes); -static void check_port_equivalence(t_physical_tile_type& physical_tile, t_logical_block_type& logical_block); +static void check_port_direct_mappings(t_physical_tile_type_ptr physical_tile, t_logical_block_type_ptr logical_block); static const t_physical_tile_port* get_port_by_name(t_physical_tile_type_ptr type, const char* port_name); +static const t_port* get_port_by_name(t_logical_block_type_ptr type, const char* port_name); + +static const t_physical_tile_port* get_port_by_pin(t_physical_tile_type_ptr type, int pin); +static const t_port* get_port_by_pin(t_logical_block_type_ptr type, int pin); + +template +static T* get_type_by_name(const char* type_name, std::vector& types); /* * @@ -298,14 +318,14 @@ void XmlReadArch(const char* ArchFile, ProcessSwitchblocks(Next, arch, loc_data); } - /* Process logical block types */ - Next = get_single_child(architecture, "tiles", loc_data); - ProcessTiles(Next, PhysicalTileTypes, arch_def_fc, *arch, loc_data); - /* Process logical block types */ Next = get_single_child(architecture, "complexblocklist", loc_data); ProcessComplexBlocks(Next, LogicalBlockTypes, *arch, timing_enabled, loc_data); + /* Process logical block types */ + Next = get_single_child(architecture, "tiles", loc_data); + ProcessTiles(Next, PhysicalTileTypes, LogicalBlockTypes, arch_def_fc, *arch, loc_data); + /* Link Physical Tiles with Logical Blocks */ link_physical_logical_types(PhysicalTileTypes, LogicalBlockTypes); @@ -626,7 +646,7 @@ static void SetupPinLocationsAndPinClasses(pugi::xml_node Locations, if (port.equivalent != PortEquivalence::NONE) { PhysicalTileType->class_inf[num_class].num_pins = port.num_pins; PhysicalTileType->class_inf[num_class].pinlist = (int*)vtr::malloc(sizeof(int) * port.num_pins); - PhysicalTileType->class_inf[num_class].equivalence = PhysicalTileType->ports[i].equivalent; + PhysicalTileType->class_inf[num_class].equivalence = port.equivalent; } for (k = 0; k < port.num_pins; ++k) { @@ -796,10 +816,10 @@ static void LoadPinLoc(pugi::xml_node Locations, for (int height = 0; height < type->height; ++height) { for (e_side side : {TOP, RIGHT, BOTTOM, LEFT}) { for (int pin = 0; pin < type->num_pin_loc_assignments[width][height][side]; ++pin) { - auto pin_range = ProcessCustomPinLoc(Locations, - type, - type->pin_loc_assignments[width][height][side][pin], - loc_data); + auto pin_range = ProcessPinString(Locations, + type, + type->pin_loc_assignments[width][height][side][pin], + loc_data); for (int pin_num = pin_range.first; pin_num < pin_range.second; ++pin_num) { VTR_ASSERT(pin_num < type->num_pins / type->capacity); @@ -827,10 +847,11 @@ static void LoadPinLoc(pugi::xml_node Locations, } } -static std::pair ProcessCustomPinLoc(pugi::xml_node Locations, - t_physical_tile_type_ptr type, - const char* pin_loc_string, - const pugiutil::loc_data& loc_data) { +template +static std::pair ProcessPinString(pugi::xml_node Locations, + T type, + const char* pin_loc_string, + const pugiutil::loc_data& loc_data) { int num_tokens; auto tokens = GetTokensFromString(pin_loc_string, &num_tokens); @@ -859,11 +880,14 @@ static std::pair ProcessCustomPinLoc(pugi::xml_node Locations, } auto port = get_port_by_name(type, token.data); - VTR_ASSERT(port != nullptr); + if (port == nullptr) { + archfpga_throw(loc_data.filename_c_str(), loc_data.line(Locations), + "Port %s for %s could not be found: %s\n", + type->name, token.data, + pin_loc_string); + } int abs_first_pin_idx = port->absolute_first_pin_index; - std::pair pins; - token_index++; // All the pins of the port are taken or the port has a single pin @@ -1393,6 +1417,8 @@ static void ProcessPb_Type(pugi::xml_node Parent, t_pb_type* pb_type, t_mode* mo /* process ports */ j = 0; + int absolute_port_first_pin_index = 0; + for (i = 0; i < 3; i++) { if (i == 0) { k = 0; @@ -1411,6 +1437,9 @@ static void ProcessPb_Type(pugi::xml_node Parent, t_pb_type* pb_type, t_mode* mo ProcessPb_TypePort(Cur, &pb_type->ports[j], pb_type->pb_type_power->estimation_method, is_root_pb_type, loc_data); + pb_type->ports[j].absolute_first_pin_index = absolute_port_first_pin_index; + absolute_port_first_pin_index += pb_type->ports[j].num_pins; + //Check port name duplicates ret_pb_ports = pb_port_names.insert(std::pair(pb_type->ports[j].name, 0)); if (!ret_pb_ports.second) { @@ -1443,6 +1472,8 @@ static void ProcessPb_Type(pugi::xml_node Parent, t_pb_type* pb_type, t_mode* mo } } + pb_type->num_pins = pb_type->num_input_pins + pb_type->num_output_pins + pb_type->num_clock_pins; + //Warn that max_internal_delay is no longer supported //TODO: eventually remove try { @@ -2929,6 +2960,7 @@ static void ProcessChanWidthDistrDir(pugi::xml_node Node, t_chan* chan, const pu static void ProcessTiles(pugi::xml_node Node, std::vector& PhysicalTileTypes, + std::vector& LogicalBlockTypes, const t_default_fc_spec& arch_def_fc, t_arch& arch, const pugiutil::loc_data& loc_data) { @@ -2941,7 +2973,6 @@ static void ProcessTiles(pugi::xml_node Node, */ t_physical_tile_type EMPTY_PHYSICAL_TILE_TYPE = SetupEmptyPhysicalType(); EMPTY_PHYSICAL_TILE_TYPE.index = 0; - EMPTY_PHYSICAL_TILE_TYPE.logical_block_index = 0; PhysicalTileTypes.push_back(EMPTY_PHYSICAL_TILE_TYPE); /* Process the types */ @@ -2953,6 +2984,8 @@ static void ProcessTiles(pugi::xml_node Node, t_physical_tile_type PhysicalTileType; + PhysicalTileType.index = index; + /* Parses the properties fields of the type */ ProcessTileProps(CurTileType, &PhysicalTileType, loc_data); @@ -3000,9 +3033,7 @@ static void ProcessTiles(pugi::xml_node Node, //Load equivalent sites infromation Cur = get_single_child(CurTileType, "equivalent_sites", loc_data, ReqOpt::REQUIRED); - ProcessTileEquivalentSites(Cur, &PhysicalTileType, loc_data); - - PhysicalTileType.index = index; + ProcessTileEquivalentSites(Cur, &PhysicalTileType, LogicalBlockTypes, loc_data); /* Type fully read */ ++index; @@ -3179,33 +3210,129 @@ static void ProcessTilePort(pugi::xml_node Node, static void ProcessTileEquivalentSites(pugi::xml_node Parent, t_physical_tile_type* PhysicalTileType, + std::vector& LogicalBlockTypes, const pugiutil::loc_data& loc_data) { pugi::xml_node CurSite; expect_only_children(Parent, {"site"}, loc_data); - if (count_children(Parent, "site", loc_data) != 1) { + if (count_children(Parent, "site", loc_data) < 1) { archfpga_throw(loc_data.filename_c_str(), loc_data.line(Parent), - "Zero or more than one sites corresponding to a tile.\n"); + "There are no sites corresponding to this tile: %s.\n", PhysicalTileType->name); } CurSite = Parent.first_child(); while (CurSite) { check_node(CurSite, "site", loc_data); - t_equivalent_site equivalent_site; - - expect_only_attributes(CurSite, {"pb_type"}, loc_data); + expect_only_attributes(CurSite, {"pb_type", "pin_mapping"}, loc_data); /* Load equivalent site name */ - auto Prop = get_attribute(CurSite, "pb_type", loc_data).value(); - equivalent_site.pb_type_name = vtr::strdup(Prop); + auto Prop = std::string(get_attribute(CurSite, "pb_type", loc_data).value()); + + auto LogicalBlockType = get_type_by_name(Prop.c_str(), LogicalBlockTypes); + + auto pin_mapping = get_attribute(CurSite, "pin_mapping", loc_data, ReqOpt::OPTIONAL).as_string("direct"); + + if (0 == strcmp(pin_mapping, "custom")) { + // Pin mapping between Tile and Pb Type is user-defined + ProcessEquivalentSiteCustomConnection(CurSite, PhysicalTileType, LogicalBlockType, Prop, loc_data); + } else if (0 == strcmp(pin_mapping, "direct")) { + ProcessEquivalentSiteDirectConnection(CurSite, PhysicalTileType, LogicalBlockType, loc_data); + } - PhysicalTileType->equivalent_sites.push_back(equivalent_site); + if (0 == strcmp(LogicalBlockType->pb_type->name, Prop.c_str())) { + PhysicalTileType->equivalent_sites.push_back(LogicalBlockType); + + check_port_direct_mappings(PhysicalTileType, LogicalBlockType); + } CurSite = CurSite.next_sibling(CurSite.name()); } } +static void ProcessEquivalentSiteDirectConnection(pugi::xml_node Parent, + t_physical_tile_type* PhysicalTileType, + t_logical_block_type* LogicalBlockType, + const pugiutil::loc_data& loc_data) { + int num_pins = PhysicalTileType->num_pins / PhysicalTileType->capacity; + + if (num_pins != LogicalBlockType->pb_type->num_pins) { + archfpga_throw(loc_data.filename_c_str(), loc_data.line(Parent), + "Pin definition differ between site %s and tile %s. User-defined pin mapping is required.\n", LogicalBlockType->pb_type->name, PhysicalTileType->name); + } + + vtr::bimap directs_map; + + for (int npin = 0; npin < num_pins; npin++) { + t_physical_pin physical_pin(npin); + t_logical_pin logical_pin(npin); + + directs_map.insert(logical_pin, physical_pin); + } + + PhysicalTileType->tile_block_pin_directs_map[LogicalBlockType->index] = directs_map; +} + +static void ProcessEquivalentSiteCustomConnection(pugi::xml_node Parent, + t_physical_tile_type* PhysicalTileType, + t_logical_block_type* LogicalBlockType, + std::string site_name, + const pugiutil::loc_data& loc_data) { + pugi::xml_node CurDirect; + + expect_only_children(Parent, {"direct"}, loc_data); + + if (count_children(Parent, "direct", loc_data) < 1) { + archfpga_throw(loc_data.filename_c_str(), loc_data.line(Parent), + "There are no direct pin mappings between site %s and tile %s.\n", site_name.c_str(), PhysicalTileType->name); + } + + vtr::bimap directs_map; + + CurDirect = Parent.first_child(); + while (CurDirect) { + check_node(CurDirect, "direct", loc_data); + + expect_only_attributes(CurDirect, {"from", "to"}, loc_data); + + std::string from, to; + // `from` attribute is relative to the physical tile pins + from = std::string(get_attribute(CurDirect, "from", loc_data).value()); + + // `to` attribute is relative to the logical block pins + to = std::string(get_attribute(CurDirect, "to", loc_data).value()); + + auto from_pins = ProcessPinString(CurDirect, PhysicalTileType, from.c_str(), loc_data); + auto to_pins = ProcessPinString(CurDirect, LogicalBlockType, to.c_str(), loc_data); + + // Checking that the number of pins is exactly the same + if (from_pins.second - from_pins.first != to_pins.second - to_pins.first) { + archfpga_throw(loc_data.filename_c_str(), loc_data.line(Parent), + "The number of pins specified in the direct pin mapping is " + "not equivalent for Physical Tile %s and Logical Block %s.\n", + PhysicalTileType->name, LogicalBlockType->name); + } + + int num_pins = from_pins.second - from_pins.first; + for (int i = 0; i < num_pins; i++) { + t_physical_pin physical_pin(from_pins.first + i); + t_logical_pin logical_pin(to_pins.first + i); + + auto result = directs_map.insert(logical_pin, physical_pin); + if (!result.second) { + archfpga_throw(loc_data.filename_c_str(), loc_data.line(Parent), + "Duplicate logical pin (%d) to physical pin (%d) mappings found for " + "Physical Tile %s and Logical Block %s.\n", + logical_pin.pin, physical_pin.pin, PhysicalTileType->name, LogicalBlockType->name); + } + } + + CurDirect = CurDirect.next_sibling(CurDirect.name()); + } + + PhysicalTileType->tile_block_pin_directs_map[LogicalBlockType->index] = directs_map; +} + /* Takes in node pointing to and loads all the * child type objects. */ static void ProcessComplexBlocks(pugi::xml_node Node, @@ -3222,7 +3349,6 @@ static void ProcessComplexBlocks(pugi::xml_node Node, */ t_logical_block_type EMPTY_LOGICAL_BLOCK_TYPE = SetupEmptyLogicalType(); EMPTY_LOGICAL_BLOCK_TYPE.index = 0; - EMPTY_LOGICAL_BLOCK_TYPE.physical_tile_index = 0; LogicalBlockTypes.push_back(EMPTY_LOGICAL_BLOCK_TYPE); /* Process the types */ @@ -4676,55 +4802,128 @@ e_side string_to_side(std::string side_str) { static void link_physical_logical_types(std::vector& PhysicalTileTypes, std::vector& LogicalBlockTypes) { - std::map check_equivalence; - for (auto& physical_tile : PhysicalTileTypes) { if (physical_tile.index == EMPTY_TYPE_INDEX) continue; - for (auto& equivalent_site : physical_tile.equivalent_sites) { - for (auto& logical_block : LogicalBlockTypes) { - if (logical_block.index == EMPTY_TYPE_INDEX) continue; + auto& equivalent_sites = physical_tile.equivalent_sites; - // Check the corresponding Logical Block - if (0 == strcmp(logical_block.pb_type->name, equivalent_site.pb_type_name)) { - physical_tile.logical_block_index = logical_block.index; - logical_block.physical_tile_index = physical_tile.index; + auto criteria = [physical_tile](const t_logical_block_type* lhs, const t_logical_block_type* rhs) { + int num_physical_pins = physical_tile.num_pins / physical_tile.capacity; - auto result = check_equivalence.emplace(&physical_tile, &logical_block); - if (!result.second) { - archfpga_throw(__FILE__, __LINE__, - "Logical and Physical types do not have a one to one mapping\n"); - } + int lhs_num_logical_pins = lhs->pb_type->num_pins; + int rhs_num_logical_pins = rhs->pb_type->num_pins; + + int lhs_diff_num_pins = num_physical_pins - lhs_num_logical_pins; + int rhs_diff_num_pins = num_physical_pins - rhs_num_logical_pins; + + return lhs_diff_num_pins < rhs_diff_num_pins; + }; - check_port_equivalence(physical_tile, logical_block); + std::sort(equivalent_sites.begin(), equivalent_sites.end(), criteria); + + for (auto& logical_block : LogicalBlockTypes) { + for (auto site : equivalent_sites) { + if (0 == strcmp(logical_block.name, site->pb_type->name)) { + logical_block.equivalent_tiles.push_back(&physical_tile); break; } } } } + + for (auto& logical_block : LogicalBlockTypes) { + if (logical_block.index == EMPTY_TYPE_INDEX) continue; + + auto& equivalent_tiles = logical_block.equivalent_tiles; + + if ((int)equivalent_tiles.size() <= 0) { + archfpga_throw(__FILE__, __LINE__, + "Logical Block %s does not have any equivalent tiles.\n", logical_block.name); + } + + std::unordered_map ignored_pins_check_map; + std::unordered_map global_pins_check_map; + + auto criteria = [logical_block](const t_physical_tile_type* lhs, const t_physical_tile_type* rhs) { + int num_logical_pins = logical_block.pb_type->num_pins; + + int lhs_num_physical_pins = lhs->num_pins / lhs->capacity; + int rhs_num_physical_pins = rhs->num_pins / rhs->capacity; + + int lhs_diff_num_pins = lhs_num_physical_pins - num_logical_pins; + int rhs_diff_num_pins = rhs_num_physical_pins - num_logical_pins; + + return lhs_diff_num_pins < rhs_diff_num_pins; + }; + + std::sort(equivalent_tiles.begin(), equivalent_tiles.end(), criteria); + + for (int pin = 0; pin < logical_block.pb_type->num_pins; pin++) { + for (auto& tile : logical_block.equivalent_tiles) { + auto direct_map = tile->tile_block_pin_directs_map.at(logical_block.index); + auto result = direct_map.find(t_logical_pin(pin)); + if (result == direct_map.end()) { + archfpga_throw(__FILE__, __LINE__, + "Logical pin %d not present in pin mapping between Tile %s and Block %s.\n", + pin, tile->name, logical_block.name); + } + + int phy_index = result->second.pin; + + bool is_ignored = tile->is_ignored_pin[phy_index]; + bool is_global = tile->is_pin_global[phy_index]; + + auto ignored_result = ignored_pins_check_map.insert(std::pair(pin, is_ignored)); + if (!ignored_result.second && ignored_result.first->second != is_ignored) { + archfpga_throw(__FILE__, __LINE__, + "Physical Tile %s has a different value for the ignored pin (physical pin: %d, logical pin: %d) " + "different from the corresponding pins of the other equivalent sites\n.", + tile->name, phy_index, pin); + } + + auto global_result = global_pins_check_map.insert(std::pair(pin, is_global)); + if (!global_result.second && global_result.first->second != is_global) { + archfpga_throw(__FILE__, __LINE__, + "Physical Tile %s has a different value for the global pin (physical pin: %d, logical pin: %d) " + "different from the corresponding pins of the other equivalent sites\n.", + tile->name, phy_index, pin); + } + } + } + } } -static void check_port_equivalence(t_physical_tile_type& physical_tile, t_logical_block_type& logical_block) { - auto pb_type = logical_block.pb_type; - auto pb_type_ports = pb_type->ports; +static void check_port_direct_mappings(t_physical_tile_type_ptr physical_tile, t_logical_block_type_ptr logical_block) { + auto pb_type = logical_block->pb_type; + + if (pb_type->num_pins > physical_tile->num_pins) { + archfpga_throw(__FILE__, __LINE__, + "Logical Block (%s) has more pins than the Physical Tile (%s).\n", + logical_block->name, physical_tile->name); + } + + auto& pin_direct_mapping = physical_tile->tile_block_pin_directs_map.at(logical_block->index); - if (pb_type->num_ports != (int)physical_tile.ports.size()) { + if (pb_type->num_pins != (int)pin_direct_mapping.size()) { archfpga_throw(__FILE__, __LINE__, "Logical block (%s) and Physical tile (%s) have a different number of ports.\n", - logical_block.name, physical_tile.name); + logical_block->name, physical_tile->name); } - for (auto& tile_port : physical_tile.ports) { - auto block_port = pb_type_ports[tile_port.index]; + for (auto pin_map : pin_direct_mapping) { + auto block_port = get_port_by_pin(logical_block, pin_map.first.pin); + auto tile_port = get_port_by_pin(physical_tile, pin_map.second.pin); + + VTR_ASSERT(block_port != nullptr); + VTR_ASSERT(tile_port != nullptr); - if (0 != strcmp(tile_port.name, block_port.name) - || tile_port.type != block_port.type - || tile_port.num_pins != block_port.num_pins - || tile_port.equivalent != block_port.equivalent) { + if (tile_port->type != block_port->type + || tile_port->num_pins != block_port->num_pins + || tile_port->equivalent != block_port->equivalent) { archfpga_throw(__FILE__, __LINE__, "Logical block (%s) and Physical tile (%s) do not have equivalent port specifications.\n", - logical_block.name, physical_tile.name); + logical_block->name, physical_tile->name); } } } @@ -4738,3 +4937,51 @@ static const t_physical_tile_port* get_port_by_name(t_physical_tile_type_ptr typ return nullptr; } + +static const t_port* get_port_by_name(t_logical_block_type_ptr type, const char* port_name) { + auto pb_type = type->pb_type; + + for (int i = 0; i < pb_type->num_ports; i++) { + auto port = pb_type->ports[i]; + if (0 == strcmp(port.name, port_name)) { + return &pb_type->ports[port.index]; + } + } + + return nullptr; +} + +static const t_physical_tile_port* get_port_by_pin(t_physical_tile_type_ptr type, int pin) { + for (auto port : type->ports) { + if (pin >= port.absolute_first_pin_index && pin < port.absolute_first_pin_index + port.num_pins) { + return &type->ports[port.index]; + } + } + + return nullptr; +} + +static const t_port* get_port_by_pin(t_logical_block_type_ptr type, int pin) { + auto pb_type = type->pb_type; + + for (int i = 0; i < pb_type->num_ports; i++) { + auto port = pb_type->ports[i]; + if (pin >= port.absolute_first_pin_index && pin < port.absolute_first_pin_index + port.num_pins) { + return &pb_type->ports[port.index]; + } + } + + return nullptr; +} + +template +static T* get_type_by_name(const char* type_name, std::vector& types) { + for (auto& type : types) { + if (0 == strcmp(type.name, type_name)) { + return &type; + } + } + + archfpga_throw(__FILE__, __LINE__, + "Could not find type: %s\n", type_name); +} diff --git a/utils/fasm/src/fasm.cpp b/utils/fasm/src/fasm.cpp index 361abdb1a4b..64607f6a6b8 100644 --- a/utils/fasm/src/fasm.cpp +++ b/utils/fasm/src/fasm.cpp @@ -42,6 +42,7 @@ void FasmWriterVisitor::visit_top_impl(const char* top_level_name) { void FasmWriterVisitor::visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb) { auto& place_ctx = g_vpr_ctx.placement(); auto& device_ctx = g_vpr_ctx.device(); + auto& cluster_ctx = g_vpr_ctx.clustering(); current_blk_id_ = blk_id; @@ -54,7 +55,8 @@ void FasmWriterVisitor::visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb) { int y = place_ctx.block_locs[blk_id].loc.y; int z = place_ctx.block_locs[blk_id].loc.z; auto &grid_loc = device_ctx.grid[x][y]; - blk_type_ = grid_loc.type; + physical_tile_ = grid_loc.type; + logical_block_ = cluster_ctx.clb_nlist.block_type(blk_id); blk_prefix_ = ""; clb_prefix_ = ""; @@ -94,11 +96,11 @@ void FasmWriterVisitor::visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb) { VTR_ASSERT(value != nullptr); std::string prefix_unsplit = value->front().as_string(); std::vector fasm_prefixes = vtr::split(prefix_unsplit, " \t\n"); - if(fasm_prefixes.size() != static_cast(blk_type_->capacity)) { + if(fasm_prefixes.size() != static_cast(physical_tile_->capacity)) { vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__, "number of fasm_prefix (%s) options (%d) for block (%s) must match capacity(%d)", - prefix_unsplit.c_str(), fasm_prefixes.size(), blk_type_->name, blk_type_->capacity); + prefix_unsplit.c_str(), fasm_prefixes.size(), physical_tile_->name, physical_tile_->capacity); } grid_prefix = fasm_prefixes[z]; blk_prefix_ = grid_prefix + "."; @@ -122,7 +124,7 @@ void FasmWriterVisitor::check_interconnect(const t_pb_routes &pb_routes, int ino return; } - t_pb_graph_pin *prev_pin = pb_graph_pin_lookup_from_index_by_type_.at(blk_type_->index)[prev_node]; + t_pb_graph_pin *prev_pin = pb_graph_pin_lookup_from_index_by_type_.at(logical_block_->index)[prev_node]; int prev_edge; for(prev_edge = 0; prev_edge < prev_pin->num_output_edges; prev_edge++) { diff --git a/utils/fasm/src/fasm.h b/utils/fasm/src/fasm.h index 28ab1c79d7f..892dc6a83d7 100644 --- a/utils/fasm/src/fasm.h +++ b/utils/fasm/src/fasm.h @@ -86,7 +86,8 @@ class FasmWriterVisitor : public NetlistVisitor { t_pb_graph_node *root_clb_; bool current_blk_has_prefix_; - t_physical_tile_type_ptr blk_type_; + t_physical_tile_type_ptr physical_tile_; + t_logical_block_type_ptr logical_block_; std::string blk_prefix_; std::string clb_prefix_; std::map clb_prefix_map_; diff --git a/utils/route_diag/src/main.cpp b/utils/route_diag/src/main.cpp index 0c8e095cd52..ee1c467d17c 100644 --- a/utils/route_diag/src/main.cpp +++ b/utils/route_diag/src/main.cpp @@ -154,7 +154,7 @@ static void profile_source(int source_rr_node, for (int sink_x = start_x; sink_x <= end_x; sink_x++) { for (int sink_y = start_y; sink_y <= end_y; sink_y++) { - if(device_ctx.grid[sink_x][sink_y].type == device_ctx.EMPTY_TYPE) { + if(device_ctx.grid[sink_x][sink_y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) { continue; } @@ -220,7 +220,7 @@ static t_chan_width setup_chan_width(t_router_opts router_opts, if (router_opts.fixed_channel_width == NO_FIXED_CHANNEL_WIDTH) { auto& device_ctx = g_vpr_ctx.device(); - auto type = physical_tile_type(find_most_common_block_type(device_ctx.grid)); + auto type = find_most_common_tile_type(device_ctx.grid); width_fac = 4 * type->num_pins; /*this is 2x the value that binary search starts */ diff --git a/vpr/src/base/SetupGrid.cpp b/vpr/src/base/SetupGrid.cpp index 6712196800b..982f313cafc 100644 --- a/vpr/src/base/SetupGrid.cpp +++ b/vpr/src/base/SetupGrid.cpp @@ -221,15 +221,33 @@ static DeviceGrid auto_size_device_grid(const std::vector& grid_layo } static std::vector grid_overused_resources(const DeviceGrid& grid, std::map instance_counts) { + auto& device_ctx = g_vpr_ctx.device(); + std::vector overused_resources; + std::unordered_map min_count_map; + // Initialize min_count_map + for (const auto& physical_tile : device_ctx.physical_tile_types) { + min_count_map.insert(std::make_pair(&physical_tile, size_t(0))); + } + //Are the resources satisified? for (auto kv : instance_counts) { - t_physical_tile_type_ptr type; - size_t min_count; - std::tie(type, min_count) = std::make_pair(physical_tile_type(kv.first), kv.second); + t_physical_tile_type_ptr type = nullptr; - size_t inst_cnt = grid.num_instances(type); + size_t inst_cnt = 0; + for (auto& physical_tile : kv.first->equivalent_tiles) { + size_t tmp_inst_cnt = grid.num_instances(physical_tile); + + if (inst_cnt <= tmp_inst_cnt) { + type = physical_tile; + inst_cnt = tmp_inst_cnt; + } + } + + VTR_ASSERT(type); + size_t min_count = min_count_map.at(type) + kv.second; + min_count_map.at(type) = min_count; if (inst_cnt < min_count) { overused_resources.push_back(type); @@ -277,7 +295,7 @@ static DeviceGrid build_device_grid(const t_grid_def& grid_def, size_t grid_widt auto grid = vtr::Matrix({grid_width, grid_height}); //Initialize the device to all empty blocks - auto empty_type = find_block_type_by_name(EMPTY_BLOCK_NAME, device_ctx.physical_tile_types); + auto empty_type = device_ctx.EMPTY_PHYSICAL_TILE_TYPE; VTR_ASSERT(empty_type != nullptr); for (size_t x = 0; x < grid_width; ++x) { for (size_t y = 0; y < grid_height; ++y) { @@ -290,7 +308,7 @@ static DeviceGrid build_device_grid(const t_grid_def& grid_def, size_t grid_widt for (const auto& grid_loc_def : grid_def.loc_defs) { //Fill in the block types according to the specification - auto type = find_block_type_by_name(grid_loc_def.block_type, device_ctx.physical_tile_types); + auto type = find_tile_type_by_name(grid_loc_def.block_type, device_ctx.physical_tile_types); if (!type) { VPR_FATAL_ERROR(VPR_ERROR_ARCH, @@ -531,7 +549,7 @@ static void set_grid_block_type(int priority, const t_physical_tile_type* type, VTR_ASSERT(grid_priorities[x][y] <= priority); if (grid_tile.type != nullptr - && grid_tile.type != device_ctx.EMPTY_TYPE) { + && grid_tile.type != device_ctx.EMPTY_PHYSICAL_TILE_TYPE) { //We are overriding a non-empty block, we need to be careful //to ensure we remove any blocks which will be invalidated when we //overwrite part of their locations @@ -566,8 +584,8 @@ static void set_grid_block_type(int priority, const t_physical_tile_type* type, // Note: that we explicitly check the type and offsets, since the original block // may have been completely overwritten, and we don't want to change anything // in that case - VTR_ASSERT(device_ctx.EMPTY_TYPE->width == 1); - VTR_ASSERT(device_ctx.EMPTY_TYPE->height == 1); + VTR_ASSERT(device_ctx.EMPTY_PHYSICAL_TILE_TYPE->width == 1); + VTR_ASSERT(device_ctx.EMPTY_PHYSICAL_TILE_TYPE->height == 1); #ifdef VERBOSE VTR_LOG("Ripping up block '%s' at (%d,%d) offset (%d,%d). Overlapped by '%s' at (%d,%d)\n", @@ -576,7 +594,7 @@ static void set_grid_block_type(int priority, const t_physical_tile_type* type, type->name, x_root, y_root); #endif - grid[x][y].type = device_ctx.EMPTY_TYPE; + grid[x][y].type = device_ctx.EMPTY_PHYSICAL_TILE_TYPE; grid[x][y].width_offset = 0; grid[x][y].height_offset = 0; @@ -664,7 +682,12 @@ float calculate_device_utilization(const DeviceGrid& grid, std::mapwidth * type->height; diff --git a/vpr/src/base/SetupVPR.cpp b/vpr/src/base/SetupVPR.cpp index 12d6638ce0b..4fd21004eb5 100644 --- a/vpr/src/base/SetupVPR.cpp +++ b/vpr/src/base/SetupVPR.cpp @@ -111,22 +111,43 @@ void SetupVPR(const t_options* Options, *library_models = Arch->model_library; /* TODO: this is inelegant, I should be populating this information in XmlReadArch */ - device_ctx.EMPTY_TYPE = nullptr; + device_ctx.EMPTY_PHYSICAL_TILE_TYPE = nullptr; for (const auto& type : device_ctx.physical_tile_types) { if (strcmp(type.name, EMPTY_BLOCK_NAME) == 0) { - VTR_ASSERT(device_ctx.EMPTY_TYPE == nullptr); - device_ctx.EMPTY_TYPE = &type; + VTR_ASSERT(device_ctx.EMPTY_PHYSICAL_TILE_TYPE == nullptr); + device_ctx.EMPTY_PHYSICAL_TILE_TYPE = &type; } else { - if (block_type_contains_blif_model(logical_block_type(&type), MODEL_INPUT)) { - device_ctx.input_types.insert(&type); + for (const auto& equivalent_site : type.equivalent_sites) { + if (block_type_contains_blif_model(equivalent_site, MODEL_INPUT)) { + device_ctx.input_types.insert(&type); + break; + } } - if (block_type_contains_blif_model(logical_block_type(&type), MODEL_OUTPUT)) { - device_ctx.output_types.insert(&type); + + for (const auto& equivalent_site : type.equivalent_sites) { + if (block_type_contains_blif_model(equivalent_site, MODEL_OUTPUT)) { + device_ctx.output_types.insert(&type); + break; + } } } } - VTR_ASSERT(device_ctx.EMPTY_TYPE != nullptr); + device_ctx.EMPTY_LOGICAL_BLOCK_TYPE = nullptr; + int max_equivalent_tiles = 0; + for (const auto& type : device_ctx.logical_block_types) { + if (0 == strcmp(type.name, EMPTY_BLOCK_NAME)) { + device_ctx.EMPTY_LOGICAL_BLOCK_TYPE = &type; + } + + max_equivalent_tiles = std::max(max_equivalent_tiles, (int)type.equivalent_tiles.size()); + } + + VTR_ASSERT(max_equivalent_tiles > 0); + device_ctx.has_multiple_equivalent_tiles = max_equivalent_tiles > 1; + + VTR_ASSERT(device_ctx.EMPTY_PHYSICAL_TILE_TYPE != nullptr); + VTR_ASSERT(device_ctx.EMPTY_LOGICAL_BLOCK_TYPE != nullptr); if (device_ctx.input_types.empty()) { VPR_ERROR(VPR_ERROR_ARCH, diff --git a/vpr/src/base/ShowSetup.cpp b/vpr/src/base/ShowSetup.cpp index 3b6c374af76..05501cc9c71 100644 --- a/vpr/src/base/ShowSetup.cpp +++ b/vpr/src/base/ShowSetup.cpp @@ -74,15 +74,20 @@ void printClusteredNetlistStats() { L_num_p_outputs = 0; for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - num_blocks_type[cluster_ctx.clb_nlist.block_type(blk_id)->index]++; - auto type = physical_tile_type(blk_id); - if (is_io_type(type)) { - for (j = 0; j < type->num_pins; j++) { + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + auto physical_tile = pick_best_physical_type(logical_block); + num_blocks_type[logical_block->index]++; + if (is_io_type(physical_tile)) { + for (j = 0; j < logical_block->pb_type->num_pins; j++) { + int physical_pin = get_physical_pin(physical_tile, logical_block, j); + auto pin_class = physical_tile->pin_class[physical_pin]; + auto class_inf = physical_tile->class_inf[pin_class]; + if (cluster_ctx.clb_nlist.block_net(blk_id, j) != ClusterNetId::INVALID()) { - if (type->class_inf[type->pin_class[j]].type == DRIVER) { + if (class_inf.type == DRIVER) { L_num_p_inputs++; } else { - VTR_ASSERT(type->class_inf[type->pin_class[j]].type == RECEIVER); + VTR_ASSERT(class_inf.type == RECEIVER); L_num_p_outputs++; } } diff --git a/vpr/src/base/check_netlist.cpp b/vpr/src/base/check_netlist.cpp index 63fa3cd78f2..ee77a8b8fff 100644 --- a/vpr/src/base/check_netlist.cpp +++ b/vpr/src/base/check_netlist.cpp @@ -91,14 +91,18 @@ static int check_connections_to_global_clb_pins(ClusterNetId net_id, int verbosi int global_to_non_global_connection_count = 0; for (auto pin_id : cluster_ctx.clb_nlist.net_pins(net_id)) { ClusterBlockId blk_id = cluster_ctx.clb_nlist.pin_block(pin_id); - int pin_index = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + auto logical_type = cluster_ctx.clb_nlist.block_type(blk_id); + auto physical_type = pick_best_physical_type(logical_type); - if (physical_tile_type(blk_id)->is_ignored_pin[pin_index] != net_is_ignored - && !is_io_type(physical_tile_type(blk_id))) { + int log_index = cluster_ctx.clb_nlist.pin_logical_index(pin_id); + int pin_index = get_physical_pin(physical_type, logical_type, log_index); + + if (physical_type->is_ignored_pin[pin_index] != net_is_ignored + && !is_io_type(physical_type)) { VTR_LOGV_WARN(verbosity > 2, "Global net '%s' connects to non-global architecture pin '%s' (netlist pin '%s')\n", cluster_ctx.clb_nlist.net_name(net_id).c_str(), - block_type_pin_index_to_name(physical_tile_type(blk_id), pin_index).c_str(), + block_type_pin_index_to_name(physical_type, pin_index).c_str(), cluster_ctx.clb_nlist.pin_name(pin_id).c_str()); ++global_to_non_global_connection_count; @@ -144,7 +148,7 @@ static int check_clb_conn(ClusterBlockId iblk, int num_conn) { /* This case should already have been flagged as an error -- this is * * just a redundant double check. */ - if (num_conn > physical_tile_type(type)->num_pins) { + if (num_conn > type->pb_type->num_pins) { VTR_LOG_ERROR("logic block #%d with output %s has %d pins.\n", iblk, cluster_ctx.clb_nlist.block_name(iblk).c_str(), num_conn); error++; diff --git a/vpr/src/base/clock_modeling.cpp b/vpr/src/base/clock_modeling.cpp index 0e09f4092db..623eb3d7d6c 100644 --- a/vpr/src/base/clock_modeling.cpp +++ b/vpr/src/base/clock_modeling.cpp @@ -6,7 +6,7 @@ void ClockModeling::treat_clock_pins_as_non_globals() { auto& device_ctx = g_vpr_ctx.mutable_device(); for (const auto& type : device_ctx.physical_tile_types) { - if (logical_block_type(&type)->pb_type) { + if (!is_empty_type(&type)) { for (auto clock_pin_idx : type.get_clock_pins_indices()) { // clock pins should be originally considered as global when reading the architecture VTR_ASSERT(type.is_ignored_pin[clock_pin_idx]); diff --git a/vpr/src/base/clustered_netlist.cpp b/vpr/src/base/clustered_netlist.cpp index a2fc8c31daf..f0e031dd413 100644 --- a/vpr/src/base/clustered_netlist.cpp +++ b/vpr/src/base/clustered_netlist.cpp @@ -30,8 +30,8 @@ t_logical_block_type_ptr ClusteredNetlist::block_type(const ClusterBlockId id) c return block_types_[id]; } -ClusterNetId ClusteredNetlist::block_net(const ClusterBlockId blk_id, const int phys_pin_index) const { - auto pin_id = block_pin(blk_id, phys_pin_index); +ClusterNetId ClusteredNetlist::block_net(const ClusterBlockId blk_id, const int logical_pin_index) const { + auto pin_id = block_pin(blk_id, logical_pin_index); if (pin_id) { return pin_net(pin_id); @@ -50,11 +50,11 @@ int ClusteredNetlist::block_pin_net_index(const ClusterBlockId blk_id, const int return OPEN; } -ClusterPinId ClusteredNetlist::block_pin(const ClusterBlockId blk, const int phys_pin_index) const { +ClusterPinId ClusteredNetlist::block_pin(const ClusterBlockId blk, const int logical_pin_index) const { VTR_ASSERT_SAFE(valid_block_id(blk)); - VTR_ASSERT_SAFE_MSG(phys_pin_index >= 0 && phys_pin_index < physical_tile_type(block_type(blk))->num_pins, "Physical pin index must be in range"); + VTR_ASSERT_SAFE_MSG(logical_pin_index >= 0 && logical_pin_index < static_cast(block_logical_pins_[blk].size()), "Logical pin index must be in range"); - return block_logical_pins_[blk][phys_pin_index]; + return block_logical_pins_[blk][logical_pin_index]; } bool ClusteredNetlist::block_contains_primary_input(const ClusterBlockId blk) const { @@ -75,17 +75,17 @@ bool ClusteredNetlist::block_contains_primary_output(const ClusterBlockId blk) c * Pins * */ -int ClusteredNetlist::pin_physical_index(const ClusterPinId id) const { - VTR_ASSERT_SAFE(valid_pin_id(id)); +int ClusteredNetlist::pin_logical_index(const ClusterPinId pin_id) const { + VTR_ASSERT_SAFE(valid_pin_id(pin_id)); - return pin_physical_index_[id]; + return pin_logical_index_[pin_id]; } -int ClusteredNetlist::net_pin_physical_index(const ClusterNetId net_id, int net_pin_index) const { +int ClusteredNetlist::net_pin_logical_index(const ClusterNetId net_id, int net_pin_index) const { auto pin_id = net_pin(net_id, net_pin_index); if (pin_id) { - return pin_physical_index(pin_id); + return pin_logical_index(pin_id); } return OPEN; //No valid pin found @@ -122,7 +122,7 @@ ClusterBlockId ClusteredNetlist::create_block(const char* name, t_pb* pb, t_logi block_types_.insert(blk_id, type); //Allocate and initialize every potential pin of the block - block_logical_pins_.insert(blk_id, std::vector(physical_tile_type(type)->num_pins, ClusterPinId::INVALID())); + block_logical_pins_.insert(blk_id, std::vector(get_max_num_pins(type), ClusterPinId::INVALID())); } //Check post-conditions: size @@ -135,20 +135,6 @@ ClusterBlockId ClusteredNetlist::create_block(const char* name, t_pb* pb, t_logi return blk_id; } -void ClusteredNetlist::set_pin_physical_index(const ClusterPinId pin, const int phys_pin_index) { - VTR_ASSERT_SAFE(valid_pin_id(pin)); - auto blk = pin_block(pin); - - int old_phys_pin_index = pin_physical_index(pin); - - //Invalidate old mapping - block_logical_pins_[blk][old_phys_pin_index] = ClusterPinId::INVALID(); - - //Update mappings - pin_physical_index_[pin] = phys_pin_index; - block_logical_pins_[blk][phys_pin_index] = pin; -} - ClusterPortId ClusteredNetlist::create_port(const ClusterBlockId blk_id, const std::string name, BitIndex width, PortType type) { ClusterPortId port_id = find_port(blk_id, name); if (!port_id) { @@ -169,7 +155,7 @@ ClusterPortId ClusteredNetlist::create_port(const ClusterBlockId blk_id, const s ClusterPinId ClusteredNetlist::create_pin(const ClusterPortId port_id, BitIndex port_bit, const ClusterNetId net_id, const PinType pin_type_, int pin_index, bool is_const) { ClusterPinId pin_id = Netlist::create_pin(port_id, port_bit, net_id, pin_type_, is_const); - pin_physical_index_.push_back(pin_index); + pin_logical_index_.push_back(pin_index); ClusterBlockId block_id = port_block(port_id); block_logical_pins_[block_id][pin_index] = pin_id; @@ -242,7 +228,7 @@ void ClusteredNetlist::clean_ports_impl(const vtr::vector_map& pin_id_map) { //Update all the pin values - pin_physical_index_ = clean_and_reorder_values(pin_physical_index_, pin_id_map); + pin_logical_index_ = clean_and_reorder_values(pin_logical_index_, pin_id_map); } void ClusteredNetlist::clean_nets_impl(const vtr::vector_map& net_id_map) { @@ -254,10 +240,10 @@ void ClusteredNetlist::clean_nets_impl(const vtr::vector_map& /*pin_id_map*/, const vtr::vector_map& /*port_id_map*/) { for (auto blk : blocks()) { - block_logical_pins_[blk] = std::vector(physical_tile_type(blk)->num_pins, ClusterPinId::INVALID()); //Reset + block_logical_pins_[blk] = std::vector(get_max_num_pins(block_type(blk)), ClusterPinId::INVALID()); //Reset for (auto pin : block_pins(blk)) { - int phys_pin_index = pin_physical_index(pin); - block_logical_pins_[blk][phys_pin_index] = pin; + int logical_pin_index = pin_logical_index(pin); + block_logical_pins_[blk][logical_pin_index] = pin; } } } @@ -283,7 +269,7 @@ void ClusteredNetlist::shrink_to_fit_impl() { block_logical_pins_.shrink_to_fit(); //Pin data - pin_physical_index_.shrink_to_fit(); + pin_logical_index_.shrink_to_fit(); //Net data net_is_ignored_.shrink_to_fit(); @@ -310,7 +296,7 @@ bool ClusteredNetlist::validate_port_sizes_impl(size_t /*num_ports*/) const { } bool ClusteredNetlist::validate_pin_sizes_impl(size_t num_pins) const { - if (pin_physical_index_.size() != num_pins) { + if (pin_logical_index_.size() != num_pins) { return false; } return true; diff --git a/vpr/src/base/clustered_netlist.h b/vpr/src/base/clustered_netlist.h index 2147c8557ed..53039bfdb15 100644 --- a/vpr/src/base/clustered_netlist.h +++ b/vpr/src/base/clustered_netlist.h @@ -68,12 +68,12 @@ * Pins * ---- * The only piece of unique pin information is: - * physical_pin_index_ + * logical_pin_index_ * - * Example of physical_pin_index_ + * Example of logical_pin_index_ * --------------------- - * Given a ClusterPinId, physical_pin_index_ will return the index of the pin within its block - * relative to the t_logical_block_type (physical description of the block). + * Given a ClusterPinId, logical_pin_index_ will return the index of the pin within its block + * relative to the t_logical_block_type (logical description of the block). * * +-----------+ * 0-->|O X|-->3 @@ -83,7 +83,7 @@ * * The index skips over unused pins, e.g. CLB has 6 pins (3 in, 3 out, numbered [0...5]), where * the first two ins, and last two outs are used. Indices [0,1] represent the ins, and [4,5] - * represent the outs. Indices [2,3] are unused. Therefore, physical_pin_index_[92] = 5. + * represent the outs. Indices [2,3] are unused. Therefore, logical_pin_index_[92] = 5. * * Nets * ---- @@ -134,8 +134,8 @@ class ClusteredNetlist : public Netlist block_pbs_; //Physical block representing the clustering & internal hierarchy of each CLB - vtr::vector_map block_types_; //The type of physical block this user circuit block is mapped to - vtr::vector_map> block_logical_pins_; //The logical pin associated with each physical block pin + vtr::vector_map block_types_; //The type of logical block this user circuit block is mapped to + vtr::vector_map> block_logical_pins_; //The logical pin associated with each physical tile pin //Pins - vtr::vector_map pin_physical_index_; //The physical pin index (i.e. pin index - //in t_logical_block_type) of logical pins + vtr::vector_map pin_logical_index_; //The logical pin index of this block (i.e. pin index + //in t_logical_block_type) corresponding + //to the clustered pin //Nets vtr::vector_map net_is_ignored_; //Boolean mapping indicating if the net is ignored diff --git a/vpr/src/base/clustered_netlist_utils.cpp b/vpr/src/base/clustered_netlist_utils.cpp index a642405e82c..0e7f09a6fe8 100644 --- a/vpr/src/base/clustered_netlist_utils.cpp +++ b/vpr/src/base/clustered_netlist_utils.cpp @@ -17,7 +17,7 @@ void ClusteredPinAtomPinsLookup::init_lookup(const ClusteredNetlist& clustered_n clustered_pin_connected_atom_pins_.resize(clustered_pins.size()); for (ClusterPinId clustered_pin : clustered_pins) { auto clustered_block = clustered_netlist.pin_block(clustered_pin); - int phys_pin_index = clustered_netlist.pin_physical_index(clustered_pin); - clustered_pin_connected_atom_pins_[clustered_pin] = find_clb_pin_connected_atom_pins(clustered_block, phys_pin_index, pb_gpin_lookup); + int logical_pin_index = clustered_netlist.pin_logical_index(clustered_pin); + clustered_pin_connected_atom_pins_[clustered_pin] = find_clb_pin_connected_atom_pins(clustered_block, logical_pin_index, pb_gpin_lookup); } } diff --git a/vpr/src/base/device_grid.cpp b/vpr/src/base/device_grid.cpp index c37f3eed4a9..3be488d26dd 100644 --- a/vpr/src/base/device_grid.cpp +++ b/vpr/src/base/device_grid.cpp @@ -13,7 +13,7 @@ DeviceGrid::DeviceGrid(std::string grid_name, vtr::Matrix grid, std } size_t DeviceGrid::num_instances(t_physical_tile_type_ptr type) const { - auto iter = instance_counts_.find(logical_block_type(type)); + auto iter = instance_counts_.find(type); if (iter != instance_counts_.end()) { //Return count return iter->second; @@ -36,7 +36,7 @@ void DeviceGrid::count_instances() { if (grid_[x][y].width_offset == 0 && grid_[x][y].height_offset == 0) { //Add capacity only if this is the root location - instance_counts_[logical_block_type(type)] += type->capacity; + instance_counts_[type] += type->capacity; } } } diff --git a/vpr/src/base/device_grid.h b/vpr/src/base/device_grid.h index 9247aac1e2d..6f0584c94db 100644 --- a/vpr/src/base/device_grid.h +++ b/vpr/src/base/device_grid.h @@ -37,7 +37,7 @@ class DeviceGrid { //traditional 2-d indexing to be used vtr::Matrix grid_; - std::map instance_counts_; + std::map instance_counts_; std::vector limiting_resources_; }; diff --git a/vpr/src/base/read_netlist.cpp b/vpr/src/base/read_netlist.cpp index cfbb78c384b..a44ad17e6b3 100644 --- a/vpr/src/base/read_netlist.cpp +++ b/vpr/src/base/read_netlist.cpp @@ -860,7 +860,6 @@ static void load_external_nets_and_cb(ClusteredNetlist& clb_nlist) { ext_nhash = alloc_hash_table(); - t_physical_tile_type_ptr tile_type; t_logical_block_type_ptr block_type; /* Assumes that complex block pins are ordered inputs, outputs, globals */ @@ -868,14 +867,13 @@ static void load_external_nets_and_cb(ClusteredNetlist& clb_nlist) { /* Determine the external nets of complex block */ for (auto blk_id : clb_nlist.blocks()) { block_type = clb_nlist.block_type(blk_id); - tile_type = physical_tile_type(block_type); const t_pb* pb = clb_nlist.block_pb(blk_id); ipin = 0; VTR_ASSERT(block_type->pb_type->num_input_pins + block_type->pb_type->num_output_pins + block_type->pb_type->num_clock_pins - == tile_type->num_pins / tile_type->capacity); + == block_type->pb_type->num_pins); int num_input_ports = pb->pb_graph_node->num_input_ports; int num_output_ports = pb->pb_graph_node->num_output_ports; @@ -951,14 +949,16 @@ static void load_external_nets_and_cb(ClusteredNetlist& clb_nlist) { * and blocks point back to net pins */ for (auto blk_id : clb_nlist.blocks()) { block_type = clb_nlist.block_type(blk_id); - tile_type = physical_tile_type(block_type); - for (j = 0; j < tile_type->num_pins; j++) { + auto tile_type = pick_best_physical_type(block_type); + for (j = 0; j < block_type->pb_type->num_pins; j++) { + int physical_pin = get_physical_pin(tile_type, block_type, j); + //Iterate through each pin of the block, and see if there is a net allocated/used for it clb_net_id = clb_nlist.block_net(blk_id, j); if (clb_net_id != ClusterNetId::INVALID()) { //Verify old and new CLB netlists have the same # of pins per net - if (RECEIVER == tile_type->class_inf[tile_type->pin_class[j]].type) { + if (RECEIVER == tile_type->class_inf[tile_type->pin_class[physical_pin]].type) { count[clb_net_id]++; if (count[clb_net_id] > (int)clb_nlist.net_sinks(clb_net_id).size()) { @@ -971,23 +971,23 @@ static void load_external_nets_and_cb(ClusteredNetlist& clb_nlist) { //Asserts the ClusterBlockId is the same when ClusterNetId & pin BitIndex is provided VTR_ASSERT(blk_id == clb_nlist.pin_block(*(clb_nlist.net_pins(clb_net_id).begin() + count[clb_net_id]))); //Asserts the block's pin index is the same - VTR_ASSERT(j == clb_nlist.pin_physical_index(*(clb_nlist.net_pins(clb_net_id).begin() + count[clb_net_id]))); - VTR_ASSERT(j == clb_nlist.net_pin_physical_index(clb_net_id, count[clb_net_id])); + VTR_ASSERT(j == clb_nlist.pin_logical_index(*(clb_nlist.net_pins(clb_net_id).begin() + count[clb_net_id]))); + VTR_ASSERT(j == clb_nlist.net_pin_logical_index(clb_net_id, count[clb_net_id])); // nets connecting to global pins are marked as global nets - if (tile_type->is_pin_global[j]) { + if (tile_type->is_pin_global[physical_pin]) { clb_nlist.set_net_is_global(clb_net_id, true); } - if (tile_type->is_ignored_pin[j]) { + if (tile_type->is_ignored_pin[physical_pin]) { clb_nlist.set_net_is_ignored(clb_net_id, true); } /* Error check performed later to ensure no mixing of ignored and non ignored signals */ } else { - VTR_ASSERT(DRIVER == tile_type->class_inf[tile_type->pin_class[j]].type); - VTR_ASSERT(j == clb_nlist.pin_physical_index(*(clb_nlist.net_pins(clb_net_id).begin()))); - VTR_ASSERT(j == clb_nlist.net_pin_physical_index(clb_net_id, 0)); + VTR_ASSERT(DRIVER == tile_type->class_inf[tile_type->pin_class[physical_pin]].type); + VTR_ASSERT(j == clb_nlist.pin_logical_index(*(clb_nlist.net_pins(clb_net_id).begin()))); + VTR_ASSERT(j == clb_nlist.net_pin_logical_index(clb_net_id, 0)); } } } @@ -999,8 +999,11 @@ static void load_external_nets_and_cb(ClusteredNetlist& clb_nlist) { for (auto pin_id : clb_nlist.net_sinks(net_id)) { bool is_ignored_net = clb_nlist.net_is_ignored(net_id); block_type = clb_nlist.block_type(clb_nlist.pin_block(pin_id)); - tile_type = physical_tile_type(block_type); - if (tile_type->is_ignored_pin[clb_nlist.pin_physical_index(pin_id)] != is_ignored_net) { + auto tile_type = pick_best_physical_type(block_type); + int logical_pin = clb_nlist.pin_logical_index(pin_id); + int physical_pin = get_physical_pin(tile_type, block_type, logical_pin); + + if (tile_type->is_ignored_pin[physical_pin] != is_ignored_net) { VTR_LOG_WARN( "Netlist connects net %s to both global and non-global pins.\n", clb_nlist.net_name(net_id).c_str()); diff --git a/vpr/src/base/read_place.cpp b/vpr/src/base/read_place.cpp index 7c81a3158f3..929192ec34c 100644 --- a/vpr/src/base/read_place.cpp +++ b/vpr/src/base/read_place.cpp @@ -160,7 +160,8 @@ void read_user_pad_loc(const char* pad_loc_file) { hash_table = alloc_hash_table(); for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - if (is_io_type(physical_tile_type(blk_id))) { + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + if (is_io_type(pick_best_physical_type(logical_block))) { insert_in_hash_table(hash_table, cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id)); place_ctx.block_locs[blk_id].loc.x = OPEN; /* Mark as not seen yet. */ } @@ -266,7 +267,8 @@ void read_user_pad_loc(const char* pad_loc_file) { } for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - auto type = physical_tile_type(blk_id); + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + auto type = pick_best_physical_type(logical_block); if (is_io_type(type) && place_ctx.block_locs[blk_id].loc.x == OPEN) { vpr_throw(VPR_ERROR_PLACE_F, pad_loc_file, 0, "IO block %s location was not specified in the pad file.\n", cluster_ctx.clb_nlist.block_name(blk_id).c_str()); diff --git a/vpr/src/base/read_route.cpp b/vpr/src/base/read_route.cpp index 6bdfeb2b1b3..9ec4069fe2c 100644 --- a/vpr/src/base/read_route.cpp +++ b/vpr/src/base/read_route.cpp @@ -378,7 +378,7 @@ static void process_global_blocks(std::ifstream& fp, ClusterNetId inet, const ch x, y, place_ctx.block_locs[bnum].loc.x, place_ctx.block_locs[bnum].loc.y); } - int pin_index = cluster_ctx.clb_nlist.net_pin_physical_index(inet, pin_counter); + int pin_index = net_pin_to_tile_pin_index(inet, pin_counter); if (physical_tile_type(bnum)->pin_class[pin_index] != atoi(tokens[7].c_str())) { vpr_throw(VPR_ERROR_ROUTE, filename, lineno, "The pin class %d of %lu net does not match given ", diff --git a/vpr/src/base/stats.cpp b/vpr/src/base/stats.cpp index 28eb70c45db..846185ee215 100644 --- a/vpr/src/base/stats.cpp +++ b/vpr/src/base/stats.cpp @@ -61,7 +61,7 @@ void routing_stats(bool full_stats, enum e_route_type route_type, std::vectorarea == UNDEFINED) { area += grid_logic_tile_area * type->width * type->height; } else { diff --git a/vpr/src/base/vpr_api.cpp b/vpr/src/base/vpr_api.cpp index 0854b644f8b..f2900d0d299 100644 --- a/vpr/src/base/vpr_api.cpp +++ b/vpr/src/base/vpr_api.cpp @@ -418,22 +418,35 @@ void vpr_create_device_grid(const t_vpr_setup& vpr_setup, const t_arch& Arch) { VTR_LOG("\n"); VTR_LOG("Resource usage...\n"); - for (const auto& type : device_ctx.physical_tile_types) { - VTR_LOG("\tNetlist %d\tblocks of type: %s\n", - num_type_instances[logical_block_type(&type)], type.name); - VTR_LOG("\tArchitecture %d\tblocks of type: %s\n", - device_ctx.grid.num_instances(&type), type.name); + for (const auto& type : device_ctx.logical_block_types) { + if (is_empty_type(&type)) continue; + + VTR_LOG("\tNetlist\n\t\t%d\tblocks of type: %s\n", + num_type_instances[&type], type.name); + + VTR_LOG("\tArchitecture\n"); + for (const auto equivalent_tile : type.equivalent_tiles) { + VTR_LOG("\t\t%d\tblocks of type: %s\n", + device_ctx.grid.num_instances(equivalent_tile), equivalent_tile->name); + } } VTR_LOG("\n"); float device_utilization = calculate_device_utilization(device_ctx.grid, num_type_instances); VTR_LOG("Device Utilization: %.2f (target %.2f)\n", device_utilization, target_device_utilization); for (const auto& type : device_ctx.physical_tile_types) { - float util = 0.; + if (is_empty_type(&type)) { + continue; + } + if (device_ctx.grid.num_instances(&type) != 0) { - util = float(num_type_instances[logical_block_type(&type)]) / device_ctx.grid.num_instances(&type); + float util = 0.; + VTR_LOG("\tPhysical Tile %s:\n", type.name); + for (auto logical_block : type.equivalent_sites) { + util = float(num_type_instances[logical_block]) / device_ctx.grid.num_instances(&type); + VTR_LOG("\tBlock Utilization: %.2f Logical Block: %s\n", util, logical_block->name); + } } - VTR_LOG("\tBlock Utilization: %.2f Type: %s\n", util, type.name); } VTR_LOG("\n"); @@ -880,7 +893,7 @@ static void get_intercluster_switch_fanin_estimates(const t_vpr_setup& vpr_setup //Build a dummy 10x10 device to determine the 'best' block type to use auto grid = create_device_grid(vpr_setup.device_layout, arch.grid_layouts, 10, 10); - auto type = physical_tile_type(find_most_common_block_type(grid)); + auto type = find_most_common_tile_type(grid); /* get Fc_in/out for most common block (e.g. logic blocks) */ VTR_ASSERT(type->fc_specs.size() > 0); diff --git a/vpr/src/base/vpr_context.h b/vpr/src/base/vpr_context.h index 7a72c7b2c97..b6a1f9859da 100644 --- a/vpr/src/base/vpr_context.h +++ b/vpr/src/base/vpr_context.h @@ -120,7 +120,10 @@ struct DeviceContext : public Context { /* Special pointers to identify special blocks on an FPGA: I/Os, unused, and default */ std::set input_types; std::set output_types; - t_physical_tile_type_ptr EMPTY_TYPE; + + /* Empty types */ + t_physical_tile_type_ptr EMPTY_PHYSICAL_TILE_TYPE; + t_logical_block_type_ptr EMPTY_LOGICAL_BLOCK_TYPE; /* block_types are blocks that can be moved by the placer * such as: I/Os, CLBs, memories, multipliers, etc @@ -129,6 +132,10 @@ struct DeviceContext : public Context { std::vector physical_tile_types; std::vector logical_block_types; + /* Boolean that indicates whether the architecture implements an N:M + * physical tiles to logical blocks mapping */ + bool has_multiple_equivalent_tiles; + /******************************************************************* * Routing related ********************************************************************/ @@ -247,6 +254,9 @@ struct PlacementContext : public Context { //Clustered block placement locations vtr::vector_map block_locs; + //Clustered pin placement mapping with physical pin + vtr::vector_map physical_pins; + //Clustered block associated with each grid location (i.e. inverse of block_locs) vtr::Matrix grid_blocks; //[0..device_ctx.grid.width()-1][0..device_ctx.grid.width()-1] diff --git a/vpr/src/base/vpr_types.h b/vpr/src/base/vpr_types.h index f1c74839f64..e6c010f7398 100644 --- a/vpr/src/base/vpr_types.h +++ b/vpr/src/base/vpr_types.h @@ -625,13 +625,11 @@ struct t_place_region { * x: x-coordinate * y: y-coordinate * z: occupancy coordinate - * is_fixed: true if this block's position is fixed by the user and shouldn't be moved during annealing - * nets_and_pins_synced_to_z_coordinate: true if the associated clb's pins have been synced to the z location (i.e. after placement) */ + * is_fixed: true if this block's position is fixed by the user and shouldn't be moved during annealing */ struct t_block_loc { t_pl_loc loc; bool is_fixed = false; - bool nets_and_pins_synced_to_z_coordinate = false; }; /* Stores the clustered blocks placed at a particular grid location */ diff --git a/vpr/src/draw/draw.cpp b/vpr/src/draw/draw.cpp index 1e099500fc1..d29ede76133 100644 --- a/vpr/src/draw/draw.cpp +++ b/vpr/src/draw/draw.cpp @@ -2654,15 +2654,18 @@ void draw_highlight_blocks_color(t_logical_block_type_ptr type, ClusterBlockId b t_draw_state* draw_state = get_draw_state_vars(); auto& cluster_ctx = g_vpr_ctx.clustering(); - for (k = 0; k < physical_tile_type(type)->num_pins; k++) { /* Each pin on a CLB */ + for (k = 0; k < type->pb_type->num_pins; k++) { /* Each pin on a CLB */ ClusterNetId net_id = cluster_ctx.clb_nlist.block_net(blk_id, k); if (net_id == ClusterNetId::INVALID()) continue; - iclass = physical_tile_type(type)->pin_class[k]; + auto physical_tile = physical_tile_type(blk_id); + int physical_pin = get_physical_pin(physical_tile, type, k); - if (physical_tile_type(type)->class_inf[iclass].type == DRIVER) { /* Fanout */ + iclass = physical_tile->pin_class[physical_pin]; + + if (physical_tile->class_inf[iclass].type == DRIVER) { /* Fanout */ if (draw_state->block_color[blk_id] == SELECTED_COLOR) { /* If block already highlighted, de-highlight the fanout. (the deselect case)*/ draw_state->net_color[net_id] = ezgl::BLACK; @@ -2712,7 +2715,8 @@ void deselect_all() { /* Create some colour highlighting */ for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - draw_reset_blk_color(blk_id); + if (blk_id != ClusterBlockId::INVALID()) + draw_reset_blk_color(blk_id); } for (auto net_id : cluster_ctx.clb_nlist.nets()) @@ -2726,9 +2730,13 @@ void deselect_all() { } static void draw_reset_blk_color(ClusterBlockId blk_id) { + auto& clb_nlist = g_vpr_ctx.clustering().clb_nlist; + + auto logical_block = clb_nlist.block_type(blk_id); + t_draw_state* draw_state = get_draw_state_vars(); - draw_state->block_color[blk_id] = get_block_type_color(physical_tile_type(blk_id)); + draw_state->block_color[blk_id] = get_block_type_color(pick_best_physical_type(logical_block)); } /** @@ -3305,10 +3313,8 @@ static void draw_block_pin_util() { continue; } - t_pb_type* pb_type = logical_block_type(&type)->pb_type; - - total_input_pins[&type] = pb_type->num_input_pins + pb_type->num_clock_pins; - total_output_pins[&type] = pb_type->num_output_pins; + total_input_pins[&type] = type.num_input_pins + type.num_clock_pins; + total_output_pins[&type] = type.num_output_pins; } auto blks = cluster_ctx.clb_nlist.blocks(); @@ -3688,7 +3694,7 @@ static void highlight_blocks(double x, double y) { } } - if (clb_index == EMPTY_BLOCK_ID) { + if (clb_index == EMPTY_BLOCK_ID || clb_index == ClusterBlockId::INVALID()) { //Nothing found return; } diff --git a/vpr/src/draw/draw_types.cpp b/vpr/src/draw/draw_types.cpp index 7282aef73f9..5d6e4ca7a83 100644 --- a/vpr/src/draw/draw_types.cpp +++ b/vpr/src/draw/draw_types.cpp @@ -95,7 +95,7 @@ ezgl::rectangle t_draw_coords::get_absolute_clb_bbox(const ClusterBlockId clb_in ezgl::rectangle t_draw_coords::get_absolute_clb_bbox(int grid_x, int grid_y, int sub_block_index) { auto& device_ctx = g_vpr_ctx.device(); - return get_pb_bbox(grid_x, grid_y, sub_block_index, *logical_block_type(device_ctx.grid[grid_x][grid_y].type)->pb_graph_head); + return get_pb_bbox(grid_x, grid_y, sub_block_index, *pick_best_logical_type(device_ctx.grid[grid_x][grid_y].type)->pb_graph_head); } #endif // NO_GRAPHICS diff --git a/vpr/src/draw/intra_logic_block.cpp b/vpr/src/draw/intra_logic_block.cpp index 40eb8aa0ad9..a3771f7fed5 100644 --- a/vpr/src/draw/intra_logic_block.cpp +++ b/vpr/src/draw/intra_logic_block.cpp @@ -68,7 +68,7 @@ void draw_internal_alloc_blk() { draw_coords->blk_info.resize(device_ctx.logical_block_types.size()); for (const auto& type : device_ctx.logical_block_types) { - if (physical_tile_type(&type) == device_ctx.EMPTY_TYPE) { + if (&type == device_ctx.EMPTY_LOGICAL_BLOCK_TYPE) { continue; } @@ -92,10 +92,12 @@ void draw_internal_init_blk() { auto& device_ctx = g_vpr_ctx.device(); for (const auto& type : device_ctx.physical_tile_types) { /* Empty block has no sub_blocks */ - if (&type == device_ctx.EMPTY_TYPE) + if (is_empty_type(&type)) { continue; + } - pb_graph_head_node = logical_block_type(&type)->pb_graph_head; + auto logical_block = pick_best_logical_type(&type); + pb_graph_head_node = logical_block->pb_graph_head; int type_descriptor_index = type.index; int num_sub_tiles = type.capacity; @@ -129,7 +131,7 @@ void draw_internal_init_blk() { clb_bbox.width(), clb_bbox.height()); /* Determine the max number of sub_block levels in the FPGA */ - draw_state->max_sub_blk_lvl = std::max(draw_internal_find_max_lvl(*logical_block_type(&type)->pb_type), + draw_state->max_sub_blk_lvl = std::max(draw_internal_find_max_lvl(*logical_block->pb_type), draw_state->max_sub_blk_lvl); } } @@ -151,7 +153,7 @@ void draw_internal_draw_subblk(ezgl::renderer* g) { continue; /* Don't draw if tile is empty. This includes corners. */ - if (device_ctx.grid[i][j].type == device_ctx.EMPTY_TYPE) + if (device_ctx.grid[i][j].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) continue; int num_sub_tiles = device_ctx.grid[i][j].type->capacity; diff --git a/vpr/src/pack/cluster.cpp b/vpr/src/pack/cluster.cpp index 20086355b95..fa9d4d8f832 100644 --- a/vpr/src/pack/cluster.cpp +++ b/vpr/src/pack/cluster.cpp @@ -451,7 +451,7 @@ std::map do_clustering(const t_packer_opts& pa num_molecules = count_molecules(molecule_head); for (const auto& type : device_ctx.logical_block_types) { - if (device_ctx.EMPTY_TYPE == physical_tile_type(&type)) + if (is_empty_type(&type)) continue; cur_cluster_size = get_max_primitives_in_pb_type(type.pb_type); @@ -1963,8 +1963,16 @@ static void start_new_cluster(t_cluster_placement_stats* cluster_placement_stats //support the same primitive(s). std::stable_sort(candidate_types.begin(), candidate_types.end(), [&](t_logical_block_type_ptr lhs, t_logical_block_type_ptr rhs) { - float lhs_util = vtr::safe_ratio(num_used_type_instances[lhs], device_ctx.grid.num_instances(physical_tile_type(lhs))); - float rhs_util = vtr::safe_ratio(num_used_type_instances[rhs], device_ctx.grid.num_instances(physical_tile_type(rhs))); + int lhs_num_instances = 0; + int rhs_num_instances = 0; + // Count number of instances for each type + for (auto type : lhs->equivalent_tiles) + lhs_num_instances += device_ctx.grid.num_instances(type); + for (auto type : rhs->equivalent_tiles) + rhs_num_instances += device_ctx.grid.num_instances(type); + + float lhs_util = vtr::safe_ratio(num_used_type_instances[lhs], lhs_num_instances); + float rhs_util = vtr::safe_ratio(num_used_type_instances[rhs], rhs_num_instances); //Lower util first return lhs_util < rhs_util; }); @@ -2053,10 +2061,17 @@ static void start_new_cluster(t_cluster_placement_stats* cluster_placement_stats VTR_ASSERT(success); //Successfully create cluster - num_used_type_instances[clb_nlist->block_type(clb_index)]++; + auto block_type = clb_nlist->block_type(clb_index); + num_used_type_instances[block_type]++; /* Expand FPGA size if needed */ - if (num_used_type_instances[clb_nlist->block_type(clb_index)] > device_ctx.grid.num_instances(physical_tile_type(clb_index))) { + // Check used type instances against the possible equivalent physical locations + unsigned int num_instances = 0; + for (auto equivalent_tile : block_type->equivalent_tiles) { + num_instances += device_ctx.grid.num_instances(equivalent_tile); + } + + if (num_used_type_instances[block_type] > num_instances) { device_ctx.grid = create_device_grid(device_layout_name, arch->grid_layouts, num_used_type_instances, target_device_utilization); VTR_LOGV(verbosity > 0, "Not enough resources expand FPGA size to (%d x %d)\n", device_ctx.grid.width(), device_ctx.grid.height()); diff --git a/vpr/src/pack/cluster_placement.cpp b/vpr/src/pack/cluster_placement.cpp index 807908f3c57..36a78bec6e8 100644 --- a/vpr/src/pack/cluster_placement.cpp +++ b/vpr/src/pack/cluster_placement.cpp @@ -63,7 +63,7 @@ t_cluster_placement_stats* alloc_and_load_cluster_placement_stats() { cluster_placement_stats_list = (t_cluster_placement_stats*)vtr::calloc(device_ctx.logical_block_types.size(), sizeof(t_cluster_placement_stats)); for (const auto& type : device_ctx.logical_block_types) { - if (device_ctx.EMPTY_TYPE != physical_tile_type(&type)) { + if (!is_empty_type(&type)) { cluster_placement_stats_list[type.index].valid_primitives = (t_cluster_placement_primitive**)vtr::calloc( get_max_primitives_in_pb_type(type.pb_type) + 1, sizeof(t_cluster_placement_primitive*)); /* too much memory allocated but shouldn't be a problem */ diff --git a/vpr/src/pack/lb_type_rr_graph.cpp b/vpr/src/pack/lb_type_rr_graph.cpp index 8f2763d53df..cc600ca47f7 100644 --- a/vpr/src/pack/lb_type_rr_graph.cpp +++ b/vpr/src/pack/lb_type_rr_graph.cpp @@ -58,7 +58,7 @@ std::vector* alloc_and_load_all_lb_type_rr_graph() { for (const auto& type : device_ctx.logical_block_types) { int itype = type.index; - if (physical_tile_type(&type) != device_ctx.EMPTY_TYPE) { + if (&type != device_ctx.EMPTY_LOGICAL_BLOCK_TYPE) { alloc_and_load_lb_type_rr_graph_for_type(&type, lb_type_rr_graphs[itype]); /* Now that the data is loaded, reallocate to the precise amount of memory needed to prevent insidious bugs */ @@ -75,7 +75,7 @@ void free_all_lb_type_rr_graph(std::vector* lb_type_rr_graphs for (const auto& type : device_ctx.logical_block_types) { int itype = type.index; - if (physical_tile_type(&type) != device_ctx.EMPTY_TYPE) { + if (!is_empty_type(&type)) { int graph_size = lb_type_rr_graphs[itype].size(); for (int inode = 0; inode < graph_size; inode++) { t_lb_type_rr_node* node = &lb_type_rr_graphs[itype][inode]; @@ -133,7 +133,7 @@ void echo_lb_type_rr_graphs(char* filename, std::vector* lb_t auto& device_ctx = g_vpr_ctx.device(); for (const auto& type : device_ctx.logical_block_types) { - if (physical_tile_type(&type) != device_ctx.EMPTY_TYPE) { + if (!is_empty_type(&type)) { fprintf(fp, "--------------------------------------------------------------\n"); fprintf(fp, "Intra-Logic Block Routing Resource For Type %s\n", type.name); fprintf(fp, "--------------------------------------------------------------\n"); diff --git a/vpr/src/pack/output_clustering.cpp b/vpr/src/pack/output_clustering.cpp index 9987be6a32c..bb8a98ae5d0 100644 --- a/vpr/src/pack/output_clustering.cpp +++ b/vpr/src/pack/output_clustering.cpp @@ -63,18 +63,24 @@ static void print_stats() { /* Counters used only for statistics purposes. */ for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - auto type = physical_tile_type(blk_id); - for (ipin = 0; ipin < type->num_pins; ipin++) { + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + auto physical_tile = pick_best_physical_type(logical_block); + for (ipin = 0; ipin < logical_block->pb_type->num_pins; ipin++) { + int physical_pin = get_physical_pin(physical_tile, logical_block, ipin); + auto pin_class = physical_tile->pin_class[physical_pin]; + auto pin_class_inf = physical_tile->class_inf[pin_class]; + if (cluster_ctx.clb_nlist.block_pb(blk_id)->pb_route.empty()) { ClusterNetId clb_net_id = cluster_ctx.clb_nlist.block_net(blk_id, ipin); if (clb_net_id != ClusterNetId::INVALID()) { auto net_id = atom_ctx.lookup.atom_net(clb_net_id); VTR_ASSERT(net_id); nets_absorbed[net_id] = false; - if (type->class_inf[type->pin_class[ipin]].type == RECEIVER) { - num_clb_inputs_used[type->index]++; - } else if (type->class_inf[type->pin_class[ipin]].type == DRIVER) { - num_clb_outputs_used[type->index]++; + + if (pin_class_inf.type == RECEIVER) { + num_clb_inputs_used[logical_block->index]++; + } else if (pin_class_inf.type == DRIVER) { + num_clb_outputs_used[logical_block->index]++; } } } else { @@ -86,16 +92,16 @@ static void print_stats() { auto atom_net_id = pb->pb_route[pb_graph_pin_id].atom_net_id; if (atom_net_id) { nets_absorbed[atom_net_id] = false; - if (type->class_inf[type->pin_class[ipin]].type == RECEIVER) { - num_clb_inputs_used[type->index]++; - } else if (type->class_inf[type->pin_class[ipin]].type == DRIVER) { - num_clb_outputs_used[type->index]++; + if (pin_class_inf.type == RECEIVER) { + num_clb_inputs_used[logical_block->index]++; + } else if (pin_class_inf.type == DRIVER) { + num_clb_outputs_used[logical_block->index]++; } } } } } - num_clb_types[type->index]++; + num_clb_types[logical_block->index]++; } for (itype = 0; itype < device_ctx.logical_block_types.size(); itype++) { diff --git a/vpr/src/pack/pack.cpp b/vpr/src/pack/pack.cpp index 0dfd4349126..612d0df24d9 100644 --- a/vpr/src/pack/pack.cpp +++ b/vpr/src/pack/pack.cpp @@ -169,7 +169,12 @@ bool try_pack(t_packer_opts* packer_opts, } resource_reqs += std::string(iter->first->name) + ": " + std::to_string(iter->second); - resource_avail += std::string(iter->first->name) + ": " + std::to_string(grid.num_instances(physical_tile_type(iter->first))); + + int num_instances = 0; + for (auto type : iter->first->equivalent_tiles) + num_instances += grid.num_instances(type); + + resource_avail += std::string(iter->first->name) + ": " + std::to_string(num_instances); } VPR_FATAL_ERROR(VPR_ERROR_OTHER, "Failed to find device which satisifies resource requirements required: %s (available %s)", resource_reqs.c_str(), resource_avail.c_str()); @@ -274,14 +279,21 @@ static bool try_size_device_grid(const t_arch& arch, const std::map type_util; for (const auto& type : device_ctx.logical_block_types) { - auto physical_type = physical_tile_type(&type); + if (is_empty_type(&type)) continue; + auto itr = num_type_instances.find(&type); if (itr == num_type_instances.end()) continue; float num_instances = itr->second; float util = 0.; - if (device_ctx.grid.num_instances(physical_type) != 0) { - util = num_instances / device_ctx.grid.num_instances(physical_type); + + float num_total_instances = 0.; + for (const auto& equivalent_tile : type.equivalent_tiles) { + num_total_instances += device_ctx.grid.num_instances(equivalent_tile); + } + + if (num_total_instances != 0) { + util = num_instances / num_total_instances; } type_util[&type] = util; diff --git a/vpr/src/pack/pack_report.cpp b/vpr/src/pack/pack_report.cpp index a0f920b6df1..c571a2737d9 100644 --- a/vpr/src/pack/pack_report.cpp +++ b/vpr/src/pack/pack_report.cpp @@ -15,22 +15,22 @@ void report_packing_pin_usage(std::ostream& os, const VprContext& ctx) { auto& cluster_ctx = ctx.clustering(); auto& device_ctx = ctx.device(); - std::map total_input_pins; - std::map total_output_pins; - for (auto const& type : device_ctx.physical_tile_types) { + std::map total_input_pins; + std::map total_output_pins; + for (auto const& type : device_ctx.logical_block_types) { if (is_empty_type(&type)) continue; - t_pb_type* pb_type = logical_block_type(&type)->pb_type; + t_pb_type* pb_type = type.pb_type; total_input_pins[&type] = pb_type->num_input_pins + pb_type->num_clock_pins; total_output_pins[&type] = pb_type->num_output_pins; } - std::map> inputs_used; - std::map> outputs_used; + std::map> inputs_used; + std::map> outputs_used; for (auto blk : cluster_ctx.clb_nlist.blocks()) { - t_physical_tile_type_ptr type = physical_tile_type(blk); + t_logical_block_type_ptr type = cluster_ctx.clb_nlist.block_type(blk); inputs_used[type].push_back(cluster_ctx.clb_nlist.block_input_pins(blk).size() + cluster_ctx.clb_nlist.block_clock_pins(blk).size()); outputs_used[type].push_back(cluster_ctx.clb_nlist.block_output_pins(blk).size()); @@ -40,8 +40,8 @@ void report_packing_pin_usage(std::ostream& os, const VprContext& ctx) { os << std::fixed << std::setprecision(2); - for (auto const& physical_type : device_ctx.physical_tile_types) { - auto type = &physical_type; + for (auto const& logical_type : device_ctx.logical_block_types) { + auto type = &logical_type; if (is_empty_type(type)) continue; if (!inputs_used.count(type)) continue; diff --git a/vpr/src/pack/pb_type_graph.cpp b/vpr/src/pack/pb_type_graph.cpp index d5c7324a60d..8d60c3072a5 100644 --- a/vpr/src/pack/pb_type_graph.cpp +++ b/vpr/src/pack/pb_type_graph.cpp @@ -133,7 +133,7 @@ void alloc_and_load_all_pb_graphs(bool load_power_structures) { load_pin_classes_in_pb_graph_head(type.pb_graph_head); } else { type.pb_graph_head = nullptr; - VTR_ASSERT(physical_tile_type(&type) == device_ctx.EMPTY_TYPE); + VTR_ASSERT(&type == device_ctx.EMPTY_LOGICAL_BLOCK_TYPE); } } diff --git a/vpr/src/place/compressed_grid.cpp b/vpr/src/place/compressed_grid.cpp index b0f5ba39d13..9f3a6219eb4 100644 --- a/vpr/src/place/compressed_grid.cpp +++ b/vpr/src/place/compressed_grid.cpp @@ -6,19 +6,21 @@ std::vector create_compressed_block_grids() { auto& grid = device_ctx.grid; //Collect the set of x/y locations for each instace of a block type - std::vector>> block_locations(device_ctx.physical_tile_types.size()); + std::vector>> block_locations(device_ctx.logical_block_types.size()); for (size_t x = 0; x < grid.width(); ++x) { for (size_t y = 0; y < grid.height(); ++y) { const t_grid_tile& tile = grid[x][y]; if (tile.width_offset == 0 && tile.height_offset == 0) { - //Only record at block root location - block_locations[tile.type->index].emplace_back(x, y); + for (auto& block : tile.type->equivalent_sites) { + //Only record at block root location + block_locations[block->index].emplace_back(x, y); + } } } } - std::vector compressed_type_grids(device_ctx.physical_tile_types.size()); - for (const auto& type : device_ctx.physical_tile_types) { + std::vector compressed_type_grids(device_ctx.logical_block_types.size()); + for (const auto& type : device_ctx.logical_block_types) { compressed_type_grids[type.index] = create_compressed_block_grid(block_locations[type.index]); } diff --git a/vpr/src/place/initial_placement.cpp b/vpr/src/place/initial_placement.cpp new file mode 100644 index 00000000000..8a1f0e12e94 --- /dev/null +++ b/vpr/src/place/initial_placement.cpp @@ -0,0 +1,438 @@ +#include "vtr_memory.h" +#include "vtr_random.h" + +#include "globals.h" +#include "read_place.h" +#include "initial_placement.h" + +/* The maximum number of tries when trying to place a carry chain at a * + * random location before trying exhaustive placement - find the fist * + * legal position and place it during initial placement. */ +#define MAX_NUM_TRIES_TO_PLACE_MACROS_RANDOMLY 4 + +static t_pl_loc** legal_pos = nullptr; /* [0..device_ctx.num_block_types-1][0..type_tsize - 1] */ +static int* num_legal_pos = nullptr; /* [0..num_legal_pos-1] */ + +static void alloc_legal_placement_locations(); +static void load_legal_placement_locations(); + +static void free_legal_placement_locations(); + +static int check_macro_can_be_placed(t_pl_macro pl_macro, int itype, t_pl_loc head_pos); +static int try_place_macro(int itype, int ipos, t_pl_macro pl_macro); +static void initial_placement_pl_macros(int macros_max_num_tries, int* free_locations); + +static void initial_placement_blocks(int* free_locations, enum e_pad_loc_type pad_loc_type); +static void initial_placement_location(const int* free_locations, int& pipos, int itype, t_pl_loc& to); + +static t_physical_tile_type_ptr pick_placement_type(t_logical_block_type_ptr logical_block, + int num_needed_types, + int* free_locations); + +static void alloc_legal_placement_locations() { + auto& device_ctx = g_vpr_ctx.device(); + auto& place_ctx = g_vpr_ctx.mutable_placement(); + + legal_pos = new t_pl_loc*[device_ctx.physical_tile_types.size()]; + num_legal_pos = (int*)vtr::calloc(device_ctx.physical_tile_types.size(), sizeof(int)); + + /* Initialize all occupancy to zero. */ + + for (size_t i = 0; i < device_ctx.grid.width(); i++) { + for (size_t j = 0; j < device_ctx.grid.height(); j++) { + place_ctx.grid_blocks[i][j].usage = 0; + + for (int k = 0; k < device_ctx.grid[i][j].type->capacity; k++) { + if (place_ctx.grid_blocks[i][j].blocks[k] != INVALID_BLOCK_ID) { + place_ctx.grid_blocks[i][j].blocks[k] = EMPTY_BLOCK_ID; + if (device_ctx.grid[i][j].width_offset == 0 && device_ctx.grid[i][j].height_offset == 0) { + num_legal_pos[device_ctx.grid[i][j].type->index]++; + } + } + } + } + } + + for (const auto& type : device_ctx.physical_tile_types) { + legal_pos[type.index] = new t_pl_loc[num_legal_pos[type.index]]; + } +} + +static void load_legal_placement_locations() { + auto& device_ctx = g_vpr_ctx.device(); + auto& place_ctx = g_vpr_ctx.placement(); + + int* index = (int*)vtr::calloc(device_ctx.physical_tile_types.size(), sizeof(int)); + + for (size_t i = 0; i < device_ctx.grid.width(); i++) { + for (size_t j = 0; j < device_ctx.grid.height(); j++) { + for (int k = 0; k < device_ctx.grid[i][j].type->capacity; k++) { + if (place_ctx.grid_blocks[i][j].blocks[k] == INVALID_BLOCK_ID) { + continue; + } + if (device_ctx.grid[i][j].width_offset == 0 && device_ctx.grid[i][j].height_offset == 0) { + int itype = device_ctx.grid[i][j].type->index; + legal_pos[itype][index[itype]].x = i; + legal_pos[itype][index[itype]].y = j; + legal_pos[itype][index[itype]].z = k; + index[itype]++; + } + } + } + } + free(index); +} + +static void free_legal_placement_locations() { + auto& device_ctx = g_vpr_ctx.device(); + + for (unsigned int i = 0; i < device_ctx.physical_tile_types.size(); i++) { + delete[] legal_pos[i]; + } + delete[] legal_pos; /* Free the mapping list */ + free(num_legal_pos); +} + +static int check_macro_can_be_placed(t_pl_macro pl_macro, int itype, t_pl_loc head_pos) { + auto& device_ctx = g_vpr_ctx.device(); + auto& place_ctx = g_vpr_ctx.placement(); + + // Every macro can be placed until proven otherwise + int macro_can_be_placed = true; + + // Check whether all the members can be placed + for (size_t imember = 0; imember < pl_macro.members.size(); imember++) { + t_pl_loc member_pos = head_pos + pl_macro.members[imember].offset; + + // Check whether the location could accept block of this type + // Then check whether the location could still accommodate more blocks + // Also check whether the member position is valid, that is the member's location + // still within the chip's dimemsion and the member_z is allowed at that location on the grid + if (member_pos.x < int(device_ctx.grid.width()) && member_pos.y < int(device_ctx.grid.height()) + && device_ctx.grid[member_pos.x][member_pos.y].type->index == itype + && place_ctx.grid_blocks[member_pos.x][member_pos.y].blocks[member_pos.z] == EMPTY_BLOCK_ID) { + // Can still accommodate blocks here, check the next position + continue; + } else { + // Cant be placed here - skip to the next try + macro_can_be_placed = false; + break; + } + } + + return (macro_can_be_placed); +} + +static int try_place_macro(int itype, int ipos, t_pl_macro pl_macro) { + auto& place_ctx = g_vpr_ctx.mutable_placement(); + + int macro_placed = false; + + // Choose a random position for the head + t_pl_loc head_pos = legal_pos[itype][ipos]; + + // If that location is occupied, do nothing. + if (place_ctx.grid_blocks[head_pos.x][head_pos.y].blocks[head_pos.z] != EMPTY_BLOCK_ID) { + return (macro_placed); + } + + int macro_can_be_placed = check_macro_can_be_placed(pl_macro, itype, head_pos); + + if (macro_can_be_placed) { + // Place down the macro + macro_placed = true; + for (size_t imember = 0; imember < pl_macro.members.size(); imember++) { + t_pl_loc member_pos = head_pos + pl_macro.members[imember].offset; + + ClusterBlockId iblk = pl_macro.members[imember].blk_index; + place_ctx.block_locs[iblk].loc = member_pos; + + place_ctx.grid_blocks[member_pos.x][member_pos.y].blocks[member_pos.z] = pl_macro.members[imember].blk_index; + place_ctx.grid_blocks[member_pos.x][member_pos.y].usage++; + + // Could not ensure that the randomiser would not pick this location again + // So, would have to do a lazy removal - whenever I come across a block that could not be placed, + // go ahead and remove it from the legal_pos[][] array + + } // Finish placing all the members in the macro + + } // End of this choice of legal_pos + + return (macro_placed); +} + +static void initial_placement_pl_macros(int macros_max_num_tries, int* free_locations) { + int macro_placed; + int itype, itry, ipos; + ClusterBlockId blk_id; + + auto& cluster_ctx = g_vpr_ctx.clustering(); + auto& device_ctx = g_vpr_ctx.device(); + auto& place_ctx = g_vpr_ctx.placement(); + + auto& pl_macros = place_ctx.pl_macros; + + // Sorting blocks to place to have most constricted ones to be placed first + std::vector sorted_pl_macros(pl_macros.begin(), pl_macros.end()); + + auto criteria = [&cluster_ctx](const t_pl_macro lhs, t_pl_macro rhs) { + auto lhs_logical_block = cluster_ctx.clb_nlist.block_type(lhs.members[0].blk_index); + auto rhs_logical_block = cluster_ctx.clb_nlist.block_type(rhs.members[0].blk_index); + + auto lhs_num_tiles = lhs_logical_block->equivalent_tiles.size(); + auto rhs_num_tiles = rhs_logical_block->equivalent_tiles.size(); + + return lhs_num_tiles < rhs_num_tiles; + }; + + if (device_ctx.has_multiple_equivalent_tiles) { + std::sort(sorted_pl_macros.begin(), sorted_pl_macros.end(), criteria); + } + + /* Macros are harder to place. Do them first */ + for (auto pl_macro : sorted_pl_macros) { + // Every macro are not placed in the beginnning + macro_placed = false; + + // Assume that all the blocks in the macro are of the same type + blk_id = pl_macro.members[0].blk_index; + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + auto type = pick_placement_type(logical_block, int(pl_macro.members.size()), free_locations); + + if (type == nullptr) { + VPR_FATAL_ERROR(VPR_ERROR_PLACE, + "Initial placement failed.\n" + "Could not place macro length %zu with head block %s (#%zu); not enough free locations of type %s (#%d).\n" + "VPR cannot auto-size for your circuit, please resize the FPGA manually.\n", + pl_macro.members.size(), cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), logical_block->name, logical_block->index); + } + + itype = type->index; + + // Try to place the macro first, if can be placed - place them, otherwise try again + for (itry = 0; itry < macros_max_num_tries && macro_placed == false; itry++) { + // Choose a random position for the head + ipos = vtr::irand(free_locations[itype] - 1); + + // Try to place the macro + macro_placed = try_place_macro(itype, ipos, pl_macro); + + } // Finished all tries + + if (macro_placed == false) { + // if a macro still could not be placed after macros_max_num_tries times, + // go through the chip exhaustively to find a legal placement for the macro + // place the macro on the first location that is legal + // then set macro_placed = true; + // if there are no legal positions, error out + + // Exhaustive placement of carry macros + for (ipos = 0; ipos < free_locations[itype] && macro_placed == false; ipos++) { + // Try to place the macro + macro_placed = try_place_macro(itype, ipos, pl_macro); + + } // Exhausted all the legal placement position for this macro + + // If macro could not be placed after exhaustive placement, error out + if (macro_placed == false) { + // Error out + VPR_FATAL_ERROR(VPR_ERROR_PLACE, + "Initial placement failed.\n" + "Could not place macro length %zu with head block %s (#%zu); not enough free locations of type %s (#%d).\n" + "Please manually size the FPGA because VPR can't do this yet.\n", + pl_macro.members.size(), cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), device_ctx.physical_tile_types[itype].name, itype); + } + + } else { + // This macro has been placed successfully, proceed to place the next macro + continue; + } + } // Finish placing all the pl_macros successfully +} + +/* Place blocks that are NOT a part of any macro. + * We'll randomly place each block in the clustered netlist, one by one. */ +static void initial_placement_blocks(int* free_locations, enum e_pad_loc_type pad_loc_type) { + int itype, ipos; + auto& cluster_ctx = g_vpr_ctx.clustering(); + auto& device_ctx = g_vpr_ctx.device(); + auto& place_ctx = g_vpr_ctx.mutable_placement(); + + auto blocks = cluster_ctx.clb_nlist.blocks(); + + // Sorting blocks to place to have most constricted ones to be placed first + std::vector sorted_blocks(blocks.begin(), blocks.end()); + + auto criteria = [&cluster_ctx](const ClusterBlockId lhs, ClusterBlockId rhs) { + auto lhs_logical_block = cluster_ctx.clb_nlist.block_type(lhs); + auto rhs_logical_block = cluster_ctx.clb_nlist.block_type(rhs); + + auto lhs_num_tiles = lhs_logical_block->equivalent_tiles.size(); + auto rhs_num_tiles = rhs_logical_block->equivalent_tiles.size(); + + return lhs_num_tiles < rhs_num_tiles; + }; + + if (device_ctx.has_multiple_equivalent_tiles) { + std::sort(sorted_blocks.begin(), sorted_blocks.end(), criteria); + } + + for (auto blk_id : sorted_blocks) { + if (place_ctx.block_locs[blk_id].loc.x != -1) { // -1 is a sentinel for an empty block + // block placed. + continue; + } + + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + + /* Don't do IOs if the user specifies IOs; we'll read those locations later. */ + if (!(is_io_type(pick_best_physical_type(logical_block)) && pad_loc_type == USER)) { + /* Randomly select a free location of the appropriate type for blk_id. + * We have a linearized list of all the free locations that can + * accommodate a block of that type in free_locations[itype]. + * Choose one randomly and put blk_id there. Then we don't want to pick + * that location again, so remove it from the free_locations array. + */ + + auto type = pick_placement_type(logical_block, 1, free_locations); + + if (type == nullptr) { + VPR_FATAL_ERROR(VPR_ERROR_PLACE, + "Initial placement failed.\n" + "Could not place block %s (#%zu); no free locations of type %s (#%d).\n", + cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), logical_block->name, logical_block->index); + } + + itype = type->index; + + t_pl_loc to; + initial_placement_location(free_locations, ipos, itype, to); + + // Make sure that the position is EMPTY_BLOCK before placing the block down + VTR_ASSERT(place_ctx.grid_blocks[to.x][to.y].blocks[to.z] == EMPTY_BLOCK_ID); + + place_ctx.grid_blocks[to.x][to.y].blocks[to.z] = blk_id; + place_ctx.grid_blocks[to.x][to.y].usage++; + + place_ctx.block_locs[blk_id].loc = to; + + //Mark IOs as fixed if specifying a (fixed) random placement + if (is_io_type(pick_best_physical_type(logical_block)) && pad_loc_type == RANDOM) { + place_ctx.block_locs[blk_id].is_fixed = true; + } + + /* Ensure randomizer doesn't pick this location again, since it's occupied. Could shift all the + * legal positions in legal_pos to remove the entry (choice) we just used, but faster to + * just move the last entry in legal_pos to the spot we just used and decrement the + * count of free_locations. */ + legal_pos[itype][ipos] = legal_pos[itype][free_locations[itype] - 1]; /* overwrite used block position */ + free_locations[itype]--; + } + } +} + +static void initial_placement_location(const int* free_locations, int& ipos, int itype, t_pl_loc& to) { + ipos = vtr::irand(free_locations[itype] - 1); + to = legal_pos[itype][ipos]; +} + +static t_physical_tile_type_ptr pick_placement_type(t_logical_block_type_ptr logical_block, + int num_needed_types, + int* free_locations) { + // Loop through the ordered map to get tiles in a decreasing priority order + for (auto& tile : logical_block->equivalent_tiles) { + if (free_locations[tile->index] >= num_needed_types) { + return tile; + } + } + + return nullptr; +} + +void initial_placement(enum e_pad_loc_type pad_loc_type, + const char* pad_loc_file) { + /* Randomly places the blocks to create an initial placement. We rely on + * the legal_pos array already being loaded. That legal_pos[itype] is an + * array that gives every legal value of (x,y,z) that can accommodate a block. + * The number of such locations is given by num_legal_pos[itype]. + */ + + // Loading legal placement locations + alloc_legal_placement_locations(); + load_legal_placement_locations(); + + int itype, ipos; + int* free_locations; /* [0..device_ctx.num_block_types-1]. + * Stores how many locations there are for this type that *might* still be free. + * That is, this stores the number of entries in legal_pos[itype] that are worth considering + * as you look for a free location. + */ + auto& device_ctx = g_vpr_ctx.device(); + auto& cluster_ctx = g_vpr_ctx.clustering(); + auto& place_ctx = g_vpr_ctx.mutable_placement(); + + free_locations = (int*)vtr::malloc(device_ctx.physical_tile_types.size() * sizeof(int)); + for (const auto& type : device_ctx.physical_tile_types) { + itype = type.index; + free_locations[itype] = num_legal_pos[itype]; + } + + /* We'll use the grid to record where everything goes. Initialize to the grid has no + * blocks placed anywhere. + */ + for (size_t i = 0; i < device_ctx.grid.width(); i++) { + for (size_t j = 0; j < device_ctx.grid.height(); j++) { + place_ctx.grid_blocks[i][j].usage = 0; + itype = device_ctx.grid[i][j].type->index; + for (int k = 0; k < device_ctx.physical_tile_types[itype].capacity; k++) { + if (place_ctx.grid_blocks[i][j].blocks[k] != INVALID_BLOCK_ID) { + place_ctx.grid_blocks[i][j].blocks[k] = EMPTY_BLOCK_ID; + } + } + } + } + + /* Similarly, mark all blocks as not being placed yet. */ + for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { + place_ctx.block_locs[blk_id].loc = t_pl_loc(); + } + + initial_placement_pl_macros(MAX_NUM_TRIES_TO_PLACE_MACROS_RANDOMLY, free_locations); + + // All the macros are placed, update the legal_pos[][] array + for (const auto& type : device_ctx.physical_tile_types) { + itype = type.index; + VTR_ASSERT(free_locations[itype] >= 0); + for (ipos = 0; ipos < free_locations[itype]; ipos++) { + t_pl_loc pos = legal_pos[itype][ipos]; + + // Check if that location is occupied. If it is, remove from legal_pos + if (place_ctx.grid_blocks[pos.x][pos.y].blocks[pos.z] != EMPTY_BLOCK_ID && place_ctx.grid_blocks[pos.x][pos.y].blocks[pos.z] != INVALID_BLOCK_ID) { + legal_pos[itype][ipos] = legal_pos[itype][free_locations[itype] - 1]; + free_locations[itype]--; + + // After the move, I need to check this particular entry again + ipos--; + continue; + } + } + } // Finish updating the legal_pos[][] and free_locations[] array + + initial_placement_blocks(free_locations, pad_loc_type); + + if (pad_loc_type == USER) { + read_user_pad_loc(pad_loc_file); + } + + /* Restore legal_pos */ + load_legal_placement_locations(); + +#ifdef VERBOSE + VTR_LOG("At end of initial_placement.\n"); + if (getEchoEnabled() && isEchoFileEnabled(E_ECHO_INITIAL_CLB_PLACEMENT)) { + print_clb_placement(getEchoFileName(E_ECHO_INITIAL_CLB_PLACEMENT)); + } +#endif + free(free_locations); + free_legal_placement_locations(); +} diff --git a/vpr/src/place/initial_placement.h b/vpr/src/place/initial_placement.h new file mode 100644 index 00000000000..ec2ad38f326 --- /dev/null +++ b/vpr/src/place/initial_placement.h @@ -0,0 +1,9 @@ +#ifndef VPR_INITIAL_PLACEMENT_H +#define VPR_INITIAL_PLACEMENT_H + +#include "vpr_types.h" + +void initial_placement(enum e_pad_loc_type pad_loc_type, + const char* pad_loc_file); + +#endif diff --git a/vpr/src/place/move_utils.cpp b/vpr/src/place/move_utils.cpp index e9751f684d4..ef00c65cbbe 100644 --- a/vpr/src/place/move_utils.cpp +++ b/vpr/src/place/move_utils.cpp @@ -113,6 +113,8 @@ e_block_move_result record_single_block_swap(t_pl_blocks_to_be_moved& blocks_aff ClusterBlockId b_to = place_ctx.grid_blocks[to.x][to.y].blocks[to.z]; + t_pl_loc curr_from = place_ctx.block_locs[b_from].loc; + e_block_move_result outcome = e_block_move_result::VALID; // Check whether the to_location is empty @@ -121,6 +123,13 @@ e_block_move_result record_single_block_swap(t_pl_blocks_to_be_moved& blocks_aff outcome = record_block_move(blocks_affected, b_from, to); } else if (b_to != INVALID_BLOCK_ID) { + // Check whether block to is compatible with from location + if (b_to != EMPTY_BLOCK_ID && b_to != INVALID_BLOCK_ID) { + if (!(is_legal_swap_to_location(b_to, curr_from))) { + return e_block_move_result::ABORT; + } + } + // Sets up the blocks moved outcome = record_block_move(blocks_affected, b_from, to); @@ -253,10 +262,18 @@ e_block_move_result record_macro_macro_swaps(t_pl_blocks_to_be_moved& blocks_aff ClusterBlockId b_from = place_ctx.pl_macros[imacro_from].members[imember_from].blk_index; t_pl_loc curr_to = place_ctx.block_locs[b_from].loc + swap_offset; + t_pl_loc curr_from = place_ctx.block_locs[b_from].loc; ClusterBlockId b_to = place_ctx.pl_macros[imacro_to].members[imember_to].blk_index; VTR_ASSERT_SAFE(curr_to == place_ctx.block_locs[b_to].loc); + // Check whether block to is compatible with from location + if (b_to != EMPTY_BLOCK_ID && b_to != INVALID_BLOCK_ID) { + if (!(is_legal_swap_to_location(b_to, curr_from))) { + return e_block_move_result::ABORT; + } + } + if (!is_legal_swap_to_location(b_from, curr_to)) { log_move_abort("macro_from swap to location illegal"); return e_block_move_result::ABORT; @@ -417,11 +434,12 @@ bool is_legal_swap_to_location(ClusterBlockId blk, t_pl_loc to) { //(neccessarily) translationally invariant for an arbitrary macro auto& device_ctx = g_vpr_ctx.device(); + auto& cluster_ctx = g_vpr_ctx.clustering(); if (to.x < 0 || to.x >= int(device_ctx.grid.width()) || to.y < 0 || to.y >= int(device_ctx.grid.height()) || to.z < 0 || to.z >= device_ctx.grid[to.x][to.y].type->capacity - || (device_ctx.grid[to.x][to.y].type != physical_tile_type(blk))) { + || !is_tile_compatible(device_ctx.grid[to.x][to.y].type, cluster_ctx.clb_nlist.block_type(blk))) { return false; } return true; @@ -482,7 +500,7 @@ ClusterBlockId pick_from_block() { return ClusterBlockId::INVALID(); } -bool find_to_loc_uniform(t_physical_tile_type_ptr type, +bool find_to_loc_uniform(t_logical_block_type_ptr type, float rlim, const t_pl_loc from, t_pl_loc& to) { @@ -495,10 +513,6 @@ bool find_to_loc_uniform(t_physical_tile_type_ptr type, // //This ensures that such blocks don't get locked down too early during placement (as would be the //case with a physical distance rlim) - auto& grid = g_vpr_ctx.device().grid; - - auto grid_type = grid[from.x][from.y].type; - VTR_ASSERT(type == grid_type); //Retrieve the compressed block grid for this block type const auto& compressed_block_grid = g_vpr_ctx.placement().compressed_block_grids[type->index]; @@ -511,7 +525,7 @@ bool find_to_loc_uniform(t_physical_tile_type_ptr type, int cx_from = grid_to_compressed(compressed_block_grid.compressed_to_grid_x, from.x); int cy_from = grid_to_compressed(compressed_block_grid.compressed_to_grid_y, from.y); - //Determin the valid compressed grid location ranges + //Determine the valid compressed grid location ranges int min_cx = std::max(0, cx_from - rlim_x); int max_cx = std::min(compressed_block_grid.compressed_to_grid_x.size() - 1, cx_from + rlim_x); int delta_cx = max_cx - min_cx; @@ -605,14 +619,17 @@ bool find_to_loc_uniform(t_physical_tile_type_ptr type, to.x = compressed_block_grid.compressed_to_grid_x[cx_to]; to.y = compressed_block_grid.compressed_to_grid_y[cy_to]; + auto& grid = g_vpr_ctx.device().grid; + + auto to_type = grid[to.x][to.y].type; + //Each x/y location contains only a single type, so we can pick a random //z (capcity) location - to.z = vtr::irand(type->capacity - 1); + to.z = vtr::irand(to_type->capacity - 1); - auto& device_ctx = g_vpr_ctx.device(); - VTR_ASSERT_MSG(device_ctx.grid[to.x][to.y].type == type, "Type must match"); - VTR_ASSERT_MSG(device_ctx.grid[to.x][to.y].width_offset == 0, "Should be at block base location"); - VTR_ASSERT_MSG(device_ctx.grid[to.x][to.y].height_offset == 0, "Should be at block base location"); + VTR_ASSERT_MSG(is_tile_compatible(to_type, type), "Type must be compatible"); + VTR_ASSERT_MSG(grid[to.x][to.y].width_offset == 0, "Should be at block base location"); + VTR_ASSERT_MSG(grid[to.x][to.y].height_offset == 0, "Should be at block base location"); return true; } diff --git a/vpr/src/place/move_utils.h b/vpr/src/place/move_utils.h index ddf7e17c891..7f6b438509b 100644 --- a/vpr/src/place/move_utils.h +++ b/vpr/src/place/move_utils.h @@ -46,7 +46,7 @@ std::set determine_locations_emptied_by_move(t_pl_blocks_to_be_moved& ClusterBlockId pick_from_block(); -bool find_to_loc_uniform(t_physical_tile_type_ptr type, +bool find_to_loc_uniform(t_logical_block_type_ptr type, float rlim, const t_pl_loc from, t_pl_loc& to); diff --git a/vpr/src/place/place.cpp b/vpr/src/place/place.cpp index 15b922ebb70..8a2b32fd962 100644 --- a/vpr/src/place/place.cpp +++ b/vpr/src/place/place.cpp @@ -27,6 +27,7 @@ #include "place_macro.h" #include "histogram.h" #include "place_util.h" +#include "initial_placement.h" #include "place_delay_model.h" #include "move_transactions.h" #include "move_utils.h" @@ -59,11 +60,6 @@ using std::min; * variables round-offs check. */ #define MAX_MOVES_BEFORE_RECOMPUTE 500000 -/* The maximum number of tries when trying to place a carry chain at a * - * random location before trying exhaustive placement - find the fist * - * legal position and place it during initial placement. */ -#define MAX_NUM_TRIES_TO_PLACE_MACROS_RANDOMLY 4 - /* Flags for the states of the bounding box. * * Stored as char for memory efficiency. */ #define NOT_UPDATED_YET 'N' @@ -113,9 +109,6 @@ constexpr double MAX_INV_TIMING_COST = 1.e9; /* Cost of a net, and a temporary cost of a net used during move assessment. */ static vtr::vector net_cost, temp_net_cost; -static t_pl_loc** legal_pos = nullptr; /* [0..device_ctx.num_block_types-1][0..type_tsize - 1] */ -static int* num_legal_pos = nullptr; /* [0..num_legal_pos-1] */ - /* [0...cluster_ctx.clb_nlist.nets().size()-1] * * A flag array to indicate whether the specific bounding box has been updated * * in this particular swap or not. If it has been updated before, the code * @@ -285,23 +278,6 @@ static void alloc_and_load_for_fast_cost_update(float place_cost_exp); static void free_fast_cost_update(); -static void alloc_legal_placements(); -static void load_legal_placements(); - -static void free_legal_placements(); - -static int check_macro_can_be_placed(int imacro, int itype, t_pl_loc head_pos); - -static int try_place_macro(int itype, int ipos, int imacro); - -static void initial_placement_pl_macros(int macros_max_num_tries, int* free_locations); - -static void initial_placement_blocks(int* free_locations, enum e_pad_loc_type pad_loc_type); -static void initial_placement_location(const int* free_locations, ClusterBlockId blk_id, int& pipos, t_pl_loc& to); - -static void initial_placement(enum e_pad_loc_type pad_loc_type, - const char* pad_loc_file); - static double comp_bb_cost(e_cost_methods method); static void update_move_nets(int num_nets_affected); @@ -437,6 +413,7 @@ static void print_place_status(const float t, const float rlim, const float crit_exponent, size_t tot_moves); +static void print_resources_utilization(); /*****************************************************************************/ void try_place(const t_placer_opts& placer_opts, @@ -509,6 +486,12 @@ void try_place(const t_placer_opts& placer_opts, directs, num_directs); initial_placement(placer_opts.pad_loc_type, placer_opts.pad_loc_file.c_str()); + + // Update physical pin values + for (auto block_id : cluster_ctx.clb_nlist.blocks()) { + place_sync_external_block_connections(block_id); + } + init_draw_coords((float)width_fac); //Enables fast look-up of atom pins connect to CLB pins ClusteredPinAtomPinsLookup netlist_pin_lookup(cluster_ctx.clb_nlist, pb_gpin_lookup); @@ -829,6 +812,8 @@ void try_place(const t_placer_opts& placer_opts, report_aborted_moves(); + print_resources_utilization(); + free_placement_structs(placer_opts); if (placer_opts.place_algorithm == PATH_TIMING_DRIVEN_PLACE || placer_opts.enable_timing_computations) { @@ -960,8 +945,8 @@ static void placement_inner_loop(float t, /* Lines below prevent too much round-off error from accumulating * in the cost over many iterations (due to incremental updates). - * This round-off can lead to error checks failing because the cost - * is different from what you get when you recompute from scratch. + * This round-off can lead to error checks failing because the cost + * is different from what you get when you recompute from scratch. */ ++(*moves_since_cost_recompute); if (*moves_since_cost_recompute > MAX_MOVES_BEFORE_RECOMPUTE) { @@ -1239,15 +1224,15 @@ static e_move_result try_swap(float t, VTR_ASSERT(create_move_outcome == e_create_move::VALID); /* - * To make evaluating the move simpler (e.g. calculating changed bounding box), - * we first move the blocks to thier new locations (apply the move to - * place_ctx.block_locs) and then computed the change in cost. If the move is + * To make evaluating the move simpler (e.g. calculating changed bounding box), + * we first move the blocks to thier new locations (apply the move to + * place_ctx.block_locs) and then computed the change in cost. If the move is * accepted, the inverse look-up in place_ctx.grid_blocks is updated (committing - * the move). If the move is rejected the blocks are returned to their original + * the move). If the move is rejected the blocks are returned to their original * positions (reverting place_ctx.block_locs to its original state). * * Note that the inverse look-up place_ctx.grid_blocks is only updated - * after move acceptance is determined, and so should not be used when + * after move acceptance is determined, and so should not be used when * evaluating a move. */ @@ -1405,7 +1390,7 @@ static void update_net_bb(const ClusterNetId net, } } else { //For large nets, update bounding box incrementally - int iblk_pin = cluster_ctx.clb_nlist.pin_physical_index(blk_pin); + int iblk_pin = tile_pin_index(blk_pin); t_physical_tile_type_ptr blk_type = physical_tile_type(blk); int pin_width_offset = blk_type->pin_width_offset[iblk_pin]; @@ -1511,8 +1496,8 @@ static float comp_td_point_to_point_delay(const PlaceDelayModel* delay_model, Cl ClusterBlockId source_block = cluster_ctx.clb_nlist.pin_block(source_pin); ClusterBlockId sink_block = cluster_ctx.clb_nlist.pin_block(sink_pin); - int source_block_ipin = cluster_ctx.clb_nlist.pin_physical_index(source_pin); - int sink_block_ipin = cluster_ctx.clb_nlist.pin_physical_index(sink_pin); + int source_block_ipin = cluster_ctx.clb_nlist.pin_logical_index(source_pin); + int sink_block_ipin = cluster_ctx.clb_nlist.pin_logical_index(sink_pin); int source_x = place_ctx.block_locs[source_block].loc.x; int source_y = place_ctx.block_locs[source_block].loc.y; @@ -1685,7 +1670,6 @@ static double comp_bb_cost(e_cost_methods method) { static void free_placement_structs(const t_placer_opts& placer_opts) { auto& cluster_ctx = g_vpr_ctx.clustering(); - free_legal_placements(); free_fast_cost_update(); if (placer_opts.place_algorithm == PATH_TIMING_DRIVEN_PLACE @@ -1716,7 +1700,6 @@ static void free_placement_structs(const t_placer_opts& placer_opts) { free_placement_macros_structs(); /* Frees up all the data structure used in vpr_utils. */ - free_port_pin_from_blk_pin(); free_blk_pin_from_port_pin(); } @@ -1737,9 +1720,6 @@ static void alloc_and_load_placement_structs(float place_cost_exp, init_placement_context(); - alloc_legal_placements(); - load_legal_placements(); - max_pins_per_clb = 0; for (const auto& type : device_ctx.physical_tile_types) { max_pins_per_clb = max(max_pins_per_clb, type.num_pins); @@ -1824,7 +1804,7 @@ static void alloc_and_load_net_pin_indices() { continue; netpin = 0; for (auto pin_id : cluster_ctx.clb_nlist.net_pins(net_id)) { - int pin_index = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + int pin_index = cluster_ctx.clb_nlist.pin_logical_index(pin_id); ClusterBlockId block_id = cluster_ctx.clb_nlist.pin_block(pin_id); net_pin_indices[block_id][pin_index] = netpin; netpin++; @@ -1861,7 +1841,7 @@ static void get_bb_from_scratch(ClusterNetId net_id, t_bb* coords, t_bb* num_on_ auto& grid = device_ctx.grid; ClusterBlockId bnum = cluster_ctx.clb_nlist.net_driver_block(net_id); - pnum = cluster_ctx.clb_nlist.net_pin_physical_index(net_id, 0); + pnum = net_pin_to_tile_pin_index(net_id, 0); VTR_ASSERT(pnum >= 0); x = place_ctx.block_locs[bnum].loc.x + physical_tile_type(bnum)->pin_width_offset[pnum]; y = place_ctx.block_locs[bnum].loc.y + physical_tile_type(bnum)->pin_height_offset[pnum]; @@ -1880,7 +1860,7 @@ static void get_bb_from_scratch(ClusterNetId net_id, t_bb* coords, t_bb* num_on_ for (auto pin_id : cluster_ctx.clb_nlist.net_sinks(net_id)) { bnum = cluster_ctx.clb_nlist.pin_block(pin_id); - pnum = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + pnum = tile_pin_index(pin_id); x = place_ctx.block_locs[bnum].loc.x + physical_tile_type(bnum)->pin_width_offset[pnum]; y = place_ctx.block_locs[bnum].loc.y + physical_tile_type(bnum)->pin_height_offset[pnum]; @@ -2020,7 +2000,7 @@ static void get_non_updateable_bb(ClusterNetId net_id, t_bb* bb_coord_new) { auto& device_ctx = g_vpr_ctx.device(); ClusterBlockId bnum = cluster_ctx.clb_nlist.net_driver_block(net_id); - pnum = cluster_ctx.clb_nlist.net_pin_physical_index(net_id, 0); + pnum = net_pin_to_tile_pin_index(net_id, 0); x = place_ctx.block_locs[bnum].loc.x + physical_tile_type(bnum)->pin_width_offset[pnum]; y = place_ctx.block_locs[bnum].loc.y + physical_tile_type(bnum)->pin_height_offset[pnum]; @@ -2031,7 +2011,7 @@ static void get_non_updateable_bb(ClusterNetId net_id, t_bb* bb_coord_new) { for (auto pin_id : cluster_ctx.clb_nlist.net_sinks(net_id)) { bnum = cluster_ctx.clb_nlist.pin_block(pin_id); - pnum = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + pnum = tile_pin_index(pin_id); x = place_ctx.block_locs[bnum].loc.x + physical_tile_type(bnum)->pin_width_offset[pnum]; y = place_ctx.block_locs[bnum].loc.y + physical_tile_type(bnum)->pin_height_offset[pnum]; @@ -2252,358 +2232,6 @@ static void update_bb(ClusterNetId net_id, t_bb* bb_coord_new, t_bb* bb_edge_new } } -static void alloc_legal_placements() { - auto& device_ctx = g_vpr_ctx.device(); - auto& place_ctx = g_vpr_ctx.mutable_placement(); - - legal_pos = new t_pl_loc*[device_ctx.physical_tile_types.size()]; - num_legal_pos = (int*)vtr::calloc(device_ctx.physical_tile_types.size(), sizeof(int)); - - /* Initialize all occupancy to zero. */ - - for (size_t i = 0; i < device_ctx.grid.width(); i++) { - for (size_t j = 0; j < device_ctx.grid.height(); j++) { - place_ctx.grid_blocks[i][j].usage = 0; - - for (int k = 0; k < device_ctx.grid[i][j].type->capacity; k++) { - if (place_ctx.grid_blocks[i][j].blocks[k] != INVALID_BLOCK_ID) { - place_ctx.grid_blocks[i][j].blocks[k] = EMPTY_BLOCK_ID; - if (device_ctx.grid[i][j].width_offset == 0 && device_ctx.grid[i][j].height_offset == 0) { - num_legal_pos[device_ctx.grid[i][j].type->index]++; - } - } - } - } - } - - for (const auto& type : device_ctx.physical_tile_types) { - legal_pos[type.index] = new t_pl_loc[num_legal_pos[type.index]]; - } -} - -static void load_legal_placements() { - auto& device_ctx = g_vpr_ctx.device(); - auto& place_ctx = g_vpr_ctx.placement(); - - int* index = (int*)vtr::calloc(device_ctx.physical_tile_types.size(), sizeof(int)); - - for (size_t i = 0; i < device_ctx.grid.width(); i++) { - for (size_t j = 0; j < device_ctx.grid.height(); j++) { - for (int k = 0; k < device_ctx.grid[i][j].type->capacity; k++) { - if (place_ctx.grid_blocks[i][j].blocks[k] == INVALID_BLOCK_ID) { - continue; - } - if (device_ctx.grid[i][j].width_offset == 0 && device_ctx.grid[i][j].height_offset == 0) { - int itype = device_ctx.grid[i][j].type->index; - legal_pos[itype][index[itype]].x = i; - legal_pos[itype][index[itype]].y = j; - legal_pos[itype][index[itype]].z = k; - index[itype]++; - } - } - } - } - free(index); -} - -static void free_legal_placements() { - auto& device_ctx = g_vpr_ctx.device(); - - for (unsigned int i = 0; i < device_ctx.physical_tile_types.size(); i++) { - delete[] legal_pos[i]; - } - delete[] legal_pos; /* Free the mapping list */ - free(num_legal_pos); -} - -static int check_macro_can_be_placed(int imacro, int itype, t_pl_loc head_pos) { - auto& device_ctx = g_vpr_ctx.device(); - auto& place_ctx = g_vpr_ctx.placement(); - - // Every macro can be placed until proven otherwise - int macro_can_be_placed = true; - - auto& pl_macros = place_ctx.pl_macros; - - // Check whether all the members can be placed - for (size_t imember = 0; imember < pl_macros[imacro].members.size(); imember++) { - t_pl_loc member_pos = head_pos + pl_macros[imacro].members[imember].offset; - - // Check whether the location could accept block of this type - // Then check whether the location could still accommodate more blocks - // Also check whether the member position is valid, that is the member's location - // still within the chip's dimemsion and the member_z is allowed at that location on the grid - if (member_pos.x < int(device_ctx.grid.width()) && member_pos.y < int(device_ctx.grid.height()) - && device_ctx.grid[member_pos.x][member_pos.y].type->index == itype - && place_ctx.grid_blocks[member_pos.x][member_pos.y].blocks[member_pos.z] == EMPTY_BLOCK_ID) { - // Can still accommodate blocks here, check the next position - continue; - } else { - // Cant be placed here - skip to the next try - macro_can_be_placed = false; - break; - } - } - - return (macro_can_be_placed); -} - -static int try_place_macro(int itype, int ipos, int imacro) { - auto& place_ctx = g_vpr_ctx.mutable_placement(); - - int macro_placed = false; - - // Choose a random position for the head - t_pl_loc head_pos = legal_pos[itype][ipos]; - - // If that location is occupied, do nothing. - if (place_ctx.grid_blocks[head_pos.x][head_pos.y].blocks[head_pos.z] != EMPTY_BLOCK_ID) { - return (macro_placed); - } - - int macro_can_be_placed = check_macro_can_be_placed(imacro, itype, head_pos); - - if (macro_can_be_placed) { - auto& pl_macros = place_ctx.pl_macros; - - // Place down the macro - macro_placed = true; - for (size_t imember = 0; imember < pl_macros[imacro].members.size(); imember++) { - t_pl_loc member_pos = head_pos + pl_macros[imacro].members[imember].offset; - - ClusterBlockId iblk = pl_macros[imacro].members[imember].blk_index; - place_ctx.block_locs[iblk].loc = member_pos; - - place_ctx.grid_blocks[member_pos.x][member_pos.y].blocks[member_pos.z] = pl_macros[imacro].members[imember].blk_index; - place_ctx.grid_blocks[member_pos.x][member_pos.y].usage++; - - // Could not ensure that the randomiser would not pick this location again - // So, would have to do a lazy removal - whenever I come across a block that could not be placed, - // go ahead and remove it from the legal_pos[][] array - - } // Finish placing all the members in the macro - - } // End of this choice of legal_pos - - return (macro_placed); -} - -static void initial_placement_pl_macros(int macros_max_num_tries, int* free_locations) { - int macro_placed; - int itype, itry, ipos; - ClusterBlockId blk_id; - - auto& cluster_ctx = g_vpr_ctx.clustering(); - auto& device_ctx = g_vpr_ctx.device(); - auto& place_ctx = g_vpr_ctx.placement(); - - auto& pl_macros = place_ctx.pl_macros; - - /* Macros are harder to place. Do them first */ - for (size_t imacro = 0; imacro < place_ctx.pl_macros.size(); imacro++) { - // Every macro are not placed in the beginnning - macro_placed = false; - - // Assume that all the blocks in the macro are of the same type - blk_id = pl_macros[imacro].members[0].blk_index; - auto type = physical_tile_type(blk_id); - itype = type->index; - if (free_locations[itype] < int(pl_macros[imacro].members.size())) { - VPR_FATAL_ERROR(VPR_ERROR_PLACE, - "Initial placement failed.\n" - "Could not place macro length %zu with head block %s (#%zu); not enough free locations of type %s (#%d).\n" - "VPR cannot auto-size for your circuit, please resize the FPGA manually.\n", - pl_macros[imacro].members.size(), cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), type->name, itype); - } - - // Try to place the macro first, if can be placed - place them, otherwise try again - for (itry = 0; itry < macros_max_num_tries && macro_placed == false; itry++) { - // Choose a random position for the head - ipos = vtr::irand(free_locations[itype] - 1); - - // Try to place the macro - macro_placed = try_place_macro(itype, ipos, imacro); - - } // Finished all tries - - if (macro_placed == false) { - // if a macro still could not be placed after macros_max_num_tries times, - // go through the chip exhaustively to find a legal placement for the macro - // place the macro on the first location that is legal - // then set macro_placed = true; - // if there are no legal positions, error out - - // Exhaustive placement of carry macros - for (ipos = 0; ipos < free_locations[itype] && macro_placed == false; ipos++) { - // Try to place the macro - macro_placed = try_place_macro(itype, ipos, imacro); - - } // Exhausted all the legal placement position for this macro - - // If macro could not be placed after exhaustive placement, error out - if (macro_placed == false) { - // Error out - VPR_FATAL_ERROR(VPR_ERROR_PLACE, - "Initial placement failed.\n" - "Could not place macro length %zu with head block %s (#%zu); not enough free locations of type %s (#%d).\n" - "Please manually size the FPGA because VPR can't do this yet.\n", - pl_macros[imacro].members.size(), cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), device_ctx.physical_tile_types[itype].name, itype); - } - - } else { - // This macro has been placed successfully, proceed to place the next macro - continue; - } - } // Finish placing all the pl_macros successfully -} - -/* Place blocks that are NOT a part of any macro. - * We'll randomly place each block in the clustered netlist, one by one. */ -static void initial_placement_blocks(int* free_locations, enum e_pad_loc_type pad_loc_type) { - int itype, ipos; - auto& cluster_ctx = g_vpr_ctx.clustering(); - auto& place_ctx = g_vpr_ctx.mutable_placement(); - auto& device_ctx = g_vpr_ctx.device(); - - for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - if (place_ctx.block_locs[blk_id].loc.x != -1) { // -1 is a sentinel for an empty block - // block placed. - continue; - } - - /* Don't do IOs if the user specifies IOs; we'll read those locations later. */ - if (!(is_io_type(physical_tile_type(blk_id)) && pad_loc_type == USER)) { - /* Randomly select a free location of the appropriate type for blk_id. - * We have a linearized list of all the free locations that can - * accommodate a block of that type in free_locations[itype]. - * Choose one randomly and put blk_id there. Then we don't want to pick - * that location again, so remove it from the free_locations array. - */ - itype = cluster_ctx.clb_nlist.block_type(blk_id)->index; - if (free_locations[itype] <= 0) { - VPR_FATAL_ERROR(VPR_ERROR_PLACE, - "Initial placement failed.\n" - "Could not place block %s (#%zu); no free locations of type %s (#%d).\n", - cluster_ctx.clb_nlist.block_name(blk_id).c_str(), size_t(blk_id), device_ctx.physical_tile_types[itype].name, itype); - } - - t_pl_loc to; - initial_placement_location(free_locations, blk_id, ipos, to); - - // Make sure that the position is EMPTY_BLOCK before placing the block down - VTR_ASSERT(place_ctx.grid_blocks[to.x][to.y].blocks[to.z] == EMPTY_BLOCK_ID); - - place_ctx.grid_blocks[to.x][to.y].blocks[to.z] = blk_id; - place_ctx.grid_blocks[to.x][to.y].usage++; - - place_ctx.block_locs[blk_id].loc = to; - - //Mark IOs as fixed if specifying a (fixed) random placement - if (is_io_type(physical_tile_type(blk_id)) && pad_loc_type == RANDOM) { - place_ctx.block_locs[blk_id].is_fixed = true; - } - - /* Ensure randomizer doesn't pick this location again, since it's occupied. Could shift all the - * legal positions in legal_pos to remove the entry (choice) we just used, but faster to - * just move the last entry in legal_pos to the spot we just used and decrement the - * count of free_locations. */ - legal_pos[itype][ipos] = legal_pos[itype][free_locations[itype] - 1]; /* overwrite used block position */ - free_locations[itype]--; - } - } -} - -static void initial_placement_location(const int* free_locations, ClusterBlockId blk_id, int& ipos, t_pl_loc& to) { - auto& cluster_ctx = g_vpr_ctx.clustering(); - - int itype = cluster_ctx.clb_nlist.block_type(blk_id)->index; - - ipos = vtr::irand(free_locations[itype] - 1); - to = legal_pos[itype][ipos]; -} - -static void initial_placement(enum e_pad_loc_type pad_loc_type, - const char* pad_loc_file) { - /* Randomly places the blocks to create an initial placement. We rely on - * the legal_pos array already being loaded. That legal_pos[itype] is an - * array that gives every legal value of (x,y,z) that can accommodate a block. - * The number of such locations is given by num_legal_pos[itype]. - */ - int itype, ipos; - int* free_locations; /* [0..device_ctx.num_block_types-1]. - * Stores how many locations there are for this type that *might* still be free. - * That is, this stores the number of entries in legal_pos[itype] that are worth considering - * as you look for a free location. - */ - auto& device_ctx = g_vpr_ctx.device(); - auto& cluster_ctx = g_vpr_ctx.clustering(); - auto& place_ctx = g_vpr_ctx.mutable_placement(); - - free_locations = (int*)vtr::malloc(device_ctx.physical_tile_types.size() * sizeof(int)); - for (const auto& type : device_ctx.physical_tile_types) { - itype = type.index; - free_locations[itype] = num_legal_pos[itype]; - } - - /* We'll use the grid to record where everything goes. Initialize to the grid has no - * blocks placed anywhere. - */ - for (size_t i = 0; i < device_ctx.grid.width(); i++) { - for (size_t j = 0; j < device_ctx.grid.height(); j++) { - place_ctx.grid_blocks[i][j].usage = 0; - itype = device_ctx.grid[i][j].type->index; - for (int k = 0; k < device_ctx.physical_tile_types[itype].capacity; k++) { - if (place_ctx.grid_blocks[i][j].blocks[k] != INVALID_BLOCK_ID) { - place_ctx.grid_blocks[i][j].blocks[k] = EMPTY_BLOCK_ID; - } - } - } - } - - /* Similarly, mark all blocks as not being placed yet. */ - for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - place_ctx.block_locs[blk_id].loc = t_pl_loc(); - } - - initial_placement_pl_macros(MAX_NUM_TRIES_TO_PLACE_MACROS_RANDOMLY, free_locations); - - // All the macros are placed, update the legal_pos[][] array - for (const auto& type : device_ctx.physical_tile_types) { - itype = type.index; - VTR_ASSERT(free_locations[itype] >= 0); - for (ipos = 0; ipos < free_locations[itype]; ipos++) { - t_pl_loc pos = legal_pos[itype][ipos]; - - // Check if that location is occupied. If it is, remove from legal_pos - if (place_ctx.grid_blocks[pos.x][pos.y].blocks[pos.z] != EMPTY_BLOCK_ID && place_ctx.grid_blocks[pos.x][pos.y].blocks[pos.z] != INVALID_BLOCK_ID) { - legal_pos[itype][ipos] = legal_pos[itype][free_locations[itype] - 1]; - free_locations[itype]--; - - // After the move, I need to check this particular entry again - ipos--; - continue; - } - } - } // Finish updating the legal_pos[][] and free_locations[] array - - initial_placement_blocks(free_locations, pad_loc_type); - - if (pad_loc_type == USER) { - read_user_pad_loc(pad_loc_file); - } - - /* Restore legal_pos */ - load_legal_placements(); - -#ifdef VERBOSE - VTR_LOG("At end of initial_placement.\n"); - if (getEchoEnabled() && isEchoFileEnabled(E_ECHO_INITIAL_CLB_PLACEMENT)) { - print_clb_placement(getEchoFileName(E_ECHO_INITIAL_CLB_PLACEMENT)); - } -#endif - free(free_locations); -} - static void free_fast_cost_update() { auto& device_ctx = g_vpr_ctx.device(); @@ -2938,3 +2566,33 @@ static void print_place_status(const float t, VTR_LOG(" %6.3f\n", t / oldt); fflush(stdout); } + +static void print_resources_utilization() { + auto& place_ctx = g_vpr_ctx.placement(); + auto& cluster_ctx = g_vpr_ctx.clustering(); + auto& device_ctx = g_vpr_ctx.device(); + + //Record the resource requirement + std::map num_type_instances; + std::map> num_placed_instances; + for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { + auto block_loc = place_ctx.block_locs[blk_id]; + auto loc = block_loc.loc; + + auto physical_tile = device_ctx.grid[loc.x][loc.y].type; + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + + num_type_instances[logical_block]++; + num_placed_instances[logical_block][physical_tile]++; + } + + for (auto logical_block : num_type_instances) { + VTR_LOG("Logical Block: %s\n", logical_block.first->name); + VTR_LOG("\tInstances -> %d\n", logical_block.second); + + VTR_LOG("\tPhysical Tiles used:\n"); + for (auto physical_tile : num_placed_instances[logical_block.first]) { + VTR_LOG("\t\t%s: %d\n", physical_tile.first->name, physical_tile.second); + } + } +} diff --git a/vpr/src/place/place_macro.cpp b/vpr/src/place/place_macro.cpp index 4faeb1d9deb..5411e3223f8 100644 --- a/vpr/src/place/place_macro.cpp +++ b/vpr/src/place/place_macro.cpp @@ -77,11 +77,16 @@ static void find_all_the_macro(int* num_of_macro, std::vector& p num_macro = 0; for (auto blk_id : cluster_ctx.clb_nlist.blocks()) { - num_blk_pins = physical_tile_type(blk_id)->num_pins; + auto logical_block = cluster_ctx.clb_nlist.block_type(blk_id); + auto physical_tile = pick_best_physical_type(logical_block); + + num_blk_pins = cluster_ctx.clb_nlist.block_type(blk_id)->pb_type->num_pins; for (to_iblk_pin = 0; to_iblk_pin < num_blk_pins; to_iblk_pin++) { + int to_physical_pin = get_physical_pin(physical_tile, logical_block, to_iblk_pin); + to_net_id = cluster_ctx.clb_nlist.block_net(blk_id, to_iblk_pin); - to_idirect = f_idirect_from_blk_pin[cluster_ctx.clb_nlist.block_type(blk_id)->index][to_iblk_pin]; - to_src_or_sink = f_direct_type_from_blk_pin[cluster_ctx.clb_nlist.block_type(blk_id)->index][to_iblk_pin]; + to_idirect = f_idirect_from_blk_pin[physical_tile->index][to_physical_pin]; + to_src_or_sink = f_direct_type_from_blk_pin[physical_tile->index][to_physical_pin]; // Identify potential macro head blocks (i.e. start of a macro) // @@ -97,9 +102,11 @@ static void find_all_the_macro(int* num_of_macro, std::vector& p || (is_constant_clb_net(to_net_id) && !net_is_driven_by_direct(to_net_id)))) { for (from_iblk_pin = 0; from_iblk_pin < num_blk_pins; from_iblk_pin++) { + int from_physical_pin = get_physical_pin(physical_tile, logical_block, from_iblk_pin); + from_net_id = cluster_ctx.clb_nlist.block_net(blk_id, from_iblk_pin); - from_idirect = f_idirect_from_blk_pin[cluster_ctx.clb_nlist.block_type(blk_id)->index][from_iblk_pin]; - from_src_or_sink = f_direct_type_from_blk_pin[cluster_ctx.clb_nlist.block_type(blk_id)->index][from_iblk_pin]; + from_idirect = f_idirect_from_blk_pin[physical_tile->index][from_physical_pin]; + from_src_or_sink = f_direct_type_from_blk_pin[physical_tile->index][from_physical_pin]; // Confirm whether this is a head macro // @@ -129,8 +136,8 @@ static void find_all_the_macro(int* num_of_macro, std::vector& p next_blk_id = cluster_ctx.clb_nlist.net_pin_block(curr_net_id, 1); // Assume that the from_iblk_pin index is the same for the next block - VTR_ASSERT(f_idirect_from_blk_pin[cluster_ctx.clb_nlist.block_type(next_blk_id)->index][from_iblk_pin] == from_idirect - && f_direct_type_from_blk_pin[cluster_ctx.clb_nlist.block_type(next_blk_id)->index][from_iblk_pin] == SOURCE); + VTR_ASSERT(f_idirect_from_blk_pin[physical_tile->index][from_physical_pin] == from_idirect + && f_direct_type_from_blk_pin[physical_tile->index][from_physical_pin] == SOURCE); next_net_id = cluster_ctx.clb_nlist.block_net(next_blk_id, from_iblk_pin); // Mark down this block as a member of the macro @@ -455,6 +462,10 @@ static void write_place_macros(std::string filename, const std::vectorindex][pin_index]; diff --git a/vpr/src/place/timing_place_lookup.cpp b/vpr/src/place/timing_place_lookup.cpp index 48dee3549ff..4aa439aab16 100644 --- a/vpr/src/place/timing_place_lookup.cpp +++ b/vpr/src/place/timing_place_lookup.cpp @@ -268,7 +268,7 @@ static t_chan_width setup_chan_width(const t_router_opts& router_opts, if (router_opts.fixed_channel_width == NO_FIXED_CHANNEL_WIDTH) { auto& device_ctx = g_vpr_ctx.device(); - auto type = physical_tile_type(find_most_common_block_type(device_ctx.grid)); + auto type = find_most_common_tile_type(device_ctx.grid); width_fac = 4 * type->num_pins; /*this is 2x the value that binary search starts */ @@ -365,8 +365,8 @@ static void generic_compute_matrix( t_physical_tile_type_ptr src_type = device_ctx.grid[source_x][source_y].type; t_physical_tile_type_ptr sink_type = device_ctx.grid[sink_x][sink_y].type; - bool src_or_target_empty = (src_type == device_ctx.EMPTY_TYPE - || sink_type == device_ctx.EMPTY_TYPE); + bool src_or_target_empty = (src_type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE + || sink_type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE); bool is_allowed_type = allowed_types.empty() || allowed_types.find(src_type->name) != allowed_types.end(); @@ -471,7 +471,7 @@ static vtr::Matrix compute_delta_delays( for (y = 0; y < grid.height(); ++y) { auto type = grid[x][y].type; - if (type != device_ctx.EMPTY_TYPE) { + if (type != device_ctx.EMPTY_PHYSICAL_TILE_TYPE) { if (!allowed_types.empty() && allowed_types.find(std::string(type->name)) == allowed_types.end()) { continue; } @@ -501,7 +501,7 @@ static vtr::Matrix compute_delta_delays( for (x = 0; x < grid.width(); ++x) { auto type = grid[x][y].type; - if (type != device_ctx.EMPTY_TYPE) { + if (type != device_ctx.EMPTY_PHYSICAL_TILE_TYPE) { if (!allowed_types.empty() && allowed_types.find(std::string(type->name)) == allowed_types.end()) { continue; } @@ -867,8 +867,8 @@ void OverrideDelayModel::compute_override_delay_model( InstPort from_port = parse_inst_port(direct->from_pin); InstPort to_port = parse_inst_port(direct->to_pin); - t_physical_tile_type_ptr from_type = find_block_type_by_name(from_port.instance_name(), device_ctx.physical_tile_types); - t_physical_tile_type_ptr to_type = find_block_type_by_name(to_port.instance_name(), device_ctx.physical_tile_types); + t_physical_tile_type_ptr from_type = find_tile_type_by_name(from_port.instance_name(), device_ctx.physical_tile_types); + t_physical_tile_type_ptr to_type = find_tile_type_by_name(to_port.instance_name(), device_ctx.physical_tile_types); int num_conns = from_port.port_high_index() - from_port.port_low_index() + 1; VTR_ASSERT_MSG(num_conns == to_port.port_high_index() - to_port.port_low_index() + 1, "Directs must have the same size to/from"); @@ -887,16 +887,16 @@ void OverrideDelayModel::compute_override_delay_model( std::set> sampled_rr_pairs; for (int iconn = 0; iconn < num_conns; ++iconn) { //Find the associated pins - int from_pin = find_pin(logical_block_type(from_type), from_port.port_name(), from_port.port_low_index() + iconn); - int to_pin = find_pin(logical_block_type(to_type), to_port.port_name(), to_port.port_low_index() + iconn); + int from_pin = find_pin(from_type, from_port.port_name(), from_port.port_low_index() + iconn); + int to_pin = find_pin(to_type, to_port.port_name(), to_port.port_low_index() + iconn); VTR_ASSERT(from_pin != OPEN); VTR_ASSERT(to_pin != OPEN); - int from_pin_class = find_pin_class(logical_block_type(from_type), from_port.port_name(), from_port.port_low_index() + iconn, DRIVER); + int from_pin_class = find_pin_class(from_type, from_port.port_name(), from_port.port_low_index() + iconn, DRIVER); VTR_ASSERT(from_pin_class != OPEN); - int to_pin_class = find_pin_class(logical_block_type(to_type), to_port.port_name(), to_port.port_low_index() + iconn, RECEIVER); + int to_pin_class = find_pin_class(to_type, to_port.port_name(), to_port.port_low_index() + iconn, RECEIVER); VTR_ASSERT(to_pin_class != OPEN); int src_rr = OPEN; diff --git a/vpr/src/place/uniform_move_generator.cpp b/vpr/src/place/uniform_move_generator.cpp index b29604188bf..6d39f3f3eaf 100644 --- a/vpr/src/place/uniform_move_generator.cpp +++ b/vpr/src/place/uniform_move_generator.cpp @@ -14,10 +14,11 @@ e_create_move UniformMoveGenerator::propose_move(t_pl_blocks_to_be_moved& blocks t_pl_loc from = place_ctx.block_locs[b_from].loc; auto cluster_from_type = cluster_ctx.clb_nlist.block_type(b_from); auto grid_from_type = g_vpr_ctx.device().grid[from.x][from.y].type; - VTR_ASSERT(physical_tile_type(cluster_from_type) == grid_from_type); + VTR_ASSERT(is_tile_compatible(grid_from_type, cluster_from_type)); t_pl_loc to; - if (!find_to_loc_uniform(physical_tile_type(b_from), rlim, from, to)) { + + if (!find_to_loc_uniform(cluster_from_type, rlim, from, to)) { return e_create_move::ABORT; } diff --git a/vpr/src/power/power.cpp b/vpr/src/power/power.cpp index 94294cf55c2..d4e17c0f852 100644 --- a/vpr/src/power/power.cpp +++ b/vpr/src/power/power.cpp @@ -601,26 +601,34 @@ static void power_usage_blocks(t_power_usage* power_usage) { power_reset_tile_usage(); + t_logical_block_type_ptr logical_block; + /* Loop through all grid locations */ for (size_t x = 0; x < device_ctx.grid.width(); x++) { for (size_t y = 0; y < device_ctx.grid.height(); y++) { + auto physical_tile = device_ctx.grid[x][y].type; + if ((device_ctx.grid[x][y].width_offset != 0) || (device_ctx.grid[x][y].height_offset != 0) - || (device_ctx.grid[x][y].type == device_ctx.EMPTY_TYPE)) { + || is_empty_type(physical_tile)) { continue; } - for (int z = 0; z < device_ctx.grid[x][y].type->capacity; z++) { + for (int z = 0; z < physical_tile->capacity; z++) { t_pb* pb = nullptr; t_power_usage pb_power; ClusterBlockId iblk = place_ctx.grid_blocks[x][y].blocks[z]; - if (iblk != EMPTY_BLOCK_ID && iblk != INVALID_BLOCK_ID) + if (iblk != EMPTY_BLOCK_ID && iblk != INVALID_BLOCK_ID) { pb = cluster_ctx.clb_nlist.block_pb(iblk); + logical_block = cluster_ctx.clb_nlist.block_type(iblk); + } else { + logical_block = pick_best_logical_type(physical_tile); + } /* Calculate power of this CLB */ - power_usage_pb(&pb_power, pb, logical_block_type(device_ctx.grid[x][y].type)->pb_graph_head, iblk); + power_usage_pb(&pb_power, pb, logical_block->pb_graph_head, iblk); power_add_usage(power_usage, &pb_power); } } diff --git a/vpr/src/route/check_route.cpp b/vpr/src/route/check_route.cpp index 67a6782d83f..e9fa206736d 100644 --- a/vpr/src/route/check_route.cpp +++ b/vpr/src/route/check_route.cpp @@ -173,29 +173,25 @@ void check_route(enum e_route_type route_type) { /* Checks that this SINK node is one of the terminals of inet, and marks * * the appropriate pin as being reached. */ static void check_sink(int inode, ClusterNetId net_id, bool* pin_done) { - int i, j, ifound, ptc_num, iclass, iblk, pin_index; - ClusterBlockId bnum; - unsigned int ipin; - t_physical_tile_type_ptr type; auto& device_ctx = g_vpr_ctx.device(); auto& cluster_ctx = g_vpr_ctx.clustering(); auto& place_ctx = g_vpr_ctx.placement(); VTR_ASSERT(device_ctx.rr_nodes[inode].type() == SINK); - i = device_ctx.rr_nodes[inode].xlow(); - j = device_ctx.rr_nodes[inode].ylow(); - type = device_ctx.grid[i][j].type; + int i = device_ctx.rr_nodes[inode].xlow(); + int j = device_ctx.rr_nodes[inode].ylow(); + auto type = device_ctx.grid[i][j].type; /* For sinks, ptc_num is the class */ - ptc_num = device_ctx.rr_nodes[inode].ptc_num(); - ifound = 0; + int ptc_num = device_ctx.rr_nodes[inode].ptc_num(); + int ifound = 0; - for (iblk = 0; iblk < type->capacity; iblk++) { - bnum = place_ctx.grid_blocks[i][j].blocks[iblk]; /* Hardcoded to one cluster_ctx block*/ - ipin = 1; + for (int iblk = 0; iblk < type->capacity; iblk++) { + ClusterBlockId bnum = place_ctx.grid_blocks[i][j].blocks[iblk]; /* Hardcoded to one cluster_ctx block*/ + unsigned int ipin = 1; for (auto pin_id : cluster_ctx.clb_nlist.net_sinks(net_id)) { if (cluster_ctx.clb_nlist.pin_block(pin_id) == bnum) { - pin_index = cluster_ctx.clb_nlist.pin_physical_index(pin_id); - iclass = type->pin_class[pin_index]; + int pin_index = tile_pin_index(pin_id); + int iclass = type->pin_class[pin_index]; if (iclass == ptc_num) { /* Could connect to same pin class on the same clb more than once. Only * * update pin_done for a pin that hasn't been reached yet. */ @@ -225,27 +221,23 @@ static void check_sink(int inode, ClusterNetId net_id, bool* pin_done) { /* Checks that the node passed in is a valid source for this net. */ static void check_source(int inode, ClusterNetId net_id) { - t_rr_type rr_type; - t_physical_tile_type_ptr type; - ClusterBlockId blk_id; - int i, j, ptc_num, node_block_pin, iclass; auto& device_ctx = g_vpr_ctx.device(); auto& cluster_ctx = g_vpr_ctx.clustering(); auto& place_ctx = g_vpr_ctx.placement(); - rr_type = device_ctx.rr_nodes[inode].type(); + t_rr_type rr_type = device_ctx.rr_nodes[inode].type(); if (rr_type != SOURCE) { VPR_FATAL_ERROR(VPR_ERROR_ROUTE, "in check_source: net %d begins with a node of type %d.\n", size_t(net_id), rr_type); } - i = device_ctx.rr_nodes[inode].xlow(); - j = device_ctx.rr_nodes[inode].ylow(); + int i = device_ctx.rr_nodes[inode].xlow(); + int j = device_ctx.rr_nodes[inode].ylow(); /* for sinks and sources, ptc_num is class */ - ptc_num = device_ctx.rr_nodes[inode].ptc_num(); + int ptc_num = device_ctx.rr_nodes[inode].ptc_num(); /* First node_block for net is the source */ - blk_id = cluster_ctx.clb_nlist.net_driver_block(net_id); - type = device_ctx.grid[i][j].type; + ClusterBlockId blk_id = cluster_ctx.clb_nlist.net_driver_block(net_id); + auto type = device_ctx.grid[i][j].type; if (place_ctx.block_locs[blk_id].loc.x != i || place_ctx.block_locs[blk_id].loc.y != j) { VPR_FATAL_ERROR(VPR_ERROR_ROUTE, @@ -253,8 +245,9 @@ static void check_source(int inode, ClusterNetId net_id) { } //Get the driver pin's index in the block - node_block_pin = cluster_ctx.clb_nlist.net_pin_physical_index(net_id, 0); - iclass = type->pin_class[node_block_pin]; + auto physical_pin = net_pin_to_tile_pin_index(net_id, 0); + + int iclass = type->pin_class[physical_pin]; if (ptc_num != iclass) { VPR_FATAL_ERROR(VPR_ERROR_ROUTE, diff --git a/vpr/src/route/clock_connection_builders.cpp b/vpr/src/route/clock_connection_builders.cpp index 68adb5a4f62..e8fca69771b 100644 --- a/vpr/src/route/clock_connection_builders.cpp +++ b/vpr/src/route/clock_connection_builders.cpp @@ -225,11 +225,25 @@ void ClockToPinsConnection::create_switches(const ClockRRGraphBuilder& clock_gra } auto type = grid[x][y].type; + + // Skip EMPTY type + if (is_empty_type(type)) { + continue; + } + auto width_offset = grid[x][y].width_offset; auto height_offset = grid[x][y].height_offset; - // Ignore gird locations that do not have blocks - if (!logical_block_type(type)->pb_type) { + // Ignore grid locations that do not have blocks + bool has_pb_type = false; + for (auto logical_block : type->equivalent_sites) { + if (logical_block->pb_type) { + has_pb_type = true; + break; + } + } + + if (!has_pb_type) { continue; } diff --git a/vpr/src/route/route_common.cpp b/vpr/src/route/route_common.cpp index 396517ed1ba..6204dad984b 100644 --- a/vpr/src/route/route_common.cpp +++ b/vpr/src/route/route_common.cpp @@ -1034,8 +1034,6 @@ void reset_rr_node_route_structs() { static vtr::vector> load_net_rr_terminals(const t_rr_node_indices& L_rr_node_indices) { vtr::vector> net_rr_terminals; - int inode, i, j, node_block_pin, iclass; - auto& cluster_ctx = g_vpr_ctx.clustering(); auto& place_ctx = g_vpr_ctx.placement(); @@ -1049,19 +1047,19 @@ static vtr::vector> load_net_rr_terminals(const t int pin_count = 0; for (auto pin_id : cluster_ctx.clb_nlist.net_pins(net_id)) { auto block_id = cluster_ctx.clb_nlist.pin_block(pin_id); - i = place_ctx.block_locs[block_id].loc.x; - j = place_ctx.block_locs[block_id].loc.y; + int i = place_ctx.block_locs[block_id].loc.x; + int j = place_ctx.block_locs[block_id].loc.y; auto type = physical_tile_type(block_id); /* In the routing graph, each (x, y) location has unique pins on it * so when there is capacity, blocks are packed and their pin numbers * are offset to get their actual rr_node */ - node_block_pin = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + int phys_pin = tile_pin_index(pin_id); - iclass = type->pin_class[node_block_pin]; + int iclass = type->pin_class[phys_pin]; - inode = get_rr_node_index(L_rr_node_indices, i, j, (pin_count == 0 ? SOURCE : SINK), /* First pin is driver */ - iclass); + int inode = get_rr_node_index(L_rr_node_indices, i, j, (pin_count == 0 ? SOURCE : SINK), /* First pin is driver */ + iclass); net_rr_terminals[net_id][pin_count] = inode; pin_count++; } @@ -1540,7 +1538,7 @@ void print_route(FILE* fp, const vtr::vector& traceba for (auto pin_id : cluster_ctx.clb_nlist.net_pins(net_id)) { ClusterBlockId block_id = cluster_ctx.clb_nlist.pin_block(pin_id); - int pin_index = cluster_ctx.clb_nlist.pin_physical_index(pin_id); + int pin_index = tile_pin_index(pin_id); int iclass = physical_tile_type(block_id)->pin_class[pin_index]; fprintf(fp, "Block %s (#%zu) at (%d,%d), Pin class %d.\n", diff --git a/vpr/src/route/rr_graph.cpp b/vpr/src/route/rr_graph.cpp index 3aa6afd94a9..df1bb8b0967 100644 --- a/vpr/src/route/rr_graph.cpp +++ b/vpr/src/route/rr_graph.cpp @@ -51,10 +51,10 @@ struct t_mux_size_distribution { }; struct t_clb_to_clb_directs { - t_logical_block_type_ptr from_clb_type; + t_physical_tile_type_ptr from_clb_type; int from_clb_pin_start_index; int from_clb_pin_end_index; - t_logical_block_type_ptr to_clb_type; + t_physical_tile_type_ptr to_clb_type; int to_clb_pin_start_index; int to_clb_pin_end_index; int switch_index; //The switch type used by this direct connection @@ -2654,94 +2654,107 @@ static void build_unidir_rr_opins(const int i, const int j, const e_side side, c * TODO: The function that does this parsing in placement is poorly done because it lacks generality on heterogeniety, should replace with this one */ static t_clb_to_clb_directs* alloc_and_load_clb_to_clb_directs(const t_direct_inf* directs, const int num_directs, int delayless_switch) { - int i, j; - unsigned int itype; + int i; t_clb_to_clb_directs* clb_to_clb_directs; - char *pb_type_name, *port_name; + char *tile_name, *port_name; int start_pin_index, end_pin_index; - t_pb_type* pb_type; + t_physical_tile_type_ptr physical_tile = nullptr; + t_physical_tile_port tile_port; auto& device_ctx = g_vpr_ctx.device(); clb_to_clb_directs = (t_clb_to_clb_directs*)vtr::calloc(num_directs, sizeof(t_clb_to_clb_directs)); - pb_type_name = nullptr; + tile_name = nullptr; port_name = nullptr; for (i = 0; i < num_directs; i++) { - pb_type_name = (char*)vtr::malloc((strlen(directs[i].from_pin) + strlen(directs[i].to_pin)) * sizeof(char)); + tile_name = (char*)vtr::malloc((strlen(directs[i].from_pin) + strlen(directs[i].to_pin)) * sizeof(char)); port_name = (char*)vtr::malloc((strlen(directs[i].from_pin) + strlen(directs[i].to_pin)) * sizeof(char)); // Load from pins // Parse out the pb_type name, port name, and pin range - parse_direct_pin_name(directs[i].from_pin, directs[i].line, &start_pin_index, &end_pin_index, pb_type_name, port_name); + parse_direct_pin_name(directs[i].from_pin, directs[i].line, &start_pin_index, &end_pin_index, tile_name, port_name); // Figure out which type, port, and pin is used - for (itype = 0; itype < device_ctx.logical_block_types.size(); ++itype) { - if (strcmp(device_ctx.logical_block_types[itype].name, pb_type_name) == 0) { + for (const auto& type : device_ctx.physical_tile_types) { + if (strcmp(type.name, tile_name) == 0) { + physical_tile = &type; break; } } - if (itype >= device_ctx.logical_block_types.size()) { - vpr_throw(VPR_ERROR_ARCH, get_arch_file_name(), directs[i].line, "Unable to find block %s.\n", pb_type_name); + if (physical_tile == nullptr) { + VPR_THROW(VPR_ERROR_ARCH, "Unable to find block %s.\n", tile_name); } - clb_to_clb_directs[i].from_clb_type = &device_ctx.logical_block_types[itype]; - pb_type = clb_to_clb_directs[i].from_clb_type->pb_type; + clb_to_clb_directs[i].from_clb_type = physical_tile; - for (j = 0; j < pb_type->num_ports; j++) { - if (strcmp(pb_type->ports[j].name, port_name) == 0) { + bool port_found = false; + for (const auto& port : physical_tile->ports) { + if (0 == strcmp(port.name, port_name)) { + tile_port = port; + port_found = true; break; } } - if (j >= pb_type->num_ports) { - vpr_throw(VPR_ERROR_ARCH, get_arch_file_name(), directs[i].line, "Unable to find port %s (on block %s).\n", port_name, pb_type_name); + + if (!port_found) { + VPR_THROW(VPR_ERROR_ARCH, "Unable to find port %s (on block %s).\n", port_name, tile_name); } if (start_pin_index == OPEN) { VTR_ASSERT(start_pin_index == end_pin_index); start_pin_index = 0; - end_pin_index = pb_type->ports[j].num_pins - 1; + end_pin_index = tile_port.num_pins - 1; } - get_blk_pin_from_port_pin(clb_to_clb_directs[i].from_clb_type->index, j, start_pin_index, &clb_to_clb_directs[i].from_clb_pin_start_index); - get_blk_pin_from_port_pin(clb_to_clb_directs[i].from_clb_type->index, j, end_pin_index, &clb_to_clb_directs[i].from_clb_pin_end_index); + + // Add clb directs start/end pin indices based on the absolute pin position + // of the port defined in the direct connection. The CLB is the source one. + clb_to_clb_directs[i].from_clb_pin_start_index = tile_port.absolute_first_pin_index + start_pin_index; + clb_to_clb_directs[i].from_clb_pin_end_index = tile_port.absolute_first_pin_index + end_pin_index; // Load to pins // Parse out the pb_type name, port name, and pin range - parse_direct_pin_name(directs[i].to_pin, directs[i].line, &start_pin_index, &end_pin_index, pb_type_name, port_name); + parse_direct_pin_name(directs[i].to_pin, directs[i].line, &start_pin_index, &end_pin_index, tile_name, port_name); // Figure out which type, port, and pin is used - for (itype = 0; itype < device_ctx.logical_block_types.size(); ++itype) { - if (strcmp(device_ctx.logical_block_types[itype].name, pb_type_name) == 0) { + for (const auto& type : device_ctx.physical_tile_types) { + if (strcmp(type.name, tile_name) == 0) { + physical_tile = &type; break; } } - if (itype >= device_ctx.logical_block_types.size()) { - vpr_throw(VPR_ERROR_ARCH, get_arch_file_name(), directs[i].line, "Unable to find block %s.\n", pb_type_name); + if (physical_tile == nullptr) { + VPR_THROW(VPR_ERROR_ARCH, "Unable to find block %s.\n", tile_name); } - clb_to_clb_directs[i].to_clb_type = &device_ctx.logical_block_types[itype]; - pb_type = clb_to_clb_directs[i].to_clb_type->pb_type; + clb_to_clb_directs[i].to_clb_type = physical_tile; - for (j = 0; j < pb_type->num_ports; j++) { - if (strcmp(pb_type->ports[j].name, port_name) == 0) { + port_found = false; + for (const auto& port : physical_tile->ports) { + if (0 == strcmp(port.name, port_name)) { + tile_port = port; + port_found = true; break; } } - if (j >= pb_type->num_ports) { - vpr_throw(VPR_ERROR_ARCH, get_arch_file_name(), directs[i].line, "Unable to find port %s (on block %s).\n", port_name, pb_type_name); + + if (!port_found) { + VPR_THROW(VPR_ERROR_ARCH, "Unable to find port %s (on block %s).\n", port_name, tile_name); } if (start_pin_index == OPEN) { VTR_ASSERT(start_pin_index == end_pin_index); start_pin_index = 0; - end_pin_index = pb_type->ports[j].num_pins - 1; + end_pin_index = tile_port.num_pins - 1; } - get_blk_pin_from_port_pin(clb_to_clb_directs[i].to_clb_type->index, j, start_pin_index, &clb_to_clb_directs[i].to_clb_pin_start_index); - get_blk_pin_from_port_pin(clb_to_clb_directs[i].to_clb_type->index, j, end_pin_index, &clb_to_clb_directs[i].to_clb_pin_end_index); + // Add clb directs start/end pin indices based on the absolute pin position + // of the port defined in the direct connection. The CLB is the destination one. + clb_to_clb_directs[i].to_clb_pin_start_index = tile_port.absolute_first_pin_index + start_pin_index; + clb_to_clb_directs[i].to_clb_pin_end_index = tile_port.absolute_first_pin_index + end_pin_index; if (abs(clb_to_clb_directs[i].from_clb_pin_start_index - clb_to_clb_directs[i].from_clb_pin_end_index) != abs(clb_to_clb_directs[i].to_clb_pin_start_index - clb_to_clb_directs[i].to_clb_pin_end_index)) { vpr_throw(VPR_ERROR_ARCH, get_arch_file_name(), directs[i].line, @@ -2756,7 +2769,7 @@ static t_clb_to_clb_directs* alloc_and_load_clb_to_clb_directs(const t_direct_in //Use the delayless switch by default clb_to_clb_directs[i].switch_index = delayless_switch; } - free(pb_type_name); + free(tile_name); free(port_name); //We must be careful to clean-up anything that we may have incidentally allocated. @@ -2804,7 +2817,7 @@ static int get_opin_direct_connecions(int x, /* Iterate through all direct connections */ for (int i = 0; i < num_directs; i++) { /* Find matching direct clb-to-clb connections with the same type as current grid location */ - if (clb_to_clb_directs[i].from_clb_type == logical_block_type(curr_type)) { //We are at a valid starting point + if (clb_to_clb_directs[i].from_clb_type == curr_type) { //We are at a valid starting point if (directs[i].from_side != NUM_SIDES && directs[i].from_side != side) continue; @@ -2815,7 +2828,7 @@ static int get_opin_direct_connecions(int x, && y + directs[i].y_offset > 0) { //Only add connections if the target clb type matches the type in the direct specification t_physical_tile_type_ptr target_type = device_ctx.grid[x + directs[i].x_offset][y + directs[i].y_offset].type; - if (clb_to_clb_directs[i].to_clb_type == logical_block_type(target_type) + if (clb_to_clb_directs[i].to_clb_type == target_type && z + directs[i].z_offset < int(target_type->capacity) && z + directs[i].z_offset >= 0) { /* Compute index of opin with regards to given pins */ diff --git a/vpr/src/route/rr_graph2.cpp b/vpr/src/route/rr_graph2.cpp index 72bbe09a555..556a879229f 100644 --- a/vpr/src/route/rr_graph2.cpp +++ b/vpr/src/route/rr_graph2.cpp @@ -452,7 +452,7 @@ void obstruct_chan_details(const DeviceGrid& grid, if (!trim_obs_channels) continue; - if (grid[x][y].type == device_ctx.EMPTY_TYPE) + if (grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) continue; if (grid[x][y].width_offset > 0 || grid[x][y].height_offset > 0) continue; @@ -491,22 +491,22 @@ void obstruct_chan_details(const DeviceGrid& grid, if ((x == 0) || (y == 0)) continue; } - if (grid[x][y].type == device_ctx.EMPTY_TYPE) { + if (grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) { if ((x == grid.width() - 2) && is_io_type(grid[x + 1][y].type)) //-2 for no perim channels continue; if ((y == grid.height() - 2) && is_io_type(grid[x][y + 1].type)) //-2 for no perim channels continue; } - if (is_io_type(grid[x][y].type) || (grid[x][y].type == device_ctx.EMPTY_TYPE)) { - if (is_io_type(grid[x][y + 1].type) || (grid[x][y + 1].type == device_ctx.EMPTY_TYPE)) { + if (is_io_type(grid[x][y].type) || (grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE)) { + if (is_io_type(grid[x][y + 1].type) || (grid[x][y + 1].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE)) { for (int track = 0; track < nodes_per_chan->max; ++track) { chan_details_x[x][y][track].set_length(0); } } } - if (is_io_type(grid[x][y].type) || (grid[x][y].type == device_ctx.EMPTY_TYPE)) { - if (is_io_type(grid[x + 1][y].type) || (grid[x + 1][y].type == device_ctx.EMPTY_TYPE)) { + if (is_io_type(grid[x][y].type) || (grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE)) { + if (is_io_type(grid[x + 1][y].type) || (grid[x + 1][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE)) { for (int track = 0; track < nodes_per_chan->max; ++track) { chan_details_y[x][y][track].set_length(0); } @@ -1370,7 +1370,7 @@ int find_average_rr_node_index(int device_width, for (int x = 0; x < device_width; ++x) { for (int y = 0; y < device_height; ++y) { - if (device_ctx.grid[x][y].type == device_ctx.EMPTY_TYPE) + if (device_ctx.grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) continue; if (is_io_type(device_ctx.grid[x][y].type)) continue; @@ -1429,7 +1429,7 @@ int get_track_to_pins(int seg, } /* PAJ - if the pointed to is an EMPTY then shouldn't look for ipins */ - if (device_ctx.grid[x][y].type == device_ctx.EMPTY_TYPE) + if (device_ctx.grid[x][y].type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE) continue; /* Move from logical (straight) to physical (twisted) track index diff --git a/vpr/src/timing/PostClusterDelayCalculator.tpp b/vpr/src/timing/PostClusterDelayCalculator.tpp index 6d3325af8ef..f1daacd1dfb 100644 --- a/vpr/src/timing/PostClusterDelayCalculator.tpp +++ b/vpr/src/timing/PostClusterDelayCalculator.tpp @@ -263,7 +263,7 @@ inline tatum::Time PostClusterDelayCalculator::atom_net_delay(const tatum::Timin ClusterBlockId driver_block_id = cluster_ctx.clb_nlist.net_driver_block(net_id); VTR_ASSERT(driver_block_id == clb_src_block); - src_block_pin_index = cluster_ctx.clb_nlist.net_pin_physical_index(net_id, 0); + src_block_pin_index = cluster_ctx.clb_nlist.net_pin_logical_index(net_id, 0); tatum::Time driver_clb_delay = tatum::Time(clb_delay_calc_.internal_src_to_clb_output_delay(driver_block_id, src_block_pin_index, diff --git a/vpr/src/timing/clb_delay_calc.inl b/vpr/src/timing/clb_delay_calc.inl index e0ee78dd940..cabf04b620b 100644 --- a/vpr/src/timing/clb_delay_calc.inl +++ b/vpr/src/timing/clb_delay_calc.inl @@ -10,13 +10,11 @@ inline ClbDelayCalc::ClbDelayCalc() : intra_lb_pb_pin_lookup_(g_vpr_ctx.device().logical_block_types) {} inline float ClbDelayCalc::clb_input_to_internal_sink_delay(const ClusterBlockId block_id, const int pin_index, int internal_sink_pin, DelayType delay_type) const { - int pb_ipin = find_clb_pb_pin(block_id, pin_index); - return trace_delay(block_id, pb_ipin, internal_sink_pin, delay_type); + return trace_delay(block_id, pin_index, internal_sink_pin, delay_type); } inline float ClbDelayCalc::internal_src_to_clb_output_delay(const ClusterBlockId block_id, const int pin_index, int internal_src_pin, DelayType delay_type) const { - int pb_opin = find_clb_pb_pin(block_id, pin_index); - return trace_delay(block_id, internal_src_pin, pb_opin, delay_type); + return trace_delay(block_id, internal_src_pin, pin_index, delay_type); } inline float ClbDelayCalc::internal_src_to_internal_sink_delay(const ClusterBlockId clb, int internal_src_pin, int internal_sink_pin, DelayType delay_type) const { diff --git a/vpr/src/util/vpr_utils.cpp b/vpr/src/util/vpr_utils.cpp index 463a7273802..69e712d4999 100644 --- a/vpr/src/util/vpr_utils.cpp +++ b/vpr/src/util/vpr_utils.cpp @@ -6,6 +6,7 @@ #include "vtr_assert.h" #include "vtr_log.h" #include "vtr_memory.h" +#include "vtr_random.h" #include "vpr_types.h" #include "vpr_error.h" @@ -33,20 +34,9 @@ * while in the post-pack level, block pins are used. The reason block * * type is used instead of blocks is to save memories. */ -/* f_port_from_blk_pin array allow us to quickly find what port a block * - * pin corresponds to. * - * [0...device_ctx.logical_block_type.size()-1][0...blk_pin_count-1] * - * */ -static int** f_port_from_blk_pin = nullptr; - -/* f_port_pin_from_blk_pin array allow us to quickly find what port pin a* - * block pin corresponds to. * - * [0...device_ctx.logical_block_types.size()-1][0...blk_pin_count-1] */ -static int** f_port_pin_from_blk_pin = nullptr; - /* f_port_pin_to_block_pin array allows us to quickly find what block * * pin a port pin corresponds to. * - * [0...device_ctx.logical_block_types.size()-1][0...num_ports-1][0...num_port_pins-1] */ + * [0...device_ctx.physical_tile_types.size()-1][0...num_ports-1][0...num_port_pins-1] */ static int*** f_blk_pin_from_port_pin = nullptr; //Regular expressions used to determine register and logic primitives @@ -56,11 +46,6 @@ const std::regex LOGIC_MODEL_REGEX("(.subckt\\s+)?.*(lut|names|lcell).*", std::r /******************** Subroutine declarations ********************************/ -/* Allocates and loads f_port_from_blk_pin and f_port_pin_from_blk_pin * - * arrays. * - * The arrays are freed in free_placement_structs() */ -static void alloc_and_load_port_pin_from_blk_pin(); - /* Allocates and loads blk_pin_from_port_pin array. * * The arrays are freed in free_placement_structs() */ static void alloc_and_load_blk_pin_from_port_pin(); @@ -217,20 +202,17 @@ std::string block_type_pin_index_to_name(t_physical_tile_type_ptr type, int pin_ pin_name += "."; - t_pb_type* pb_type = logical_block_type(type)->pb_type; int curr_index = 0; - for (int iport = 0; iport < pb_type->num_ports; ++iport) { - t_port* port = &pb_type->ports[iport]; - - if (curr_index + port->num_pins > pin_index) { + for (auto const& port : type->ports) { + if (curr_index + port.num_pins > pin_index) { //This port contains the desired pin index int index_in_port = pin_index - curr_index; - pin_name += port->name; + pin_name += port.name; pin_name += "[" + std::to_string(index_in_port) + "]"; return pin_name; } - curr_index += port->num_pins; + curr_index += port.num_pins; } return ""; @@ -330,36 +312,42 @@ void swap(IntraLbPbPinLookup& lhs, IntraLbPbPinLookup& rhs) { //Returns the set of pins which are connected to the top level clb pin // The pin(s) may be input(s) or and output (returning the connected sinks or drivers respectively) -std::vector find_clb_pin_connected_atom_pins(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { +std::vector find_clb_pin_connected_atom_pins(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { std::vector atom_pins; + auto& clb_nlist = g_vpr_ctx.clustering().clb_nlist; + + auto logical_block = clb_nlist.block_type(clb); + auto physical_tile = pick_best_physical_type(logical_block); - if (is_opin(clb_pin, physical_tile_type(clb))) { + int physical_pin = get_physical_pin(physical_tile, logical_block, logical_pin); + + if (is_opin(physical_pin, physical_tile)) { //output - AtomPinId driver = find_clb_pin_driver_atom_pin(clb, clb_pin, pb_gpin_lookup); + AtomPinId driver = find_clb_pin_driver_atom_pin(clb, logical_pin, pb_gpin_lookup); if (driver) { atom_pins.push_back(driver); } } else { //input - atom_pins = find_clb_pin_sink_atom_pins(clb, clb_pin, pb_gpin_lookup); + atom_pins = find_clb_pin_sink_atom_pins(clb, logical_pin, pb_gpin_lookup); } return atom_pins; } //Returns the atom pin which drives the top level clb output pin -AtomPinId find_clb_pin_driver_atom_pin(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { +AtomPinId find_clb_pin_driver_atom_pin(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { auto& cluster_ctx = g_vpr_ctx.clustering(); auto& atom_ctx = g_vpr_ctx.atom(); - int pb_pin_id = find_clb_pb_pin(clb, clb_pin); - if (pb_pin_id < 0) { + if (logical_pin < 0) { //CLB output pin has no internal driver return AtomPinId::INVALID(); } const t_pb_routes& pb_routes = cluster_ctx.clb_nlist.block_pb(clb)->pb_route; - AtomNetId atom_net = pb_routes[pb_pin_id].atom_net_id; + AtomNetId atom_net = pb_routes[logical_pin].atom_net_id; + int pb_pin_id = logical_pin; //Trace back until the driver is reached while (pb_routes[pb_pin_id].driver_pb_pin_id >= 0) { pb_pin_id = pb_routes[pb_pin_id].driver_pb_pin_id; @@ -378,29 +366,27 @@ AtomPinId find_clb_pin_driver_atom_pin(ClusterBlockId clb, int clb_pin, const In } //Returns the set of atom sink pins associated with the top level clb input pin -std::vector find_clb_pin_sink_atom_pins(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { +std::vector find_clb_pin_sink_atom_pins(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup) { auto& cluster_ctx = g_vpr_ctx.clustering(); auto& atom_ctx = g_vpr_ctx.atom(); const t_pb_routes& pb_routes = cluster_ctx.clb_nlist.block_pb(clb)->pb_route; - VTR_ASSERT_MSG(clb_pin < physical_tile_type(clb)->num_pins, "Must be a valid top-level pin"); - - int pb_pin = find_clb_pb_pin(clb, clb_pin); + VTR_ASSERT_MSG(logical_pin < cluster_ctx.clb_nlist.block_type(clb)->pb_type->num_pins, "Must be a valid tile pin"); VTR_ASSERT(cluster_ctx.clb_nlist.block_pb(clb)); - VTR_ASSERT_MSG(pb_pin < cluster_ctx.clb_nlist.block_pb(clb)->pb_graph_node->num_pins(), "Pin must map to a top-level pb pin"); + VTR_ASSERT_MSG(logical_pin < cluster_ctx.clb_nlist.block_pb(clb)->pb_graph_node->num_pins(), "Pin must map to a top-level pb pin"); - VTR_ASSERT_MSG(pb_routes[pb_pin].driver_pb_pin_id < 0, "CLB input pin should have no internal drivers"); + VTR_ASSERT_MSG(pb_routes[logical_pin].driver_pb_pin_id < 0, "CLB input pin should have no internal drivers"); - AtomNetId atom_net = pb_routes[pb_pin].atom_net_id; + AtomNetId atom_net = pb_routes[logical_pin].atom_net_id; VTR_ASSERT(atom_net); - std::vector connected_sink_pb_pins = find_connected_internal_clb_sink_pins(clb, pb_pin); + std::vector connected_sink_pb_pins = find_connected_internal_clb_sink_pins(clb, logical_pin); std::vector sink_atom_pins; for (int sink_pb_pin : connected_sink_pb_pins) { - //Map the pb_pin_id to AtomPinId + //Map the logical_pin_id to AtomPinId AtomPinId atom_pin = find_atom_pin_for_pb_route_id(clb, sink_pb_pin, pb_gpin_lookup); VTR_ASSERT(atom_pin); @@ -519,68 +505,13 @@ std::tuple find_pb_route_clb_input_net_pin(ClusterBlockI return std::make_tuple(ClusterNetId::INVALID(), -1, -1); } - //To account for capacity > 1 blocks we need to convert the pb_pin to the clb pin - int clb_pin = find_pb_pin_clb_pin(clb, curr_pb_pin_id); - VTR_ASSERT(clb_pin >= 0); - - //clb_pin should be a top-level CLB input - ClusterNetId clb_net_idx = cluster_ctx.clb_nlist.block_net(clb, clb_pin); - int clb_net_pin_idx = cluster_ctx.clb_nlist.block_pin_net_index(clb, clb_pin); + //curr_pb_pin should be a top-level CLB input + ClusterNetId clb_net_idx = cluster_ctx.clb_nlist.block_net(clb, curr_pb_pin_id); + int clb_net_pin_idx = cluster_ctx.clb_nlist.block_pin_net_index(clb, curr_pb_pin_id); VTR_ASSERT(clb_net_idx != ClusterNetId::INVALID()); VTR_ASSERT(clb_net_pin_idx >= 0); - return std::tuple(clb_net_idx, clb_pin, clb_net_pin_idx); -} - -//Return the pb pin index corresponding to the pin clb_pin on block clb -// Given a clb_pin index on a this function will return the corresponding -// pin index on the pb_type (accounting for the possible z-coordinate offset). -int find_clb_pb_pin(ClusterBlockId clb, int clb_pin) { - auto& place_ctx = g_vpr_ctx.placement(); - - auto type = physical_tile_type(clb); - VTR_ASSERT_MSG(clb_pin < type->num_pins, "Must be a valid top-level pin"); - - int pb_pin = -1; - if (place_ctx.block_locs[clb].nets_and_pins_synced_to_z_coordinate) { - //Pins have been offset by z-coordinate, need to remove offset - - VTR_ASSERT(type->num_pins % type->capacity == 0); - int num_basic_block_pins = type->num_pins / type->capacity; - /* Logical location and physical location is offset by z * max_num_block_pins */ - - pb_pin = clb_pin - place_ctx.block_locs[clb].loc.z * num_basic_block_pins; - } else { - //No offset - pb_pin = clb_pin; - } - - VTR_ASSERT(pb_pin >= 0); - - return pb_pin; -} - -//Inverse of find_clb_pb_pin() -int find_pb_pin_clb_pin(ClusterBlockId clb, int pb_pin) { - auto& place_ctx = g_vpr_ctx.placement(); - - auto type = physical_tile_type(clb); - - int clb_pin = -1; - if (place_ctx.block_locs[clb].nets_and_pins_synced_to_z_coordinate) { - //Pins have been offset by z-coordinate, need to remove offset - VTR_ASSERT(type->num_pins % type->capacity == 0); - int num_basic_block_pins = type->num_pins / type->capacity; - /* Logical location and physical location is offset by z * max_num_block_pins */ - - clb_pin = pb_pin + place_ctx.block_locs[clb].loc.z * num_basic_block_pins; - } else { - //No offset - clb_pin = pb_pin; - } - VTR_ASSERT(clb_pin >= 0); - - return clb_pin; + return std::tuple(clb_net_idx, curr_pb_pin_id, clb_net_pin_idx); } bool is_clb_external_pin(ClusterBlockId blk_id, int pb_pin_id) { @@ -633,33 +564,23 @@ bool is_io_type(t_physical_tile_type_ptr type) { bool is_empty_type(t_physical_tile_type_ptr type) { auto& device_ctx = g_vpr_ctx.device(); - return type == device_ctx.EMPTY_TYPE; + return type == device_ctx.EMPTY_PHYSICAL_TILE_TYPE; } -t_physical_tile_type_ptr physical_tile_type(t_logical_block_type_ptr logical_block_type) { +bool is_empty_type(t_logical_block_type_ptr type) { auto& device_ctx = g_vpr_ctx.device(); - /* It is assumed that there is a 1:1 mapping between logical and physical types - * making it possible to use the same index to access the corresponding type - */ - return &device_ctx.physical_tile_types[logical_block_type->index]; + return type == device_ctx.EMPTY_LOGICAL_BLOCK_TYPE; } t_physical_tile_type_ptr physical_tile_type(ClusterBlockId blk) { - auto& cluster_ctx = g_vpr_ctx.clustering(); - - auto blk_type = cluster_ctx.clb_nlist.block_type(blk); - - return physical_tile_type(blk_type); -} - -t_logical_block_type_ptr logical_block_type(t_physical_tile_type_ptr physical_tile_type) { + auto& place_ctx = g_vpr_ctx.placement(); auto& device_ctx = g_vpr_ctx.device(); - /* It is assumed that there is a 1:1 mapping between logical and physical types - * making it possible to use the same index to access the corresponding type - */ - return &device_ctx.logical_block_types[physical_tile_type->index]; + auto block_loc = place_ctx.block_locs[blk]; + auto loc = block_loc.loc; + + return device_ctx.grid[loc.x][loc.y].type; } /* Each node in the pb_graph for a top-level pb_type can be uniquely identified @@ -715,7 +636,7 @@ void get_pin_range_for_block(const ClusterBlockId blk_id, *pin_high = (place_ctx.block_locs[blk_id].loc.z + 1) * (type->num_pins / type->capacity) - 1; } -t_physical_tile_type_ptr find_block_type_by_name(std::string name, const std::vector& types) { +t_physical_tile_type_ptr find_tile_type_by_name(std::string name, const std::vector& types) { for (auto const& type : types) { if (type.name == name) { return &type; @@ -746,7 +667,14 @@ t_logical_block_type_ptr infer_logic_block_type(const DeviceGrid& grid) { //Sort the candidates by the most common block type auto by_desc_grid_count = [&](t_logical_block_type_ptr lhs, t_logical_block_type_ptr rhs) { - return grid.num_instances(physical_tile_type(lhs)) > grid.num_instances(physical_tile_type(rhs)); + int lhs_num_instances = 0; + int rhs_num_instances = 0; + // Count number of instances for each type + for (auto type : lhs->equivalent_tiles) + lhs_num_instances += grid.num_instances(type); + for (auto type : rhs->equivalent_tiles) + rhs_num_instances += grid.num_instances(type); + return lhs_num_instances > rhs_num_instances; }; std::stable_sort(logic_block_candidates.begin(), logic_block_candidates.end(), by_desc_grid_count); @@ -765,34 +693,66 @@ t_logical_block_type_ptr infer_logic_block_type(const DeviceGrid& grid) { t_logical_block_type_ptr find_most_common_block_type(const DeviceGrid& grid) { auto& device_ctx = g_vpr_ctx.device(); + t_logical_block_type_ptr max_type = nullptr; + size_t max_count = 0; + for (const auto& logical_block : device_ctx.logical_block_types) { + size_t inst_cnt = 0; + for (const auto& equivalent_tile : logical_block.equivalent_tiles) { + inst_cnt += grid.num_instances(equivalent_tile); + } + + if (max_count < inst_cnt) { + max_count = inst_cnt; + max_type = &logical_block; + } + } + + if (max_type == nullptr) { + VTR_LOG_WARN("Unable to determine most common block type (perhaps the device grid was empty?)\n"); + } + + return max_type; +} + +t_physical_tile_type_ptr find_most_common_tile_type(const DeviceGrid& grid) { + auto& device_ctx = g_vpr_ctx.device(); + t_physical_tile_type_ptr max_type = nullptr; size_t max_count = 0; - for (const auto& type : device_ctx.physical_tile_types) { - size_t inst_cnt = grid.num_instances(&type); + for (const auto& physical_tile : device_ctx.physical_tile_types) { + size_t inst_cnt = grid.num_instances(&physical_tile); + if (max_count < inst_cnt) { max_count = inst_cnt; - max_type = &type; + max_type = &physical_tile; } } if (max_type == nullptr) { VTR_LOG_WARN("Unable to determine most common block type (perhaps the device grid was empty?)\n"); - return nullptr; } - return logical_block_type(max_type); + + return max_type; } InstPort parse_inst_port(std::string str) { InstPort inst_port(str); auto& device_ctx = g_vpr_ctx.device(); - auto blk_type = find_block_type_by_name(inst_port.instance_name(), device_ctx.physical_tile_types); - if (!blk_type) { + auto blk_type = find_tile_type_by_name(inst_port.instance_name(), device_ctx.physical_tile_types); + if (blk_type == nullptr) { VPR_FATAL_ERROR(VPR_ERROR_ARCH, "Failed to find block type named %s", inst_port.instance_name().c_str()); } - const t_port* port = find_pb_graph_port(logical_block_type(blk_type)->pb_graph_head, inst_port.port_name()); - if (!port) { + int num_pins = OPEN; + for (auto physical_port : blk_type->ports) { + if (0 == strcmp(inst_port.port_name().c_str(), physical_port.name)) { + num_pins = physical_port.num_pins; + break; + } + } + + if (num_pins == OPEN) { VPR_FATAL_ERROR(VPR_ERROR_ARCH, "Failed to find port %s on block type %s", inst_port.port_name().c_str(), inst_port.instance_name().c_str()); } @@ -800,55 +760,51 @@ InstPort parse_inst_port(std::string str) { VTR_ASSERT(inst_port.port_high_index() == InstPort::UNSPECIFIED); inst_port.set_port_low_index(0); - inst_port.set_port_high_index(port->num_pins - 1); + inst_port.set_port_high_index(num_pins - 1); } else { - if (inst_port.port_low_index() < 0 || inst_port.port_low_index() >= port->num_pins - || inst_port.port_high_index() < 0 || inst_port.port_high_index() >= port->num_pins) { + if (inst_port.port_low_index() < 0 || inst_port.port_low_index() >= num_pins + || inst_port.port_high_index() < 0 || inst_port.port_high_index() >= num_pins) { VPR_FATAL_ERROR(VPR_ERROR_ARCH, "Pin indices [%d:%d] on port %s of block type %s out of expected range [%d:%d]", inst_port.port_low_index(), inst_port.port_high_index(), inst_port.port_name().c_str(), inst_port.instance_name().c_str(), - 0, port->num_pins - 1); + 0, num_pins - 1); } } return inst_port; } //Returns the pin class associated with the specified pin_index_in_port within the port port_name on type -int find_pin_class(t_logical_block_type_ptr type, std::string port_name, int pin_index_in_port, e_pin_type pin_type) { +int find_pin_class(t_physical_tile_type_ptr type, std::string port_name, int pin_index_in_port, e_pin_type pin_type) { int iclass = OPEN; int ipin = find_pin(type, port_name, pin_index_in_port); if (ipin != OPEN) { - iclass = physical_tile_type(type)->pin_class[ipin]; + iclass = type->pin_class[ipin]; if (iclass != OPEN) { - VTR_ASSERT(physical_tile_type(type)->class_inf[iclass].type == pin_type); + VTR_ASSERT(type->class_inf[iclass].type == pin_type); } } return iclass; } -int find_pin(t_logical_block_type_ptr type, std::string port_name, int pin_index_in_port) { +int find_pin(t_physical_tile_type_ptr type, std::string port_name, int pin_index_in_port) { int ipin = OPEN; - - t_pb_type* pb_type = type->pb_type; - t_port* matched_port = nullptr; int port_base_ipin = 0; - for (int iport = 0; iport < pb_type->num_ports; ++iport) { - t_port* port = &pb_type->ports[iport]; + int num_pins = OPEN; - if (port->name == port_name) { - matched_port = port; + for (auto port : type->ports) { + if (port.name == port_name) { + num_pins = port.num_pins; break; } - port_base_ipin += port->num_pins; + port_base_ipin += port.num_pins; } - if (matched_port) { - VTR_ASSERT(matched_port->name == port_name); - VTR_ASSERT(pin_index_in_port < matched_port->num_pins); + if (num_pins != OPEN) { + VTR_ASSERT(pin_index_in_port < num_pins); ipin = port_base_ipin + pin_index_in_port; } @@ -1573,124 +1529,6 @@ void free_pb_stats(t_pb* pb) { * * ***************************************************************************************/ -void get_port_pin_from_blk_pin(int blk_type_index, int blk_pin, int* port, int* port_pin) { - /* These two mappings are needed since there are two different netlist * - * conventions - in the cluster level, ports and port pins are used * - * while in the post-pack level, block pins are used. The reason block * - * type is used instead of blocks is that the mapping is the same for * - * blocks belonging to the same block type. * - * * - * f_port_from_blk_pin array allow us to quickly find what port a * - * block pin corresponds to. * - * [0...device_ctx.logical_block_types.size()-1][0...blk_pin_count-1] * - * * - * f_port_pin_from_blk_pin array allow us to quickly find what port * - * pin a block pin corresponds to. * - * [0...device_ctx.logical_block_types.size()-1][0...blk_pin_count-1] */ - - /* If either one of the arrays is not allocated and loaded, it is * - * corrupted, so free both of them. */ - if ((f_port_from_blk_pin == nullptr && f_port_pin_from_blk_pin != nullptr) - || (f_port_from_blk_pin != nullptr && f_port_pin_from_blk_pin == nullptr)) { - free_port_pin_from_blk_pin(); - } - - /* If the arrays are not allocated and loaded, allocate it. */ - if (f_port_from_blk_pin == nullptr && f_port_pin_from_blk_pin == nullptr) { - alloc_and_load_port_pin_from_blk_pin(); - } - - /* Return the port and port_pin for the pin. */ - *port = f_port_from_blk_pin[blk_type_index][blk_pin]; - *port_pin = f_port_pin_from_blk_pin[blk_type_index][blk_pin]; -} - -void free_port_pin_from_blk_pin() { - /* Frees the f_port_from_blk_pin and f_port_pin_from_blk_pin arrays. * - * * - * This function is called when the file-scope arrays are corrupted. * - * Otherwise, the arrays are freed in free_placement_structs() */ - - unsigned int itype; - - auto& device_ctx = g_vpr_ctx.device(); - - if (f_port_from_blk_pin != nullptr) { - for (itype = 1; itype < device_ctx.logical_block_types.size(); itype++) { - free(f_port_from_blk_pin[itype]); - } - free(f_port_from_blk_pin); - - f_port_from_blk_pin = nullptr; - } - - if (f_port_pin_from_blk_pin != nullptr) { - for (itype = 1; itype < device_ctx.logical_block_types.size(); itype++) { - free(f_port_pin_from_blk_pin[itype]); - } - free(f_port_pin_from_blk_pin); - - f_port_pin_from_blk_pin = nullptr; - } -} - -static void alloc_and_load_port_pin_from_blk_pin() { - /* Allocates and loads f_port_from_blk_pin and f_port_pin_from_blk_pin * - * arrays. * - * * - * The arrays are freed in free_placement_structs() */ - - int** temp_port_from_blk_pin = nullptr; - int** temp_port_pin_from_blk_pin = nullptr; - unsigned int itype; - int iblk_pin, iport, iport_pin; - int blk_pin_count, num_port_pins, num_ports; - auto& device_ctx = g_vpr_ctx.device(); - - /* Allocate and initialize the values to OPEN (-1). */ - temp_port_from_blk_pin = (int**)vtr::malloc(device_ctx.logical_block_types.size() * sizeof(int*)); - temp_port_pin_from_blk_pin = (int**)vtr::malloc(device_ctx.logical_block_types.size() * sizeof(int*)); - for (const auto& type : device_ctx.logical_block_types) { - itype = type.index; - blk_pin_count = physical_tile_type(&type)->num_pins; - - temp_port_from_blk_pin[itype] = (int*)vtr::malloc(blk_pin_count * sizeof(int)); - temp_port_pin_from_blk_pin[itype] = (int*)vtr::malloc(blk_pin_count * sizeof(int)); - - for (iblk_pin = 0; iblk_pin < blk_pin_count; iblk_pin++) { - temp_port_from_blk_pin[itype][iblk_pin] = OPEN; - temp_port_pin_from_blk_pin[itype][iblk_pin] = OPEN; - } - } - - /* Load the values */ - for (const auto& type : device_ctx.logical_block_types) { - itype = type.index; - - /* itype starts from 1 since device_ctx.logical_block_types[0] is the EMPTY_TYPE. */ - if (itype == 0) { - continue; - } - - blk_pin_count = 0; - num_ports = type.pb_type->num_ports; - - for (iport = 0; iport < num_ports; iport++) { - num_port_pins = type.pb_type->ports[iport].num_pins; - - for (iport_pin = 0; iport_pin < num_port_pins; iport_pin++) { - temp_port_from_blk_pin[itype][blk_pin_count] = iport; - temp_port_pin_from_blk_pin[itype][blk_pin_count] = iport_pin; - blk_pin_count++; - } - } - } - - /* Sets the file_scope variables to point at the arrays. */ - f_port_from_blk_pin = temp_port_from_blk_pin; - f_port_pin_from_blk_pin = temp_port_pin_from_blk_pin; -} - void get_blk_pin_from_port_pin(int blk_type_index, int port, int port_pin, int* blk_pin) { /* This mapping is needed since there are two different netlist * * conventions - in the cluster level, ports and port pins are used * @@ -1720,15 +1558,15 @@ void free_blk_pin_from_port_pin() { auto& device_ctx = g_vpr_ctx.device(); if (f_blk_pin_from_port_pin != nullptr) { - for (const auto& type : device_ctx.logical_block_types) { + for (const auto& type : device_ctx.physical_tile_types) { int itype = type.index; - // Avoid EMPTY_TYPE + // Avoid EMPTY_PHYSICAL_TILE_TYPE if (itype == 0) { continue; } - num_ports = type.pb_type->num_ports; + num_ports = type.ports.size(); for (iport = 0; iport < num_ports; iport++) { free(f_blk_pin_from_port_pin[itype][iport]); } @@ -1752,12 +1590,12 @@ static void alloc_and_load_blk_pin_from_port_pin() { auto& device_ctx = g_vpr_ctx.device(); /* Allocate and initialize the values to OPEN (-1). */ - temp_blk_pin_from_port_pin = (int***)vtr::malloc(device_ctx.logical_block_types.size() * sizeof(int**)); - for (itype = 1; itype < device_ctx.logical_block_types.size(); itype++) { - num_ports = device_ctx.logical_block_types[itype].pb_type->num_ports; + temp_blk_pin_from_port_pin = (int***)vtr::malloc(device_ctx.physical_tile_types.size() * sizeof(int**)); + for (itype = 1; itype < device_ctx.physical_tile_types.size(); itype++) { + num_ports = device_ctx.physical_tile_types[itype].ports.size(); temp_blk_pin_from_port_pin[itype] = (int**)vtr::malloc(num_ports * sizeof(int*)); for (iport = 0; iport < num_ports; iport++) { - num_port_pins = device_ctx.logical_block_types[itype].pb_type->ports[iport].num_pins; + num_port_pins = device_ctx.physical_tile_types[itype].ports[iport].num_pins; temp_blk_pin_from_port_pin[itype][iport] = (int*)vtr::malloc(num_port_pins * sizeof(int)); for (iport_pin = 0; iport_pin < num_port_pins; iport_pin++) { @@ -1767,12 +1605,12 @@ static void alloc_and_load_blk_pin_from_port_pin() { } /* Load the values */ - /* itype starts from 1 since device_ctx.block_types[0] is the EMPTY_TYPE. */ - for (itype = 1; itype < device_ctx.logical_block_types.size(); itype++) { + /* itype starts from 1 since device_ctx.block_types[0] is the EMPTY_PHYSICAL_TILE_TYPE. */ + for (itype = 1; itype < device_ctx.physical_tile_types.size(); itype++) { blk_pin_count = 0; - num_ports = device_ctx.logical_block_types[itype].pb_type->num_ports; + num_ports = device_ctx.physical_tile_types[itype].ports.size(); for (iport = 0; iport < num_ports; iport++) { - num_port_pins = device_ctx.logical_block_types[itype].pb_type->ports[iport].num_pins; + num_port_pins = device_ctx.physical_tile_types[itype].ports[iport].num_pins; for (iport_pin = 0; iport_pin < num_port_pins; iport_pin++) { temp_blk_pin_from_port_pin[itype][iport][iport_pin] = blk_pin_count; blk_pin_count++; @@ -1933,14 +1771,15 @@ static void mark_direct_of_ports(int idirect, int direct_type, char* pb_type_nam auto& device_ctx = g_vpr_ctx.device(); // Go through all the block types - for (itype = 1; itype < device_ctx.logical_block_types.size(); itype++) { + for (itype = 1; itype < device_ctx.physical_tile_types.size(); itype++) { + auto& physical_tile = device_ctx.physical_tile_types[itype]; // Find blocks with the same pb_type_name - if (strcmp(device_ctx.logical_block_types[itype].pb_type->name, pb_type_name) == 0) { - num_ports = device_ctx.logical_block_types[itype].pb_type->num_ports; + if (strcmp(physical_tile.name, pb_type_name) == 0) { + num_ports = physical_tile.ports.size(); for (iport = 0; iport < num_ports; iport++) { // Find ports with the same port_name - if (strcmp(device_ctx.logical_block_types[itype].pb_type->ports[iport].name, port_name) == 0) { - num_port_pins = device_ctx.logical_block_types[itype].pb_type->ports[iport].num_pins; + if (strcmp(physical_tile.ports[iport].name, port_name) == 0) { + num_port_pins = physical_tile.ports[iport].num_pins; // Check whether the end_pin_index is valid if (end_pin_index > num_port_pins) { @@ -1997,11 +1836,13 @@ void alloc_and_load_idirect_from_blk_pin(t_direct_inf* directs, int num_directs, auto& device_ctx = g_vpr_ctx.device(); /* Allocate and initialize the values to OPEN (-1). */ - temp_idirect_from_blk_pin = (int**)vtr::malloc(device_ctx.logical_block_types.size() * sizeof(int*)); - temp_direct_type_from_blk_pin = (int**)vtr::malloc(device_ctx.logical_block_types.size() * sizeof(int*)); - for (const auto& type : device_ctx.logical_block_types) { + temp_idirect_from_blk_pin = (int**)vtr::malloc(device_ctx.physical_tile_types.size() * sizeof(int*)); + temp_direct_type_from_blk_pin = (int**)vtr::malloc(device_ctx.physical_tile_types.size() * sizeof(int*)); + for (const auto& type : device_ctx.physical_tile_types) { + if (is_empty_type(&type)) continue; + int itype = type.index; - num_type_pins = physical_tile_type(&type)->num_pins; + num_type_pins = type.num_pins; temp_idirect_from_blk_pin[itype] = (int*)vtr::malloc(num_type_pins * sizeof(int)); temp_direct_type_from_blk_pin[itype] = (int*)vtr::malloc(num_type_pins * sizeof(int)); @@ -2210,24 +2051,40 @@ void print_switch_usage() { */ void place_sync_external_block_connections(ClusterBlockId iblk) { - auto& cluster_ctx = g_vpr_ctx.mutable_clustering(); + auto& cluster_ctx = g_vpr_ctx.clustering(); + auto& clb_nlist = cluster_ctx.clb_nlist; auto& place_ctx = g_vpr_ctx.mutable_placement(); - VTR_ASSERT_MSG(place_ctx.block_locs[iblk].nets_and_pins_synced_to_z_coordinate == false, "Block net and pins must not be already synced"); - auto type = physical_tile_type(iblk); - VTR_ASSERT(type->num_pins % type->capacity == 0); - int max_num_block_pins = type->num_pins / type->capacity; + auto physical_tile = physical_tile_type(iblk); + auto logical_block = clb_nlist.block_type(iblk); + + VTR_ASSERT(physical_tile->num_pins % physical_tile->capacity == 0); + int max_num_block_pins = physical_tile->num_pins / physical_tile->capacity; /* Logical location and physical location is offset by z * max_num_block_pins */ - auto& clb_nlist = cluster_ctx.clb_nlist; for (auto pin : clb_nlist.block_pins(iblk)) { - int orig_phys_pin_index = clb_nlist.pin_physical_index(pin); - int new_phys_pin_index = orig_phys_pin_index + place_ctx.block_locs[iblk].loc.z * max_num_block_pins; - clb_nlist.set_pin_physical_index(pin, new_phys_pin_index); + int logical_pin_index = clb_nlist.pin_logical_index(pin); + int physical_pin_index = get_physical_pin(physical_tile, logical_block, logical_pin_index); + + int new_physical_pin_index = physical_pin_index + place_ctx.block_locs[iblk].loc.z * max_num_block_pins; + + auto result = place_ctx.physical_pins.find(pin); + if (result != place_ctx.physical_pins.end()) { + place_ctx.physical_pins[pin] = new_physical_pin_index; + } else { + place_ctx.physical_pins.insert(pin, new_physical_pin_index); + } } +} - //Mark the block as synced - place_ctx.block_locs[iblk].nets_and_pins_synced_to_z_coordinate = true; +int get_max_num_pins(t_logical_block_type_ptr logical_block) { + int max_num_pins = 0; + + for (auto physical_tile : logical_block->equivalent_tiles) { + max_num_pins = std::max(max_num_pins, physical_tile->num_pins); + } + + return max_num_pins; } int max_pins_per_grid_tile() { @@ -2241,6 +2098,75 @@ int max_pins_per_grid_tile() { return max_pins; } +bool is_tile_compatible(t_physical_tile_type_ptr physical_tile, t_logical_block_type_ptr logical_block) { + auto equivalent_tiles = logical_block->equivalent_tiles; + return std::find(equivalent_tiles.begin(), equivalent_tiles.end(), physical_tile) != equivalent_tiles.end(); +} + +/** + * This function returns the most common physical tile type given a logical block + */ +t_physical_tile_type_ptr pick_best_physical_type(t_logical_block_type_ptr logical_block) { + return logical_block->equivalent_tiles[0]; +} + +t_logical_block_type_ptr pick_best_logical_type(t_physical_tile_type_ptr physical_tile) { + return physical_tile->equivalent_sites[0]; +} + +int get_logical_pin(t_physical_tile_type_ptr physical_tile, + t_logical_block_type_ptr logical_block, + int pin) { + t_physical_pin physical_pin(pin); + + auto direct_map = physical_tile->tile_block_pin_directs_map.at(logical_block->index); + auto result = direct_map.find(physical_pin); + + if (result == direct_map.inverse_end()) { + VTR_LOG_WARN( + "Couldn't find the corresponding logical pin of the physical pin %d." + "Physical Tile: %s, Logical Block: %s.\n", + pin, physical_tile->name, logical_block->name); + return OPEN; + } + + return result->second.pin; +} + +int get_physical_pin(t_physical_tile_type_ptr physical_tile, + t_logical_block_type_ptr logical_block, + int pin) { + t_logical_pin logical_pin(pin); + + auto direct_map = physical_tile->tile_block_pin_directs_map.at(logical_block->index); + auto result = direct_map.find(logical_pin); + + if (result == direct_map.end()) { + VTR_LOG_WARN( + "Couldn't find the corresponding physical pin of the logical pin %d." + "Physical Tile: %s, Logical Block: %s.\n", + pin, physical_tile->name, logical_block->name); + return OPEN; + } + + return result->second.pin; +} + +int net_pin_to_tile_pin_index(const ClusterNetId net_id, int net_pin_index) { + auto& cluster_ctx = g_vpr_ctx.clustering(); + + // Get the logical pin index of pin within it's logical block type + auto pin_id = cluster_ctx.clb_nlist.net_pin(net_id, net_pin_index); + + return tile_pin_index(pin_id); +} + +int tile_pin_index(const ClusterPinId pin) { + auto& place_ctx = g_vpr_ctx.placement(); + + return place_ctx.physical_pins[pin]; +} + void pretty_print_uint(const char* prefix, size_t value, int num_digits, int scientific_precision) { //Print as integer if it will fit in the width, other wise scientific if (value <= std::pow(10, num_digits) - 1) { diff --git a/vpr/src/util/vpr_utils.h b/vpr/src/util/vpr_utils.h index 5209db5aa00..d05dc88173e 100644 --- a/vpr/src/util/vpr_utils.h +++ b/vpr/src/util/vpr_utils.h @@ -26,11 +26,10 @@ bool is_input_type(t_physical_tile_type_ptr type); bool is_output_type(t_physical_tile_type_ptr type); bool is_io_type(t_physical_tile_type_ptr type); bool is_empty_type(t_physical_tile_type_ptr type); +bool is_empty_type(t_logical_block_type_ptr type); -//Returns the corresponding physical/logical type given the logical/physical type as parameter -t_physical_tile_type_ptr physical_tile_type(t_logical_block_type_ptr logical_block_type); +//Returns the corresponding physical type given the logical type as parameter t_physical_tile_type_ptr physical_tile_type(ClusterBlockId blk); -t_logical_block_type_ptr logical_block_type(t_physical_tile_type_ptr physical_tile_type); int get_unique_pb_graph_node_id(const t_pb_graph_node* pb_graph_node); @@ -76,32 +75,16 @@ class IntraLbPbPinLookup { }; //Find the atom pins (driver or sinks) connected to the specified top-level CLB pin -std::vector find_clb_pin_connected_atom_pins(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup); +std::vector find_clb_pin_connected_atom_pins(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup); //Find the atom pin driving to the specified top-level CLB pin -AtomPinId find_clb_pin_driver_atom_pin(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup); +AtomPinId find_clb_pin_driver_atom_pin(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup); //Find the atom pins driven by the specified top-level CLB pin -std::vector find_clb_pin_sink_atom_pins(ClusterBlockId clb, int clb_pin, const IntraLbPbPinLookup& pb_gpin_lookup); +std::vector find_clb_pin_sink_atom_pins(ClusterBlockId clb, int logical_pin, const IntraLbPbPinLookup& pb_gpin_lookup); std::tuple find_pb_route_clb_input_net_pin(ClusterBlockId clb, int sink_pb_route_id); -//Return the pb pin index corresponding to the pin clb_pin on block clb, -//acounting for the effect of 'z' position > 0. -// -// Note that a CLB pin index does not (neccessarily) map directly to the pb_route index representing the first stage -// of internal routing in the block, since a block may have capacity > 1 (e.g. IOs) -// -// In the clustered netlist blocks with capacity > 1 may have their 'z' position > 0, and their clb pin indicies offset -// by the number of pins on the type (c.f. post_place_sync()). -// -// This offset is not mirrored in the t_pb or pb graph, so we need to recover the basic pin index before processing -// further -- which is what this function does. -int find_clb_pb_pin(ClusterBlockId clb, int clb_pin); - -//Return the clb_pin corresponding to the pb_pin on the specified block -int find_pb_pin_clb_pin(ClusterBlockId clb, int pb_pin); - //Returns the port matching name within pb_gnode const t_port* find_pb_graph_port(const t_pb_graph_node* pb_gnode, std::string port_name); @@ -110,19 +93,22 @@ const t_pb_graph_pin* find_pb_graph_pin(const t_pb_graph_node* pb_gnode, std::st AtomPinId find_atom_pin(ClusterBlockId blk_id, const t_pb_graph_pin* pb_gpin); -//Returns the block type matching name, or nullptr (if not found) -t_physical_tile_type_ptr find_block_type_by_name(std::string name, const std::vector& types); +//Returns the physical tile type matching a given physical tile type name, or nullptr (if not found) +t_physical_tile_type_ptr find_tile_type_by_name(std::string name, const std::vector& types); -//Returns the block type which is most common in the device grid +//Returns the logical block type which is most common in the device grid t_logical_block_type_ptr find_most_common_block_type(const DeviceGrid& grid); +//Returns the physical tile type which is most common in the device grid +t_physical_tile_type_ptr find_most_common_tile_type(const DeviceGrid& grid); + //Parses a block_name.port[x:y] (e.g. LAB.data_in[3:10]) pin range specification, if no pin range is specified //looks-up the block port and fills in the full range InstPort parse_inst_port(std::string str); -int find_pin_class(t_logical_block_type_ptr type, std::string port_name, int pin_index_in_port, e_pin_type pin_type); +int find_pin_class(t_physical_tile_type_ptr type, std::string port_name, int pin_index_in_port, e_pin_type pin_type); -int find_pin(t_logical_block_type_ptr type, std::string port_name, int pin_index_in_port); +int find_pin(t_physical_tile_type_ptr type, std::string port_name, int pin_index_in_port); //Returns the block type which is most likely the logic block t_logical_block_type_ptr infer_logic_block_type(const DeviceGrid& grid); @@ -148,9 +134,6 @@ void free_pin_id_to_pb_mapping(vtr::vector& pin_id_to_pb float compute_primitive_base_cost(const t_pb_graph_node* primitive); int num_ext_inputs_atom_block(AtomBlockId blk_id); -void get_port_pin_from_blk_pin(int blk_type_index, int blk_pin, int* port, int* port_pin); -void free_port_pin_from_blk_pin(); - void get_blk_pin_from_port_pin(int blk_type_index, int port, int port_pin, int* blk_pin); void free_blk_pin_from_port_pin(); @@ -168,6 +151,24 @@ void print_usage_by_wire_length(); AtomBlockId find_memory_sibling(const t_pb* pb); void place_sync_external_block_connections(ClusterBlockId iblk); +int get_max_num_pins(t_logical_block_type_ptr logical_block); + +bool is_tile_compatible(t_physical_tile_type_ptr physical_tile, t_logical_block_type_ptr logical_block); +t_physical_tile_type_ptr pick_best_physical_type(t_logical_block_type_ptr logical_block); +t_logical_block_type_ptr pick_best_logical_type(t_physical_tile_type_ptr physical_tile); + +int get_logical_pin(t_physical_tile_type_ptr physical_tile, + t_logical_block_type_ptr logical_block, + int pin); +int get_physical_pin(t_physical_tile_type_ptr physical_tile, + t_logical_block_type_ptr logical_block, + int pin); + +//Returns the physical pin of the tile, related to the given ClusterNedId, and the net pin index +int net_pin_to_tile_pin_index(const ClusterNetId net_id, int net_pin_index); + +//Returns the physical pin of the tile, related to the given ClusterPinId +int tile_pin_index(const ClusterPinId pin); int max_pins_per_grid_tile(); diff --git a/vtr_flow/arch/equivalent_sites/equivalent.xml b/vtr_flow/arch/equivalent_sites/equivalent.xml new file mode 100644 index 00000000000..5e8fb3b55d8 --- /dev/null +++ b/vtr_flow/arch/equivalent_sites/equivalent.xml @@ -0,0 +1,195 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + io_tile.in io_tile.out + io_tile.in io_tile.out + io_tile.in io_tile.out + io_tile.in io_tile.out + + + + + + + + + + + + + + + + + + + + + + + + + + pass_through_tile.in pass_through_tile.out + pass_through_tile.in pass_through_tile.out + pass_through_tile.in pass_through_tile.out + pass_through_tile.in pass_through_tile.out + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1 1 + 1 + + + diff --git a/vtr_flow/benchmarks/microbenchmarks/carry_chain.blif b/vtr_flow/benchmarks/microbenchmarks/carry_chain.blif index 7adc6d29f1f..0918a3e0110 100644 --- a/vtr_flow/benchmarks/microbenchmarks/carry_chain.blif +++ b/vtr_flow/benchmarks/microbenchmarks/carry_chain.blif @@ -2,14 +2,7 @@ .inputs \ clk .outputs \ - out[0] \ - out[1] \ - out[2] \ - out[3] \ - out[4] \ - out[5] \ - out[6] \ - out[7] + out .names $false @@ -126,52 +119,11 @@ .subckt CARRY \ CI=$auto$alumacc.cc:474:replace_alu$26.C[15] \ - S=counter0[15] \ + S=out \ CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[16] \ CO_FABRIC=__vpr__unconn6 \ O=$0\counter0[21:0][15] -.subckt CARRY0 \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[16] \ - S=counter0[16] \ - CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[17] \ - CO_FABRIC=__vpr__unconn7 \ - O=$0\counter0[21:0][16] - -.subckt CARRY \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[17] \ - S=counter0[17] \ - CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[18] \ - CO_FABRIC=__vpr__unconn8 \ - O=$0\counter0[21:0][17] - -.subckt CARRY \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[18] \ - S=counter0[18] \ - CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[19] \ - CO_FABRIC=__vpr__unconn9 \ - O=$0\counter0[21:0][18] - -.subckt CARRY \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[19] \ - S=counter0[19] \ - CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[20] \ - CO_FABRIC=__vpr__unconn10 \ - O=$0\counter0[21:0][19] - -.subckt CARRY0 \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[20] \ - S=counter0[20] \ - CO_CHAIN=$auto$alumacc.cc:474:replace_alu$26.C[21] \ - CO_FABRIC=__vpr__unconn12 \ - O=$0\counter0[21:0][20] - -.subckt CARRY \ - CI=$auto$alumacc.cc:474:replace_alu$26.C[21] \ - S=out[0] \ - CO_CHAIN=__vpr__unconn13 \ - O=$0\counter0[21:0][21] - .subckt FDRE \ CE=$true \ D=$0\counter0[21:0][0] \ @@ -283,45 +235,3 @@ R=$false \ Q=counter0[15] \ C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][16] \ - R=$false \ - Q=counter0[16] \ - C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][17] \ - R=$false \ - Q=counter0[17] \ - C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][18] \ - R=$false \ - Q=counter0[18] \ - C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][19] \ - R=$false \ - Q=counter0[19] \ - C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][20] \ - R=$false \ - Q=counter0[20] \ - C=clk - -.subckt FDRE \ - CE=$true \ - D=$0\counter0[21:0][21] \ - R=$false \ - Q=out[0] \ - C=clk diff --git a/vtr_flow/benchmarks/microbenchmarks/equivalent.blif b/vtr_flow/benchmarks/microbenchmarks/equivalent.blif new file mode 100644 index 00000000000..6292a60d433 --- /dev/null +++ b/vtr_flow/benchmarks/microbenchmarks/equivalent.blif @@ -0,0 +1,9 @@ +.model top +.inputs in +.outputs out +.names $false +.names $true +1 +.subckt IO_0 in=in out=out_1 +.subckt IO_1 in=out_1 out=out +.end diff --git a/vtr_flow/scripts/upgrade_arch.py b/vtr_flow/scripts/upgrade_arch.py index 2ea9f6510aa..88956eded55 100755 --- a/vtr_flow/scripts/upgrade_arch.py +++ b/vtr_flow/scripts/upgrade_arch.py @@ -41,6 +41,7 @@ def __init__(self): "upgrade_complex_sb_num_conns", "add_missing_comb_model_internal_timing_edges", "add_tile_tags", + "add_site_directs", ] def parse_args(): @@ -144,6 +145,11 @@ def main(): if result: modified = True + if "add_site_directs" in args.features: + result = add_site_directs(arch) + if result: + modified = True + if modified: if args.debug: root.write(sys.stdout, pretty_print=args.pretty) @@ -932,7 +938,7 @@ def swap_tags(tile, pb_type): if arch.findall('./tiles'): - return False + return False models = arch.find('./models') @@ -966,6 +972,57 @@ def swap_tags(tile, pb_type): return True +def add_site_directs(arch): + """ + This function adds the direct pin mappings between a physical + tile and a corresponding logical block. + + Note: the example below is only for explanatory reasons, the signal names are invented + + BEFORE: + + + + + + + + + + + + + + AFTER: + + + + + + + + + + + + + """ + + top_pb_types = [] + for pb_type in arch.iter('pb_type'): + if pb_type.getparent().tag == 'complexblocklist': + top_pb_types.append(pb_type) + + sites = [] + for pb_type in arch.iter('site'): + sites.append(pb_type) + + for pb_type in top_pb_types: + for site in sites: + if 'pin_mapping' not in site.attrib: + site.attrib['pin_mapping'] = "direct" + + return True if __name__ == "__main__": main() diff --git a/vtr_flow/tasks/regression_tests/vtr_reg_strong/strong_equivalent_sites/config/config.txt b/vtr_flow/tasks/regression_tests/vtr_reg_strong/strong_equivalent_sites/config/config.txt new file mode 100644 index 00000000000..c028818fe53 --- /dev/null +++ b/vtr_flow/tasks/regression_tests/vtr_reg_strong/strong_equivalent_sites/config/config.txt @@ -0,0 +1,28 @@ +############################################## +# Configuration file for running experiments +############################################## + +# Path to directory of circuits to use +circuits_dir=benchmarks/microbenchmarks + +# Path to directory of architectures to use +archs_dir=arch/equivalent_sites + +# Add circuits to list to sweep +circuit_list_add=equivalent.blif + +# Add architectures to list to sweep +arch_list_add=equivalent.xml + +# Parse info and how to parse +parse_file=vpr_standard.txt + +# How to parse QoR info +qor_parse_file=qor_standard.txt + +# Pass requirements +pass_requirements_file=pass_requirements.txt + +# Script parameters +#script_params="" +script_params = -track_memory_usage -lut_size 1 -starting_stage vpr diff --git a/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt b/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt index 6f7c5af06ec..f1e9286c4d3 100644 --- a/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt +++ b/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt @@ -53,3 +53,4 @@ regression_tests/vtr_reg_strong/strong_sdc regression_tests/vtr_reg_strong/strong_timing_report_detail regression_tests/vtr_reg_strong/strong_route_reconverge regression_tests/vtr_reg_strong/strong_clock_buf +regression_tests/vtr_reg_strong/strong_equivalent_sites