@@ -1757,6 +1757,55 @@ As part of the AMDGPU MC layer, AMDGPU provides the following target specific
1757
1757
1758
1758
=================== ================= ========================================================
1759
1759
1760
+ Function Resource Usage
1761
+ -----------------------
1762
+
1763
+ A function's resource usage depends on each of its callees' resource usage. The
1764
+ expressions used to denote resource usage reflect this by propagating each
1765
+ callees' equivalent expressions. Said expressions are emitted as symbols by the
1766
+ compiler when compiling to either assembly or object format and should not be
1767
+ overwritten or redefined.
1768
+
1769
+ The following describes all emitted function resource usage symbols:
1770
+
1771
+ .. table:: Function Resource Usage:
1772
+ :name: function-usage-table
1773
+
1774
+ ===================================== ========= ========================================= ===============================================================================
1775
+ Symbol Type Description Example
1776
+ ===================================== ========= ========================================= ===============================================================================
1777
+ <function_name>.num_vgpr Integer Number of VGPRs used by <function_name>, .set foo.num_vgpr, max(32, bar.num_vgpr, baz.num_vgpr)
1778
+ worst case of itself and its callees'
1779
+ VGPR use
1780
+ <function_name>.num_agpr Integer Number of AGPRs used by <function_name>, .set foo.num_agpr, max(35, bar.num_agpr)
1781
+ worst case of itself and its callees'
1782
+ AGPR use
1783
+ <function_name>.numbered_sgpr Integer Number of SGPRs used by <function_name>, .set foo.num_sgpr, 21
1784
+ worst case of itself and its callees'
1785
+ SGPR use (without any of the implicitly
1786
+ used SGPRs)
1787
+ <function_name>.private_seg_size Integer Total stack size required for .set foo.private_seg_size, 16+max(bar.private_seg_size, baz.private_seg_size)
1788
+ <function_name>, expression is the
1789
+ locally used stack size + the worst case
1790
+ callee
1791
+ <function_name>.uses_vcc Bool Whether <function_name>, or any of its .set foo.uses_vcc, or(0, bar.uses_vcc)
1792
+ callees, uses vcc
1793
+ <function_name>.uses_flat_scratch Bool Whether <function_name>, or any of its .set foo.uses_flat_scratch, 1
1794
+ callees, uses flat scratch or not
1795
+ <function_name>.has_dyn_sized_stack Bool Whether <function_name>, or any of its .set foo.has_dyn_sized_stack, 1
1796
+ callees, is dynamically sized
1797
+ <function_name>.has_recursion Bool Whether <function_name>, or any of its .set foo.has_recursion, 0
1798
+ callees, contains recursion
1799
+ <function_name>.has_indirect_call Bool Whether <function_name>, or any of its .set foo.has_indirect_call, max(0, bar.has_indirect_call)
1800
+ callees, contains an indirect call
1801
+ ===================================== ========= ========================================= ===============================================================================
1802
+
1803
+ Futhermore, three symbols are additionally emitted describing the compilation
1804
+ unit's worst case (i.e, maxima) ``num_vgpr``, ``num_agpr``, and
1805
+ ``numbered_sgpr`` which may be referenced and used by the aforementioned
1806
+ symbolic expressions. These three symbols are ``amdgcn.max_num_vgpr``,
1807
+ ``amdgcn.max_num_agpr``, and ``amdgcn.max_num_sgpr``.
1808
+
1760
1809
.. _amdgpu-elf-code-object:
1761
1810
1762
1811
ELF Code Object
0 commit comments