-
Notifications
You must be signed in to change notification settings - Fork 414
Force operator< to be inlined to reduce place time #1265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @xuqinziyue!
One additional thought. Have you tried the std::tie
based comparison operator in combination with always_inline
? If that produces similar performance, I think it would be preferable since the std::tie approach would be less brittle to future code changes (since it makes fewer assumptions).
If that isn't the case I've described how we can have the compiler check that the assumptions the proposed method depends are valid. I've also provided more detailed comments below.
#ifndef __has_attribute | ||
# define __has_attribute(x) 0 // Compatibility with non-clang compilers. | ||
#endif | ||
|
||
#if defined(COMPILER_GCC) && defined(NDEBUG) | ||
# define ALWAYS_INLINE inline __attribute__((__always_inline__)) | ||
#elif defined(COMPILER_MSVC) && defined(NDEBUG) | ||
# define ALWAYS_INLINE __forceinline | ||
#elif __has_attribute(always_inline) | ||
# define ALWAYS_INLINE __attribute__((always_inline)) // clang | ||
#else | ||
# define ALWAYS_INLINE inline | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to use C++ attributes (available since C++11) as a cleaner way to specify this.
For example:
[[gnu::always_inline]] friend bool operator<(const t_override& lhs, const t_override& rhs) {
That would avoid having to define the ALWAYS_INLINE macro in a cleaner cross-compiler way.
@mithro Any thoughts on this?
vpr/src/place/place_delay_model.h
Outdated
friend ALWAYS_INLINE bool operator<(const t_override& lhs, const t_override& rhs) { | ||
const short* left = reinterpret_cast<const short*>(&lhs); | ||
const short* right = reinterpret_cast<const short*>(&rhs); | ||
return std::lexicographical_compare(left, left + 6, right, right + 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to explicitly state what assumptions this code is making. In particular, I think it is assuming:
- All data members of t_override are shorts
- There is no padding in t_override
- Based on the above, treating t_override's memory layout as an array of shorts is the same as a member-wise comparison
We should also encode these assumptions into some compile-time checks, otherwise we are setting ourselves up for some very challenging to find bugs in the future (when t_override changes).
I'd suggest something like:
static_assert(sizeof(t_override) = sizeof(t_override::from_type)
+ sizeof(t_override::to_type)
+ sizeof(t_override::from_class)
+ sizeof(t_override::to_class)
+ sizeof(t_override::delta_x)
+ sizeof(t_override::delta_y),
"Expect t_override to have a memory layout equivalent to an array of short (no padding)");
static_assert(sizeof(t_override::from_type) == sizeof(short), "Expect all t_override data members to be shorts");
static_assert(sizeof(t_override::to_type) == sizeof(short), "Expect all t_override data members to be shorts");
static_assert(sizeof(t_override::from_class) == sizeof(short), "Expect all t_override data members to be shorts");
static_assert(sizeof(t_override::to_class) == sizeof(short), "Expect all t_override data members to be shorts");
static_assert(sizeof(t_override::delta_x) == sizeof(short), "Expect all t_override data members to be shorts");
static_assert(sizeof(t_override::delta_y) == sizeof(short), "Expect all t_override data members to be shorts");
which would verify there is no padding in the t_override memory layout and that each data member is a short. This would catch potential bugs caused by changing the number of data members in t_override, or the data type of members in t_override.
We could then also modify the comparison to not be explicitly hard coded for 6 data members:
constexpr size_t NUM_T_OVERRIDE_MEMBERS = sizeof(t_override) / sizeof(short);
return std::lexicographical_compare(left, left + NUM_T_OVERRIDE_MEMBERS, right, right + NUM_T_OVERRIDE_MEMBERS);
I have tried the combination of std::tie and always_inline but based on my experiment, operator < is not inlined in the case. Thanks for the comments below, and I will update the pull request with the revisions you suggest shortly. |
@xuqinziyue The updates look good. Can you fix the code formatting ( |
@kmurray I have fixed the formatting issue and have done a rebase. Let me know if there is anything else I should change |
Thanks! |
Description
The change forces operator< to be inlined properly.
Related Issue
#1225
Motivation and Context
Original operator< is not inlined by compiler. Since the function is timing critical, force the function to be inlined can reduce place time by ~10% with xc7 tests and ~5% with QoR and Titan benchmark tests
How Has This Been Tested?
Run through xc7 tests like picosoc and murax and passed QoR and Titan Benchmarks.
Comparison shows a place time reduction.
Comparison also proves there is no impact on quality of results.
Types of changes
Checklist: