You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Change from the default TLS model, global-dynamic (since we are using -fPIC),
to a faster one, initial-exec. global-dynamic calls __tls_get_addr() for each
TLS access (to e.g. current_thd); initial-exec caches the value (per-function),
the first time it reaches the function. In other words, on subsequent function
calls, there will be a predictable branch and then just a read from %fs.
The disadvantage of initial-exec is that one cannot access “extern thread_local”
variables across modules that are dlopen()-ed (and not otherwise linked to each
other). This is not a problem for plugins that want to access current_thd or
*THR_MALLOC, since they are linked against the server, but it would affect
a hypothetical situation where plugin A has “thread_local int x;” and plugin B
has “extern thread_local int x;” (and was not linked to A; presumably, A would
need to be loaded before B). Plugins and components are already supposed not to
do this, and we know of no cases where it happens.
Point select throughput goes up 1.4-2.2%, depending on thread count. This is
on top of the gains we already got earlier, from changing to C++11 thread_local
instead of calling pthread_getspecific().
Change-Id: Ic511636de6e0f3703cc811680596467d064146bf
0 commit comments