[X86] Target feature implication mismatch with GCC #136209

sayantn · 2025-04-17T21:42:16Z

There is a mismatch between which other features avx512f and avx512fp16 imply.

For avx512f, GCC implies only avx2, where LLVM also implies fma and f16c(see here).
Also for avx512fp16, GCC implies only avx512bw, where LLVM also implies avx512dq and avx512vl (see here).

gcc -mavx512f -Q --help=target | grep enabled

  -m128bit-long-double                  [enabled]
  -m64                                  [enabled]
  -m80387                               [enabled]
  -malign-stringops                     [enabled]
  -mavx                                 [enabled]
  -mavx2                                [enabled]
  -mavx512f                             [enabled]
  -mcrc32                               [enabled]
  -mdirect-extern-access                [enabled]
  -mevex512                             [enabled]
  -mfancy-math-387                      [enabled]
  -mfp-ret-in-387                       [enabled]
  -mfxsr                                [enabled]
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mlong-double-80                      [enabled]
  -mmmx                                 [enabled]
  -mmwait                               [enabled]
  -mpartial-vector-fp-math              [enabled]
  -mpopcnt                              [enabled]
  -mpush-args                           [enabled]
  -mred-zone                            [enabled]
  -msse                                 [enabled]
  -msse2                                [enabled]
  -msse3                                [enabled]
  -msse4                                [enabled]
  -msse4.1                              [enabled]
  -msse4.2                              [enabled]
  -mssse3                               [enabled]
  -mstv                                 [enabled]
  -mtls-direct-seg-refs                 [enabled]
  -mxsave                               [enabled]

gcc -mavx512fp16 -Q --help=target | grep enabled

  -m128bit-long-double                  [enabled]
  -m64                                  [enabled]
  -m80387                               [enabled]
  -malign-stringops                     [enabled]
  -mavx                                 [enabled]
  -mavx2                                [enabled]
  -mavx512bw                            [enabled]
  -mavx512f                             [enabled]
  -mavx512fp16                          [enabled]
  -mcrc32                               [enabled]
  -mdirect-extern-access                [enabled]
  -mevex512                             [enabled]
  -mfancy-math-387                      [enabled]
  -mfp-ret-in-387                       [enabled]
  -mfxsr                                [enabled]
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mlong-double-80                      [enabled]
  -mmmx                                 [enabled]
  -mmwait                               [enabled]
  -mpartial-vector-fp-math              [enabled]
  -mpopcnt                              [enabled]
  -mpush-args                           [enabled]
  -mred-zone                            [enabled]
  -msse                                 [enabled]
  -msse2                                [enabled]
  -msse3                                [enabled]
  -msse4                                [enabled]
  -msse4.1                              [enabled]
  -msse4.2                              [enabled]
  -mssse3                               [enabled]
  -mstv                                 [enabled]
  -mtls-direct-seg-refs                 [enabled]
  -mxsave                               [enabled]

gcc --version

gcc (GCC) 14.2.1 20250207
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I do not know what Intel specifies about implies features, but I believe this mismatch needs to be corrected.

related: rust-lang/rust#138940
related: rust-lang/rust#111137

The text was updated successfully, but these errors were encountered:

llvmbot · 2025-04-17T21:43:37Z

@llvm/issue-subscribers-backend-x86

Author: Sayantan Chakraborty (sayantn)

There is a mismatch between which other features `avx512f` and `avx512fp16` imply.

For avx512f, GCC implies only avx2, where LLVM also implies fma and f16c(see here).
Also for avx512fp16, GCC implies only avx512bw, where LLVM also implies avx512dq and avx512vl (see here).

gcc -mavx512f -Q --help=target | grep enabled

  -m128bit-long-double                  [enabled]
  -m64                                  [enabled]
  -m80387                               [enabled]
  -malign-stringops                     [enabled]
  -mavx                                 [enabled]
  -mavx2                                [enabled]
  -mavx512f                             [enabled]
  -mcrc32                               [enabled]
  -mdirect-extern-access                [enabled]
  -mevex512                             [enabled]
  -mfancy-math-387                      [enabled]
  -mfp-ret-in-387                       [enabled]
  -mfxsr                                [enabled]
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mlong-double-80                      [enabled]
  -mmmx                                 [enabled]
  -mmwait                               [enabled]
  -mpartial-vector-fp-math              [enabled]
  -mpopcnt                              [enabled]
  -mpush-args                           [enabled]
  -mred-zone                            [enabled]
  -msse                                 [enabled]
  -msse2                                [enabled]
  -msse3                                [enabled]
  -msse4                                [enabled]
  -msse4.1                              [enabled]
  -msse4.2                              [enabled]
  -mssse3                               [enabled]
  -mstv                                 [enabled]
  -mtls-direct-seg-refs                 [enabled]
  -mxsave                               [enabled]

gcc -mavx512fp16 -Q --help=target | grep enabled

  -m128bit-long-double                  [enabled]
  -m64                                  [enabled]
  -m80387                               [enabled]
  -malign-stringops                     [enabled]
  -mavx                                 [enabled]
  -mavx2                                [enabled]
  -mavx512bw                            [enabled]
  -mavx512f                             [enabled]
  -mavx512fp16                          [enabled]
  -mcrc32                               [enabled]
  -mdirect-extern-access                [enabled]
  -mevex512                             [enabled]
  -mfancy-math-387                      [enabled]
  -mfp-ret-in-387                       [enabled]
  -mfxsr                                [enabled]
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mlong-double-80                      [enabled]
  -mmmx                                 [enabled]
  -mmwait                               [enabled]
  -mpartial-vector-fp-math              [enabled]
  -mpopcnt                              [enabled]
  -mpush-args                           [enabled]
  -mred-zone                            [enabled]
  -msse                                 [enabled]
  -msse2                                [enabled]
  -msse3                                [enabled]
  -msse4                                [enabled]
  -msse4.1                              [enabled]
  -msse4.2                              [enabled]
  -mssse3                               [enabled]
  -mstv                                 [enabled]
  -mtls-direct-seg-refs                 [enabled]
  -mxsave                               [enabled]

gcc --version

gcc (GCC) 14.2.1 20250207
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I do not know what Intel specifies about implies features, but I believe this mismatch needs to be corrected.

related: rust-lang/rust#138940
related: rust-lang/rust#111137

topperc · 2025-04-17T22:46:32Z

If I recall correctly, the avx512f implying fma and fp16 is intentional to make the assembler behavior legal. avx512f and fma/fp16 use the same mnemonic names for some instructions. The assembler will prioritize the VEX encoding from fma over the EVEX encoding from avx512f if only registers xmm0-xmm15 are used and no EVEX only things like masking are used. If the VEX encoding can't be assumed to exist with avx512f then the assembler is broken.

sayantn · 2025-04-17T23:06:52Z

Ok that makes sense, what about avx512fp16? Also, Intel doesn't specify anything about fma/f16c with avx512f right? So (theoretically) there can be CPUs that have avx512f, but not fma or f16c.

topperc · 2025-04-17T23:37:33Z

The line in X86.td you linked to for avx512fp16 has two FIXMEs above it that describe the reason those dependencies are there.

sayantn · 2025-04-17T23:54:21Z

Those comments seem strange. The types v8f16 or v8i64 should be guarded under sse2 and avx512f only (due to the register size), because the cpu doesn't care about the actual representation of the registers. Only the instructions should be guarded against some target features (e.g. vaddsh under avx512fp16).

If the same logic is followed, then _mm_add_round_ss should also be guarded by avx512vl (as it is EVEX encoded and has a 128-bit operand). But Intel specifies that it doesn't require avx512vl, only avx512f.

topperc · 2025-04-18T00:13:50Z

The types v8f16 or v8i64 should be guarded under sse2 and avx512f only (due to the register size), because the cpu doesn't care about the actual representation of the registers.

Making a type legal in llvm but not having operations available requires a bunch of changes in LLVM to emulate or scalarize those operations. So the v8f16 comment was correct at the time it was written in 2021. The X86 backend only made v8f16 legal with VLX+avx512fpf16. It was made legal with SSE2 in 2022 f187948

The avx512dq comment doesn't make sense to me. v8i64 should have always been legal with avx512f.

sayantn · 2025-04-18T00:17:24Z

So, if I am getting this right, atp both of them are fixed, and avx512fp16 should only imply avx512bw now? Then I believe this should be done soon

llvmbot added the new issue label Apr 17, 2025

EugeneZelenko added backend:X86 and removed new issue labels Apr 17, 2025

This was referenced Apr 18, 2025

Remove avx512dq and avx512vl implication for avx512fp16 rust-lang/rust#140002

Open

Require fma and f16c for avx512f in std_detect rust-lang/stdarch#1781

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] Target feature implication mismatch with GCC #136209

[X86] Target feature implication mismatch with GCC #136209

sayantn commented Apr 17, 2025

llvmbot commented Apr 17, 2025

topperc commented Apr 17, 2025 •

edited

Loading

sayantn commented Apr 17, 2025

topperc commented Apr 17, 2025

sayantn commented Apr 17, 2025 •

edited

Loading

topperc commented Apr 18, 2025

sayantn commented Apr 18, 2025

[X86] Target feature implication mismatch with GCC #136209

[X86] Target feature implication mismatch with GCC #136209

Comments

sayantn commented Apr 17, 2025

llvmbot commented Apr 17, 2025

topperc commented Apr 17, 2025 • edited Loading

sayantn commented Apr 17, 2025

topperc commented Apr 17, 2025

sayantn commented Apr 17, 2025 • edited Loading

topperc commented Apr 18, 2025

sayantn commented Apr 18, 2025

topperc commented Apr 17, 2025 •

edited

Loading

sayantn commented Apr 17, 2025 •

edited

Loading