37

AMD, Intel, Red Hat, and SUSE have defined a set of "architecture levels" for x86-64 CPUs. For example x86-64-v2 means that a CPU support not only the basic x86-64 instructions set, but also other instructions like SSE4.2, SSSE3 or POPCNT.

How can I check which architecture levels are supported by my CPU?

Vikki
  • 127
gioele
  • 2,139

8 Answers8

39

This is based on gioele’s answer; the whole script might as well be written in AWK:

#!/usr/bin/awk -f

BEGIN { while (!/flags/) if (getline < "/proc/cpuinfo" != 1) exit 1 if (/lm/&&/cmov/&&/cx8/&&/fpu/&&/fxsr/&&/mmx/&&/syscall/&&/sse2/) level = 1 if (level == 1 && /cx16/&&/lahf/&&/popcnt/&&/sse4_1/&&/sse4_2/&&/ssse3/) level = 2 if (level == 2 && /avx/&&/avx2/&&/bmi1/&&/bmi2/&&/f16c/&&/fma/&&/abm/&&/movbe/&&/xsave/) level = 3 if (level == 3 && /avx512f/&&/avx512bw/&&/avx512cd/&&/avx512dq/&&/avx512vl/) level = 4 if (level > 0) { print "CPU supports x86-64-v" level; exit level + 1 } exit 1 }

This also checks for the baseline (“level 1” here), only outputs the highest supported level, and exits with an exit code matching the first unsupported level.

Stephen Kitt
  • 434,908
  • 3
    Some of the checking of earlier levels is redundant, but not a bad idea I guess. In real hardware, SSE4.2 for example already implies support for all previous (Intel) SSE versions (but not AMD SSE4a). In a virtual machine CPUID is virtualized so it's theoretically possible to indicate SSSE3 support without SSE3. Only in a software emulator would could you make SSE3 instructions fault while SSSE3 instructions didn't. (BTW, you omitted /sse3/.) The de-facto standard is that runtime CPU dispatching only needs to check the highest SSE feature flag it depends on. – Peter Cordes Jan 27 '21 at 19:02
  • 1
    There are other de-facto standards like SSE4.2 implying popcnt, but that's good to check explicitly. And other non-SIMD extensions like BMI1 are fully independent of SIMD (although since some BMI1/2 instructions use VEX encoding, they're normally only found on CPUs that support AVX. And unfortunately Intel even disables BMI1/2 on their Pentium/Celeron CPUs, perhaps as a way of fully disabling AVX.). – Peter Cordes Jan 27 '21 at 19:08
  • 1
    BTW, level 2 = Nehalem and current Silvermont, and current-gen Pentium/Celeron. Also AMD Bulldozer family since even Excavator doesn't have BMI2, only AVX2 and FMA3. Level 3 = Haswell (and Zen), and includes most of the really good stuff. MacOS apparently can make fat binaries with baseline x86-64 and Haswell feature-level, allowing usage of BMI2 efficient shift instructions all over the place, and of AVX everywhere. Level 4 = -march=skylake-avx512. – Peter Cordes Jan 27 '21 at 19:12
  • 1
    @PeterCordes yes, there are a number of deficiencies and redundancies here (in particular, I should check full fields instead of using regexes, since for example /lm/ will match anything containing those characters). I followed the exhaustive level definitions as used in the first answer (that’s where /ssse3/ without /sse3/ came from), even though as you say many of them are redundant. (I’ve been following the discussions leading up to the definition of these levels.) – Stephen Kitt Jan 27 '21 at 19:29
  • 1
    TBH this was more an exercise in showing that all the checks could be done in AWK instead of a mixture of AWK ans shell, rather than coming up with the best level checker ;-). – Stephen Kitt Jan 27 '21 at 19:32
  • 1
    lm is long mode; checking for level 1 is basically just a sanity check of CPUID flags if you're already running a 64-bit kernel because those are all baseline for x86-64. (Also, my comments aren't fully directed at your answer, some of it I just wanted to put somewhere on this page for future readers. Also: Are older SIMD-versions available when using newer ones? / Do the MMX registers always exist in modern processors? / Does a processor that supports SSE4 support SSSE3 instructions?) – Peter Cordes Jan 27 '21 at 19:37
28

Originally copied from https://gitlab.archlinux.org/archlinux/rfcs/-/merge_requests/2/diffs

With glibc 2.33 or later (Arch Linux, Debian 12, Ubuntu 21.04, Fedora 34, etc.), or patched glibc (RHEL 8), you can see what architecture is supported by your CPU by running:

$ /lib/ld-linux-x86-64.so.2 --help

Subdirectories of glibc-hwcaps directories, in priority order: x86-64-v4 x86-64-v3 (supported, searched) x86-64-v2 (supported, searched)

On Debian derivatives the path is different, you need to run /lib64/ld-linux-x86-64.so.2 --help.

  • 5
    The AMD64 ABI remarks that /lib64/ld-linux-x86-64.so.2 is the standard place (on Linux) for the program interpreter. – Sam Morris Jan 17 '23 at 11:50
12

On Linux, one can check the CPU capabilities reported by /proc/cpuinfo against the requirements described in the x86-psABI documentation.

The following script automates that process (the exit code is the number of the first non-supported architecture level).

#!/bin/sh -eu

flags=$(cat /proc/cpuinfo | grep flags | head -n 1 | cut -d: -f2)

supports_v2='awk "/cx16/&&/lahf/&&/popcnt/&&/sse4_1/&&/sse4_2/&&/ssse3/ {found=1} END {exit !found}"' supports_v3='awk "/avx/&&/avx2/&&/bmi1/&&/bmi2/&&/f16c/&&/fma/&&/abm/&&/movbe/&&/xsave/ {found=1} END {exit !found}"' supports_v4='awk "/avx512f/&&/avx512bw/&&/avx512cd/&&/avx512dq/&&/avx512vl/ {found=1} END {exit !found}"'

echo "$flags" | eval $supports_v2 || exit 2 && echo "CPU supports x86-64-v2" echo "$flags" | eval $supports_v3 || exit 3 && echo "CPU supports x86-64-v3" echo "$flags" | eval $supports_v4 || exit 4 && echo "CPU supports x86-64-v4"

gioele
  • 2,139
  • 2
    Instead of using a variable and evaling it, you could have used a function – muru Jan 27 '21 at 08:48
  • As an FYI, my old AMD FX-6100 supports v2, but not v3 or v4. – RonJohn Jan 27 '21 at 18:47
  • 1
    @RonJohn: Yup, even Bulldozer-family is only "level 2", even though Excavator has AVX2 and FMA. It's missing BMI2 and movbe. (Piledriver / Steamroller have AVX1 and FMA; Bulldozer has AVX1 and FMA4 but not FMA3; Intel pulled the rug out from under AMD as late as they could. See Stop the instruction set war on Agner Fog's blog.) To be fair, having another level with AVX but not BMI2 would be of limited value, and BMI2 is quite nice for Intel CPUs: variable-count shifts with SHLX/SHRX are 1 uop instead of 3, and can use any reg instead of CL – Peter Cordes Jan 27 '21 at 19:16
  • 1
    Level 3 = Haswell and Zen1. Level 4 = -march=skylake-avx512. – Peter Cordes Jan 27 '21 at 19:16
6

Here's a shell script to determine the x86_64 CPU architecture level on Linux. It's compatible with BusyBox. With the option -v, it shows which flags you're missing to reach the next level. See What do the flags in /proc/cpuinfo mean? for an explanation of the flags.

#!/bin/sh
set -e

verbose= while getopts v OPTLET; do case "$OPTLET" in v) verbose=1;; ?) exit 2;; esac done

flags=$(grep '^flags\b' </proc/cpuinfo | head -n 1) flags=" ${flags#*:} "

has_flags () { for flag; do case "$flags" in " $flag ") :;; *) if [ -n "$verbose" ]; then echo >&2 "Missing $flag for the next level" fi return 1;; esac done }

determine_level () { level=0 has_flags lm cmov cx8 fpu fxsr mmx syscall sse2 || return 0 level=1 has_flags cx16 lahf_lm popcnt sse4_1 sse4_2 ssse3 || return 0 level=2 has_flags avx avx2 bmi1 bmi2 f16c fma abm movbe xsave || return 0 level=3 has_flags avx512f avx512bw avx512cd avx512dq avx512vl || return 0 level=4 }

determine_level echo "$level"

(Acknowledgement: I reused the list of flags from Stephen Kitt's answer which in turns builds on gioele's answer.)

1

I've created an x86-64-level tool based on the suggestions here. Examples:

$ x86-64-level
3

$ level=$(x86-64-level) $ echo "x86-64-v${level}" x86-64-v3

Output an explanation to stderr

$ x86-64-level --verbose Identified x86-64-v3, because x86-64-v4 requires 'avx512f', which is not supported by this CPU [Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz] 3

If you want to assert that the current machine supports a certain x86-64 level in a shell script, add the following one-line gatekeeper;

x86-64-level --assert=4 || exit 1

This will be silent if the host supports x86-64-v4, otherwise it'll output:

The CPU [Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz] on this host ('dev2')
supports x86-64-v3, which is less than the required x86-64-v4

and exit with exit value 1.

The x86-64-level tool is a standalone Bash script that's available at https://github.com/HenrikBengtsson/x86-64-level.

HenrikB
  • 111
1

One way is to use the Function Multiversioning feature in GCC, write a test program, and see what version of the function (dependent on your CPU arch) will it pick.

The foo function from the program below will create multiple symbols in the binary, and the "best" version will be picked at runtime

$ nm a.out | grep foo
0000000000402236 T _Z3foov
000000000040224c T _Z3foov.arch_x86_64
0000000000402257 T _Z3foov.arch_x86_64_v2
0000000000402262 T _Z3foov.arch_x86_64_v3
000000000040226d T _Z3foov.arch_x86_64_v4
0000000000402290 W _Z3foov.resolver
0000000000402241 T _Z3foov.sse4.2
0000000000402290 i _Z7_Z3foovv
// multiversioning.c

#include <stdio.h>

attribute ((target ("default"))) const char* foo () { return "default"; }

attribute ((target ("sse4.2"))) const char* foo () { return "sse4.2"; }

attribute ((target ("arch=x86-64"))) const char* foo () { return "x86-64-v1"; }

attribute ((target ("arch=x86-64-v2"))) const char* foo () { return "x86-64-v2"; }

attribute ((target ("arch=x86-64-v3"))) const char* foo () { return "x86-64-v3"; }

attribute ((target ("arch=x86-64-v4"))) const char* foo () { return "x86-64-v4"; }

int main () { printf("%s\n", foo()); return 0; }

On my laptop, this prints

$ g++ multiversioning.c 
$ ./a.out 
x86-64-v3

Note that the use of g++ is intentional here.

If I used gcc to compile, it would fail with error: redefinition of ‘foo’.

user7610
  • 2,038
1

The single thing that worked for me is to use gcc with its __builtin_cpu_supports feature.  Since I invoked it in msys it is likely to work on Windows too.  Can be done with C++ too.

// test_cpu.c
#ifndef __GNUC__
#error "You must use gnu"
#endif

#include <stdio.h> int main() { if (__builtin_cpu_supports("x86-64-v4")) puts("v=4"); else if (__builtin_cpu_supports("x86-64-v3")) puts("v=3"); else if (__builtin_cpu_supports("x86-64-v2")) puts("v=2"); else puts("v=1"); }

Usage:

$ gcc /test_cpu.c -o /test_cpu

$ /test_cpu v=3

Yuriy
  • 11
0

On more modern Fedora / Red Hat systems do this:

$ /usr/lib64/ld-linux-x86-64.so.2 --help
Usage: /usr/lib64/ld-linux-x86-64.so.2 [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked 'ld.so', the program interpreter for dynamically-linked
ELF programs.  Usually, the program interpreter is invoked automatically
when a dynamically-linked executable is started.

You may invoke the program interpreter program directly from the command line to load and run an ELF executable file; this is like executing that file itself, but always uses the program interpreter you invoked, instead of the program interpreter specified in the executable file you run. Invoking the program interpreter directly provides access to additional diagnostics, and changing the dynamic linker behavior without setting environment variables (which would be inherited by subprocesses).

--list list all dependencies and how they are resolved --verify verify that given object really is a dynamically linked object we can handle --inhibit-cache Do not use /etc/ld.so.cache --library-path PATH use given PATH instead of content of the environment variable LD_LIBRARY_PATH --glibc-hwcaps-prepend LIST search glibc-hwcaps subdirectories in LIST --glibc-hwcaps-mask LIST only search built-in subdirectories if in LIST --inhibit-rpath LIST ignore RUNPATH and RPATH information in object names in LIST --audit LIST use objects named in LIST as auditors --preload LIST preload objects named in LIST --argv0 STRING set argv[0] to STRING before running --list-tunables list all tunables with minimum and maximum values --list-diagnostics list diagnostics information --help display this help and exit --version output version information and exit

This program interpreter self-identifies as: /lib64/ld-linux-x86-64.so.2

Shared library search path: (libraries located via /etc/ld.so.cache) /lib64 (system search path) /usr/lib64 (system search path)

Subdirectories of glibc-hwcaps directories, in priority order: x86-64-v4 x86-64-v3 x86-64-v2 (supported, searched)

lzap
  • 2,102