1. 15 7月, 2022 9 次提交
  2. 14 7月, 2022 1 次提交
    • A
      iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) · b0b0b77e
      Alexander Lobakin 提交于
      KASAN reports:
      
      [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
      [    4.676149][    T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
      [    4.683454][    T0]
      [    4.685638][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838 #1
      [    4.694331][    T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
      [    4.703196][    T0] Call Trace:
      [    4.706334][    T0]  <TASK>
      [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
      
      after converting the type of the first argument (@nr, bit number)
      of arch_test_bit() from `long` to `unsigned long`[0].
      
      Under certain conditions (for example, when ACPI NUMA is disabled
      via command line), pxm_to_node() can return %NUMA_NO_NODE (-1).
      It is valid 'magic' number of NUMA node, but not valid bit number
      to use in bitops.
      node_online() eventually descends to test_bit() without checking
      for the input, assuming it's on caller side (which might be good
      for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads
      to an insane array index when calculating bit position in memory.
      
      For now, add an explicit check for @node being not %NUMA_NO_NODE
      before calling test_bit(). The actual logics didn't change here
      at all.
      
      [0] https://github.com/norov/linux/commit/0e862838f290147ea9c16db852d8d494b552d38d
      
      Fixes: ee34b32d ("dmar: support for parsing Remapping Hardware Static Affinity structure")
      Cc: stable@vger.kernel.org # 2.6.33+
      Reported-by: Nkernel test robot <oliver.sang@intel.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      b0b0b77e
  3. 12 7月, 2022 2 次提交
  4. 01 7月, 2022 9 次提交
    • A
      lib: test_bitmap: add compile-time optimization/evaluations assertions · dc34d503
      Alexander Lobakin 提交于
      Add a function to the bitmap test suite, which will ensure that
      compilers are able to evaluate operations performed by the
      bitops/bitmap helpers to compile-time constants when all of the
      arguments are compile-time constants as well, or trigger a build
      bug otherwise. This should work on all architectures and all the
      optimization levels supported by Kbuild.
      The function doesn't perform any runtime tests and gets optimized
      out to nothing after passing the build assertions.
      Unfortunately, Clang for s390 is currently broken (up to the latest
      Git snapshots) -- see the comment in the code -- so for now there's
      a small workaround for it which doesn't alter the logics. Hope we'll
      be able to remove it one day (bugreport is on its way).
      Suggested-by: NYury Norov <yury.norov@gmail.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      dc34d503
    • A
      bitmap: don't assume compiler evaluates small mem*() builtins calls · 3e7e5baa
      Alexander Lobakin 提交于
      Intel kernel bot triggered the build bug on ARC architecture that
      in fact is as follows:
      
      	DECLARE_BITMAP(bitmap, BITS_PER_LONG);
      
      	bitmap_clear(bitmap, 0, BITS_PER_LONG);
      	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
      
      which can be expanded to:
      
      	unsigned long bitmap[1];
      
      	memset(bitmap, 0, sizeof(*bitmap));
      	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
      
      In most cases, a compiler is able to expand small/simple mem*()
      calls to simple assignments or bitops, in this case that would mean:
      
      	unsigned long bitmap[1] = { 0 };
      
      	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
      
      and on most architectures this works, but not on ARC, despite having
      -O3 for every build.
      So, to make this work, in case when the last bit to modify is still
      within the first long (small_const_nbits()), just use plain
      assignments for the rest of bitmap_*() functions which still use
      mem*(), but didn't receive such compile-time optimizations yet.
      This doesn't have the same coverage as compilers provide, but at
      least something to start:
      
      text: add/remove: 3/7 grow/shrink: 43/78 up/down: 1848/-3370 (-1546)
      data: add/remove: 1/11 grow/shrink: 0/8 up/down: 4/-356 (-352)
      
      notably cpumask_*() family when NR_CPUS <= BITS_PER_LONG:
      
      netif_get_num_default_rss_queues              38       4     -34
      cpumask_copy                                  90       -     -90
      cpumask_clear                                146       -    -146
      
      and the abovementioned assertion started passing.
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      3e7e5baa
    • A
      net/ice: fix initializing the bitmap in the switch code · 2f7ee2a7
      Alexander Lobakin 提交于
      Kbuild spotted the following bug during the testing of one of
      the optimizations:
      
      In file included from include/linux/cpumask.h:12,
      [...]
                      from drivers/net/ethernet/intel/ice/ice_switch.c:4:
      drivers/net/ethernet/intel/ice/ice_switch.c: In function 'ice_find_free_recp_res_idx.constprop':
      include/linux/bitmap.h:447:22: warning: 'possible_idx[0]' is used uninitialized [-Wuninitialized]
        447 |                 *map |= GENMASK(start + nbits - 1, start);
            |                      ^~
      In file included from drivers/net/ethernet/intel/ice/ice.h:7,
                       from drivers/net/ethernet/intel/ice/ice_lib.h:7,
                       from drivers/net/ethernet/intel/ice/ice_switch.c:4:
      drivers/net/ethernet/intel/ice/ice_switch.c:4929:24: note: 'possible_idx[0]' was declared here
       4929 |         DECLARE_BITMAP(possible_idx, ICE_MAX_FV_WORDS);
            |                        ^~~~~~~~~~~~
      include/linux/types.h:11:23: note: in definition of macro 'DECLARE_BITMAP'
         11 |         unsigned long name[BITS_TO_LONGS(bits)]
            |                       ^~~~
      
      %ICE_MAX_FV_WORDS is 48, so bitmap_set() here was initializing only
      48 bits, leaving a junk in the rest 16.
      It was previously hidden due to that filling 48 bits makes
      bitmap_set() call external __bitmap_set(), but after making it use
      plain bit arithmetics on small bitmaps, compilers started seeing
      the issue. It was still working because those 16 weren't used
      anywhere anyhow.
      bitmap_{clear,set}() are not really intended to initialize bitmaps,
      rather to modify already initialized ones, as they don't do anything
      past the passed number of bits. The correct function to do this in
      that particular case is bitmap_fill(), so use it here. It will do
      `*possible_idx = ~0UL` instead of `*possible_idx |= GENMASK(47, 0)`,
      not leaving anything in an undefined state.
      
      Fixes: fd2a6b71 ("ice: create advanced switch recipe")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      2f7ee2a7
    • A
      bitops: let optimize out non-atomic bitops on compile-time constants · b03fc117
      Alexander Lobakin 提交于
      Currently, many architecture-specific non-atomic bitop
      implementations use inline asm or other hacks which are faster or
      more robust when working with "real" variables (i.e. fields from
      the structures etc.), but the compilers have no clue how to optimize
      them out when called on compile-time constants. That said, the
      following code:
      
      	DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1];
      	unsigned long bar = BIT(BAR_BIT);
      	unsigned long baz = 0;
      
      	__set_bit(FOO_BIT, foo);
      	baz |= BIT(BAZ_BIT);
      
      	BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
      	BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT));
      	BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));
      
      triggers the first assertion on x86_64, which means that the
      compiler is unable to evaluate it to a compile-time initializer
      when the architecture-specific bitop is used even if it's obvious.
      In order to let the compiler optimize out such cases, expand the
      bitop() macro to use the "constant" C non-atomic bitop
      implementations when all of the arguments passed are compile-time
      constants, which means that the result will be a compile-time
      constant as well, so that it produces more efficient and simple
      code in 100% cases, comparing to the architecture-specific
      counterparts.
      
      The savings are architecture, compiler and compiler flags dependent,
      for example, on x86_64 -O2:
      
      GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235)
      LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)
      LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)
      
      and ARM64 (courtesy of Mark):
      
      GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240)
      LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NMarco Elver <elver@google.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      b03fc117
    • A
      bitops: wrap non-atomic bitops with a transparent macro · e69eb9c4
      Alexander Lobakin 提交于
      In preparation for altering the non-atomic bitops with a macro, wrap
      them in a transparent definition. This requires prepending one more
      '_' to their names in order to be able to do that seamlessly. It is
      a simple change, given that all the non-prefixed definitions are now
      in asm-generic.
      sparc32 already has several triple-underscored functions, so I had
      to rename them ('___' -> 'sp32_').
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NMarco Elver <elver@google.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      e69eb9c4
    • A
      bitops: define const_*() versions of the non-atomics · bb7379bf
      Alexander Lobakin 提交于
      Define const_*() variants of the non-atomic bitops to be used when
      the input arguments are compile-time constants, so that the compiler
      will be always able to resolve those to compile-time constants as
      well. Those are mostly direct aliases for generic_*() with one
      exception for const_test_bit(): the original one is declared
      atomic-safe and thus doesn't discard the `volatile` qualifier, so
      in order to let optimize code, define it separately disregarding
      the qualifier.
      Add them to the compile-time type checks as well just in case.
      Suggested-by: NMarco Elver <elver@google.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NMarco Elver <elver@google.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      bb7379bf
    • A
      bitops: unify non-atomic bitops prototypes across architectures · 0e862838
      Alexander Lobakin 提交于
      Currently, there is a mess with the prototypes of the non-atomic
      bitops across the different architectures:
      
      ret	bool, int, unsigned long
      nr	int, long, unsigned int, unsigned long
      addr	volatile unsigned long *, volatile void *
      
      Thankfully, it doesn't provoke any bugs, but can sometimes make
      the compiler angry when it's not handy at all.
      Adjust all the prototypes to the following standard:
      
      ret	bool				retval can be only 0 or 1
      nr	unsigned long			native; signed makes no sense
      addr	volatile unsigned long *	bitmaps are arrays of ulongs
      
      Next, some architectures don't define 'arch_' versions as they don't
      support instrumentation, others do. To make sure there is always the
      same set of callables present and to ease any potential future
      changes, make them all follow the rule:
       * architecture-specific files define only 'arch_' versions;
       * non-prefixed versions can be defined only in asm-generic files;
      and place the non-prefixed definitions into a new file in
      asm-generic to be included by non-instrumented architectures.
      
      Finally, add some static assertions in order to prevent people from
      making a mess in this room again.
      I also used the %__always_inline attribute consistently, so that
      they always get resolved to the actual operations.
      Suggested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NYury Norov <yury.norov@gmail.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      0e862838
    • A
      bitops: always define asm-generic non-atomic bitops · 21bb8af5
      Alexander Lobakin 提交于
      Move generic non-atomic bitops from the asm-generic header which
      gets included only when there are no architecture-specific
      alternatives, to a separate independent file to make them always
      available.
      Almost no actual code changes, only one comment added to
      generic_test_bit() saying that it's an atomic operation itself
      and thus `volatile` must always stay there with no cast-aways.
      
      Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # comment
      Suggested-by: Marco Elver <elver@google.com> # reference to kernel-doc
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NMarco Elver <elver@google.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      21bb8af5
    • A
      ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr() · e5a16a5c
      Alexander Lobakin 提交于
      test_bit(), as any other bitmap op, takes `unsigned long *` as a
      second argument (pointer to the actual bitmap), as any bitmap
      itself is an array of unsigned longs. However, the ia64_get_irr()
      code passes a ref to `u64` as a second argument.
      This works with the ia64 bitops implementation due to that they
      have `void *` as the second argument and then cast it later on.
      This works with the bitmap API itself due to that `unsigned long`
      has the same size on ia64 as `u64` (`unsigned long long`), but
      from the compiler PoV those two are different.
      Define @irr as `unsigned long` to fix that. That implies no
      functional changes. Has been hidden for 16 years!
      
      Fixes: a5878691 ("[IA64] avoid broken SAL_CACHE_FLUSH implementations")
      Cc: stable@vger.kernel.org # 2.6.16+
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NYury Norov <yury.norov@gmail.com>
      Signed-off-by: NYury Norov <yury.norov@gmail.com>
      e5a16a5c
  5. 24 6月, 2022 1 次提交
  6. 20 6月, 2022 1 次提交
  7. 19 6月, 2022 17 次提交