提交 · f768b35a2371ccf85255f608444d234062a1b5c9 · openeuler / Kernel

20 3月, 2023 1 次提交

cpumask: introduce for_each_cpu_or · 1470afef

由 Dave Chinner 提交于 3月 15, 2023

Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

1470afef

12 3月, 2023 1 次提交

cpumask: relax sanity checking constraints · e7304080

由 Linus Torvalds 提交于 3月 12, 2023

The cpumask_check() was unnecessarily tight, and causes problems for the
users of cpumask_next().

We have a number of users that take the previous return value of one of
the bit scanning functions and subtract one to keep it in "range". But
since the scanning functions end up returning up to 'small_cpumask_bits'
instead of the tighter 'nr_cpumask_bits', the range really needs to be
using that widened form.

[ This "previous-1" behavior is also the reason we have all those
comments about /* -1 is a legal arg here. */ and separate checks for
that being ok. So we could have just made "small_cpumask_bits-1"
be a similar special "don't check this" value.

Tetsuo Handa even suggested a patch that only does that for
cpumask_next(), since that seems to be the only actual case that
triggers, but that all makes it even _more_ magical and special. So
just relax the check ]

One example of this kind of pattern being the 'c_start()' function in
arch/x86/kernel/cpu/proc.c, but also duplicated in various forms on
other architectures.

Reported-by: syzbot+96cae094d90877641f32@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=96cae094d90877641f32Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Link: https://lore.kernel.org/lkml/c1f4cc16-feea-b83c-82cf-1a1f007b7eb9@I-love.SAKURA.ne.jp/
Fixes: 596ff4a0 ("cpumask: re-introduce constant-sized cpumask optimizations")
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7304080

08 3月, 2023 1 次提交

cpumask: be more careful with 'cpumask_setall()' · 63355b98

由 Linus Torvalds 提交于 3月 07, 2023

Commit 596ff4a0 ("cpumask: re-introduce constant-sized cpumask
optimizations") changed cpumask_setall() to use "bitmap_set()" instead
of "bitmap_fill()", because bitmap_fill() would explicitly set all the
bits of a constant sized small bitmap, and that's exactly what we don't
want: we want to only set bits up to 'nr_cpu_ids', which is what
"bitmap_set()" does.

However, Yury correctly points out that while "bitmap_set()" does indeed
only set bits up to the required bitmap size, it doesn't _clear_ bits
above that size, so the upper bits would still not have well-defined
values.

Now, none of this should really matter, since any bits set past
'nr_cpu_ids' should always be ignored in the first place.  Yes, the bit
scanning functions might return them as a result, but since users should
always consider the ">= nr_cpu_ids" condition to mean "no more bits",
that shouldn't have any actual effect (see previous commit 8ca09d5f
"cpumask: fix incorrect cpumask scanning result checks").

But let's just do it right, the way the code was _intended_ to work.  We
have had enough lazy code that works but bites us in the *rse later
(again, see previous commit) that there's no reason to not just do this
properly.

It turns out that "bitmap_fill()" gets this all right for the complex
case, and really only fails for the inlined optimized case that just
fills the whole word.  And while we could just fix bitmap_fill() to use
the proper last word mask, there's two issues with that:

 - the cpumask case wants to do the _optimization_ based on "NR_CPUS is
   a small constant", but then wants to do the actual bit _fill_ based
   on "nr_cpu_ids" that isn't necessarily that same constant

 - we have lots of non-cpumask users of bitmap_fill(), and while they
   hopefully don't care, and probably would want the proper semantics
   anyway ("only set bits up to the limit"), I do not want the cpumask
   changes to impact other parts

So this ends up just doing the single-word optimization by hand in the
cpumask code.  If our cpumask is fundamentally limited to a single word,
just do the proper "fill in that word" exactly.  And if it's the more
complex multi-word case, then the generic bitmap_fill() will DTRT.

This is all an example of how our bitmap function optimizations really
are somewhat broken.  They conflate the "this is size of the bitmap"
optimizations with the actual bit(s) we want to set.

In many cases we really want to have the two be separate things:
sometimes we base our optimizations on the size of the whole bitmap ("I
know this whole bitmap fits in a single word, so I'll just use
single-word accesses"), and sometimes we base them on the bit we are
looking at ("this is just acting on bits that are in the first word, so
I'll use single-word accesses").

Notice how the end result of the two optimizations are the same, but the
way we get to them are quite different.

And all our cpumask optimization games are really about that fundamental
distinction, and we'd often really want to pass in both the "this is the
bit I'm working on" (which _can_ be a small constant but might be
variable), and "I know it's in this range even if it's variable" (based
on CONFIG_NR_CPUS).

So this cpumask_setall() implementation just makes that explicit.  It
checks the "I statically know the size is small" using the known static
size of the cpumask (which is what that 'small_cpumask_bits' is all
about), but then sets the actual bits using the exact number of cpus we
have (ie 'nr_cpumask_bits')

Of course, in a perfect world, the compiler would have done all the
range analysis (possibly with help from us just telling it that
"this value is always in this range"), and would do all of this for us.
But that is not the world we live in.

While we dream of that perfect world, this does that manual logic to
make it all work out.  And this was a very long explanation for a small
code change that shouldn't even matter.
Reported-by: NYury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/lkml/ZAV9nGG9e1%2FrV+L%2F@yury-laptop/Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

63355b98

07 3月, 2023 1 次提交

cpumask: Fix typo nr_cpumask_size --> nr_cpumask_bits · 80c16b2b

由 Andy Shevchenko 提交于 3月 06, 2023

The never used nr_cpumask_size is just a typo, hence use existing
redefinition that's called nr_cpumask_bits.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80c16b2b

06 3月, 2023 1 次提交

cpumask: re-introduce constant-sized cpumask optimizations · 596ff4a0

由 Linus Torvalds 提交于 3月 04, 2023

Commit aa47a7c2 ("lib/cpumask: deprecate nr_cpumask_bits") resulted
in the cpumask operations potentially becoming hugely less efficient,
because suddenly the cpumask was always considered to be variable-sized.

The optimization was then later added back in a limited form by commit
6f9c07be ("lib/cpumask: add FORCE_NR_CPUS config option"), but that
FORCE_NR_CPUS option is not useful in a generic kernel and more of a
special case for embedded situations with fixed hardware.

Instead, just re-introduce the optimization, with some changes.

Instead of depending on CPUMASK_OFFSTACK being false, and then always
using the full constant cpumask width, this introduces three different
cpumask "sizes":

 - the exact size (nr_cpumask_bits) remains identical to nr_cpu_ids.

   This is used for situations where we should use the exact size.

 - the "small" size (small_cpumask_bits) is the NR_CPUS constant if it
   fits in a single word and the bitmap operations thus end up able
   to trigger the "small_const_nbits()" optimizations.

   This is used for the operations that have optimized single-word
   cases that get inlined, notably the bit find and scanning functions.

 - the "large" size (large_cpumask_bits) is the NR_CPUS constant if it
   is an sufficiently small constant that makes simple "copy" and
   "clear" operations more efficient.

   This is arbitrarily set at four words or less.

As a an example of this situation, without this fixed size optimization,
cpumask_clear() will generate code like

        movl    nr_cpu_ids(%rip), %edx
        addq    $63, %rdx
        shrq    $3, %rdx
        andl    $-8, %edx
        callq   memset@PLT

on x86-64, because it would calculate the "exact" number of longwords
that need to be cleared.

In contrast, with this patch, using a MAX_CPU of 64 (which is quite a
reasonable value to use), the above becomes a single

	movq $0,cpumask

instruction instead, because instead of caring to figure out exactly how
many CPU's the system has, it just knows that the cpumask will be a
single word and can just clear it all.

Note that this does end up tightening the rules a bit from the original
version in another way: operations that set bits in the cpumask are now
limited to the actual nr_cpu_ids limit, whereas we used to do the
nr_cpumask_bits thing almost everywhere in the cpumask code.

But if you just clear bits, or scan for bits, we can use the simpler
compile-time constants.

In the process, remove 'cpumask_complement()' and 'for_each_cpu_not()'
which were not useful, and which fundamentally have to be limited to
'nr_cpu_ids'.  Better remove them now than have somebody introduce use
of them later.

Of course, on x86-64 with MAXSMP there is no sane small compile-time
constant for the cpumask sizes, and we end up using the actual CPU bits,
and will generate the above kind of horrors regardless.  Please don't
use MAXSMP unless you really expect to have machines with thousands of
cores.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

596ff4a0

08 2月, 2023 1 次提交

cpumask: introduce cpumask_nth_and_andnot · 62f4386e

由 Yury Norov 提交于 1月 20, 2023

Introduce cpumask_nth_and_andnot() based on find_nth_and_andnot_bit().
It's used in the following patch to traverse cpumasks without storing
intermediate result in temporary cpumask.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Acked-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NJacob Keller <jacob.e.keller@intel.com>
Reviewed-by: NPeter Lafreniere <peter@n8pjl.ca>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

62f4386e

13 1月, 2023 1 次提交

cpuidle, ACPI: Make noinstr clean · 6a123d6a

由 Peter Zijlstra 提交于 1月 12, 2023

objtool found cases where ACPI methods called out into instrumentation code:

vmlinux.o: warning: objtool: io_idle+0xc: call to __inb.isra.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_idle_enter+0xfe: call to num_online_cpus() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_idle_enter+0x115: call to acpi_idle_fallback_to_c1.isra.0() leaves .noinstr.text section

Fix this by: marking the IO in/out, acpi_idle_fallback_to_c1() and
num_online_cpus() methods as __always_inline.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NTony Lindgren <tony@atomide.com>
Tested-by: NUlf Hansson <ulf.hansson@linaro.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NFrederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195541.294846301@infradead.org

6a123d6a

17 10月, 2022 1 次提交

Revert "cpumask: fix checking valid cpu range". · 80493877

由 Tetsuo Handa 提交于 10月 16, 2022

This reverts commit 78e5a339 ("cpumask: fix checking valid cpu range").

syzbot is hitting WARN_ON_ONCE(cpu >= nr_cpumask_bits) warning at
cpu_max_bits_warn() [1], for commit 78e5a339 ("cpumask: fix checking
valid cpu range") is broken.  Obviously that patch hits WARN_ON_ONCE()
when e.g.  reading /proc/cpuinfo because passing "cpu + 1" instead of
"cpu" will trivially hit cpu == nr_cpumask_bits condition.

Although syzbot found this problem in linux-next.git on 2022/09/27 [2],
this problem was not fixed immediately.  As a result, that patch was
sent to linux.git before the patch author recognizes this problem, and
syzbot started failing to test changes in linux.git since 2022/10/10
[3].

Andrew Jones proposed a fix for x86 and riscv architectures [4].  But
[2] and [5] indicate that affected locations are not limited to arch
code.  More delay before we find and fix affected locations, less tested
kernel (and more difficult to bisect and fix) before release.

We should have inspected and fixed basically all cpumask users before
applying that patch.  We should not crash kernels in order to ask
existing cpumask users to update their code, even if limited to
CONFIG_DEBUG_PER_CPU_MAPS=y case.

Link: https://syzkaller.appspot.com/bug?extid=d0fd2bf0dd6da72496dd [1]
Link: https://syzkaller.appspot.com/bug?extid=21da700f3c9f0bc40150 [2]
Link: https://syzkaller.appspot.com/bug?extid=51a652e2d24d53e75734 [3]
Link: https://lkml.kernel.org/r/20221014155845.1986223-1-ajones@ventanamicro.com [4]
Link: https://syzkaller.appspot.com/bug?extid=4d46c43d81c3bd155060 [5]
Reported-by: NAndrew Jones <ajones@ventanamicro.com>
Reported-by: syzbot+d0fd2bf0dd6da72496dd@syzkaller.appspotmail.com
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80493877

06 10月, 2022 1 次提交

cpumask: Introduce for_each_cpu_andnot() · 5f75ff29

由 Valentin Schneider 提交于 10月 03, 2022

for_each_cpu_and() is very convenient as it saves having to allocate a
temporary cpumask to store the result of cpumask_and(). The same issue
applies to cpumask_andnot() which doesn't actually need temporary storage
for iteration purposes.

Following what has been done for for_each_cpu_and(), introduce
for_each_cpu_andnot().
Signed-off-by: NValentin Schneider <vschneid@redhat.com>

5f75ff29

02 10月, 2022 3 次提交

cpumask: fix checking valid cpu range · 78e5a339

由 Yury Norov 提交于 9月 19, 2022

The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
passed with a shifted CPU index, and for them, the valid range is
[-1, nr_cpu_ids-1). Currently for those functions, we check the index
against [-1, nr_cpu_ids), which is wrong.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

78e5a339

lib/bitmap: introduce for_each_set_bit_wrap() macro · 4fe49b3b

由 Yury Norov 提交于 9月 19, 2022

Add for_each_set_bit_wrap() macro and use it in for_each_cpu_wrap(). The
new macro is based on __for_each_wrap() iterator, which is simpler and
smaller than cpumask_next_wrap().
Signed-off-by: NYury Norov <yury.norov@gmail.com>

4fe49b3b

cpumask: switch for_each_cpu{,_not} to use for_each_bit() · 33e67710

由 Yury Norov 提交于 9月 19, 2022

The difference between for_each_cpu() and for_each_set_bit()
is that the latter uses cpumask_next() instead of find_next_bit(),
and so calls cpumask_check().

This check is useless because the iterator value is not provided by
user. It generates false-positives for the very last iteration
of for_each_cpu().
Signed-off-by: NYury Norov <yury.norov@gmail.com>

33e67710

27 9月, 2022 2 次提交

cpumask: add cpumask_nth_{,and,andnot} · 944c417d

由 Yury Norov 提交于 9月 17, 2022

Add cpumask_nth_{,and,andnot} as wrappers around corresponding
find functions, and use it in cpumask_local_spread().
Signed-off-by: NYury Norov <yury.norov@gmail.com>

944c417d

lib/bitmap: add bitmap_weight_and() · 24291caf

由 Yury Norov 提交于 9月 17, 2022

The function calculates Hamming weight of (bitmap1 & bitmap2). Now we
have to do like this:
	tmp = bitmap_alloc(nbits);
	bitmap_and(tmp, map1, map2, nbits);
	weight = bitmap_weight(tmp, nbits);
	bitmap_free(tmp);

This requires additional memory, adds pressure on alloc subsystem, and
way less cache-friendly than just:
	weight = bitmap_weight_and(map1, map2, nbits);

The following patches apply it for cpumask functions.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

24291caf

22 9月, 2022 1 次提交

drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES · d7f06bdd

由 Phil Auld 提交于 9月 06, 2022

As PAGE_SIZE is unsigned long, -1 > PAGE_SIZE when NR_CPUS <= 3.
This leads to very large file sizes:

topology$ ls -l
total 0
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 core_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 core_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 10:58 core_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 10:10 core_siblings
-r--r--r-- 1 root root                 4096 Sep  5 11:59 core_siblings_list
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 die_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 die_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 11:59 die_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 package_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 package_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 10:58 physical_package_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 10:10 thread_siblings
-r--r--r-- 1 root root                 4096 Sep  5 11:59 thread_siblings_list

Adjust the inequality to catch the case when NR_CPUS is configured
to a small value.

Fixes: 7ee951ac ("drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Cc: feng xiangjun <fengxj325@gmail.com>
Reported-by: Nfeng xiangjun <fengxj325@gmail.com>
Signed-off-by: NPhil Auld <pauld@redhat.com>
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20220906203542.1796629-1-pauld@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d7f06bdd

21 9月, 2022 1 次提交

lib/cpumask: add FORCE_NR_CPUS config option · 6f9c07be

由 Yury Norov 提交于 9月 05, 2022

The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.

In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.

This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.

If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.

The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.

For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
Signed-off-by: NYury Norov <yury.norov@gmail.com>

6f9c07be

20 9月, 2022 3 次提交

lib/cpumask: deprecate nr_cpumask_bits · aa47a7c2

由 Yury Norov 提交于 9月 05, 2022

Cpumask code is written in assumption that when CONFIG_CPUMASK_OFFSTACK
is enabled, all cpumasks have boot-time defined size, otherwise the size
is always NR_CPUS.

The latter is wrong because the number of possible cpus is always
calculated on boot, and it may be less than NR_CPUS.

On my 4-cpu arm64 VM the nr_cpu_ids is 4, as expected, and nr_cpumask_bits
is 256, which corresponds to NR_CPUS. This not only leads to useless
traversing of cpumask bits greater than 4, this also makes some cpumask
routines fail.

For example, cpumask_full(0b1111000..000) would erroneously return false
in the example above because tail bits in the mask are all unset.

This patch deprecates nr_cpumask_bits and wires it to nr_cpu_ids
unconditionally, so that cpumask routines will not waste time traversing
unused part of cpu masks. It also fixes cpumask_full() and similar
routines.

As a side effect, because now a length of cpumasks is defined at run-time
even if CPUMASK_OFFSTACK is disabled, compiler can't optimize corresponding
functions.

It increases kernel size by ~2.5KB if OFFSTACK is off. This is addressed in
the following patch.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

aa47a7c2

lib/cpumask: delete misleading comment · 7102b3bb

由 Yury Norov 提交于 9月 05, 2022

The comment says that HOTPLUG config option enables all cpus in
cpu_possible_mask up to NR_CPUs. This is wrong. Even if HOTPLUG is
enabled, the mask is populated on boot with respect to ACPI/DT records.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

7102b3bb

smp: add set_nr_cpu_ids() · 38bef8e5

由 Yury Norov 提交于 9月 05, 2022

In preparation to support compile-time nr_cpu_ids, add a setter for
the variable.

This is a no-op for all arches.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

38bef8e5

09 9月, 2022 1 次提交

drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES · b9be19ee

由 Phil Auld 提交于 9月 06, 2022

As PAGE_SIZE is unsigned long, -1 > PAGE_SIZE when NR_CPUS <= 3.
This leads to very large file sizes:

topology$ ls -l
total 0
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 core_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 core_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 10:58 core_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 10:10 core_siblings
-r--r--r-- 1 root root                 4096 Sep  5 11:59 core_siblings_list
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 die_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 die_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 11:59 die_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 11:59 package_cpus
-r--r--r-- 1 root root                 4096 Sep  5 11:59 package_cpus_list
-r--r--r-- 1 root root                 4096 Sep  5 10:58 physical_package_id
-r--r--r-- 1 root root 18446744073709551615 Sep  5 10:10 thread_siblings
-r--r--r-- 1 root root                 4096 Sep  5 11:59 thread_siblings_list

Adjust the inequality to catch the case when NR_CPUS is configured
to a small value.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Cc: feng xiangjun <fengxj325@gmail.com>
Fixes: 7ee951ac ("drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist")
Reported-by: Nfeng xiangjun <fengxj325@gmail.com>
Signed-off-by: NPhil Auld <pauld@redhat.com>
Signed-off-by: NYury Norov <yury.norov@gmail.com>

b9be19ee

16 8月, 2022 2 次提交

lib/cpumask: add inline cpumask_next_wrap() for UP · 2248ccd8

由 Sander Vanheule 提交于 8月 09, 2022

In the uniprocessor case, cpumask_next_wrap() can be simplified, as the
number of valid argument combinations is limited:
    - 'start' can only be 0
    - 'n' can only be -1 or 0

The only valid CPU that can then be returned, if any, will be the first
one set in the provided 'mask'.

For NR_CPUS == 1, include/linux/cpumask.h now provides an inline
definition of cpumask_next_wrap(), which will conflict with the one
provided by lib/cpumask.c.  Make building of lib/cpumask.o again depend
on CONFIG_SMP=y (i.e. NR_CPUS > 1) to avoid the re-definition.
Suggested-by: NYury Norov <yury.norov@gmail.com>
Signed-off-by: NSander Vanheule <sander@svanheule.net>
Signed-off-by: NYury Norov <yury.norov@gmail.com>

2248ccd8

cpumask: align signatures of UP implementations · be599244

由 Sander Vanheule 提交于 8月 09, 2022

Between the generic version, and their uniprocessor optimised
implementations, the return types of cpumask_any_and_distribute() and
cpumask_any_distribute() are not identical.  Change the UP versions to
'unsigned int', to match the generic versions.
Suggested-by: NYury Norov <yury.norov@gmail.com>
Signed-off-by: NSander Vanheule <sander@svanheule.net>
Signed-off-by: NYury Norov <yury.norov@gmail.com>

be599244

18 7月, 2022 3 次提交

cpumask: update cpumask_next_wrap() signature · 953257a9

由 Sander Vanheule 提交于 7月 02, 2022

The extern specifier is not needed for this declaration, so drop it.  The
function also depends only on the input parameters, and has no side
effects, so it can be marked __pure like other functions in cpumask.h.

Link: https://lkml.kernel.org/r/72ab755695b74bb5fbaa756ae4c0edd708d172f1.1656777646.git.sander@svanheule.netSigned-off-by: NSander Vanheule <sander@svanheule.net>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

953257a9

cpumask: Fix invalid uniprocessor mask assumption · b81dce77

由 Sander Vanheule 提交于 7月 02, 2022

On uniprocessor builds, any CPU mask is assumed to contain exactly one CPU
(cpu0).  This assumption ignores the existence of empty masks, resulting
in incorrect behaviour.

cpumask_first_zero(), cpumask_next_zero(), and for_each_cpu_not() don't
provide behaviour matching the assumption that a UP mask is always "1",
and instead provide behaviour matching the empty mask.

Drop the incorrectly optimised code and use the generic implementations in
all cases.

Link: https://lkml.kernel.org/r/86bf3f005abba2d92120ddd0809235cab4f759a6.1656777646.git.sander@svanheule.netSigned-off-by: NSander Vanheule <sander@svanheule.net>
Suggested-by: NYury Norov <yury.norov@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

b81dce77

cpumask: add UP optimised for_each_*_cpu versions · 4f099030

由 Sander Vanheule 提交于 7月 02, 2022

On uniprocessor builds, the following loops will always run over a mask
that contains one enabled CPU (cpu0):

    - for_each_possible_cpu
    - for_each_online_cpu
    - for_each_present_cpu

Provide uniprocessor-specific macros for these loops, that always run
exactly once.

Link: https://lkml.kernel.org/r/3a92869b902a075b97be5d1452c9c6badbbff0df.1656777646.git.sander@svanheule.netSigned-off-by: NSander Vanheule <sander@svanheule.net>
Acked-by: NYury Norov <yury.norov@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

4f099030

15 7月, 2022 5 次提交

drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist · 7ee951ac

由 Phil Auld 提交于 7月 15, 2022

Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.

For cpulist the maximum size is on the order of
	NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2

which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.

The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.

Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.

As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):

before:

-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap

after:

-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root  4096 Jul 13 11:31 system/node/node0/cpumap

CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root  4607 Jul 13 14:02 system/node/node0/cpumap

The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.

Fixes: 75bd50fa ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h)
Signed-off-by: NPhil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7ee951ac

lib/cpumask: move some one-line wrappers to header file · f0dd891d

由 Yury Norov 提交于 7月 01, 2022

After moving gfp flags to a separate header, it's possible to move some
cpumask allocators into headers, and avoid creating real functions.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

f0dd891d

lib/cpumask: move trivial wrappers around find_bit to the header · 9b2e7086

由 Yury Norov 提交于 7月 01, 2022

To avoid circular dependencies, cpumask keeps simple (almost) one-line
wrappers around find_bit() in a c-file.

Commit 47d8c156 ("include: move find.h from asm_generic to linux")
moved find.h header out of asm_generic include path, and it helped to fix
many circular dependencies, including some in cpumask.h.

This patch moves those one-liners to header files.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

9b2e7086

lib/cpumask: change return types to unsigned where appropriate · 8b6b795d

由 Yury Norov 提交于 7月 01, 2022

Switch return types to unsigned int where return values cannot be negative.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

8b6b795d

cpumask: change return types to bool where appropriate · cb32c285

由 Yury Norov 提交于 7月 01, 2022

Some cpumask functions have integer return types where return values
are naturally booleans.
Signed-off-by: NYury Norov <yury.norov@gmail.com>

cb32c285

13 2月, 2022 1 次提交

cpumask: Add a x86-specific cpumask_clear_cpu() helper · f5c54f77

由 Borislav Petkov 提交于 2月 04, 2022

Add a x86-specific cpumask_clear_cpu() helper which will be used in
places where the explicit KASAN-instrumentation in the *_bit() helpers
is unwanted.

Also, always inline two more cpumask generic helpers.

allyesconfig:

     text    data     bss     dec     hex filename
  190553143       159425889       32076404        382055436       16c5b40c vmlinux.before
  190551812       159424945       32076404        382053161       16c5ab29 vmlinux.after
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NMarco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20220204083015.17317-2-bp@alien8.de

f5c54f77

26 1月, 2022 1 次提交

cpumask: Always inline helpers which use bit manipulation functions · 1dc01aba

由 Borislav Petkov 提交于 1月 13, 2022

Former are always inlined so do that for the latter too, for
consistency.

Size impact is a whopping 5 bytes increase! :-)

   text    data     bss     dec     hex filename
22350551        8213184 1917164 32480899        1ef9e83 vmlinux.x86-64.defconfig.before
22350556        8213152 1917164 32480872        1ef9e68 vmlinux.x86-64.defconfig.after
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NMarco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20220113155357.4706-3-bp@alien8.de

1dc01aba

16 1月, 2022 2 次提交

cpumask: replace cpumask_next_* with cpumask_first_* where appropriate · 9b51d9d8

由 Yury Norov 提交于 8月 14, 2021

cpumask_first() is a more effective analogue of 'next' version if n == -1
(which means start == 0). This patch replaces 'next' with 'first' where
things look trivial.

There's no cpumask_first_zero() function, so create it.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Tested-by: NWolfram Sang <wsa+renesas@sang-engineering.com>

9b51d9d8

cpumask: use find_first_and_bit() · 93ba139b

由 Yury Norov 提交于 8月 14, 2021

Now we have an efficient implementation for find_first_and_bit(),
so switch cpumask to use it where appropriate.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Tested-by: NWolfram Sang <wsa+renesas@sang-engineering.com>

93ba139b

21 9月, 2021 1 次提交

cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf · c86a2d90

由 Tobias Klauser 提交于 9月 17, 2021

The changes in the patch series [1] introduced a terminating null byte
when reading from cpulist or cpumap sysfs files, for example:

  $ xxd /sys/devices/system/node/node0/cpulist
  00000000: 302d 310a 00                             0-1..

Before this change, the output looked as follows:

  $ xxd /sys/devices/system/node/node0/cpulist
  00000000: 302d 310a                                0-1.

Fix this regression by excluding the terminating null byte from the
returned length in cpumap_print_list_to_buf and
cpumap_print_bitmask_to_buf.

[1] https://lore.kernel.org/all/20210806110251.560-1-song.bao.hua@hisilicon.com/

Fixes: 1fae5629 ("cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list")
Acked-by: NBarry Song <song.bao.hua@hisilicon.com>
Acked-by: NYury Norov <yury.norov@gmail.com>
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Link: https://lore.kernel.org/r/20210916222705.13554-1-tklauser@distanz.chSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c86a2d90

13 8月, 2021 1 次提交

cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list · 1fae5629

由 Tian Tao 提交于 8月 06, 2021

The existing cpumap_print_to_pagebuf() is used by cpu topology and other
drivers to export hexadecimal bitmask and decimal list to userspace by
sysfs ABI.

Right now, those drivers are using a normal attribute for this kind of
ABIs. A normal attribute typically has show entry as below:

static ssize_t example_dev_show(struct device *dev,
                struct device_attribute *attr, char *buf)
{
	...
	return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
}
show entry of attribute has no offset and count parameters and this
means the file is limited to one page only.

cpumap_print_to_pagebuf() API works terribly well for this kind of
normal attribute with buf parameter and without offset, count:

static inline ssize_t
cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
{
	return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
				       nr_cpu_ids);
}

The problem is once we have many cpus, we have a chance to make bitmask
or list more than one page. Especially for list, it could be as complex
as 0,3,5,7,9,...... We have no simple way to know it exact size.

It turns out bin_attribute is a way to break this limit. bin_attribute
has show entry as below:
static ssize_t
example_bin_attribute_show(struct file *filp, struct kobject *kobj,
             struct bin_attribute *attr, char *buf,
             loff_t offset, size_t count)
{
	...
}

With the new offset and count parameters, this makes sysfs ABI be able
to support file size more than one page. For example, offset could be
>= 4096.

This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap
infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers
can move to bin_attribute to support large bitmask and list. At the same
time, we have to pass those corresponding parameters such as offset, count
from bin_attribute to this new API.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: NTian Tao <tiantao6@hisilicon.com>
Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com>
Link: https://lore.kernel.org/r/20210806110251.560-2-song.bao.hua@hisilicon.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1fae5629

09 7月, 2021 1 次提交

lib: fix spelling mistakes in header files · c23c8082

由 Zhen Lei 提交于 7月 07, 2021

Fix some spelling mistakes in comments found by "codespell":
Hoever ==> However
poiter ==> pointer
representaion ==> representation
uppon ==> upon
independend ==> independent
aquired ==> acquired
mis-match ==> mismatch
scrach ==> scratch
struture ==> structure
Analagous ==> Analogous
interation ==> iteration

And some were discovered manually by Joe Perches and Christoph Lameter:
stroed ==> stored
arch independent ==> an architecture independent
A example structure for ==> Example structure for

Link: https://lkml.kernel.org/r/20210609150027.14805-2-thunder.leizhen@huawei.comSigned-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Cc: Christoph Lameter <cl@gentwo.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c23c8082

16 4月, 2021 2 次提交

cpumask: Introduce DYING mask · e40f74c5

由 Peter Zijlstra 提交于 1月 19, 2021

Introduce a cpumask that indicates (for each CPU) what direction the
CPU hotplug is currently going. Notably, it tracks rollbacks. Eg. when
an up fails and we do a roll-back down, it will accurately reflect the
direction.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NValentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210310150109.151441252@infradead.org

e40f74c5

cpumask: Make cpu_{online,possible,present,active}() inline · b02a4fd8

由 Peter Zijlstra 提交于 1月 25, 2021

Prepare for addition of another mask. Primarily a code movement to
avoid having to create more #ifdef, but while there, convert
everything with an argument to an inline function.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NValentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210310150109.045447765@infradead.org

b02a4fd8

06 3月, 2021 1 次提交

cpumask: Mark functions as pure · 291c4011

由 Nadav Amit 提交于 2月 20, 2021

cpumask_next_and() and cpumask_any_but() are pure, and marking them as
such seems to generate different and presumably better code for
native_flush_tlb_multi().
Signed-off-by: NNadav Amit <namit@vmware.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20210220231712.2475218-8-namit@vmware.com

291c4011

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功