提交 · 14409624e2d8b4a38a0d597d295b0337c6363118 · openeuler / Kernel

26 9月, 2021 40 次提交

keys: Allow to set key domain tag separately from the key type · 14409624

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add KEY_ALLOC_DOMAIN_* flags so that the key domain tag can be
specified on the key creation. This is done to separate the
key domain setting from the key type.

If applied to the keyring, it will set the requested domain tag for
every key added to that keyring.

IMA uses the existing key_type_asymmetric for appraisal, but also has
to specify the key domain to bind appraisal key with the ima namespace.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

14409624

keys: Include key domain tag in the iterative search · 44313f67

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add domain tag to the key_match_data. If set, check domain tag in the
default match function and asymmetric keys match functions.

This will allow to use the key domain tag in the search criteria for
the iterative search, not only for the direct lookup that is based on
the index key.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

44313f67

keys: Add domain tag to the keyring search criteria · d3ef5f85

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add keyring_search_tag() version of keyring_search(), that allows to
specify the key domain tag.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d3ef5f85

ima: Remap IDs of subject based rules if necessary · dc3fb393

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

If subject based rule is added to the policy before the user namespace
uid mapping is defined, ID has to be recalculated.

It can happen if the new user namespace is created alongside the new
ima namespace. The default policy rules are loaded when the first
process is born into the new ima namespace. In that case, user has no
chance to define the mapping. It can also happen for the custom policy
rules loaded from within the new ima namespace, before the mapping is
created.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dc3fb393

user namespace: Add function that checks if the UID map is defined · 44a41d57

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add function that checks if the UID map is defined. It will be used by
ima to check if ID remapping in subject-based rules is necessary.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

44a41d57

ima: Parse per ima namespace policy file · 3ead0c63

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Parse per ima namespace policy file. The path is passed as for the root
ima namespace through the ima securityfs 'policy' entry.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3ead0c63

ima: Configure the new ima namespace from securityfs · be54067b

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add ima securityfs entries to configure per ima namespace:
- path to the x509 certificate
- ima kernel boot parameters

The x509 certificate will be parsed and loaded when the first process is
born into the new ima namespace, paths are not validated when written.

Kernel boot parameters are pre-parsed and applied when the first process
is born into the
new namespace.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

be54067b

ima: Change the owning user namespace of the ima namespace if necessary · 2098d7b5

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

It's possible that the user first unshares the ima namespace and then
creates a new user namespace using clone3(). In that case the owning
user namespace is the newly created one, because it is associated with
the first process in the new ima namespace.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2098d7b5

ima: Add the violation counter to the namespace · 5622ee15

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

The violations are now tracked per namespace.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5622ee15

ima: Extend permissions to the ima securityfs entries · 441a760d

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add "others" permissions to the namespaced ima securityfs entries. It
is necessary so that the root in the user namespace that is the parent
of the given ima namespace has access to the ima related data.

Loosened DAC restrictrions are compensated by an extra check for
SYS_ADMIN capabilities in the ima code. The access is given
only to the namespaced data, e.g. root user in the new ima namespace
will see measurement list entries collected for that namespace and not
for the other existing namespaces. The only exception is made for the
admin in the initial user namespace, who has access to all the data.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

441a760d

ima: Add a reader counter to the integrity inode data · b80cb82f

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

To detect ToMToU violations reader counter of the given inode is
checked. This is not enough, because the reader may exist in a
different ima namespace. Per inode reader counter tracks readers in all
ima namespaces, whereas the per namespace counter is necessary to avoid
false positives.

Add a new reader counter to the integrity inode cache entry.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b80cb82f

ima: Add per namespace view of the measurement list · ba729f30

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Modify ima securityfs interface, so that only measurement list entries
that belong to the given ima namespace are visible/counted. The initial
ima namespace is an exception, its processes have access to all
measurement list entries.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ba729f30

ima: Add a new ima template that includes namespace ID · 782baa66

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add a new ima-ns template:
"d-ng|n-ng|ns"
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

782baa66

ima: Check ima namespace ID during digest entry lookup · bd86d4c7

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Compare the namespace id during the digest entry lookup. Remove digests
from hash table when the namespace is destroyed, but keep the global
measurement list entry.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bd86d4c7

ima: Keep track of the measurment list per ima namespace · a1f9c162

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Maintain per ima namespace measurement list. It will be used to provide
information about the namespace measurements in securityfs and to clean up
hash table entries when the namespace is destroyed.

The global measurement list remains and is not modified. It is necessary to
keep it so that the PCR value can be recreated.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a1f9c162

ima: Add ima namespace id to the measurement list related structures · 78e01410

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add ima namespace id to the ima_event_data and ima_template_entry. This is
done so that the template entries can be tracked per ima namespace. The
following patches will add new templates that will include the namespace
id, but the namespace id has to be stored separately so that the namespace
functionality is enabled for every template.

After kexec, all entries from the old measurement list will be associated
with the new root ima namespace. This will prevent users in the new ima
namespaces from accessing the old entries if the ima namespace id is
reused.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

78e01410

ima: Enable per ima namespace policy settings · 27229fcf

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Set ima policy per namespace and remove the global settings. Operations
on the objects may now have impact in more than one ima namespace and
therefore iterate all active ima namespaces when necessary.

Read-write violations can now happen across namespaces and should be
checked in all namespaces for each relevant ima hook.

Inform all concerned ima namespaces about the actions on the objects
when the object is freed. E.g. if an object had been appraised in the
ima_ns_1 and then modified in the ima_ns_2, appraised flag in the
ima_ns_1 is cleared and the object will be re-appraised in the ima_ns_1
namespace.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

27229fcf

ima: Add integrity inode related data to the ima namespace · 5f7f33c2

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add an iint tree to the ima namespace. Each namespace should track
operations on its objects separately. Per namespace iint tree is not
yet used, it will be done in the following patches.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5f7f33c2

ima: Extend the APIs in the integrity subsystem · eecf0d83

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Inode integrity cache will be maintained per ima namespace. Add new
functions that allow to specify the iint tree to use.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

eecf0d83

ima: Add ima namespace to the ima subsystem APIs · c2b095cc

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add ima namespace pointer to the input parameters of the relevant
functions. This is a preparation for the policy namespacing, more
functions may be modified later, when other aspects of the ima are
namespaced.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c2b095cc

ima: Add methods for parsing ima policy configuration string · dcc22cdd

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

IMA subsystem is configured at boot time using kernel command-line
parameters, e.g.: ima_policy=tcb|appraise_tcb|secure_boot. The same
configuration options should be available for the new ima namespace.
Add new functions to parse configuration string and store parsed data
in the new policy data structures. Don't implement it yet, just add the
dummy interface.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dcc22cdd

ima: Add ima policy related data to the ima namespace · 3c759877

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Collate global variables describing the ima policy in one structure and
add it to the ima namespace. Collate setup data (parsed kernel boot
parameters) in a separate structure.

Per namespace policy is not yet properly set and it is not used. This
will be done in the following patches.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3c759877

ima: Bind ima namespace to the file descriptor · 9cf5b2e8

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

IMA namespace reference will be required in ima_file_free() to check
the policy and find inode integrity data for the correct ima namespace.
ima_file_free() is called on __fput(), and __fput() may be called after
releasing namespaces in exit_task_namespaces() in do_exit() and
therefore nsproxy reference cannot be used - it is already set to NULL.

This is a preparation for namespacing policy and inode integrity data.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9cf5b2e8

ima: Add a list of the installed ima namespaces · ee67fcbf

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

Add a list of the installed ima namespaces. IMA namespace is considered
installed, if there is at least one process born in that namespace.

This list will be used to check the read-write violations and to detect
any object related changes relevant across namespaces.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ee67fcbf

ima: Introduce ima namespace · a8352473

由 Krzysztof Struczynski 提交于 9月 10, 2021

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49KW1
CVE: NA

--------------------------------

IMA namespace wraps global ima resources in an abstraction, to enable ima
to work with the containers. Currently, ima namespace contains no useful
data but a dummy interface. IMA resources related to different aspects of
IMA, namely IMA-audit, IMA-measurement, IMA-appraisal will be added in the
following patches.

The way how ima namespace is created is analogous to the time namespace:
unshare(CLONE_NEWIMA) system call creates a new ima namespace but doesn't
assign it to the current process. All children of the process will be born
in the new ima namespace, or a process can use setns() system call to join
the new ima namespace. Call to clone3(CLONE_NEWIMA) system call creates a
new namespace, which the new process joins instantly.

This scheme, allows to configure the new ima namespace before any process
appears in it. If user initially unshares the new ima namespace, ima can
be configured using ima entries in the securityfs. If user calls clone3()
system call directly, the new ima namespace can be configured using clone
arguments. To allow this, new securityfs entries have to be added, and
structures clone_args and kernel_clone_args have to be extended.

Early configuration is crucial. The new ima polices must apply to the
first process in the new namespace, and the appraisal key has to be loaded
beforehand.

Add a new CONFIG_IMA_NS option to the kernel configuration, that enables
one to create a new IMA namespace. IMA namespace functionality is disabled
by default.
Signed-off-by: NKrzysztof Struczynski <krzysztof.struczynski@huawei.com>
Reviewed-by: NZhang Tianxing <zhangtianxing3@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a8352473

mm/page_alloc: further fix __alloc_pages_bulk() return value · 061052a9

由 Chuck Lever 提交于 9月 04, 2021

mainline inclusion
from mainline-5.14-rc2
commit 06147843
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZVL2
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=061478438d04779181c2ce4d7ffeeca343a70a98

-------------------------------------------------

The author of commit b3b64ebd ("mm/page_alloc: do bulk array
bounds check after checking populated elements") was possibly
confused by the mixture of return values throughout the function.

The API contract is clear that the function "Returns the number of pages
on the list or array." It does not list zero as a unique return value with
a special meaning.  Therefore zero is a plausible return value only if
@nr_pages is zero or less.

Clean up the return logic to make it clear that the returned value is
always the total number of pages in the array/list, not the number of
pages that were allocated during this call.

The only change in behavior with this patch is the value returned if
prepare_alloc_pages() fails.  To match the API contract, the number of
pages currently in the array/list is returned in this case.

The call site in __page_pool_alloc_pages_slow() also seems to be confused
on this matter.  It should be attended to by someone who is familiar with
that code.

[mel@techsingularity.net: Return nr_populated if 0 pages are requested]

Link: https://lkml.kernel.org/r/20210713152100.10381-4-mgorman@techsingularity.netSigned-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Cc: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: Zhang Qiang <Qiang.Zhang@windriver.com>
Cc: Yanfei Xu <yanfei.xu@windriver.com>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 06147843)
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

061052a9

mm/page_alloc: correct return value when failing at preparing · 8c4566b5

由 Yanfei Xu 提交于 9月 04, 2021

mainline inclusion
from mainline-5.14-rc2
commit e5c15cea
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZVL2
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5c15cea339115edf99dc92282865f173cf84510

-------------------------------------------------

If the array passed in is already partially populated, we should return
"nr_populated" even failing at preparing arguments stage.

Link: https://lkml.kernel.org/r/20210713152100.10381-3-mgorman@techsingularity.netSigned-off-by: NYanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20210709102855.55058-1-yanfei.xu@windriver.comSigned-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit e5c15cea)
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8c4566b5

mm/page_alloc: avoid page allocator recursion with pagesets.lock held · 180d5402

由 Mel Gorman 提交于 9月 04, 2021

mainline inclusion
from mainline-5.14-rc2
commit 187ad460
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZVL2
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=187ad460b8413e863c951998cb321a117a717868

-------------------------------------------------

Syzbot is reporting potential deadlocks due to pagesets.lock when
PAGE_OWNER is enabled.  One example from Desmond Cheong Zhi Xi is as
follows

  __alloc_pages_bulk()
    local_lock_irqsave(&pagesets.lock, flags) <---- outer lock here
    prep_new_page():
      post_alloc_hook():
        set_page_owner():
          __set_page_owner():
            save_stack():
              stack_depot_save():
                alloc_pages():
                  alloc_page_interleave():
                    __alloc_pages():
                      get_page_from_freelist():
                        rm_queue():
                          rm_queue_pcplist():
                            local_lock_irqsave(&pagesets.lock, flags);
                            *** DEADLOCK ***

Zhang, Qiang also reported

  BUG: sleeping function called from invalid context at mm/page_alloc.c:5179
  in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
  .....
  __dump_stack lib/dump_stack.c:79 [inline]
  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96
  ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153
  prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179
  __alloc_pages+0x12f/0x500 mm/page_alloc.c:5375
  alloc_page_interleave+0x1e/0x200 mm/mempolicy.c:2147
  alloc_pages+0x238/0x2a0 mm/mempolicy.c:2270
  stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303
  save_stack+0x15e/0x1e0 mm/page_owner.c:120
  __set_page_owner+0x50/0x290 mm/page_owner.c:181
  prep_new_page mm/page_alloc.c:2445 [inline]
  __alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313
  alloc_pages_bulk_array_node include/linux/gfp.h:557 [inline]
  vm_area_alloc_pages mm/vmalloc.c:2775 [inline]
  __vmalloc_area_node mm/vmalloc.c:2845 [inline]
  __vmalloc_node_range+0x39d/0x960 mm/vmalloc.c:2947
  __vmalloc_node mm/vmalloc.c:2996 [inline]
  vzalloc+0x67/0x80 mm/vmalloc.c:3066

There are a number of ways it could be fixed.  The page owner code could
be audited to strip GFP flags that allow sleeping but it'll impair the
functionality of PAGE_OWNER if allocations fail.  The bulk allocator could
add a special case to release/reacquire the lock for prep_new_page and
lookup PCP after the lock is reacquired at the cost of performance.  The
pages requiring prep could be tracked using the least significant bit and
looping through the array although it is more complicated for the list
interface.  The options are relatively complex and the second one still
incurs a performance penalty when PAGE_OWNER is active so this patch takes
the simple approach -- disable bulk allocation of PAGE_OWNER is active.
The caller will be forced to allocate one page at a time incurring a
performance penalty but PAGE_OWNER is already a performance penalty.

Link: https://lkml.kernel.org/r/20210708081434.GV3840@techsingularity.net
Fixes: dbbee9d5 ("mm/page_alloc: convert per-cpu list protection to local_lock")
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Reported-by: NDesmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reported-by: N"Zhang, Qiang" <Qiang.Zhang@windriver.com>
Reported-by: syzbot+127fd7828d6eeb611703@syzkaller.appspotmail.com
Tested-by: syzbot+127fd7828d6eeb611703@syzkaller.appspotmail.com
Acked-by: NRafael Aquini <aquini@redhat.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

180d5402

mm: vmscan: shrink deferred objects proportional to priority · 84c4ef0b

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 18bb473e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

The number of deferred objects might get windup to an absurd number, and
it results in clamp of slab objects.  It is undesirable for sustaining
workingset.

So shrink deferred objects proportional to priority and cap nr_deferred
to twice of cache items.

The idea is borrowed from Dave Chinner's patch:
  https://lore.kernel.org/linux-xfs/20191031234618.15403-13-david@fromorbit.com/

Tested with kernel build and vfs metadata heavy workload in our
production environment, no regression is spotted so far.

Link: https://lkml.kernel.org/r/20210311190845.9708-14-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

84c4ef0b

mm: memcontrol: reparent nr_deferred when memcg offline · 3db7d696

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit a178015c
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add
to parent's corresponding nr_deferred when memcg offline.

Link: https://lkml.kernel.org/r/20210311190845.9708-13-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3db7d696

mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers · a4d728ff

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 476b30a0
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Now nr_deferred is available on per memcg level for memcg aware
shrinkers, so don't need allocate shrinker->nr_deferred for such
shrinkers anymore.

The prealloc_memcg_shrinker() would return -ENOSYS if !CONFIG_MEMCG or
memcg is disabled by kernel command line, then shrinker's
SHRINKER_MEMCG_AWARE flag would be cleared.  This makes the
implementation of this patch simpler.

Link: https://lkml.kernel.org/r/20210311190845.9708-12-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a4d728ff

mm: vmscan: use per memcg nr_deferred of shrinker · 44d1dc95

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 86750830
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Use per memcg's nr_deferred for memcg aware shrinkers.  The shrinker's
nr_deferred will be used in the following cases:

    1. Non memcg aware shrinkers
    2. !CONFIG_MEMCG
    3. memcg is disabled by boot parameter

Link: https://lkml.kernel.org/r/20210311190845.9708-11-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

44d1dc95

mm: vmscan: add per memcg shrinker nr_deferred · 951a268b

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 3c6f17e6
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Currently the number of deferred objects are per shrinker, but some
slabs, for example, vfs inode/dentry cache are per memcg, this would
result in poor isolation among memcgs.

The deferred objects typically are generated by __GFP_NOFS allocations,
one memcg with excessive __GFP_NOFS allocations may blow up deferred
objects, then other innocent memcgs may suffer from over shrink,
excessive reclaim latency, etc.

For example, two workloads run in memcgA and memcgB respectively,
workload in B is vfs heavy workload.  Workload in A generates excessive
deferred objects, then B's vfs cache might be hit heavily (drop half of
caches) by B's limit reclaim or global reclaim.

We observed this hit in our production environment which was running vfs
heavy workload shown as the below tracing log:

  <...>-409454 [016] .... 28286961.747146: mm_shrink_slab_start: super_cache_scan+0x0/0x1a0 ffff9a83046f3458:
  nid: 1 objects to shrink 3641681686040 gfp_flags GFP_HIGHUSER_MOVABLE|__GFP_ZERO pgs_scanned 1 lru_pgs 15721
  cache items 246404277 delta 31345 total_scan 123202138
  <...>-409454 [022] .... 28287105.928018: mm_shrink_slab_end: super_cache_scan+0x0/0x1a0 ffff9a83046f3458:
  nid: 1 unused scan count 3641681686040 new scan count 3641798379189 total_scan 602
  last shrinker return val 123186855

The vfs cache and page cache ratio was 10:1 on this machine, and half of
caches were dropped.  This also resulted in significant amount of page
caches were dropped due to inodes eviction.

Make nr_deferred per memcg for memcg aware shrinkers would solve the
unfairness and bring better isolation.

The following patch will add nr_deferred to parent memcg when memcg
offline.  To preserve nr_deferred when reparenting memcgs to root, root
memcg needs shrinker_info allocated too.

When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the
shrinker's nr_deferred would be used.  And non memcg aware shrinkers use
shrinker's nr_deferred all the time.

Link: https://lkml.kernel.org/r/20210311190845.9708-10-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

951a268b

mm: vmscan: use a new flag to indicate shrinker is registered · 314a40ae

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 41ca668a
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Currently registered shrinker is indicated by non-NULL
shrinker->nr_deferred.  This approach is fine with nr_deferred at the
shrinker level, but the following patches will move MEMCG_AWARE
shrinkers' nr_deferred to memcg level, so their shrinker->nr_deferred
would always be NULL.  This would prevent the shrinkers from
unregistering correctly.

Remove SHRINKER_REGISTERING since we could check if shrinker is
registered successfully by the new flag.

Link: https://lkml.kernel.org/r/20210311190845.9708-9-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

314a40ae

mm: vmscan: add shrinker_info_protected() helper · a6cf2f66

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 468ab843
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

The shrinker_info is dereferenced in a couple of places via
rcu_dereference_protected with different calling conventions, for
example, using mem_cgroup_nodeinfo helper or dereferencing
memcg->nodeinfo[nid]->shrinker_info.  And the later patch will add more
dereference places.

So extract the dereference into a helper to make the code more readable.
No functional change.

[akpm@linux-foundation.org: retain rcu_dereference_protected() in free_shrinker_info(), per Hugh]

Link: https://lkml.kernel.org/r/20210311190845.9708-8-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a6cf2f66

mm: memcontrol: rename shrinker_map to shrinker_info · 4496c7bf

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit e4262c4f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

The following patch is going to add nr_deferred into shrinker_map, the
change will make shrinker_map not only include map anymore, so rename it
to "memcg_shrinker_info".  And this should make the patch adding
nr_deferred cleaner and readable and make review easier.  Also remove the
"memcg_" prefix.

Link: https://lkml.kernel.org/r/20210311190845.9708-7-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4496c7bf

mm: vmscan: use kvfree_rcu instead of call_rcu · f6f23699

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 72673e86
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu().
We don't have to define a dedicated callback for call_rcu() anymore.

Link: https://lkml.kernel.org/r/20210311190845.9708-6-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f6f23699

mm: vmscan: remove memcg_shrinker_map_size · 243a2e86

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit a2fb1261
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but
actually the map size can be calculated via shrinker_nr_max, so it seems
unnecessary to keep both.  Remove memcg_shrinker_map_size since
shrinker_nr_max is also used by iterating the bit map.

Link: https://lkml.kernel.org/r/20210311190845.9708-5-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

243a2e86

mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation · a495398d

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit d27cf2aa
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

Since memcg_shrinker_map_size just can be changed under holding
shrinker_rwsem exclusively, the read side can be protected by holding read
lock, so it sounds superfluous to have a dedicated mutex.

Kirill Tkhai suggested use write lock since:

  * We want the assignment to shrinker_maps is visible for shrink_slab_memcg().
  * The rcu_dereference_protected() dereferrencing in shrink_slab_memcg(), but
    in case of we use READ lock in alloc_shrinker_maps(), the dereferrencing
    is not actually protected.
  * READ lock makes alloc_shrinker_info() racy against memory allocation fail.
    alloc_shrinker_info()->free_shrinker_info() may free memory right after
    shrink_slab_memcg() dereferenced it. You may say
    shrink_slab_memcg()->mem_cgroup_online() protects us from it? Yes, sure,
    but this is not the thing we want to remember in the future, since this
    spreads modularity.

And a test with heavy paging workload didn't show write lock makes things worse.

Link: https://lkml.kernel.org/r/20210311190845.9708-4-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a495398d

mm: vmscan: consolidate shrinker_maps handling code · dcf6ec43

由 Yang Shi 提交于 9月 04, 2021

mainline inclusion
from mainline-v5.13-rc1
commit 2bfd3637
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I48N0H
CVE: NA

-------------------------------------------------

The shrinker map management is not purely memcg specific, it is at the
intersection between memory cgroup and shrinkers.  It's allocation and
assignment of a structure, and the only memcg bit is the map is being
stored in a memcg structure.  So move the shrinker_maps handling code
into vmscan.c for tighter integration with shrinker code, and remove the
"memcg_" prefix.  There is no functional change.

Link: https://lkml.kernel.org/r/20210311190845.9708-3-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/memcontrol.c
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dcf6ec43

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功