提交 · c45248db04f8e3aca4798d67a394fb9cc2168118 · openeuler / Kernel

24 8月, 2022 2 次提交

mm/slab_common: cleanup kmalloc_track_caller() · c45248db

由 Hyeonggon Yoo 提交于 8月 17, 2022

Make kmalloc_track_caller() wrapper of kmalloc_node_track_caller().
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>

c45248db

mm/slab_common: remove CONFIG_NUMA ifdefs for common kmalloc functions · f78a03f6

由 Hyeonggon Yoo 提交于 8月 17, 2022

Now that slab_alloc_node() is available for SLAB when CONFIG_NUMA=n,
remove CONFIG_NUMA ifdefs for common kmalloc functions.
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>

f78a03f6

20 7月, 2022 1 次提交

mm/sl[au]b: use own bulk free function when bulk alloc failed · 2055e67b

由 Hyeonggon Yoo 提交于 6月 15, 2022

There is no benefit to call generic bulk free function when
kmem_cache_alloc_bulk() failed. Use own kmem_cache_free_bulk()
instead of generic function.

Note that if kmem_cache_alloc_bulk() fails to allocate first object in
SLUB, size is zero. So allow passing size == 0 to kmem_cache_free_bulk()
like SLAB's.
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>

2055e67b

04 7月, 2022 3 次提交

mm: slab: optimize memcg_slab_free_hook() · b77d5b1b

由 Muchun Song 提交于 4月 29, 2022

Most callers of memcg_slab_free_hook() already know the slab, which could
be passed to memcg_slab_free_hook() directly to reduce the overhead of an
another call of virt_to_slab(). For bulk freeing of objects, the call of
slab_objcgs() in the loop in memcg_slab_free_hook() is redundant as well.
Rework memcg_slab_free_hook() and build_detached_freelist() to reduce
those unnecessary overhead and make memcg_slab_free_hook() can handle bulk
freeing in slab_free().

Move the calling site of memcg_slab_free_hook() from do_slab_free() to
slab_free() for slub to make the code clearer since the logic is weird
(e.g. the caller need to judge whether it needs to call
memcg_slab_free_hook()). It is easy to make mistakes like missing calling
of memcg_slab_free_hook() like fixes of:

commit d1b2cf6c ("mm: memcg/slab: uncharge during kmem_cache_free_bulk()")
commit ae085d7f ("mm: kfence: fix missing objcg housekeeping for SLAB")

This optimization is mainly for bulk objects freeing. The following numbers
is shown for 16-object freeing.

before after
kmem_cache_free_bulk: ~430 ns ~400 ns

The overhead is reduced by about 7% for 16-object freeing.
Signed-off-by: NMuchun Song <songmuchun@bytedance.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Link: https://lore.kernel.org/r/20220429123044.37885-1-songmuchun@bytedance.comSigned-off-by: NVlastimil Babka <vbabka@suse.cz>

b77d5b1b

mm/tracing: add 'accounted' entry into output of allocation tracepoints · b347aa7b

由 Vasily Averin 提交于 6月 03, 2022

Slab caches marked with SLAB_ACCOUNT force accounting for every
allocation from this cache even if __GFP_ACCOUNT flag is not passed.
Unfortunately, at the moment this flag is not visible in ftrace output,
and this makes it difficult to analyze the accounted allocations.

This patch adds boolean "accounted" entry into trace output,
and set it to 'true' for calls used __GFP_ACCOUNT flag and
for allocations from caches marked with SLAB_ACCOUNT.
Set it to 'false' if accounting is disabled in configs.
Signed-off-by: NVasily Averin <vvs@openvz.org>
Acked-by: NShakeel Butt <shakeelb@google.com>
Acked-by: NRoman Gushchin <roman.gushchin@linux.dev>
Acked-by: NMuchun Song <songmuchun@bytedance.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Link: https://lore.kernel.org/r/c418ed25-65fe-f623-fbf8-1676528859ed@openvz.orgSigned-off-by: NVlastimil Babka <vbabka@suse.cz>

b347aa7b

mm/slub: Simplify __kmem_cache_alias() · efb93527

由 Xiongwei Song 提交于 5月 31, 2022

There is no need to do anything if sysfs_slab_alias() return nonzero
value after getting a mergeable cache.
Signed-off-by: NXiongwei Song <xiongwei.song@windriver.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Link: https://lore.kernel.org/all/e5ebc952-af17-321f-5343-bc914d47c931@suse.cz/Signed-off-by: NVlastimil Babka <vbabka@suse.cz>

efb93527

13 6月, 2022 2 次提交

mm/slub: add missing TID updates on slab deactivation · eeaa345e

由 Jann Horn 提交于 6月 08, 2022

The fastpath in slab_alloc_node() assumes that c->slab is stable as long as
the TID stays the same. However, two places in __slab_alloc() currently
don't update the TID when deactivating the CPU slab.

If multiple operations race the right way, this could lead to an object
getting lost; or, in an even more unlikely situation, it could even lead to
an object being freed onto the wrong slab's freelist, messing up the
`inuse` counter and eventually causing a page to be freed to the page
allocator while it still contains slab objects.

(I haven't actually tested these cases though, this is just based on
looking at the code. Writing testcases for this stuff seems like it'd be
a pain...)

The race leading to state inconsistency is (all operations on the same CPU
and kmem_cache):

 - task A: begin do_slab_free():
    - read TID
    - read pcpu freelist (==NULL)
    - check `slab == c->slab` (true)
 - [PREEMPT A->B]
 - task B: begin slab_alloc_node():
    - fastpath fails (`c->freelist` is NULL)
    - enter __slab_alloc()
    - slub_get_cpu_ptr() (disables preemption)
    - enter ___slab_alloc()
    - take local_lock_irqsave()
    - read c->freelist as NULL
    - get_freelist() returns NULL
    - write `c->slab = NULL`
    - drop local_unlock_irqrestore()
    - goto new_slab
    - slub_percpu_partial() is NULL
    - get_partial() returns NULL
    - slub_put_cpu_ptr() (enables preemption)
 - [PREEMPT B->A]
 - task A: finish do_slab_free():
    - this_cpu_cmpxchg_double() succeeds()
    - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]

From there, the object on c->freelist will get lost if task B is allowed to
continue from here: It will proceed to the retry_load_slab label,
set c->slab, then jump to load_freelist, which clobbers c->freelist.

But if we instead continue as follows, we get worse corruption:

 - task A: run __slab_free() on object from other struct slab:
    - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial)
 - task A: run slab_alloc_node() with NUMA node constraint:
    - fastpath fails (c->slab is NULL)
    - call __slab_alloc()
    - slub_get_cpu_ptr() (disables preemption)
    - enter ___slab_alloc()
    - c->slab is NULL: goto new_slab
    - slub_percpu_partial() is non-NULL
    - set c->slab to slub_percpu_partial(c)
    - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects
      from slab-2]
    - goto redo
    - node_match() fails
    - goto deactivate_slab
    - existing c->freelist is passed into deactivate_slab()
    - inuse count of slab-1 is decremented to account for object from
      slab-2

At this point, the inuse count of slab-1 is 1 lower than it should be.
This means that if we free all allocated objects in slab-1 except for one,
SLUB will think that slab-1 is completely unused, and may free its page,
leading to use-after-free.

Fixes: c17dda40 ("slub: Separate out kmem_cache_cpu processing from deactivate_slab")
Fixes: 03e404af ("slub: fast release on full slab")
Cc: stable@vger.kernel.org
Signed-off-by: NJann Horn <jannh@google.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com

eeaa345e

mm/slub: Move the stackdepot related allocation out of IRQ-off section. · c4cf6785

由 Sebastian Andrzej Siewior 提交于 6月 07, 2022

The set_track() invocation in free_debug_processing() is invoked with
acquired slab_lock(). The lock disables interrupts on PREEMPT_RT and
this forbids to allocate memory which is done in stack_depot_save().

Split set_track() into two parts: set_track_prepare() which allocate
memory and set_track_update() which only performs the assignment of the
trace data structure. Use set_track_prepare() before disabling
interrupts.

[ vbabka@suse.cz: make set_track() call set_track_update() instead of
  open-coded assignments ]

Fixes: 5cf909c5 ("mm/slub: use stackdepot to save stack trace in objects")
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/Yp9sqoUi4fVa5ExF@linutronix.de

c4cf6785

02 5月, 2022 1 次提交

mm/slub: remove unused kmem_cache_order_objects max · 23587f7c

由 Miaohe Lin 提交于 4月 29, 2022

max field holds the largest slab order that was ever used for a slab cache.
But it's unused now. Remove it.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220429090545.33413-1-linmiaohe@huawei.com

23587f7c

21 4月, 2022 1 次提交

mm/slub: remove unneeded return value of slab_pad_check · a204e6d6

由 Miaohe Lin 提交于 4月 19, 2022

The return value of slab_pad_check is never used. So we can make it return
void now.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220419120352.37825-1-linmiaohe@huawei.com

a204e6d6

16 4月, 2022 1 次提交

mm, kfence: support kmem_dump_obj() for KFENCE objects · 2dfe63e6

由 Marco Elver 提交于 4月 14, 2022

Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been
producing garbage data due to the object not actually being maintained
by SLAB or SLUB.

Fix this by implementing __kfence_obj_info() that copies relevant
information to struct kmem_obj_info when the object was allocated by
KFENCE; this is called by a common kmem_obj_info(), which also calls the
slab/slub/slob specific variant now called __kmem_obj_info().

For completeness, kmem_dump_obj() now displays if the object was
allocated by KFENCE.

Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com
Fixes: b89fb5ef ("mm, kfence: insert KFENCE hooks for SLUB")
Fixes: d3fb45f3 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: NMarco Elver <elver@google.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reported-by: Nkernel test robot <oliver.sang@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>	[slab]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2dfe63e6

13 4月, 2022 3 次提交

mm/slub: remove meaningless node check in ___slab_alloc() · 6b6efe23

由 JaeSang Yoo 提交于 4月 09, 2022

node_match() with node=NUMA_NO_NODE always returns 1.
Duplicate check by goto statement is meaningless. Remove it.
Signed-off-by: NJaeSang Yoo <jsyoo5b@gmail.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220409144239.2649257-1-jsyoo5b@gmail.com

6b6efe23

mm/slub: remove duplicate flag in allocate_slab() · 27c08f75

由 Jiyoup Kim 提交于 4月 10, 2022

In allocate_slab(), __GFP_NOFAIL flag is removed twice when trying
higher-order allocation. Remove it.
Signed-off-by: NJiyoup Kim <lakroforce@gmail.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220409150538.1264-1-lakroforce@gmail.com

27c08f75

mm/slub: remove unused parameter in setup_object*() · c0f81a94

由 JaeSang Yoo 提交于 4月 11, 2022

setup_object_debug() and setup_object() has unused parameter, "struct
slab *slab". Remove it.

By the commit 3ec09742 ("SLUB: Simplify debug code"),
setup_object_debug() were introduced to refactor previous code blocks
in the setup_object(). Previous code used SlabDebug() to init_object()
and init_tracking(). As the SlabDebug() takes "struct page *page" as
argument, the setup_object_debug() checks flag of "struct kmem_cache *s"
which doesn't require "struct page *page".
As the struct page were changed into struct slab by commit bb192ed9
("mm/slub: Convert most struct page to struct slab by spatch"), but it's
still unused parameter.
Suggested-by: NOhhoon Kwon <ohkwon1043@gmail.com>
Signed-off-by: NJaeSang Yoo <jsyoo5b@gmail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220411072534.3372768-1-jsyoo5b@gmail.com

c0f81a94

06 4月, 2022 5 次提交

mm/slub: sort debugfs output by frequency of stack traces · 553c0369

由 Oliver Glitta 提交于 5月 21, 2021

Sort the output of debugfs alloc_traces and free_traces by the frequency
of allocation/freeing stack traces. Most frequently used stack traces
will be printed first, e.g. for easier memory leak debugging.
Signed-off-by: NOliver Glitta <glittao@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-and-tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>

553c0369

mm/slub: distinguish and print stack traces in debugfs files · 8ea9fb92

由 Oliver Glitta 提交于 5月 21, 2021

Aggregate objects in slub cache by unique stack trace in addition to
caller address when producing contents of debugfs files alloc_traces and
free_traces in debugfs. Also add the stack traces to the debugfs output.
This makes it much more useful to e.g. debug memory leaks.
Signed-off-by: NOliver Glitta <glittao@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-and-tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

8ea9fb92

mm/slub: use stackdepot to save stack trace in objects · 5cf909c5

由 Oliver Glitta 提交于 7月 07, 2021

Many stack traces are similar so there are many similar arrays.
Stackdepot saves each unique stack only once.

Replace field addrs in struct track with depot_stack_handle_t handle.  Use
stackdepot to save stack trace.

The benefits are smaller memory overhead and possibility to aggregate
per-cache statistics in the following patch using the stackdepot handle
instead of matching stacks manually.

[ vbabka@suse.cz: rebase to 5.17-rc1 and adjust accordingly ]

This was initially merged as commit 78869146 and reverted by commit
ae14c63a due to several issues, that should now be fixed.
The problem of unconditional memory overhead by stackdepot has been
addressed by commit 2dba5eb1 ("lib/stackdepot: allow optional init
and stack_table allocation by kvmalloc()"), so the dependency on
stackdepot will result in extra memory usage only when a slab cache
tracking is actually enabled, and not for all CONFIG_SLUB_DEBUG builds.
The build failures on some architectures were also addressed, and the
reported issue with xfs/433 test did not reproduce on 5.17-rc1 with this
patch.
Signed-off-by: NOliver Glitta <glittao@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-and-tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>

5cf909c5

mm/slub: move struct track init out of set_track() · 0cd1a029

由 Vlastimil Babka 提交于 2月 04, 2022

set_track() either zeroes out the struct track or fills it, depending on
the addr parameter. This is unnecessary as there's only one place that
calls it for the initialization - init_tracking(). We can simply do the
zeroing there, with a single memset() that covers both TRACK_ALLOC and
TRACK_FREE as they are adjacent.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-and-tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>

0cd1a029

mm/slub, kunit: Make slub_kunit unaffected by user specified flags · a285909f

由 Hyeonggon Yoo 提交于 4月 06, 2022

slub_kunit does not expect other debugging flags to be set when running
tests. When SLAB_RED_ZONE flag is set globally, test fails because the
flag affects number of errors reported.

To make slub_kunit unaffected by user specified debugging flags,
introduce SLAB_NO_USER_FLAGS to ignore them. With this flag, only flags
specified in the code are used and others are ignored.
Suggested-by: NVlastimil Babka <vbabka@suse.cz>
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/Yk0sY9yoJhFEXWOg@hyeyoo

a285909f

23 3月, 2022 1 次提交

mm: introduce kmem_cache_alloc_lru · 88f2ef73

由 Muchun Song 提交于 3月 22, 2022

We currently allocate scope for every memcg to be able to tracked on
every superblock instantiated in the system, regardless of whether that
superblock is even accessible to that memcg.

These huge memcg counts come from container hosts where memcgs are
confined to just a small subset of the total number of superblocks that
instantiated at any given point in time.

For these systems with huge container counts, list_lru does not need the
capability of tracking every memcg on every superblock.  What it comes
down to is that adding the memcg to the list_lru at the first insert.
So introduce kmem_cache_alloc_lru to allocate objects and its list_lru.
In the later patch, we will convert all inode and dentry allocation from
kmem_cache_alloc to kmem_cache_alloc_lru.

Link: https://lkml.kernel.org/r/20220228122126.37293-3-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kari Argillander <kari.argillander@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

88f2ef73

11 3月, 2022 1 次提交

mm: slub: Delete useless parameter of alloc_slab_page() · a485e1da

由 Xiongwei Song 提交于 3月 10, 2022

The parameter @s is useless for alloc_slab_page(). It was added in 2014
by commit 5dfb4175 ("sl[au]b: charge slabs to kmemcg explicitly"). The
need for it was removed in 2020 by commit 1f3147b4 ("mm: slub: call
account_slab_page() after slab page initialization"). Let's delete it.

[willy@infradead.org: Added detailed history of @s]
Signed-off-by: NXiongwei Song <sxwjean@gmail.com>
Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NRoman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220310140701.87908-3-sxwjean@me.com

a485e1da

09 3月, 2022 3 次提交

mm/slub: remove forced_order parameter in calculate_sizes · ae44d81d

由 Miaohe Lin 提交于 3月 09, 2022

Since commit 32a6f409 ("mm, slub: remove runtime allocation order
changes"), forced_order is always -1. Remove this unneeded parameter
to simplify the code.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220309092036.50844-1-linmiaohe@huawei.com

ae44d81d

mm/slub: refactor deactivate_slab() · 6d3a16d0

由 Hyeonggon Yoo 提交于 3月 07, 2022

Simplify deactivate_slab() by unlocking n->list_lock and retrying
cmpxchg_double() when cmpxchg_double() fails, and perform
add_{partial,full} only when it succeed.

Releasing and taking n->list_lock again here is not harmful as SLUB
avoids deactivating slabs as much as possible.

[ vbabka@suse.cz: perform add_{partial,full} when cmpxchg_double()
  succeed.

  count deactivating full slabs even if debugging flag is not set. ]
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220307074057.902222-3-42.hyeyoo@gmail.com

6d3a16d0

mm/slub: limit number of node partial slabs only in cache creation · 5182f3c9

由 Hyeonggon Yoo 提交于 3月 07, 2022

SLUB sets number of minimum partial slabs for node (min_partial)
using set_min_partial(). SLUB holds at least min_partial slabs even if
they're empty to avoid excessive use of page allocator.

set_min_partial() limits value of min_partial limits value of
min_partial MIN_PARTIAL and MAX_PARTIAL. As set_min_partial() can be
called by min_partial_store() too, Only limit value of min_partial
in kmem_cache_open() so that it can be changed to value that a user wants.

[ rientjes@google.com: Fold set_min_partial() into its callers ]
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220307074057.902222-2-42.hyeyoo@gmail.com

5182f3c9

08 3月, 2022 1 次提交

mm/slub: use helper macro __ATTR_XX_MODE for SLAB_ATTR(_RO) · d1d28bd9

由 Lianjie Zhang 提交于 3月 06, 2022

This allows more concise code, and VERIFY_OCTAL_PERMISSIONS() can help
validate any future change.
Signed-off-by: NLianjie Zhang <zhanglianjie@uniontech.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220306073818.15089-1-zhanglianjie@uniontech.com

d1d28bd9

06 1月, 2022 15 次提交

mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled · 9c01e9af

由 Vlastimil Babka 提交于 11月 10, 2021

The fields 'next' and 'slabs' are only used when CONFIG_SLUB_CPU_PARTIAL
is enabled. We can put their definition to #ifdef to prevent accidental
use when disabled.

Currenlty show_slab_objects() and slabs_cpu_partial_show() contain code
accessing the slabs field that's effectively dead with
CONFIG_SLUB_CPU_PARTIAL=n through the wrappers slub_percpu_partial() and
slub_percpu_partial_read_once(), but to prevent a compile error, we need
to hide all this code behind #ifdef.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

9c01e9af

mm/kasan: Convert to struct folio and struct slab · 6e48a966

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

KASAN accesses some slab related struct page fields so we need to
convert it to struct slab. Some places are a bit simplified thanks to
kasan_addr_to_slab() encapsulating the PageSlab flag check through
virt_to_slab().  When resolving object address to either a real slab or
a large kmalloc, use struct folio as the intermediate type for testing
the slab flag to avoid unnecessary implicit compound_head().

[ vbabka@suse.cz: use struct folio, adjust to differences in previous
  patches ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NAndrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Tested-by: NHyeongogn Yoo <42.hyeyoo@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <kasan-dev@googlegroups.com>

6e48a966

mm: Convert struct page to struct slab in functions used by other subsystems · 40f3bf0c

由 Vlastimil Babka 提交于 11月 02, 2021

KASAN, KFENCE and memcg interact with SLAB or SLUB internals through
functions nearest_obj(), obj_to_index() and objs_per_slab() that use
struct page as parameter. This patch converts it to struct slab
including all callers, through a coccinelle semantic patch.

// Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace

@@
@@

-objs_per_slab_page(
+objs_per_slab(
 ...
 )
 { ... }

@@
@@

-objs_per_slab_page(
+objs_per_slab(
 ...
 )

@@
identifier fn =~ "obj_to_index|objs_per_slab";
@@

 fn(...,
-   const struct page *page
+   const struct slab *slab
    ,...)
 {
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
 }

@@
identifier fn =~ "nearest_obj";
@@

 fn(...,
-   struct page *page
+   const struct slab *slab
    ,...)
 {
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
 }

@@
identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab";
expression E;
@@

 fn(...,
(
- slab_page(E)
+ E
|
- virt_to_page(E)
+ virt_to_slab(E)
|
- virt_to_head_page(E)
+ virt_to_slab(E)
|
- page
+ page_slab(page)
)
  ,...)
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NAndrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <kasan-dev@googlegroups.com>
Cc: <cgroups@vger.kernel.org>

40f3bf0c

mm/slub: Finish struct page to struct slab conversion · c2092c12

由 Vlastimil Babka 提交于 11月 15, 2021

Update comments mentioning pages to mention slabs where appropriate.
Also some goto labels.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

c2092c12

mm/slub: Convert most struct page to struct slab by spatch · bb192ed9

由 Vlastimil Babka 提交于 11月 03, 2021

The majority of conversion from struct page to struct slab in SLUB
internals can be delegated to a coccinelle semantic patch. This includes
renaming of variables with 'page' in name to 'slab', and similar.

Big thanks to Julia Lawall and Luis Chamberlain for help with
coccinelle.

// Options: --include-headers --no-includes --smpl-spacing include/linux/slub_def.h mm/slub.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace, and ocaml for the
// embedded script

// build list of functions to exclude from applying the next rule
@initialize:ocaml@
@@

let ok_function p =
  not (List.mem (List.hd p).current_element ["nearest_obj";"obj_to_index";"objs_per_slab_page";"__slab_lock";"__slab_unlock";"free_nonslab_page";"kmalloc_large_node"])

// convert the type from struct page to struct page in all functions except the
// list from previous rule
// this also affects struct kmem_cache_cpu, but that's ok
@@
position p : script:ocaml() { ok_function p };
@@

- struct page@p
+ struct slab

// in struct kmem_cache_cpu, change the name from page to slab
// the type was already converted by the previous rule
@@
@@

struct kmem_cache_cpu {
...
-struct slab *page;
+struct slab *slab;
...
}

// there are many places that use c->page which is now c->slab after the
// previous rule
@@
struct kmem_cache_cpu *c;
@@

-c->page
+c->slab

@@
@@

struct kmem_cache {
...
- unsigned int cpu_partial_pages;
+ unsigned int cpu_partial_slabs;
...
}

@@
struct kmem_cache *s;
@@

- s->cpu_partial_pages
+ s->cpu_partial_slabs

@@
@@

static void
- setup_page_debug(
+ setup_slab_debug(
 ...)
 {...}

@@
@@

- setup_page_debug(
+ setup_slab_debug(
 ...);

// for all functions (with exceptions), change any "struct slab *page"
// parameter to "struct slab *slab" in the signature, and generally all
// occurences of "page" to "slab" in the body - with some special cases.

@@
identifier fn !~ "free_nonslab_page|obj_to_index|objs_per_slab_page|nearest_obj";
@@
 fn(...,
-   struct slab *page
+   struct slab *slab
    ,...)
 {
<...
- page
+ slab
...>
 }

// similar to previous but the param is called partial_page
@@
identifier fn;
@@

 fn(...,
-   struct slab *partial_page
+   struct slab *partial_slab
    ,...)
 {
<...
- partial_page
+ partial_slab
...>
 }

// similar to previous but for functions that take pointer to struct page ptr
@@
identifier fn;
@@

 fn(...,
-   struct slab **ret_page
+   struct slab **ret_slab
    ,...)
 {
<...
- ret_page
+ ret_slab
...>
 }

// functions converted by previous rules that were temporarily called using
// slab_page(E) so we want to remove the wrapper now that they accept struct
// slab ptr directly
@@
identifier fn =~ "slab_free|do_slab_free";
expression E;
@@

 fn(...,
- slab_page(E)
+ E
  ,...)

// similar to previous but for another pattern
@@
identifier fn =~ "slab_pad_check|check_object";
@@

 fn(...,
- folio_page(folio, 0)
+ slab
  ,...)

// functions that were returning struct page ptr and now will return struct
// slab ptr, including slab_page() wrapper removal
@@
identifier fn =~ "allocate_slab|new_slab";
expression E;
@@

 static
-struct slab *
+struct slab *
 fn(...)
 {
<...
- slab_page(E)
+ E
...>
 }

// rename any former struct page * declarations
@@
@@

struct slab *
(
- page
+ slab
|
- partial_page
+ partial_slab
|
- oldpage
+ oldslab
)
;

// this has to be separate from previous rule as page and page2 appear at the
// same line
@@
@@

struct slab *
-page2
+slab2
;

// similar but with initial assignment
@@
expression E;
@@

struct slab *
(
- page
+ slab
|
- flush_page
+ flush_slab
|
- discard_page
+ slab_to_discard
|
- page_to_unfreeze
+ slab_to_unfreeze
)
= E;

// convert most of struct page to struct slab usage inside functions (with
// exceptions), including specific variable renames
@@
identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node";
expression E;
@@

 fn(...)
 {
<...
(
- int pages;
+ int slabs;
|
- int pages = E;
+ int slabs = E;
|
- page
+ slab
|
- flush_page
+ flush_slab
|
- partial_page
+ partial_slab
|
- oldpage->pages
+ oldslab->slabs
|
- oldpage
+ oldslab
|
- unsigned int nr_pages;
+ unsigned int nr_slabs;
|
- nr_pages
+ nr_slabs
|
- unsigned int partial_pages = E;
+ unsigned int partial_slabs = E;
|
- partial_pages
+ partial_slabs
)
...>
 }

// this has to be split out from the previous rule so that lines containing
// multiple matching changes will be fully converted
@@
identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node";
@@

 fn(...)
 {
<...
(
- slab->pages
+ slab->slabs
|
- pages
+ slabs
|
- page2
+ slab2
|
- discard_page
+ slab_to_discard
|
- page_to_unfreeze
+ slab_to_unfreeze
)
...>
 }

// after we simply changed all occurences of page to slab, some usages need
// adjustment for slab-specific functions, or use slab_page() wrapper
@@
identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node";
@@

 fn(...)
 {
<...
(
- page_slab(slab)
+ slab
|
- kasan_poison_slab(slab)
+ kasan_poison_slab(slab_page(slab))
|
- page_address(slab)
+ slab_address(slab)
|
- page_size(slab)
+ slab_size(slab)
|
- PageSlab(slab)
+ folio_test_slab(slab_folio(slab))
|
- page_to_nid(slab)
+ slab_nid(slab)
|
- compound_order(slab)
+ slab_order(slab)
)
...>
 }
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Luis Chamberlain <mcgrof@kernel.org>

bb192ed9

mm/slub: Convert pfmemalloc_match() to take a struct slab · 01b34d16

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

Preparatory for mass conversion. Use the new slab_test_pfmemalloc()
helper.  As it doesn't do VM_BUG_ON(!PageSlab()) we no longer need the
pfmemalloc_match_unsafe() variant.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

01b34d16

mm/slub: Convert __free_slab() to use struct slab · 4020b4a2

由 Vlastimil Babka 提交于 10月 29, 2021

__free_slab() is on the boundary of distinguishing struct slab and
struct page so start with struct slab but convert to folio for working
with flags and folio_page() to call functions that require struct page.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

4020b4a2

mm/slub: Convert alloc_slab_page() to return a struct slab · 45387b8c

由 Vlastimil Babka 提交于 10月 26, 2021

Preparatory, callers convert back to struct page for now.

Also move setting page flags to alloc_slab_page() where we still operate
on a struct page. This means the page->slab_cache pointer is now set
later than the PageSlab flag, which could theoretically confuse some pfn
walker assuming PageSlab means there would be a valid cache pointer. But
as the code had no barriers and used __set_bit() anyway, it could have
happened already, so there shouldn't be such a walker.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

45387b8c

mm/slub: Convert print_page_info() to print_slab_info() · fb012e27

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

Improve the type safety and prepare for further conversion. For flags
access, convert to folio internally.

[ vbabka@suse.cz: access flags via folio_flags() ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>

fb012e27

mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab · 0393895b

由 Vlastimil Babka 提交于 10月 26, 2021

These functions operate on the PG_locked page flag, but make them accept
struct slab to encapsulate this implementation detail.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

0393895b

mm/slub: Convert kfree() to use a struct slab · d835eef4

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

Convert kfree(), kmem_cache_free() and ___cache_free() to resolve object
addresses to struct slab, using folio as intermediate step where needed.
Keep passing the result as struct page for now in preparation for mass
conversion of internal functions.

[ vbabka@suse.cz: Use folio as intermediate step when checking for
  large kmalloc pages, and when freeing them - rename
  free_nonslab_page() to free_large_kmalloc() that takes struct folio ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

d835eef4

mm/slub: Convert detached_freelist to use a struct slab · cc465c3b

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

This gives us a little bit of extra typesafety as we know that nobody
called virt_to_page() instead of virt_to_head_page().

[ vbabka@suse.cz: Use folio as intermediate step when filtering out
  large kmalloc pages ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

cc465c3b

mm: Convert check_heap_object() to use struct slab · 0b3eb091

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

Ensure that we're not seeing a tail page inside __check_heap_object() by
converting to a slab instead of a page.  Take the opportunity to mark
the slab as const since we're not modifying it.  Also move the
declaration of __check_heap_object() to mm/slab.h so it's not available
to the wider kernel.

[ vbabka@suse.cz: in check_heap_object() only convert to struct slab for
  actual PageSlab pages; use folio as intermediate step instead of page ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

0b3eb091

mm: Use struct slab in kmem_obj_info() · 7213230a

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

All three implementations of slab support kmem_obj_info() which reports
details of an object allocated from the slab allocator.  By using the
slab type instead of the page type, we make it obvious that this can
only be called for slabs.

[ vbabka@suse.cz: also convert the related kmem_valid_obj() to folios ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NRoman Gushchin <guro@fb.com>

7213230a

mm: Convert __ksize() to struct slab · 0c24811b

由 Matthew Wilcox (Oracle) 提交于 10月 04, 2021

In SLUB, use folios, and struct slab to access slab_cache field.
In SLOB, use folios to properly resolve pointers beyond
PAGE_SIZE offset of the object.

[ vbabka@suse.cz: use folios, and only convert folio_test_slab() == true
  folios to struct slab ]
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NRoman Gushchin <guro@fb.com>

0c24811b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功