- 30 11月, 2021 40 次提交
-
-
由 Shakeel Butt 提交于
mainline inclusion from mainline-v5.16-rc1 commit fd25a9e0 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fd25a9e0e23b --------------------------------------------------------------------- The memcg stats can be flushed in multiple context and potentially in parallel too. For example multiple parallel user space readers for memcg stats will contend on the rstat locks with each other. There is no need for that. We just need one flusher and everyone else can benefit. In addition after aa48e47e ("memcg: infrastructure to flush memcg stats") the kernel periodically flush the memcg stats from the root, so, the other flushers will potentially have much less work to do. Link: https://lkml.kernel.org/r/20211001190040.48086-2-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Michal Koutný" <mkoutny@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shakeel Butt 提交于
mainline inclusion from mainline-v5.16-rc1 commit 11192d9c category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=11192d9c124d ---------------------------------------------------------------------- At the moment, the kernel flushes the memcg stats on every refault and also on every reclaim iteration. Although rstat maintains per-cpu update tree but on the flush the kernel still has to go through all the cpu rstat update tree to check if there is anything to flush. This patch adds the tracking on the stats update side to make flush side more clever by skipping the flush if there is no update. The stats update codepath is very sensitive performance wise for many workloads and benchmarks. So, we can not follow what the commit aa48e47e ("memcg: infrastructure to flush memcg stats") did which was triggering async flush through queue_work() and caused a lot performance regression reports. That got reverted by the commit 1f828223 ("memcg: flush lruvec stats in the refault"). In this patch we kept the stats update codepath very minimal and let the stats reader side to flush the stats only when the updates are over a specific threshold. For now the threshold is (nr_cpus * CHARGE_BATCH). To evaluate the impact of this patch, an 8 GiB tmpfs file is created on a system with swap-on-zram and the file was pushed to swap through memory.force_empty interface. On reading the whole file, the memcg stat flush in the refault code path is triggered. With this patch, we observed 63% reduction in the read time of 8 GiB file. Link: https://lkml.kernel.org/r/20211001190040.48086-1-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Reviewed-by: N"Michal Koutný" <mkoutny@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.16-rc1 commit 3c08b093 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c08b0931eed -------------------------------------------------------------- c3df5fb5 ("cgroup: rstat: fix A-A deadlock on 32bit around u64_stats_sync") made u64_stats updates irq-safe to avoid A-A deadlocks. Unfortunately, the conversion missed one in blk_cgroup_bio_start(). Fix it. Fixes: 2d146aa3 ("mm: memcontrol: switch to rstat") Cc: stable@vger.kernel.org # v5.13+ Reported-by: syzbot+9738c8815b375ce482a1@syzkaller.appspotmail.com Signed-off-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/YWi7NrQdVlxD6J9W@slm.duckdns.orgSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shakeel Butt 提交于
mainline inclusion from mainline-v5.15-rc3 commit 1f828223 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f828223b799 ----------------------------------------------------------------------- Prior to the commit 7e1c0d6f ("memcg: switch lruvec stats to rstat") and the commit aa48e47e ("memcg: infrastructure to flush memcg stats"), each lruvec memcg stats can be off by (nr_cgroups * nr_cpus * 32) at worst and for unbounded amount of time. The commit aa48e47e moved the lruvec stats to rstat infrastructure and the commit 7e1c0d6f bounded the error for all the lruvec stats to (nr_cpus * 32) at worst for at most 2 seconds. More specifically it decoupled the number of stats and the number of cgroups from the error rate. However this reduction in error comes with the cost of triggering the slowpath of stats update more frequently. Previously in the slowpath the kernel adds the stats up the memcg tree. After aa48e47e, the kernel triggers the asyn lruvec stats flush through queue_work(). This causes regression reports from 0day kernel bot [1] as well as from phoronix test suite [2]. We tried two options to fix the regression: 1) Increase the threshold to trigger the slowpath in lruvec stats update codepath from 32 to 512. 2) Remove the slowpath from lruvec stats update codepath and instead flush the stats in the page refault codepath. The assumption is that the kernel timely flush the stats, so, the update tree would be small in the refault codepath to not cause the preformance impact. Following are the results of will-it-scale/page_fault[1|2|3] benchmark on four settings i.e. (1) 5.15-rc1 as baseline (2) 5.15-rc1 with aa48e47e and 7e1c0d6f reverted (3) 5.15-rc1 with option-1 (4) 5.15-rc1 with option-2. test (1) (2) (3) (4) pg_f1 368563 406277 (10.23%) 399693 (8.44%) 416398 (12.97%) pg_f2 338399 372133 (9.96%) 369180 (9.09%) 381024 (12.59%) pg_f3 500853 575399 (14.88%) 570388 (13.88%) 576083 (15.02%) From the above result, it seems like the option-2 not only solves the regression but also improves the performance for at least these benchmarks. Feng Tang (intel) ran the aim7 benchmark with these two options and confirms that option-1 reduces the regression but option-2 removes the regression. Michael Larabel (phoronix) ran multiple benchmarks with these options and reported the results at [3] and it shows for most benchmarks option-2 removes the regression introduced by the commit aa48e47e ("memcg: infrastructure to flush memcg stats"). Based on the experiment results, this patch proposed the option-2 as the solution to resolve the regression. Link: https://lore.kernel.org/all/20210726022421.GB21872@xsang-OptiPlex-9020 [1] Link: https://www.phoronix.com/scan.php?page=article&item=linux515-compile-regress [2] Link: https://openbenchmarking.org/result/2109226-DEBU-LINUX5104 [3] Fixes: aa48e47e ("memcg: infrastructure to flush memcg stats") Signed-off-by: NShakeel Butt <shakeelb@google.com> Tested-by: NMichael Larabel <Michael@phoronix.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hillf Danton <hdanton@sina.com>, Cc: Michal Koutný <mkoutny@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Miaohe Lin 提交于
mainline inclusion from mainline-v5.15-rc1 commit bec49c06 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bec49c067c67 ----------------------------------------------------------------------- Since commit 2d146aa3 ("mm: memcontrol: switch to rstat"), last user of memcg_stat_item_in_bytes() is gone. And since commit fa40d1ee ("mm: vmscan: memcontrol: remove mem_cgroup_select_victim_node()"), only the declaration of mem_cgroup_select_victim_node() is remained here. Remove them. Link: https://lkml.kernel.org/r/20210807082835.61281-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: NRoman Gushchin <guro@fb.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Alex Shi <alexs@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shakeel Butt 提交于
mainline inclusion from mainline-v5.15-rc1 commit aa48e47e category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aa48e47e3906 ------------------------------------------------------ At the moment memcg stats are read in four contexts: 1. memcg stat user interfaces 2. dirty throttling 3. page fault 4. memory reclaim Currently the kernel flushes the stats for first two cases. Flushing the stats for remaining two casese may have performance impact. Always flushing the memcg stats on the page fault code path may negatively impacts the performance of the applications. In addition flushing in the memory reclaim code path, though treated as slowpath, can become the source of contention for the global lock taken for stat flushing because when system or memcg is under memory pressure, many tasks may enter the reclaim path. This patch uses following mechanisms to solve these challenges: 1. Periodically flush the stats from root memcg every 2 seconds. This will time limit the out of sync stats. 2. Asynchronously flush the stats after fixed number of stat updates. In the worst case the stat can be out of sync by O(nr_cpus * BATCH) for 2 seconds. 3. For avoiding thundering herd to flush the stats particularly from the memory reclaim context, introduce memcg local spinlock and let only one flusher active at a time. This could have been done through cgroup_rstat_lock lock but that lock is used by other subsystem and for userspace reading memcg stats. So, it is better to keep flushers introduced by this patch decoupled from cgroup_rstat_lock. However we would have to use irqsafe version of rstat flush but that is fine as this code path will be flushing for whole tree and do the work for everyone. No one will be waiting for that worker. [shakeelb@google.com: fix sleep-in-wrong context bug] Link: https://lkml.kernel.org/r/20210716212137.1391164-2-shakeelb@google.com Link: https://lkml.kernel.org/r/20210714013948.270662-2-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com> Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflict: include/linux/memcontrol.h Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shakeel Butt 提交于
mainline inclusion from mainline-v5.15-rc1 commit 7e1c0d6f category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e1c0d6f58207 ------------------------------------------------------------------------- The commit 2d146aa3 ("mm: memcontrol: switch to rstat") switched memcg stats to rstat infrastructure but skipped the conversion of the lruvec stats as such stats are read in the performance critical code paths and flushing stats may have impacted the performances of the applications. This patch converts the lruvec stats to rstat and later patches add mechanisms to keep the performance impact to minimum. The rstat conversion comes with the price i.e. memory cost. Effectively this patch reverts the savings done by the commit f3344adf ("mm: memcontrol: optimize per-lruvec stats counter memory usage"). However this cost is justified due to negative impact of the inaccurate lruvec stats on many heuristics. One such case is reported in [1]. The memory reclaim code is filled with plethora of heuristics and many of those heuristics reads the lruvec stats. So, inaccurate stats can make such heuristics ineffective. [1] reports the impact of inaccurate lruvec stats on the "cache trim mode" heuristic. Inaccurate lruvec stats can impact the deactivation and aging anon heuristics as well. [1] https://lore.kernel.org/linux-mm/20210311004449.1170308-1-ying.huang@intel.com/ Link: https://lkml.kernel.org/r/20210716212137.1391164-1-shakeelb@google.com Link: https://lkml.kernel.org/r/20210714013948.270662-1-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Michal Koutný <mkoutny@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.14-rc4 commit 30def935 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=30def93565e5b ------------------------------------------------------------------------ Dan Carpenter reports: The patch 2d146aa3: "mm: memcontrol: switch to rstat" from Apr 29, 2021, leads to the following static checker warning: kernel/cgroup/rstat.c:200 cgroup_rstat_flush() warn: sleeping in atomic context mm/memcontrol.c 3572 static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) 3573 { 3574 unsigned long val; 3575 3576 if (mem_cgroup_is_root(memcg)) { 3577 cgroup_rstat_flush(memcg->css.cgroup); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is from static analysis and potentially a false positive. The problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold() which holds an rcu_read_lock(). And the cgroup_rstat_flush() function can sleep. 3578 val = memcg_page_state(memcg, NR_FILE_PAGES) + 3579 memcg_page_state(memcg, NR_ANON_MAPPED); 3580 if (swap) 3581 val += memcg_page_state(memcg, MEMCG_SWAP); 3582 } else { 3583 if (!swap) 3584 val = page_counter_read(&memcg->memory); 3585 else 3586 val = page_counter_read(&memcg->memsw); 3587 } 3588 return val; 3589 } __mem_cgroup_threshold() indeed holds the rcu lock. In addition, the thresholding code is invoked during stat changes, and those contexts have irqs disabled as well. If the lock breaking occurs inside the flush function, it will result in a sleep from an atomic context. Use the irqsafe flushing variant in mem_cgroup_usage() to fix this. Link: https://lkml.kernel.org/r/20210726150019.251820-1-hannes@cmpxchg.org Fixes: 2d146aa3 ("mm: memcontrol: switch to rstat") Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NChris Down <chris@chrisdown.name> Reviewed-by: NRik van Riel <riel@surriel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.14-rc6 commit c3df5fb5 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c3df5fb57fe8 ------------------------------------------------------------------------- 0fa294fb ("cgroup: Replace cgroup_rstat_mutex with a spinlock") added cgroup_rstat_flush_irqsafe() allowing flushing to happen from the irq context. However, rstat paths use u64_stats_sync to synchronize access to 64bit stat counters on 32bit machines. u64_stats_sync is implemented using seq_lock and trying to read from an irq context can lead to A-A deadlock if the irq happens to interrupt the stat update. Fix it by using the irqsafe variants - u64_stats_update_begin_irqsave() and u64_stats_update_end_irqrestore() - in the update paths. Note that none of this matters on 64bit machines. All these are just for 32bit SMP setups. Note that the interface was introduced way back, its first and currently only use was recently added by 2d146aa3 ("mm: memcontrol: switch to rstat"). Stable tagging targets this commit. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NRik van Riel <riel@surriel.com> Fixes: 2d146aa3 ("mm: memcontrol: switch to rstat") Cc: stable@vger.kernel.org # v5.13+ Conflict: block/blk-cgroup.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit 4bbcc5a4 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bbcc5a41c54 ------------------------------------------------------------------ With memcg having switched to rstat, memory.stat output is precise. Update the cgroup selftest to reflect the expectations and error tolerances of the new implementation. Also add newly tracked types of memory to the memory.stat side of the equation, since they're included in memory.current and could throw false positives. Link: https://lkml.kernel.org/r/20210209163304.77088-9-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NRoman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit 2cd21c89 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2cd21c89800c ----------------------------------------------------------------------- There are two functions to flush the per-cpu data of an lruvec into the rest of the cgroup tree: when the cgroup is being freed, and when a CPU disappears during hotplug. The difference is whether all CPUs or just one is being collected, but the rest of the flushing code is the same. Merge them into one function and share the common code. Link: https://lkml.kernel.org/r/20210209163304.77088-8-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NRoman Gushchin <guro@fb.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit 2d146aa3 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d146aa3aa84 ---------------------------------------------------------------------- Replace the memory controller's custom hierarchical stats code with the generic rstat infrastructure provided by the cgroup core. The current implementation does batched upward propagation from the write side (i.e. as stats change). The per-cpu batches introduce an error, which is multiplied by the number of subgroups in a tree. In systems with many CPUs and sizable cgroup trees, the error can be large enough to confuse users (e.g. 32 batch pages * 32 CPUs * 32 subgroups results in an error of up to 128M per stat item). This can entirely swallow allocation bursts inside a workload that the user is expecting to see reflected in the statistics. In the past, we've done read-side aggregation, where a memory.stat read would have to walk the entire subtree and add up per-cpu counts. This became problematic with lazily-freed cgroups: we could have large subtrees where most cgroups were entirely idle. Hence the switch to change-driven upward propagation. Unfortunately, it needed to trade accuracy for speed due to the write side being so hot. Rstat combines the best of both worlds: from the write side, it cheaply maintains a queue of cgroups that have pending changes, so that the read side can do selective tree aggregation. This way the reported stats will always be precise and recent as can be, while the aggregation can skip over potentially large numbers of idle cgroups. The way rstat works is that it implements a tree for tracking cgroups with pending local changes, as well as a flush function that walks the tree upwards. The controller then drives this by 1) telling rstat when a local cgroup stat changes (e.g. mod_memcg_state) and 2) when a flush is required to get uptodate hierarchy stats for a given subtree (e.g. when memory.stat is read). The controller also provides a flush callback that is called during the rstat flush walk for each cgroup and aggregates its local per-cpu counters and propagates them upwards. This adds a second vmstats to struct mem_cgroup (MEMCG_NR_STAT + NR_VM_EVENT_ITEMS) to track pending subtree deltas during upward aggregation. It removes 3 words from the per-cpu data. It eliminates memcg_exact_page_state(), since memcg_page_state() is now exact. [akpm@linux-foundation.org: merge fix] [hannes@cmpxchg.org: fix a sleep in atomic section problem] Link: https://lkml.kernel.org/r/20210315234100.64307-1-hannes@cmpxchg.org Link: https://lkml.kernel.org/r/20210209163304.77088-7-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NRoman Gushchin <guro@fb.com> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NBalbir Singh <bsingharora@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflict: include/linux/memcontrol.h Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mianline-v5.13-rc1 commit dc26532a category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dc26532aed0ab --------------------------------------------------------------------- Current users of the rstat code can source root-level statistics from the native counters of their respective subsystem, allowing them to forego aggregation at the root level. This optimization is currently implemented inside the generic rstat code, which doesn't track the root cgroup and doesn't invoke the subsystem flush callbacks on it. However, the memory controller cannot do this optimization, because cgroup1 breaks out memory specifically for the local level, including at the root level. In preparation for the memory controller switching to rstat, move the optimization from rstat core to the controllers. Afterwards, rstat will always track the root cgroup for changes and invoke the subsystem callbacks on it; and it's up to the subsystem to special-case and skip aggregation of the root cgroup if it can source this information through other, cheaper means. This is the case for the io controller and the cgroup base stats. In their respective flush callbacks, check whether the parent is the root cgroup, and if so, skip the unnecessary upward propagation. The extra cost of tracking the root cgroup is negligible: on stat changes, we actually remove a branch that checks for the root. The queueing for a flush touches only per-cpu data, and only the first stat change since a flush requires a (per-cpu) lock. Link: https://lkml.kernel.org/r/20210209163304.77088-6-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit a7df69b8 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7df69b81aac -------------------------------------------------------------------- Rstat currently only supports the default hierarchy in cgroup2. In order to replace memcg's private stats infrastructure - used in both cgroup1 and cgroup2 - with rstat, the latter needs to support cgroup1. The initialization and destruction callbacks for regular cgroups are already in place. Remove the cgroup_on_dfl() guards to handle cgroup1. The initialization of the root cgroup is currently hardcoded to only handle cgrp_dfl_root.cgrp. Move those callbacks to cgroup_setup_root() and cgroup_destroy_root() to handle the default root as well as the various cgroup1 roots we may set up during mounting. The linking of css to cgroups happens in code shared between cgroup1 and cgroup2 as well. Simply remove the cgroup_on_dfl() guard. Linkage of the root css to the root cgroup is a bit trickier: per default, the root css of a subsystem controller belongs to the default hierarchy (i.e. the cgroup2 root). When a controller is mounted in its cgroup1 version, the root css is stolen and moved to the cgroup1 root; on unmount, the css moves back to the default hierarchy. Annotate rebind_subsystems() to move the root css linkage along between roots. Link: https://lkml.kernel.org/r/20210209163304.77088-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NRoman Gushchin <guro@fb.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Acked-by: NTejun Heo <tj@kernel.org> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit a18e6e6e category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a18e6e6e150a ------------------------------------------------------------------- There are no users outside of the memory controller itself. The rest of the kernel cares either about node or lruvec stats. Link: https://lkml.kernel.org/r/20210209163304.77088-4-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflict: include/linux/memcontrol.h Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-v5.13-rc1 commit a3747b53 category: feature bugzilla: 185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a3747b53b1771 ---------------------------------------------------- No need to encapsulate a simple struct member access. Link: https://lkml.kernel.org/r/20210209163304.77088-3-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflict: mm/memcontrol.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johannes Weiner 提交于
mainline inclusion from mainline-5.13-rc1 commit a3d4c05a category: feature bugzilla:185803 https://gitee.com/openeuler/kernel/issues/I4JOG9?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a3d4c05a447486 --------------------------------------------------- Patch series "mm: memcontrol: switch to rstat", v3. This series converts memcg stats tracking to the streamlined rstat infrastructure provided by the cgroup core code. rstat is already used by the CPU controller and the IO controller. This change is motivated by recent accuracy problems in memcg's custom stats code, as well as the benefits of sharing common infra with other controllers. The current memcg implementation does batched tree aggregation on the write side: local stat changes are cached in per-cpu counters, which are then propagated upward in batches when a threshold (32 pages) is exceeded. This is cheap, but the error introduced by the lazy upward propagation adds up: 32 pages times CPUs times cgroups in the subtree. We've had complaints from service owners that the stats do not reliably track and react to allocation behavior as expected, sometimes swallowing the results of entire test applications. The original memcg stat implementation used to do tree aggregation exclusively on the read side: local stats would only ever be tracked in per-cpu counters, and a memory.stat read would iterate the entire subtree and sum those counters up. This didn't keep up with the times: - Cgroup trees are much bigger now. We switched to lazily-freed cgroups, where deleted groups would hang around until their remaining page cache has been reclaimed. This can result in large subtrees that are expensive to walk, while most of the groups are idle and their statistics don't change much anymore. - Automated monitoring increased. With the proliferation of userspace oom killing, proactive reclaim, and higher-resolution logging of workload trends in general, top-level stat files are polled at least once a second in many deployments. - The lifetime of cgroups got shorter. Where most cgroup setups in the past would have a few large policy-oriented cgroups for everything running on the system, newer cgroup deployments tend to create one group per application - which gets deleted again as the processes exit. An aggregation scheme that doesn't retain child data inside the parents loses event history of the subtree. Rstat addresses all three of those concerns through intelligent, persistent read-side aggregation. As statistics change at the local level, rstat tracks - on a per-cpu basis - only those parts of a subtree that have changes pending and require aggregation. The actual aggregation occurs on the colder read side - which can now skip over (potentially large) numbers of recently idle cgroups. === The test_kmem cgroup selftest is currently failing due to excessive cumulative vmstat drift from 100 subgroups: ok 1 test_kmem_basic memory.current = 8810496 slab + anon + file + kernel_stack = 17074568 slab = 6101384 anon = 946176 file = 0 kernel_stack = 10027008 not ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic As you can see, memory.stat items far exceed memory.current. The kernel stack alone is bigger than all of charged memory. That's because the memory of the test has been uncharged from memory.current, but the negative vmstat deltas are still sitting in the percpu caches. The test at this time isn't even counting percpu, pagetables etc. yet, which would further contribute to the error. The last patch in the series updates the test to include them - as well as reduces the vmstat tolerances in general to only expect page_counter batching. With all patches applied, the (now more stringent) test succeeds: ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic === A kernel build test confirms that overhead is comparable. Two kernels are built simultaneously in a nested tree with several idle siblings: root - kernelbuild - one - two - three - four - build-a (defconfig, make -j16) `- build-b (defconfig, make -j16) `- idle-1 `- ... `- idle-9 During the builds, kernelbuild/memory.stat is read once a second. A perf diff shows that the changes in cycle distribution is minimal. Top 10 kernel symbols: 0.09% +0.08% [kernel.kallsyms] [k] __mod_memcg_lruvec_state 0.00% +0.06% [kernel.kallsyms] [k] cgroup_rstat_updated 0.08% -0.05% [kernel.kallsyms] [k] __mod_memcg_state.part.0 0.16% -0.04% [kernel.kallsyms] [k] release_pages 0.00% +0.03% [kernel.kallsyms] [k] __count_memcg_events 0.01% +0.03% [kernel.kallsyms] [k] mem_cgroup_charge_statistics.constprop.0 0.10% -0.02% [kernel.kallsyms] [k] get_mem_cgroup_from_mm 0.05% -0.02% [kernel.kallsyms] [k] mem_cgroup_update_lru_size 0.57% +0.01% [kernel.kallsyms] [k] asm_exc_page_fault === The on-demand aggregated stats are now fully accurate: $ grep -e nr_inactive_file /proc/vmstat | awk '{print($1,$2*4096)}'; \ grep -e inactive_file /sys/fs/cgroup/memory.stat vanilla: patched: nr_inactive_file 1574105088 nr_inactive_file 1027801088 inactive_file 1577410560 inactive_file 1027801088 === This patch (of 8): The memcg hotunplug callback erroneously flushes counts on the local CPU, not the counts of the CPU going away; those counts will be lost. Flush the CPU that is actually going away. Also simplify the code a bit by using mod_memcg_state() and count_memcg_events() instead of open-coding the upward flush - this is comparable to how vmstat.c handles hotunplug flushing. Link: https://lkml.kernel.org/r/20210209163304.77088-1-hannes@cmpxchg.org Link: https://lkml.kernel.org/r/20210209163304.77088-2-hannes@cmpxchg.org Fixes: a983b5eb ("mm: memcontrol: fix excessive complexity in memory.stat reporting") Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nadav Amit 提交于
stable inclusion from stable-5.10.82 commit 40bc831ab5f630431010d1ff867390b07418a7ee category: bugfix bugzilla: 185820 https://gitee.com/openeuler/kernel/issues/I4DDEL CVE: CVE-2021-4002 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=40bc831ab5f630431010d1ff867390b07418a7ee ----------------------------------------------- commit a4a118f2 upstream. When __unmap_hugepage_range() calls to huge_pmd_unshare() succeed, a TLB flush is missing. This TLB flush must be performed before releasing the i_mmap_rwsem, in order to prevent an unshared PMDs page from being released and reused before the TLB flush took place. Arguably, a comprehensive solution would use mmu_gather interface to batch the TLB flushes and the PMDs page release, however it is not an easy solution: (1) try_to_unmap_one() and try_to_migrate_one() also call huge_pmd_unshare() and they cannot use the mmu_gather interface; and (2) deferring the release of the page reference for the PMDs page until after i_mmap_rwsem is dropeed can confuse huge_pmd_unshare() into thinking PMDs are shared when they are not. Fix __unmap_hugepage_range() by adding the missing TLB flush, and forcing a flush when unshare is successful. Fixes: 24669e58 ("hugetlb: use mmu_gather instead of a temporary linked list for accumulating pages)" # 3.6 Signed-off-by: NNadav Amit <namit@vmware.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 62424fe4c2cf38f27c1eef66cbfa226ffe233e90 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=62424fe4c2cf38f27c1eef66cbfa226ffe233e90 -------------------------------- commit 541fd20c upstream. USB control-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Use the common control-message timeout define for the five-second timeout. Fixes: dad0d04f ("rsi: Add RS9113 wireless driver") Cc: stable@vger.kernel.org # 3.15 Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NKalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211025120522.6045-5-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Gustavo A. R. Silva 提交于
stable inclusion from stable-5.10.79 commit 8971158af1e0138bf0576c16f9aa153f151fb6b4 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8971158af1e0138bf0576c16f9aa153f151fb6b4 -------------------------------- commit a44f9d6f upstream. There is a wrong comparison of the total size of the loaded firmware css->fw->size with the size of a pointer to struct imgu_fw_header. Turn binary_header into a flexible-array member[1][2], use the struct_size() helper and fix the wrong size comparison. Notice that the loaded firmware needs to contain at least one 'struct imgu_fw_info' item in the binary_header[] array. It's also worth mentioning that "css->fw->size < struct_size(css->fwp, binary_header, 1)" with binary_header declared as a flexible-array member is equivalent to "css->fw->size < sizeof(struct imgu_fw_header)" with binary_header declared as a one-element array (as in the original code). The replacement of the one-element array with a flexible-array member also helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Fixes: 09d290f0 ("media: staging/intel-ipu3: css: Add support for firmware management") Cc: stable@vger.kernel.org Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NSakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 1cf43e928954f5d511d3bb28c045ab430e9440a8 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1cf43e928954f5d511d3bb28c045ab430e9440a8 -------------------------------- commit 4cfa36d3 upstream. USB control-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Fixes: 8fc8598e ("Staging: Added Realtek rtl8192u driver to staging") Cc: stable@vger.kernel.org # 2.6.33 Acked-by: NLarry Finger <Larry.Finger@lwfinger.net> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211025120910.6339-2-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 9963ba5b9d495d05bf32f37dc42d69afed46639b bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9963ba5b9d495d05bf32f37dc42d69afed46639b -------------------------------- commit ce494052 upstream. USB control-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Fixes: 2865d42c ("staging: r8712u: Add the new driver to the mainline kernel") Cc: stable@vger.kernel.org # 2.6.37 Acked-by: NLarry Finger <Larry.Finger@lwfinger.net> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211025120910.6339-3-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 844b02496eaca61c3eb2e7430dfc25a06302bff3 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=844b02496eaca61c3eb2e7430dfc25a06302bff3 -------------------------------- commit a56d3e40 upstream. USB bulk and interrupt message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Note that the bulk-out transfer timeout was set to the endpoint bInterval value, which should be ignored for bulk endpoints and is typically set to zero. This meant that a failing bulk-out transfer would never time out. Assume that the 10 second timeout used for all other transfers is more than enough also for the bulk-out endpoint. Fixes: 985cafcc ("Staging: Comedi: vmk80xx: Add k8061 support") Fixes: 951348b3 ("staging: comedi: vmk80xx: wait for URBs to complete") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: NJohan Hovold <johan@kernel.org> Reviewed-by: NIan Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-6-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit b7fd7f3387f070215e6be341e68eb5c087eeecc0 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b7fd7f3387f070215e6be341e68eb5c087eeecc0 -------------------------------- commit 78cdfd62 upstream. The driver is using endpoint-sized buffers but must not assume that the tx and rx buffers are of equal size or a malicious device could overflow the slab-allocated receive buffer when doing bulk transfers. Fixes: 985cafcc ("Staging: Comedi: vmk80xx: Add k8061 support") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: NJohan Hovold <johan@kernel.org> Reviewed-by: NIan Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-5-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 33d7a470730dfe7c9bfc8da84575cf2cedd60d00 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=33d7a470730dfe7c9bfc8da84575cf2cedd60d00 -------------------------------- commit a23461c4 upstream. The driver uses endpoint-sized USB transfer buffers but up until recently had no sanity checks on the sizes. Commit e1f13c87 ("staging: comedi: check validity of wMaxPacketSize of usb endpoints found") inadvertently fixed NULL-pointer dereferences when accessing the transfer buffers in case a malicious device has a zero wMaxPacketSize. Make sure to allocate buffers large enough to handle also the other accesses that are done without a size check (e.g. byte 18 in vmk80xx_cnt_insn_read() for the VMK8061_MODEL) to avoid writing beyond the buffers, for example, when doing descriptor fuzzing. The original driver was for a low-speed device with 8-byte buffers. Support was later added for a device that uses bulk transfers and is presumably a full-speed device with a maximum 64-byte wMaxPacketSize. Fixes: 985cafcc ("Staging: Comedi: vmk80xx: Add k8061 support") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: NJohan Hovold <johan@kernel.org> Reviewed-by: NIan Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-4-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit ef143dc0c3defe56730ecd3a9de7b3e1d7e557c1 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ef143dc0c3defe56730ecd3a9de7b3e1d7e557c1 -------------------------------- commit 907767da upstream. The driver uses endpoint-sized USB transfer buffers but had no sanity checks on the sizes. This can lead to zero-size-pointer dereferences or overflowed transfer buffers in ni6501_port_command() and ni6501_counter_command() if a (malicious) device has smaller max-packet sizes than expected (or when doing descriptor fuzz testing). Add the missing sanity checks to probe(). Fixes: a03bb00e ("staging: comedi: add NI USB-6501 support") Cc: stable@vger.kernel.org # 3.18 Cc: Luca Ellero <luca.ellero@brickedbrain.com> Reviewed-by: NIan Abbott <abbotti@mev.co.uk> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211027093529.30896-2-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Johan Hovold 提交于
stable inclusion from stable-5.10.79 commit 786f5b03450454557ff858a8bead5d7c0cbf78d6 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=786f5b03450454557ff858a8bead5d7c0cbf78d6 -------------------------------- commit 536de747 upstream. USB transfer buffers are typically mapped for DMA and must not be allocated on the stack or transfers will fail. Allocate proper transfer buffers in the various command helpers and return an error on short transfers instead of acting on random stack data. Note that this also fixes a stack info leak on systems where DMA is not used as 32 bytes are always sent to the device regardless of how short the command is. Fixes: 63274cd7 ("Staging: comedi: add usb dt9812 driver") Cc: stable@vger.kernel.org # 2.6.29 Reviewed-by: NIan Abbott <abbotti@mev.co.uk> Signed-off-by: NJohan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211027093529.30896-3-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jan Kara 提交于
stable inclusion from stable-5.10.79 commit 86d4aedcbc69c0f84551fb70f953c24e396de2d7 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=86d4aedcbc69c0f84551fb70f953c24e396de2d7 -------------------------------- commit e96a1866 upstream. When isofs image is suitably corrupted isofs_read_inode() can read data beyond the end of buffer. Sanity-check the directory entry length before using it. Reported-and-tested-by: syzbot+6fc7fb214625d82af7d1@syzkaller.appspotmail.com CC: stable@vger.kernel.org Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Pavel Skripkin 提交于
stable inclusion from stable-5.10.79 commit c430094541a80575259a94ff879063ef01473506 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c430094541a80575259a94ff879063ef01473506 -------------------------------- commit c052cc1a upstream. Syzbot reported use-after-free in rtl8712_dl_fw(). The problem was in race condition between r871xu_dev_remove() ->ndo_open() callback. It's easy to see from crash log, that driver accesses released firmware in ->ndo_open() callback. It may happen, since driver was releasing firmware _before_ unregistering netdev. Fix it by moving unregister_netdev() before cleaning up resources. Call Trace: ... rtl871x_open_fw drivers/staging/rtl8712/hal_init.c:83 [inline] rtl8712_dl_fw+0xd95/0xe10 drivers/staging/rtl8712/hal_init.c:170 rtl8712_hal_init drivers/staging/rtl8712/hal_init.c:330 [inline] rtl871x_hal_init+0xae/0x180 drivers/staging/rtl8712/hal_init.c:394 netdev_open+0xe6/0x6c0 drivers/staging/rtl8712/os_intfs.c:380 __dev_open+0x2bc/0x4d0 net/core/dev.c:1484 Freed by task 1306: ... release_firmware+0x1b/0x30 drivers/base/firmware_loader/main.c:1053 r871xu_dev_remove+0xcc/0x2c0 drivers/staging/rtl8712/usb_intf.c:599 usb_unbind_interface+0x1d8/0x8d0 drivers/usb/core/driver.c:458 Fixes: 8c213fa5 ("staging: r8712u: Use asynchronous firmware loading") Cc: stable <stable@vger.kernel.org> Reported-and-tested-by: syzbot+c55162be492189fb4f51@syzkaller.appspotmail.com Signed-off-by: NPavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211019211718.26354-1-paskripkin@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Petr Mladek 提交于
stable inclusion from stable-5.10.79 commit ab4af56ae2508df3bdabcb192874086f5f07d98c bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ab4af56ae2508df3bdabcb192874086f5f07d98c -------------------------------- commit 3cffa06a upstream. The commit 48021f98 ("printk: handle blank console arguments passed in.") prevented crash caused by empty console= parameter value. Unfortunately, this value is widely used on Chromebooks to disable the console output. The above commit caused performance regression because the messages were pushed on slow console even though nobody was watching it. Use ttynull driver explicitly for console="" and console=null parameters. It has been created for exactly this purpose. It causes that preferred_console is set. As a result, ttySX and ttyX are not used as a fallback. And only ttynull console gets registered by default. It still allows to register other consoles either by additional console= parameters or SPCR. It prevents regression because it worked this way even before. Also it is a sane semantic. Preventing output on all consoles should be done another way, for example, by introducing mute_console parameter. Link: https://lore.kernel.org/r/20201006025935.GA597@jagdpanzerIV.localdomainSuggested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NGuenter Roeck <linux@roeck-us.net> Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: NPetr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20201111135450.11214-3-pmladek@suse.com Cc: Yi Fan <yfa@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Todd Kjos 提交于
stable inclusion from stable-5.10.79 commit 07d1db141e478917600fa32be11e5b828ff9ed13 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=07d1db141e478917600fa32be11e5b828ff9ed13 -------------------------------- commit 32e9f56a upstream. When freeing txn buffers, binder_transaction_buffer_release() attempts to detect whether the current context is the target by comparing current->group_leader to proc->tsk. This is an unreliable test. Instead explicitly pass an 'is_failure' boolean. Detecting the sender was being used as a way to tell if the transaction failed to be sent. When cleaning up after failing to send a transaction, there is no need to close the fds associated with a BINDER_TYPE_FDA object. Now 'is_failure' can be used to accurately detect this case. Fixes: 44d8047f ("binder: use standard functions to allocate fds") Cc: stable <stable@vger.kernel.org> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NTodd Kjos <tkjos@google.com> Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 James Buren 提交于
stable inclusion from stable-5.10.79 commit 42681b90c4db8bf29fc2cc93789ec093bb209296 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=42681b90c4db8bf29fc2cc93789ec093bb209296 -------------------------------- commit 05c8f1b6 upstream. These drive enclosures have firmware bugs that make it impossible to mount a new virtual ISO image after Linux ejects the old one if the device is locked by Linux. Windows bypasses this problem by the fact that they do not lock the device. Add a quirk to disable device locking for these drive enclosures. Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NJames Buren <braewoods+lkml@braewoods.net> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211014015504.2695089-1-braewoods+lkml@braewoods.netSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Viraj Shah 提交于
stable inclusion from stable-5.10.79 commit 1309753b7841db412063bcce3f2a0eddf1443e9d bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1309753b7841db412063bcce3f2a0eddf1443e9d -------------------------------- commit 21b5fcdc upstream. musb_gadget_queue() adds the passed request to musb_ep::req_list. If the endpoint is idle and it is the first request then it invokes musb_queue_resume_work(). If the function returns an error then the error is passed to the caller without any clean-up and the request remains enqueued on the list. If the caller enqueues the request again then the list corrupts. Remove the request from the list on error. Fixes: ea2f35c0 ("usb: musb: Fix sleeping function called from invalid context for hdrc glue") Cc: stable <stable@vger.kernel.org> Signed-off-by: NViraj Shah <viraj.shah@linutronix.de> Link: https://lore.kernel.org/r/20211021093644.4734-1-viraj.shah@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Geert Uytterhoeven 提交于
stable inclusion from stable-5.10.79 commit 27409143122f9c68f74da42ce987cc6badb40784 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=27409143122f9c68f74da42ce987cc6badb40784 -------------------------------- commit a0548b26 upstream. On 64-bit: drivers/usb/gadget/udc/fsl_qe_udc.c: In function ‘qe_ep0_rx’: drivers/usb/gadget/udc/fsl_qe_udc.c:842:13: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 842 | vaddr = (u32)phys_to_virt(in_be32(&bd->buf)); | ^ In file included from drivers/usb/gadget/udc/fsl_qe_udc.c:41: drivers/usb/gadget/udc/fsl_qe_udc.c:843:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 843 | frame_set_data(pframe, (u8 *)vaddr); | ^ The driver assumes physical and virtual addresses are 32-bit, hence it cannot work on 64-bit platforms. Acked-by: NLi Yang <leoyang.li@nxp.com> Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20211027080849.3276289-1-geert@linux-m68k.org Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Neal Liu 提交于
stable inclusion from stable-5.10.79 commit 94e5305a381658064949e2f427f4a7591f4194aa bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=94e5305a381658064949e2f427f4a7591f4194aa -------------------------------- commit 7f2d7378 upstream. For Aspeed, HCHalted status depends on not only Run/Stop but also ASS/PSS status. Handshake CMD_RUN on startup instead. Tested-by: NTao Ren <rentao.bupt@gmail.com> Reviewed-by: NTao Ren <rentao.bupt@gmail.com> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NNeal Liu <neal_liu@aspeedtech.com> Link: https://lore.kernel.org/r/20210910073619.26095-1-neal_liu@aspeedtech.com Cc: Joel Stanley <joel@jms.id.au> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Juergen Gross 提交于
stable inclusion from stable-5.10.79 commit a8db6fd04d58b92d0698ee56e42134f26b236a90 bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a8db6fd04d58b92d0698ee56e42134f26b236a90 -------------------------------- commit 1e254d0d upstream. This reverts commit 76b4f357. The commit has the wrong reasoning, as KVM_MAX_VCPU_ID is not defining the maximum allowed vcpu-id as its name suggests, but the number of vcpu-ids. So revert this patch again. Suggested-by: NEduardo Habkost <ehabkost@redhat.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <20210913135745.13944-2-jgross@suse.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paolo Bonzini 提交于
stable inclusion from stable-5.10.79 commit ecf58653f1e4ab88b4eb62db8fe799826d99d5ec bugzilla: 185793 https://gitee.com/openeuler/kernel/issues/I4K65C Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ecf58653f1e4ab88b4eb62db8fe799826d99d5ec -------------------------------- commit 3d5e7a28 upstream. This is a new warning in clang top-of-tree (will be clang 14): In file included from arch/x86/kvm/mmu/mmu.c:27: arch/x86/kvm/mmu/spte.h:318:9: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical] return __is_bad_mt_xwr(rsvd_check, spte) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || arch/x86/kvm/mmu/spte.h:318:9: note: cast one or both operands to int to silence this warning The code is fine, but change it anyway to shut up this clever clogs of a compiler. Reported-by: torvic9@mailbox.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [nathan: Backport to 5.10, which does not have 961f8445] Signed-off-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NBin Li <huawei.libin@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kevin Locke 提交于
mainline inclusion from mainline-v5.11-rc1 commit 0a8d0b64 category: bugfix bugzilla: 185806 https://gitee.com/openeuler/kernel/issues/I4DDEL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0a8d0b64dd6acfbc9e9b79022654bbe1ade4a29a ------------------------------------------------- When the lower file of a metacopy is inaccessible, -EIO is returned. For users not familiar with overlayfs internals, such as myself, the meaning of this error may not be apparent or easy to determine, since the (metacopy) file is present and open/stat succeed when accessed outside of the overlay. Add a rate-limited warning for orphan metacopy to give users a hint when investigating such errors. Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxi23Zsmfb4rCed1n=On0NNA5KZD74jjjeyz+et32sk-gg@mail.gmail.com/Signed-off-by: NKevin Locke <kevin@kevinlocke.name> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NZheng Liang <zhengliang6@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jan Kara 提交于
mainline inclusion from mainline-v5.15 commit b2bbb92f category: bugfix bugzilla: 179956 https://gitee.com/openeuler/kernel/issues/I4DDEL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b2bbb92f7042e8075fb036bf97043339576330c3 ------------------------------------------------- Commit 81414b4d ("ext4: remove redundant sb checksum recomputation") removed checksum recalculation after updating superblock free space / inode counters in ext4_fill_super() based on the fact that we will recalculate the checksum on superblock writeout. That is correct assumption but until the writeout happens (which can take a long time) the checksum is incorrect in the buffer cache and if programs such as tune2fs or resize2fs is called shortly after a file system is mounted can fail. So return back the checksum recalculation and add a comment explaining why. Fixes: 81414b4d ("ext4: remove redundant sb checksum recomputation") Cc: stable@kernel.org Reported-by: NBoyang Xue <bxue@redhat.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210812124737.21981-1-jack@suse.czSigned-off-by: NZheng Liang <zhengliang6@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lakshmi Ramasubramanian 提交于
mainline inclusion from mainline-5.14 commit: c6791349 category: bugfix bugzilla: 182971 https://gitee.com/openeuler/kernel/issues/I4DDEL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c67913492fec317bc53ffdff496b6ba856d2868c --------------------------- The function prototype for ima_add_kexec_buffer() is present in 'linux/ima.h'. But this header file is not included in ima_kexec.c where the function is implemented. This results in the following compiler warning when "-Wmissing-prototypes" flag is turned on: security/integrity/ima/ima_kexec.c:81:6: warning: no previous prototype for function 'ima_add_kexec_buffer' [-Wmissing-prototypes] Include the header file 'linux/ima.h' in ima_kexec.c to fix the compiler warning. Fixes: dce92f6b (arm64: Enable passing IMA log to next kernel on kexec) Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NLakshmi Ramasubramanian <nramas@linux.microsoft.com> Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com> Signed-off-by: NGuo Zihua <guozihua@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-