1. 08 12月, 2022 18 次提交
    • C
      net, neigh: Fix null-ptr-deref in neigh_table_clear() · d31519a4
      Chen Zhongjin 提交于
      stable inclusion
      from stable-v4.19.265
      commit b736592de2aa53aee2d48d6b129bc0c892007bbe
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit f8017317 ]
      
      When IPv6 module gets initialized but hits an error in the middle,
      kenel panic with:
      
      KASAN: null-ptr-deref in range [0x0000000000000598-0x000000000000059f]
      CPU: 1 PID: 361 Comm: insmod
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      RIP: 0010:__neigh_ifdown.isra.0+0x24b/0x370
      RSP: 0018:ffff888012677908 EFLAGS: 00000202
      ...
      Call Trace:
       <TASK>
       neigh_table_clear+0x94/0x2d0
       ndisc_cleanup+0x27/0x40 [ipv6]
       inet6_init+0x21c/0x2cb [ipv6]
       do_one_initcall+0xd3/0x4d0
       do_init_module+0x1ae/0x670
      ...
      Kernel panic - not syncing: Fatal exception
      
      When ipv6 initialization fails, it will try to cleanup and calls:
      
      neigh_table_clear()
        neigh_ifdown(tbl, NULL)
          pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev == NULL))
          # dev_net(NULL) triggers null-ptr-deref.
      
      Fix it by passing NULL to pneigh_queue_purge() in neigh_ifdown() if dev
      is NULL, to make kernel not panic immediately.
      
      Fixes: 66ba215c ("neigh: fix possible DoS due to net iface start/stop loop")
      Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDenis V. Lunev <den@openvz.org>
      Link: https://lore.kernel.org/r/20221101121552.21890-1-chenzhongjin@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      d31519a4
    • N
      tcp: fix indefinite deferral of RTO with SACK reneging · e6d23a4b
      Neal Cardwell 提交于
      stable inclusion
      from stable-v4.19.264
      commit 633da7b30b240000b1d9b690e43848406a0d060f
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 3d2af9cc ]
      
      This commit fixes a bug that can cause a TCP data sender to repeatedly
      defer RTOs when encountering SACK reneging.
      
      The bug is that when we're in fast recovery in a scenario with SACK
      reneging, every time we get an ACK we call tcp_check_sack_reneging()
      and it can note the apparent SACK reneging and rearm the RTO timer for
      srtt/2 into the future. In some SACK reneging scenarios that can
      happen repeatedly until the receive window fills up, at which point
      the sender can't send any more, the ACKs stop arriving, and the RTO
      fires at srtt/2 after the last ACK. But that can take far too long
      (O(10 secs)), since the connection is stuck in fast recovery with a
      low cwnd that cannot grow beyond ssthresh, even if more bandwidth is
      available.
      
      This fix changes the logic in tcp_check_sack_reneging() to only rearm
      the RTO timer if data is cumulatively ACKed, indicating forward
      progress. This avoids this kind of nearly infinite loop of RTO timer
      re-arming. In addition, this meets the goals of
      tcp_check_sack_reneging() in handling Windows TCP behavior that looks
      temporarily like SACK reneging but is not really.
      
      Many thanks to Jakub Kicinski and Neil Spring, who reported this issue
      and provided critical packet traces that enabled root-causing this
      issue. Also, many thanks to Jakub Kicinski for testing this fix.
      
      Fixes: 5ae344c9 ("tcp: reduce spurious retransmits due to transient SACK reneging")
      Reported-by: NJakub Kicinski <kuba@kernel.org>
      Reported-by: NNeil Spring <ntspring@fb.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Tested-by: NJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      e6d23a4b
    • Z
      net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed · c50c8779
      Zhengchao Shao 提交于
      stable inclusion
      from stable-v4.19.264
      commit 5a2ea549be94924364f6911227d99be86e8cf34a
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit d266935a ]
      
      When the ops_init() interface is invoked to initialize the net, but
      ops->init() fails, data is released. However, the ptr pointer in
      net->gen is invalid. In this case, when nfqnl_nf_hook_drop() is invoked
      to release the net, invalid address access occurs.
      
      The process is as follows:
      setup_net()
      	ops_init()
      		data = kzalloc(...)   ---> alloc "data"
      		net_assign_generic()  ---> assign "date" to ptr in net->gen
      		...
      		ops->init()           ---> failed
      		...
      		kfree(data);          ---> ptr in net->gen is invalid
      	...
      	ops_exit_list()
      		...
      		nfqnl_nf_hook_drop()
      			*q = nfnl_queue_pernet(net) ---> q is invalid
      
      The following is the Call Trace information:
      BUG: KASAN: use-after-free in nfqnl_nf_hook_drop+0x264/0x280
      Read of size 8 at addr ffff88810396b240 by task ip/15855
      Call Trace:
      <TASK>
      dump_stack_lvl+0x8e/0xd1
      print_report+0x155/0x454
      kasan_report+0xba/0x1f0
      nfqnl_nf_hook_drop+0x264/0x280
      nf_queue_nf_hook_drop+0x8b/0x1b0
      __nf_unregister_net_hook+0x1ae/0x5a0
      nf_unregister_net_hooks+0xde/0x130
      ops_exit_list+0xb0/0x170
      setup_net+0x7ac/0xbd0
      copy_net_ns+0x2e6/0x6b0
      create_new_namespaces+0x382/0xa50
      unshare_nsproxy_namespaces+0xa6/0x1c0
      ksys_unshare+0x3a4/0x7e0
      __x64_sys_unshare+0x2d/0x40
      do_syscall_64+0x35/0x80
      entry_SYSCALL_64_after_hwframe+0x46/0xb0
      </TASK>
      
      Allocated by task 15855:
      kasan_save_stack+0x1e/0x40
      kasan_set_track+0x21/0x30
      __kasan_kmalloc+0xa1/0xb0
      __kmalloc+0x49/0xb0
      ops_init+0xe7/0x410
      setup_net+0x5aa/0xbd0
      copy_net_ns+0x2e6/0x6b0
      create_new_namespaces+0x382/0xa50
      unshare_nsproxy_namespaces+0xa6/0x1c0
      ksys_unshare+0x3a4/0x7e0
      __x64_sys_unshare+0x2d/0x40
      do_syscall_64+0x35/0x80
      entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Freed by task 15855:
      kasan_save_stack+0x1e/0x40
      kasan_set_track+0x21/0x30
      kasan_save_free_info+0x2a/0x40
      ____kasan_slab_free+0x155/0x1b0
      slab_free_freelist_hook+0x11b/0x220
      __kmem_cache_free+0xa4/0x360
      ops_init+0xb9/0x410
      setup_net+0x5aa/0xbd0
      copy_net_ns+0x2e6/0x6b0
      create_new_namespaces+0x382/0xa50
      unshare_nsproxy_namespaces+0xa6/0x1c0
      ksys_unshare+0x3a4/0x7e0
      __x64_sys_unshare+0x2d/0x40
      do_syscall_64+0x35/0x80
      entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: f875bae0 ("net: Automatically allocate per namespace data.")
      Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      c50c8779
    • I
      serial: 8250: Flush DMA Rx on RLSI · 176d2de6
      Ilpo Järvinen 提交于
      stable inclusion
      from stable-v4.19.267
      commit 40f5fa26c11bca5c947d218cc4fe6e0c64932070
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 1980860e upstream.
      
      Returning true from handle_rx_dma() without flushing DMA first creates
      a data ordering hazard. If DMA Rx has handled any character at the
      point when RLSI occurs, the non-DMA path handles any pending characters
      jumping them ahead of those characters that are pending under DMA.
      
      Fixes: 75df022b ("serial: 8250_dma: Fix RX handling")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Link: https://lore.kernel.org/r/20221108121952.5497-5-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      176d2de6
    • I
      serial: 8250: Fall back to non-DMA Rx if IIR_RDI occurs · 80703cbe
      Ilpo Järvinen 提交于
      stable inclusion
      from stable-v4.19.267
      commit 62cda857457c7de0922852d54d69b140bd6eeb7e
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit a931237c upstream.
      
      DW UART sometimes triggers IIR_RDI during DMA Rx when IIR_RX_TIMEOUT
      should have been triggered instead. Since IIR_RDI has higher priority
      than IIR_RX_TIMEOUT, this causes the Rx to hang into interrupt loop.
      The problem seems to occur at least with some combinations of
      small-sized transfers (I've reproduced the problem on Elkhart Lake PSE
      UARTs).
      
      If there's already an on-going Rx DMA and IIR_RDI triggers, fall
      graciously back to non-DMA Rx. That is, behave as if IIR_RX_TIMEOUT had
      occurred.
      
      8250_omap already considers IIR_RDI similar to this change so its
      nothing unheard of.
      
      Fixes: 75df022b ("serial: 8250_dma: Fix RX handling")
      Cc: <stable@vger.kernel.org>
      Co-developed-by: NSrikanth Thokala <srikanth.thokala@intel.com>
      Signed-off-by: NSrikanth Thokala <srikanth.thokala@intel.com>
      Co-developed-by: NAman Kumar <aman.kumar@intel.com>
      Signed-off-by: NAman Kumar <aman.kumar@intel.com>
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Link: https://lore.kernel.org/r/20221108121952.5497-2-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      80703cbe
    • G
      capabilities: fix potential memleak on error path from vfs_getxattr_alloc() · 779c11cb
      Gaosheng Cui 提交于
      stable inclusion
      from stable-v4.19.265
      commit 90577bcc01c4188416a47269f8433f70502abe98
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 8cf0a1bc upstream.
      
      In cap_inode_getsecurity(), we will use vfs_getxattr_alloc() to
      complete the memory allocation of tmpbuf, if we have completed
      the memory allocation of tmpbuf, but failed to call handler->get(...),
      there will be a memleak in below logic:
      
        |-- ret = (int)vfs_getxattr_alloc(mnt_userns, ...)
          |           /* ^^^ alloc for tmpbuf */
          |-- value = krealloc(*xattr_value, error + 1, flags)
          |           /* ^^^ alloc memory */
          |-- error = handler->get(handler, ...)
          |           /* error! */
          |-- *xattr_value = value
          |           /* xattr_value is &tmpbuf (memory leak!) */
      
      So we will try to free(tmpbuf) after vfs_getxattr_alloc() fails to fix it.
      
      Cc: stable@vger.kernel.org
      Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
      Signed-off-by: NGaosheng Cui <cuigaosheng1@huawei.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      [PM: subject line and backtrace tweaks]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      779c11cb
    • A
      security: commoncap: fix -Wstringop-overread warning · 7e2d8bfc
      Arnd Bergmann 提交于
      stable inclusion
      from stable-v4.19.191
      commit 2f34dd12fd7a28888286924d74c0313532bc52d8
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 82e5d8cc upstream.
      
      gcc-11 introdces a harmless warning for cap_inode_getsecurity:
      
      security/commoncap.c: In function ‘cap_inode_getsecurity’:
      security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
        440 |                                 memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32);
            |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      The problem here is that tmpbuf is initialized to NULL, so gcc assumes
      it is not accessible unless it gets set by vfs_getxattr_alloc().  This is
      a legitimate warning as far as I can tell, but the code is correct since
      it correctly handles the error when that function fails.
      
      Add a separate NULL check to tell gcc about it as well.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NJames Morris <jamorris@linux.microsoft.com>
      Cc: Andrey Zhizhikin <andrey.z@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      7e2d8bfc
    • D
      ring_buffer: Do not deactivate non-existant pages · ee1cf85b
      Daniil Tatianin 提交于
      stable inclusion
      from stable-v4.19.267
      commit 455ea324770205525cbc13f155806a5346794339
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 56f4ca0a upstream.
      
      rb_head_page_deactivate() expects cpu_buffer to contain a valid list of
      ->pages, so verify that the list is actually present before calling it.
      
      Found by Linux Verification Center (linuxtesting.org) with the SVACE
      static analysis tool.
      
      Link: https://lkml.kernel.org/r/20221114143129.3534443-1-d-tatianin@yandex-team.ru
      
      Cc: stable@vger.kernel.org
      Fixes: 77ae365e ("ring-buffer: make lockless")
      Signed-off-by: NDaniil Tatianin <d-tatianin@yandex-team.ru>
      Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      ee1cf85b
    • X
      ftrace: Fix null pointer dereference in ftrace_add_mod() · 33ffa638
      Xiu Jianfeng 提交于
      stable inclusion
      from stable-v4.19.267
      commit b5bfc61f541d3f092b13dedcfe000d86eb8e133c
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 19ba6c8a upstream.
      
      The @ftrace_mod is allocated by kzalloc(), so both the members {prev,next}
      of @ftrace_mode->list are NULL, it's not a valid state to call list_del().
      If kstrdup() for @ftrace_mod->{func|module} fails, it goes to @out_free
      tag and calls free_ftrace_mod() to destroy @ftrace_mod, then list_del()
      will write prev->next and next->prev, where null pointer dereference
      happens.
      
      BUG: kernel NULL pointer dereference, address: 0000000000000008
      Oops: 0002 [#1] PREEMPT SMP NOPTI
      Call Trace:
       <TASK>
       ftrace_mod_callback+0x20d/0x220
       ? do_filp_open+0xd9/0x140
       ftrace_process_regex.isra.51+0xbf/0x130
       ftrace_regex_write.isra.52.part.53+0x6e/0x90
       vfs_write+0xee/0x3a0
       ? __audit_filter_op+0xb1/0x100
       ? auditd_test_task+0x38/0x50
       ksys_write+0xa5/0xe0
       do_syscall_64+0x3a/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      Kernel panic - not syncing: Fatal exception
      
      So call INIT_LIST_HEAD() to initialize the list member to fix this issue.
      
      Link: https://lkml.kernel.org/r/20221116015207.30858-1-xiujianfeng@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: 673feb9d ("ftrace: Add :mod: caching infrastructure to trace_array")
      Signed-off-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      33ffa638
    • W
      ftrace: Optimize the allocation for mcount entries · 3d193fb1
      Wang Wensheng 提交于
      stable inclusion
      from stable-v4.19.267
      commit d110bb57a7e9831465aa3abb6c0d1cc658b05fbe
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit bcea02b0 upstream.
      
      If we can't allocate this size, try something smaller with half of the
      size. Its order should be decreased by one instead of divided by two.
      
      Link: https://lkml.kernel.org/r/20221109094434.84046-3-wangwensheng4@huawei.com
      
      Cc: <mhiramat@kernel.org>
      Cc: <mark.rutland@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: a7900875 ("ftrace: Allocate the mcount record pages as groups")
      Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
      Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      3d193fb1
    • L
      kprobe: reverse kp->flags when arm_kprobe failed · 04150923
      Li Qiang 提交于
      stable inclusion
      from stable-v4.19.265
      commit d608ed66abfaccc233404be2583ab89c37e560fc
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 4a6f316d upstream.
      
      In aggregate kprobe case, when arm_kprobe failed,
      we need set the kp->flags with KPROBE_FLAG_DISABLED again.
      If not, the 'kp' kprobe will been considered as enabled
      but it actually not enabled.
      
      Link: https://lore.kernel.org/all/20220902155820.34755-1-liq3ea@163.com/
      
      Fixes: 12310e34 ("kprobes: Propagate error from arm_kprobe_ftrace()")
      Cc: stable@vger.kernel.org
      Signed-off-by: NLi Qiang <liq3ea@163.com>
      Acked-by: NMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: NMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      04150923
    • A
      mm: fs: initialize fsdata passed to write_begin/write_end interface · c01f46a9
      Alexander Potapenko 提交于
      stable inclusion
      from stable-v4.19.267
      commit 8a5be2948f350d34b1f6acb9ca3be4c89359a057
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 1468c6f4 upstream.
      
      Functions implementing the a_ops->write_end() interface accept the `void
      *fsdata` parameter that is supposed to be initialized by the corresponding
      a_ops->write_begin() (which accepts `void **fsdata`).
      
      However not all a_ops->write_begin() implementations initialize `fsdata`
      unconditionally, so it may get passed uninitialized to a_ops->write_end(),
      resulting in undefined behavior.
      
      Fix this by initializing fsdata with NULL before the call to
      write_begin(), rather than doing so in all possible a_ops implementations.
      
      This patch covers only the following cases found by running x86 KMSAN
      under syzkaller:
      
       - generic_perform_write()
       - cont_expand_zero() and generic_cont_expand_simple()
       - page_symlink()
      
      Other cases of passing uninitialized fsdata may persist in the codebase.
      
      Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      c01f46a9
    • Z
      nfs4: Fix kmemleak when allocate slot failed · 920f74ac
      Zhang Xiaoxu 提交于
      stable inclusion
      from stable-v4.19.265
      commit 86ce0e93cf6fb4d0c447323ac66577c642628b9d
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 7e843672 ]
      
      If one of the slot allocate failed, should cleanup all the other
      allocated slots, otherwise, the allocated slots will leak:
      
        unreferenced object 0xffff8881115aa100 (size 64):
          comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s)
          hex dump (first 32 bytes):
            00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff  ...s......Z.....
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          backtrace:
            [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130
            [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270
            [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90
            [<00000000128486db>] nfs4_init_client+0xce/0x270
            [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0
            [<000000000e593b52>] nfs4_create_server+0x300/0x5f0
            [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110
            [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0
            [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0
            [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0
            [<000000005d56bdec>] do_syscall_64+0x35/0x80
            [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: abf79bb3 ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking")
      Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      920f74ac
    • C
      kernfs: fix use-after-free in __kernfs_remove · a1a691b4
      Christian A. Ehrhardt 提交于
      stable inclusion
      from stable-v4.19.264
      commit 028cf780743eea79abffa7206b9dcfc080ad3546
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 4abc9965 upstream.
      
      Syzkaller managed to trigger concurrent calls to
      kernfs_remove_by_name_ns() for the same file resulting in
      a KASAN detected use-after-free. The race occurs when the root
      node is freed during kernfs_drain().
      
      To prevent this acquire an additional reference for the root
      of the tree that is removed before calling __kernfs_remove().
      
      Found by syzkaller with the following reproducer (slab_nomerge is
      required):
      
      syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
      r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
      close(r0)
      pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
      mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})
      
      Sample report:
      
      ==================================================================
      BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
      BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
      BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
      Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857
      
      CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3 #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:317 [inline]
       print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
       kasan_report+0xa3/0x130 mm/kasan/report.c:495
       kernfs_type include/linux/kernfs.h:335 [inline]
       kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
       __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f725f983aed
      Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
      RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
      RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
      R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
       </TASK>
      
      Allocated by task 855:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track mm/kasan/common.c:45 [inline]
       set_alloc_info mm/kasan/common.c:437 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
       kasan_slab_alloc include/linux/kasan.h:224 [inline]
       slab_post_alloc_hook mm/slab.h:727 [inline]
       slab_alloc_node mm/slub.c:3243 [inline]
       slab_alloc mm/slub.c:3251 [inline]
       __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
       kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
       kmem_cache_zalloc include/linux/slab.h:723 [inline]
       __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
       kernfs_new_node fs/kernfs/dir.c:655 [inline]
       kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
       sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
       create_dir lib/kobject.c:63 [inline]
       kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
       kobject_add_varg lib/kobject.c:358 [inline]
       kobject_init_and_add+0x101/0x160 lib/kobject.c:441
       sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Freed by task 857:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track+0x21/0x30 mm/kasan/common.c:45
       kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
       ____kasan_slab_free mm/kasan/common.c:367 [inline]
       ____kasan_slab_free mm/kasan/common.c:329 [inline]
       __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
       kasan_slab_free include/linux/kasan.h:200 [inline]
       slab_free_hook mm/slub.c:1754 [inline]
       slab_free_freelist_hook mm/slub.c:1780 [inline]
       slab_free mm/slub.c:3534 [inline]
       kmem_cache_free+0x9c/0x340 mm/slub.c:3551
       kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
       kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
       __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The buggy address belongs to the object at ffff888008880780
       which belongs to the cache kernfs_node_cache of size 128
      The buggy address is located 112 bytes inside of
       128-byte region [ffff888008880780, ffff888008880800)
      
      The buggy address belongs to the physical page:
      page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
      flags: 0x100000000000200(slab|node=0|zone=1)
      raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
      raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      ==================================================================
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: stable <stable@kernel.org> # -rc3
      Signed-off-by: NChristian A. Ehrhardt <lk@c--e.de>
      Link: https://lore.kernel.org/r/20220913121723.691454-1-lk@c--e.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      a1a691b4
    • R
      mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages · 7a5b0955
      Rik van Riel 提交于
      stable inclusion
      from stable-v4.19.264
      commit 2b35432d324898ec41beb27031d2a1a864a4d40e
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit 12df140f upstream.
      
      The h->*_huge_pages counters are protected by the hugetlb_lock, but
      alloc_huge_page has a corner case where it can decrement the counter
      outside of the lock.
      
      This could lead to a corrupted value of h->resv_huge_pages, which we have
      observed on our systems.
      
      Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
      potential race.
      
      Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
      Fixes: a88c7695 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
      Signed-off-by: NRik van Riel <riel@surriel.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Glen McCready <gkmccready@meta.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      7a5b0955
    • S
      mm: /proc/pid/smaps_rollup: fix no vma's null-deref · 33213b46
      Seth Jenkins 提交于
      stable inclusion
      from stable-v4.19.264
      commit dbe863bce7679c7f5ec0e993d834fe16c5e687b5
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      Commit 258f669e ("mm: /proc/pid/smaps_rollup: convert to single value
      seq_file") introduced a null-deref if there are no vma's in the task in
      show_smaps_rollup.
      
      Fixes: 258f669e ("mm: /proc/pid/smaps_rollup: convert to single value seq_file")
      Signed-off-by: NSeth Jenkins <sethjenkins@google.com>
      Reviewed-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Tested-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      33213b46
    • X
      signal handling: don't use BUG_ON() for debugging · a2f88993
      Xia Fukun 提交于
      stable inclusion
      from stable-v4.19.267
      commit 93d9cef55f8fe463e3b9f6c73c7a32619222c657
      category: bugfix
      bugzilla: 187828, https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit a382f8fe ]
      
      These are indeed "should not happen" situations, but it turns out recent
      changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
      exists, is pending more testing), and the BUG_ON() makes it
      unnecessarily hard to actually debug for no good reason.
      
      It's been that way for a long time, but let's make it clear: BUG_ON() is
      not good for debugging, and should never be used in situations where you
      could just say "this shouldn't happen, but we can continue".
      
      Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
      continue running.  Instead of making the system basically unusuable
      because you crashed the machine while potentially holding some very core
      locks (eg this function is commonly called while holding 'tasklist_lock'
      for writing).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NXia Fukun <xiafukun@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      a2f88993
    • X
      ida: don't use BUG_ON() for debugging · eaab6483
      Xia Fukun 提交于
      stable inclusion
      from stable-v4.19.267
      commit 33d2f83e3f2c1fdabb365d25bed3aa630041cbc0
      category: bugfix
      bugzilla: 188002, https://gitee.com/openeuler/kernel/issues/I63UEU
      CVE: NA
      
      --------------------------------
      
      commit fc82bbf4 upstream.
      
      This is another old BUG_ON() that just shouldn't exist (see also commit
      a382f8fe: "signal handling: don't use BUG_ON() for debugging").
      
      In fact, as Matthew Wilcox points out, this condition shouldn't really
      even result in a warning, since a negative id allocation result is just
      a normal allocation failure:
      
        "I wonder if we should even warn here -- sure, the caller is trying to
         free something that wasn't allocated, but we don't warn for
         kfree(NULL)"
      
      and goes on to point out how that current error check is only causing
      people to unnecessarily do their own index range checking before freeing
      it.
      
      This was noted by Itay Iellin, because the bluetooth HCI socket cookie
      code does *not* do that range checking, and ends up just freeing the
      error case too, triggering the BUG_ON().
      
      The HCI code requires CAP_NET_RAW, and seems to just result in an ugly
      splat, but there really is no reason to BUG_ON() here, and we have
      generally striven for allocation models where it's always ok to just do
      
          free(alloc());
      
      even if the allocation were to fail for some random reason (usually
      obviously that "random" reason being some resource limit).
      
      Fixes: 88eca020 ("ida: simplified functions for id allocation")
      Reported-by: NItay Iellin <ieitayie@gmail.com>
      Suggested-by: NMatthew Wilcox <willy@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NXia Fukun <xiafukun@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      eaab6483
  2. 06 12月, 2022 1 次提交
    • O
      !272 [openEuler-1.0-LTS] Add MWAIT Cx support for Zhaoxin CPUs. · 75ea48ac
      openeuler-ci-bot 提交于
      Merge Pull Request from: @leoliu-oc 
       
      When the processor is idle,low-power idle states (C-states) can be used to save power. For Zhaoxin processors,there are two methods to enter idle states. One is HLT instruction and legacy method of I/O reads from the CPI-defined register (known as P_LVLx),the other one is MWAIT instruction with idle states hints.
      
      Default for legacy operating system,HLT and P_LVLx I/O reads are used for Zhaoxin Processors to enter idle states, but we have checked on some Zhaoxin platform that MWAIT instruction is more efficient than P_LVLx I/O reads and HLT, so we add MWAIT Cx support for Zhaoxin Processors.
      
      ### Issue
      https://gitee.com/openeuler/kernel/issues/I62TOM
      
      ### Test
      N/A
      
      ### Known Issue
      N/A
      
      ### Default config change
      N/A
      
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/272 
      Reviewed-by: Laibin Qiu <qiulaibin@huawei.com> 
      Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> 
      75ea48ac
  3. 05 12月, 2022 3 次提交
  4. 29 11月, 2022 5 次提交
  5. 27 11月, 2022 1 次提交
    • F
      x86/tsc: use topology_max_packages() in tsc watchdog check · 4f283abb
      Feng Tang 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 187942, https://gitee.com/openeuler/kernel/issues/I5U037
      CVE: NA
      
      -------------------------------
      
      Commit b50db709 ("x86/tsc: Disable clocksource watchdog for TSC
      on qualified platorms") was introduced to solve problem that
      sometimes TSC clocksource is wrongly judged as unstable by watchdog
      like 'jiffies', HPET, etc.
      
      In it, the hardware socket number is a key factor for judging
      whether to disable the watchdog for TSC, and 'nr_online_nodes' was
      chosen as an estimation due to it is needed in early boot phase
      before registering 'tsc-early' clocksource, where all none-boot
      CPUs are not brought up yet.
      
      In recent patch review, Dave Hansen pointed out there are many
      cases that 'nr_online_nodes' could have issue, like:
      * numa emulation (numa=fake=4 etc.)
      * numa=off
      * platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less
        persistent memory nodes.
      
      Peter Zijlstra suggested to use logical package ids, but it is
      only usable after smp_init() and all CPUs are initialized.
      
      One solution is to skip the watchdog for 'tsc-early' clocksource,
      and move the check after smp_init(), while before 'tsc'
      clocksoure is registered, where topology_max_packages() could
      be used as a much more accurate socket number.
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      
      Conflict:
      	arch/x86/kernel/tsc.c
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      4f283abb
  6. 26 11月, 2022 2 次提交
    • X
      scsi: hisi_sas: Set iptt aborted flag when receiving an abnormal CQ · 4cccc16a
      Xingui Yang 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I62ZXO
      CVE: NA
      
      ------------------------------------------------
      
      During the write I/O, when the SAS PHY switch is tested, the hardware
      may reports two CQs for one IO. the first cq indicates invalid port when
      DPH scheduling, the second cq indicates that response frame has been
      written to the memory but the I/O is ended abnormally due to I/O data
      underload. So set iptt aborted flag when receiving an abnormal CQ, then the
      host will discards the IPTT frame received from the SAS hard disk.
      Signed-off-by: NXingui Yang <yangxingui@huawei.com>
      Reviewed-by: Nkang fenglong <kangfenglong@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      4cccc16a
    • L
      ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0 · bc9ebdce
      Luís Henriques 提交于
      mainline inclusion
      from mainline-v6.0-rc7
      commit 29a5b8a1
      category: bugfix
      bugzilla: 187444, https://gitee.com/openeuler/kernel/issues/I6261Z
      CVE: NA
      
      --------------------------------
      
      When walking through an inode extents, the ext4_ext_binsearch_idx() function
      assumes that the extent header has been previously validated.  However, there
      are no checks that verify that the number of entries (eh->eh_entries) is
      non-zero when depth is > 0.  And this will lead to problems because the
      EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:
      
      [  135.245946] ------------[ cut here ]------------
      [  135.247579] kernel BUG at fs/ext4/extents.c:2258!
      [  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
      [  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
      [  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
      [  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
      [  135.256475] Code:
      [  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
      [  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
      [  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
      [  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
      [  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
      [  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
      [  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
      [  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
      [  135.277952] Call Trace:
      [  135.278635]  <TASK>
      [  135.279247]  ? preempt_count_add+0x6d/0xa0
      [  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
      [  135.281612]  ? _raw_read_unlock+0x18/0x30
      [  135.282704]  ext4_map_blocks+0x294/0x5a0
      [  135.283745]  ? xa_load+0x6f/0xa0
      [  135.284562]  ext4_mpage_readpages+0x3d6/0x770
      [  135.285646]  read_pages+0x67/0x1d0
      [  135.286492]  ? folio_add_lru+0x51/0x80
      [  135.287441]  page_cache_ra_unbounded+0x124/0x170
      [  135.288510]  filemap_get_pages+0x23d/0x5a0
      [  135.289457]  ? path_openat+0xa72/0xdd0
      [  135.290332]  filemap_read+0xbf/0x300
      [  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
      [  135.292192]  new_sync_read+0x103/0x170
      [  135.293014]  vfs_read+0x15d/0x180
      [  135.293745]  ksys_read+0xa1/0xe0
      [  135.294461]  do_syscall_64+0x3c/0x80
      [  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      This patch simply adds an extra check in __ext4_ext_check(), verifying that
      eh_entries is not 0 when eh_depth is > 0.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
      Cc: Baokun Li <libaokun1@huawei.com>
      Cc: stable@kernel.org
      Signed-off-by: NLuís Henriques <lhenriques@suse.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NBaokun Li <libaokun1@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      bc9ebdce
  7. 24 11月, 2022 1 次提交
    • L
      Add MWAIT Cx support for Zhaoxin CPUs. · e1b6487f
      leoliu 提交于
      zhaoxin inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I62TOM
      CVE: NA
      
      ----------------------------------------------------------------
      
      When the processor is idle,low-power idle states (C-states) can be used
      to save power. For Zhaoxin processors,there are two methods to enter idle
      states. One is HLT instruction and legacy method of I/O reads from the
      ACPI-defined register (known as P_LVLx),the other one is MWAIT
      instruction with idle states hints.
      
      Default for legacy operating system,HLT and P_LVLx I/O reads are used for
      Zhaoxin Processors to enter idle states, but we have checked on some
      Zhaoxin platform that MWAIT instruction is more efficient than P_LVLx I/O
      reads and HLT, so we add MWAIT Cx support for Zhaoxin Processors.
      Signed-off-by: Nleoliu <leoliu@zhaoxin.com>
      e1b6487f
  8. 21 11月, 2022 1 次提交
  9. 19 11月, 2022 4 次提交
  10. 15 11月, 2022 1 次提交
  11. 14 11月, 2022 2 次提交
  12. 08 11月, 2022 1 次提交