1. 05 6月, 2021 1 次提交
  2. 23 5月, 2021 1 次提交
  3. 20 5月, 2021 1 次提交
  4. 19 5月, 2021 6 次提交
  5. 15 5月, 2021 3 次提交
  6. 14 5月, 2021 3 次提交
  7. 13 5月, 2021 1 次提交
  8. 12 5月, 2021 1 次提交
  9. 11 5月, 2021 3 次提交
    • O
      kyber: fix out of bounds access when preempted · efed9a33
      Omar Sandoval 提交于
      __blk_mq_sched_bio_merge() gets the ctx and hctx for the current CPU and
      passes the hctx to ->bio_merge(). kyber_bio_merge() then gets the ctx
      for the current CPU again and uses that to get the corresponding Kyber
      context in the passed hctx. However, the thread may be preempted between
      the two calls to blk_mq_get_ctx(), and the ctx returned the second time
      may no longer correspond to the passed hctx. This "works" accidentally
      most of the time, but it can cause us to read garbage if the second ctx
      came from an hctx with more ctx's than the first one (i.e., if
      ctx->index_hw[hctx->type] > hctx->nr_ctx).
      
      This manifested as this UBSAN array index out of bounds error reported
      by Jakub:
      
      UBSAN: array-index-out-of-bounds in ../kernel/locking/qspinlock.c:130:9
      index 13106 is out of range for type 'long unsigned int [128]'
      Call Trace:
       dump_stack+0xa4/0xe5
       ubsan_epilogue+0x5/0x40
       __ubsan_handle_out_of_bounds.cold.13+0x2a/0x34
       queued_spin_lock_slowpath+0x476/0x480
       do_raw_spin_lock+0x1c2/0x1d0
       kyber_bio_merge+0x112/0x180
       blk_mq_submit_bio+0x1f5/0x1100
       submit_bio_noacct+0x7b0/0x870
       submit_bio+0xc2/0x3a0
       btrfs_map_bio+0x4f0/0x9d0
       btrfs_submit_data_bio+0x24e/0x310
       submit_one_bio+0x7f/0xb0
       submit_extent_page+0xc4/0x440
       __extent_writepage_io+0x2b8/0x5e0
       __extent_writepage+0x28d/0x6e0
       extent_write_cache_pages+0x4d7/0x7a0
       extent_writepages+0xa2/0x110
       do_writepages+0x8f/0x180
       __writeback_single_inode+0x99/0x7f0
       writeback_sb_inodes+0x34e/0x790
       __writeback_inodes_wb+0x9e/0x120
       wb_writeback+0x4d2/0x660
       wb_workfn+0x64d/0xa10
       process_one_work+0x53a/0xa80
       worker_thread+0x69/0x5b0
       kthread+0x20b/0x240
       ret_from_fork+0x1f/0x30
      
      Only Kyber uses the hctx, so fix it by passing the request_queue to
      ->bio_merge() instead. BFQ and mq-deadline just use that, and Kyber can
      map the queues itself to avoid the mismatch.
      
      Fixes: a6088845 ("block: kyber: make kyber more friendly with merging")
      Reported-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Link: https://lore.kernel.org/r/c7598605401a48d5cfeadebb678abd10af22b83f.1620691329.git.osandov@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      efed9a33
    • N
      stack: Replace "o" output with "r" input constraint · 2515dd6c
      Nick Desaulniers 提交于
      "o" isn't a common asm() constraint to use; it triggers an assertion in
      assert-enabled builds of LLVM that it's not recognized when targeting
      aarch64 (though it appears to fall back to "m"). It's fixed in LLVM 13 now,
      but there isn't really a good reason to use "o" in particular here. To
      avoid causing build issues for those using assert-enabled builds of earlier
      LLVM versions, the constraint needs changing.
      
      Instead, if the point is to retain the __builtin_alloca(), make ptr appear
      to "escape" via being an input to an empty inline asm block. This is
      preferable anyways, since otherwise this looks like a dead store.
      
      While the use of "r" was considered in
      
        https://lore.kernel.org/lkml/202104011447.2E7F543@keescook/
      
      it was only tested as an output (which looks like a dead store, and wasn't
      sufficient).
      
      Use "r" as an input constraint instead, which behaves correctly across
      compilers and architectures.
      
      Fixes: 39218ff4 ("stack: Optionally randomize kernel stack offset each syscall")
      Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NKees Cook <keescook@chromium.org>
      Tested-by: NNathan Chancellor <nathan@kernel.org>
      Reviewed-by: NNathan Chancellor <nathan@kernel.org>
      Link: https://reviews.llvm.org/D100412
      Link: https://bugs.llvm.org/show_bug.cgi?id=49956
      Link: https://lore.kernel.org/r/20210419231741.4084415-1-keescook@chromium.org
      2515dd6c
    • T
      PM: runtime: Fix unpaired parent child_count for force_resume · c745253e
      Tony Lindgren 提交于
      As pm_runtime_need_not_resume() relies also on usage_count, it can return
      a different value in pm_runtime_force_suspend() compared to when called in
      pm_runtime_force_resume(). Different return values can happen if anything
      calls PM runtime functions in between, and causes the parent child_count
      to increase on every resume.
      
      So far I've seen the issue only for omapdrm that does complicated things
      with PM runtime calls during system suspend for legacy reasons:
      
      omap_atomic_commit_tail() for omapdrm.0
       dispc_runtime_get()
        wakes up 58000000.dss as it's the dispc parent
         dispc_runtime_resume()
          rpm_resume() increases parent child_count
       dispc_runtime_put() won't idle, PM runtime suspend blocked
      pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume()
       __update_runtime_status()
      system suspended
      pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume()
       pm_runtime_enable() only called because of pm_runtime_need_not_resume()
      omap_atomic_commit_tail() for omapdrm.0
       dispc_runtime_get()
        wakes up 58000000.dss as it's the dispc parent
         dispc_runtime_resume()
          rpm_resume() increases parent child_count
       dispc_runtime_put() won't idle, PM runtime suspend blocked
      ...
      rpm_suspend for 58000000.dss but parent child_count is now unbalanced
      
      Let's fix the issue by adding a flag for needs_force_resume and use it in
      pm_runtime_force_resume() instead of pm_runtime_need_not_resume().
      
      Additionally omapdrm system suspend could be simplified later on to avoid
      lots of unnecessary PM runtime calls and the complexity it adds. The
      driver can just use internal functions that are shared between the PM
      runtime and system suspend related functions.
      
      Fixes: 4918e1f8 ("PM / runtime: Rework pm_runtime_force_suspend/resume()")
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      Tested-by: NTomi Valkeinen <tomi.valkeinen@ideasonboard.com>
      Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c745253e
  10. 10 5月, 2021 2 次提交
  11. 09 5月, 2021 1 次提交
  12. 08 5月, 2021 2 次提交
  13. 07 5月, 2021 15 次提交
    • I
      mm: fix typos in comments · f0953a1b
      Ingo Molnar 提交于
      Fix ~94 single-word typos in locking code comments, plus a few
      very obvious grammar mistakes.
      
      Link: https://lkml.kernel.org/r/20210322212624.GA1963421@gmail.com
      Link: https://lore.kernel.org/r/20210322205203.GB1959563@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f0953a1b
    • M
      treewide: remove editor modelines and cruft · fa60ce2c
      Masahiro Yamada 提交于
      The section "19) Editor modelines and other cruft" in
      Documentation/process/coding-style.rst clearly says, "Do not include any
      of these in source files."
      
      I recently receive a patch to explicitly add a new one.
      
      Let's do treewide cleanups, otherwise some people follow the existing code
      and attempt to upstream their favoriate editor setups.
      
      It is even nicer if scripts/checkpatch.pl can check it.
      
      If we like to impose coding style in an editor-independent manner, I think
      editorconfig (patch [1]) is a saner solution.
      
      [1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/
      
      Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.orgSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: Miguel Ojeda <ojeda@kernel.org>	[auxdisplay]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa60ce2c
    • B
    • D
      mm/vmalloc: remove vwrite() · f7c8ce44
      David Hildenbrand 提交于
      The last user (/dev/kmem) is gone. Let's drop it.
      
      Link: https://lkml.kernel.org/r/20210324102351.6932-4-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: huang ying <huang.ying.caritas@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7c8ce44
    • D
      mm: remove xlate_dev_kmem_ptr() · f2e762ba
      David Hildenbrand 提交于
      Since /dev/kmem has been removed, let's remove the xlate_dev_kmem_ptr()
      leftovers.
      
      Link: https://lkml.kernel.org/r/20210324102351.6932-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Niklas Schnelle <schnelle@linux.ibm.com>
      Cc: Pierre Morel <pmorel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2e762ba
    • D
      drivers/char: remove /dev/kmem for good · bbcd53c9
      David Hildenbrand 提交于
      Patch series "drivers/char: remove /dev/kmem for good".
      
      Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and
      memory ballooning, I started questioning the existence of /dev/kmem.
      
      Comparing it with the /proc/kcore implementation, it does not seem to be
      able to deal with things like
      
      a) Pages unmapped from the direct mapping (e.g., to be used by secretmem)
        -> kern_addr_valid(). virt_addr_valid() is not sufficient.
      
      b) Special cases like gart aperture memory that is not to be touched
        -> mem_pfn_is_ram()
      
      Unless I am missing something, it's at least broken in some cases and might
      fault/crash the machine.
      
      Looks like its existence has been questioned before in 2005 and 2010 [1],
      after ~11 additional years, it might make sense to revive the discussion.
      
      CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by
      mistake?).  All distributions disable it: in Ubuntu it has been disabled
      for more than 10 years, in Debian since 2.6.31, in Fedora at least
      starting with FC3, in RHEL starting with RHEL4, in SUSE starting from
      15sp2, and OpenSUSE has it disabled as well.
      
      1) /dev/kmem was popular for rootkits [2] before it got disabled
         basically everywhere. Ubuntu documents [3] "There is no modern user of
         /dev/kmem any more beyond attackers using it to load kernel rootkits.".
         RHEL documents in a BZ [5] "it served no practical purpose other than to
         serve as a potential security problem or to enable binary module drivers
         to access structures/functions they shouldn't be touching"
      
      2) /proc/kcore is a decent interface to have a controlled way to read
         kernel memory for debugging puposes. (will need some extensions to
         deal with memory offlining/unplug, memory ballooning, and poisoned
         pages, though)
      
      3) It might be useful for corner case debugging [1]. KDB/KGDB might be a
         better fit, especially, to write random memory; harder to shoot
         yourself into the foot.
      
      4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems
         to be incompatible with 64bit [1]. For educational purposes,
         /proc/kcore might be used to monitor value updates -- or older
         kernels can be used.
      
      5) It's broken on arm64, and therefore, completely disabled there.
      
      Looks like it's essentially unused and has been replaced by better
      suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's
      just remove it.
      
      [1] https://lwn.net/Articles/147901/
      [2] https://www.linuxjournal.com/article/10505
      [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled
      [4] https://sourceforge.net/projects/kme/
      [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796
      
      Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com
      Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Corentin Labbe <clabbe@baylibre.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Gregory Clement <gregory.clement@bootlin.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: huang ying <huang.ying.caritas@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: James Troup <james.troup@canonical.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Cc: Liviu Dudau <liviu.dudau@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Niklas Schnelle <schnelle@linux.ibm.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: openrisc@lists.librecores.org
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Pavel Machek (CIP)" <pavel@denx.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Pierre Morel <pmorel@linux.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: sparclinux@vger.kernel.org
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Theodore Dubois <tblodt@icloud.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Xiaoming Ni <nixiaoming@huawei.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbcd53c9
    • S
      mm: fix some typos and code style problems · cb152a1a
      Shijie Luo 提交于
      fix some typos and code style problems in mm.
      
      gfp.h: s/MAXNODES/MAX_NUMNODES
      mmzone.h: s/then/than
      rmap.c: s/__vma_split()/__vma_adjust()
      swap.c: s/__mod_zone_page_stat/__mod_zone_page_state, s/is is/is
      swap_state.c: s/whoes/whose
      z3fold.c: code style problem fix in z3fold_unregister_migration
      zsmalloc.c: s/of/or, s/give/given
      
      Link: https://lkml.kernel.org/r/20210419083057.64820-1-luoshijie1@huawei.comSigned-off-by: NShijie Luo <luoshijie1@huawei.com>
      Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb152a1a
    • R
      init/initramfs.c: do unpacking asynchronously · e7cb072e
      Rasmus Villemoes 提交于
      Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
      
      These two patches are independent, but better-together.
      
      The second is a rather trivial patch that simply allows the developer to
      change "/sbin/modprobe" to something else - e.g.  the empty string, so
      that all request_module() during early boot return -ENOENT early, without
      even spawning a usermode helper, needlessly synchronizing with the
      initramfs unpacking.
      
      The first patch delegates decompressing the initramfs to a worker thread,
      allowing do_initcalls() in main.c to proceed to the device_ and late_
      initcalls without waiting for that decompression (and populating of
      rootfs) to finish.  Obviously, some of those later calls may rely on the
      initramfs being available, so I've added synchronization points in the
      firmware loader and usermodehelper paths - there might be other places
      that would need this, but so far no one has been able to think of any
      places I have missed.
      
      There's not much to win if most of the functionality needed during boot is
      only available as modules.  But systems with a custom-made .config and
      initramfs can boot faster, partly due to utilizing more than one cpu
      earlier, partly by avoiding known-futile modprobe calls (which would still
      trigger synchronization with the initramfs unpacking, thus eliminating
      most of the first benefit).
      
      This patch (of 2):
      
      Most of the boot process doesn't actually need anything from the
      initramfs, until of course PID1 is to be executed.  So instead of doing
      the decompressing and populating of the initramfs synchronously in
      populate_rootfs() itself, push that off to a worker thread.
      
      This is primarily motivated by an embedded ppc target, where unpacking
      even the rather modest sized initramfs takes 0.6 seconds, which is long
      enough that the external watchdog becomes unhappy that it doesn't get
      attention soon enough.  By doing the initramfs decompression in a worker
      thread, we get to do the device_initcalls and hence start petting the
      watchdog much sooner.
      
      Normal desktops might benefit as well.  On my mostly stock Ubuntu kernel,
      my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
      That takes almost two seconds:
      
      [    0.201454] Trying to unpack rootfs image as initramfs...
      [    1.976633] Freeing initrd memory: 29416K
      
      Before this patch, these lines occur consecutively in dmesg.  With this
      patch, the timestamps on these two lines is roughly the same as above, but
      with 172 lines inbetween - so more than one cpu has been kept busy doing
      work that would otherwise only happen after the populate_rootfs()
      finished.
      
      Should one of the initcalls done after rootfs_initcall time (i.e., device_
      and late_ initcalls) need something from the initramfs (say, a kernel
      module or a firmware blob), it will simply wait for the initramfs
      unpacking to be done before proceeding, which should in theory make this
      completely safe.
      
      But if some driver pokes around in the filesystem directly and not via one
      of the official kernel interfaces (i.e.  request_firmware*(),
      call_usermodehelper*) that theory may not hold - also, I certainly might
      have missed a spot when sprinkling wait_for_initramfs().  So there is an
      escape hatch in the form of an initramfs_async= command line parameter.
      
      Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
      Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dkSigned-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e7cb072e
    • R
    • Y
      delayacct: clear right task's flag after blkio completes · 3d1c7fd9
      Yafang Shao 提交于
      When I was implementing a latency analyzer tool by using task->delays
      and other things, I found an issue in delayacct.  The issue is it should
      clear the target's flag instead of current's in delayacct_blkio_end().
      
      When I git blame delayacct, I found there're some similar issues we have
      fixed in delayacct_blkio_end().
      
       - Commit c96f5471 ("delayacct: Account blkio completion on the
         correct task") fixed the issue that it should account blkio
         completion on the target task instead of current.
      
       - Commit b512719f ("delayacct: fix crash in delayacct_blkio_end()
         after delayacct init failure") fixed the issue that it should check
         target task's delays instead of current task'.
      
      It seems that delayacct_blkio_{begin, end} are error prone.
      
      So I introduce a new paratmeter - the target task 'p' - to these
      helpers.  After that change, the callsite will specifilly set the right
      task, which should make it less error prone.
      
      Link: https://lkml.kernel.org/r/20210414083720.24083-1-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josh Snyder <joshs@netflix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d1c7fd9
    • H
      smp: kernel/panic.c - silence warnings · 6f1f942c
      He Ying 提交于
      We found these warnings in kernel/panic.c by using sparse tool:
      warning: symbol 'panic_smp_self_stop' was not declared.
      warning: symbol 'nmi_panic_self_stop' was not declared.
      warning: symbol 'crash_smp_send_stop' was not declared.
      
      To avoid them, add declarations for these three functions in
      include/linux/smp.h.
      
      Link: https://lkml.kernel.org/r/20210316084150.75201-1-heying24@huawei.comSigned-off-by: NHe Ying <heying24@huawei.com>
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f1f942c
    • M
      include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx() · e13d04ec
      Masahiro Yamada 提交于
      compat_sys##name is declared twice, just one line below.
      
      With this removal SYSCALL_DEFINEx() (defined in <linux/syscalls.h>)
      and COMPAT_SYSCALL_DEFINEx() look symmetrical.
      
      Link: https://lkml.kernel.org/r/20210223114924.854794-1-masahiroy@kernel.orgSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e13d04ec
    • R
      lib: crc8: pointer to data block should be const · e18baa7c
      Richard Fitzgerald 提交于
      crc8() does not change the data passed to it, so the pointer argument
      should be declared const.  This avoids callers that receive const data
      having to cast it to a non-const pointer to call crc8().
      
      Link: https://lkml.kernel.org/r/20210329122409.3291-1-rf@opensource.cirrus.comSigned-off-by: NRichard Fitzgerald <rf@opensource.cirrus.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e18baa7c
    • Y
      lib: add fast path for find_first_*_bit() and find_last_bit() · 2cc7b6a4
      Yury Norov 提交于
      Similarly to bitmap functions, users would benefit if we'll handle a case
      of small-size bitmaps that fit into a single word.
      
      While here, move the find_last_bit() declaration to bitops/find.h where
      other find_*_bit() functions sit.
      
      Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.comSigned-off-by: NYury Norov <yury.norov@gmail.com>
      Acked-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Alexey Klimov <aklimov@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Sterba <dsterba@suse.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jianpeng Ma <jianpeng.ma@intel.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Stefano Brivio <sbrivio@redhat.com>
      Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
      Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
      Cc: Yoshinori Sato <ysato@users.osdn.me>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2cc7b6a4
    • Y
      lib: add fast path for find_next_*_bit() · 277a20a4
      Yury Norov 提交于
      Similarly to bitmap functions, find_next_*_bit() users will benefit if
      we'll handle a case of bitmaps that fit into a single word inline.  In the
      very best case, the compiler may replace a function call with a few
      instructions.
      
      This is the quite typical find_next_bit() user:
      
      	unsigned int cpumask_next(int n, const struct cpumask *srcp)
      	{
      		/* -1 is a legal arg here. */
      		if (n != -1)
      			cpumask_check(n);
      		return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
      	}
      	EXPORT_SYMBOL(cpumask_next);
      
      Currently, on ARM64 the generated code looks like this:
      	0000000000000000 <cpumask_next>:
      	   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
      	   4:   11000402        add     w2, w0, #0x1
      	   8:   aa0103e0        mov     x0, x1
      	   c:   d2800401        mov     x1, #0x40                       // #64
      	  10:   910003fd        mov     x29, sp
      	  14:   93407c42        sxtw    x2, w2
      	  18:   94000000        bl      0 <find_next_bit>
      	  1c:   a8c17bfd        ldp     x29, x30, [sp], #16
      	  20:   d65f03c0        ret
      	  24:   d503201f        nop
      
      After applying this patch:
      	0000000000000140 <cpumask_next>:
      	 140:   11000400        add     w0, w0, #0x1
      	 144:   93407c00        sxtw    x0, w0
      	 148:   f100fc1f        cmp     x0, #0x3f
      	 14c:   54000168        b.hi    178 <cpumask_next+0x38>  // b.pmore
      	 150:   f9400023        ldr     x3, [x1]
      	 154:   92800001        mov     x1, #0xffffffffffffffff         // #-1
      	 158:   9ac02020        lsl     x0, x1, x0
      	 15c:   52800802        mov     w2, #0x40                       // #64
      	 160:   8a030001        and     x1, x0, x3
      	 164:   dac00020        rbit    x0, x1
      	 168:   f100003f        cmp     x1, #0x0
      	 16c:   dac01000        clz     x0, x0
      	 170:   1a800040        csel    w0, w2, w0, eq  // eq = none
      	 174:   d65f03c0        ret
      	 178:   52800800        mov     w0, #0x40                       // #64
      	 17c:   d65f03c0        ret
      
      find_next_bit() call is replaced with 6 instructions.  find_next_bit()
      itself is 41 instructions plus function call overhead.
      
      Despite inlining, the scripts/bloat-o-meter report smaller .text size
      after applying the series:
      	add/remove: 11/9 grow/shrink: 233/176 up/down: 5780/-6768 (-988)
      
      Link: https://lkml.kernel.org/r/20210401003153.97325-10-yury.norov@gmail.comSigned-off-by: NYury Norov <yury.norov@gmail.com>
      Acked-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Alexey Klimov <aklimov@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Sterba <dsterba@suse.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jianpeng Ma <jianpeng.ma@intel.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Stefano Brivio <sbrivio@redhat.com>
      Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
      Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
      Cc: Yoshinori Sato <ysato@users.osdn.me>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      277a20a4