1. 30 6月, 2021 2 次提交
  2. 24 6月, 2021 1 次提交
  3. 04 6月, 2021 1 次提交
  4. 07 5月, 2021 1 次提交
    • D
      drivers/char: remove /dev/kmem for good · bbcd53c9
      David Hildenbrand 提交于
      Patch series "drivers/char: remove /dev/kmem for good".
      
      Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and
      memory ballooning, I started questioning the existence of /dev/kmem.
      
      Comparing it with the /proc/kcore implementation, it does not seem to be
      able to deal with things like
      
      a) Pages unmapped from the direct mapping (e.g., to be used by secretmem)
        -> kern_addr_valid(). virt_addr_valid() is not sufficient.
      
      b) Special cases like gart aperture memory that is not to be touched
        -> mem_pfn_is_ram()
      
      Unless I am missing something, it's at least broken in some cases and might
      fault/crash the machine.
      
      Looks like its existence has been questioned before in 2005 and 2010 [1],
      after ~11 additional years, it might make sense to revive the discussion.
      
      CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by
      mistake?).  All distributions disable it: in Ubuntu it has been disabled
      for more than 10 years, in Debian since 2.6.31, in Fedora at least
      starting with FC3, in RHEL starting with RHEL4, in SUSE starting from
      15sp2, and OpenSUSE has it disabled as well.
      
      1) /dev/kmem was popular for rootkits [2] before it got disabled
         basically everywhere. Ubuntu documents [3] "There is no modern user of
         /dev/kmem any more beyond attackers using it to load kernel rootkits.".
         RHEL documents in a BZ [5] "it served no practical purpose other than to
         serve as a potential security problem or to enable binary module drivers
         to access structures/functions they shouldn't be touching"
      
      2) /proc/kcore is a decent interface to have a controlled way to read
         kernel memory for debugging puposes. (will need some extensions to
         deal with memory offlining/unplug, memory ballooning, and poisoned
         pages, though)
      
      3) It might be useful for corner case debugging [1]. KDB/KGDB might be a
         better fit, especially, to write random memory; harder to shoot
         yourself into the foot.
      
      4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems
         to be incompatible with 64bit [1]. For educational purposes,
         /proc/kcore might be used to monitor value updates -- or older
         kernels can be used.
      
      5) It's broken on arm64, and therefore, completely disabled there.
      
      Looks like it's essentially unused and has been replaced by better
      suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's
      just remove it.
      
      [1] https://lwn.net/Articles/147901/
      [2] https://www.linuxjournal.com/article/10505
      [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled
      [4] https://sourceforge.net/projects/kme/
      [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796
      
      Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com
      Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Corentin Labbe <clabbe@baylibre.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Gregory Clement <gregory.clement@bootlin.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: huang ying <huang.ying.caritas@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: James Troup <james.troup@canonical.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Cc: Liviu Dudau <liviu.dudau@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Niklas Schnelle <schnelle@linux.ibm.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: openrisc@lists.librecores.org
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Pavel Machek (CIP)" <pavel@denx.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Pierre Morel <pmorel@linux.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: sparclinux@vger.kernel.org
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Theodore Dubois <tblodt@icloud.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Xiaoming Ni <nixiaoming@huawei.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbcd53c9
  5. 06 5月, 2021 1 次提交
  6. 01 5月, 2021 1 次提交
    • J
      mm: provide filemap_range_needs_writeback() helper · 63135aa3
      Jens Axboe 提交于
      Patch series "Improve IOCB_NOWAIT O_DIRECT reads", v3.
      
      An internal workload complained because it was using too much CPU, and
      when I took a look, we had a lot of io_uring workers going to town.
      
      For an async buffered read like workload, I am normally expecting _zero_
      offloads to a worker thread, but this one had tons of them.  I'd drop
      caches and things would look good again, but then a minute later we'd
      regress back to using workers.  Turns out that every minute something
      was reading parts of the device, which would add page cache for that
      inode.  I put patches like these in for our kernel, and the problem was
      solved.
      
      Don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
      entries for the given range.  This causes unnecessary work from the
      callers side, when the IO could have been issued totally fine without
      blocking on writeback when there is none.
      
      This patch (of 3):
      
      For O_DIRECT reads/writes, we check if we need to issue a call to
      filemap_write_and_wait_range() to issue and/or wait for writeback for any
      page in the given range.  The existing mechanism just checks for a page in
      the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
      slow path (and needing retry) if there's just a clean page cache page in
      the range.
      
      Provide filemap_range_needs_writeback() which tries a little harder to
      check if we actually need to issue and/or wait for writeback in the range.
      
      Link: https://lkml.kernel.org/r/20210224164455.1096727-1-axboe@kernel.dk
      Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63135aa3
  7. 23 4月, 2021 1 次提交
  8. 12 4月, 2021 2 次提交
    • M
      vfs: remove unused ioctl helpers · 51db776a
      Miklos Szeredi 提交于
      Remove vfs_ioc_setflags_prepare(), vfs_ioc_fssetxattr_check() and
      simple_fill_fsxattr(), which are no longer used.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      51db776a
    • M
      vfs: add fileattr ops · 4c5b4799
      Miklos Szeredi 提交于
      There's a substantial amount of boilerplate in filesystems handling
      FS_IOC_[GS]ETFLAGS/ FS_IOC_FS[GS]ETXATTR ioctls.
      
      Also due to userspace buffers being involved in the ioctl API this is
      difficult to stack, as shown by overlayfs issues related to these ioctls.
      
      Introduce a new internal API named "fileattr" (fsxattr can be confused with
      xattr, xflags is inappropriate, since this is more than just flags).
      
      There's significant overlap between flags and xflags and this API handles
      the conversions automatically, so filesystems may choose which one to use.
      
      In ->fileattr_get() a hint is provided to the filesystem whether flags or
      xattr are being requested by userspace, but in this series this hint is
      ignored by all filesystems, since generating all the attributes is cheap.
      
      If a filesystem doesn't implemement the fileattr API, just fall back to
      f_op->ioctl().  When all filesystems are converted, the fallback can be
      removed.
      
      32bit compat ioctls are now handled by the generic code as well.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4c5b4799
  9. 08 4月, 2021 1 次提交
  10. 23 3月, 2021 6 次提交
  11. 08 3月, 2021 1 次提交
  12. 25 2月, 2021 1 次提交
  13. 28 1月, 2021 1 次提交
  14. 24 1月, 2021 12 次提交
  15. 14 1月, 2021 2 次提交
  16. 16 12月, 2020 1 次提交
  17. 11 12月, 2020 1 次提交
  18. 03 12月, 2020 1 次提交
    • D
      libfs: Add generic function for setting dentry_ops · 608af703
      Daniel Rosenberg 提交于
      This adds a function to set dentry operations at lookup time that will
      work for both encrypted filenames and casefolded filenames.
      
      A filesystem that supports both features simultaneously can use this
      function during lookup preparations to set up its dentry operations once
      fscrypt no longer does that itself.
      
      Currently the casefolding dentry operation are always set if the
      filesystem defines an encoding because the features is toggleable on
      empty directories. Unlike in the encryption case, the dentry operations
      used come from the parent. Since we don't know what set of functions
      we'll eventually need, and cannot change them later, we enable the
      casefolding operations if the filesystem supports them at all.
      
      By splitting out the various cases, we support as few dentry operations
      as we can get away with, maximizing compatibility with overlayfs, which
      will not function if a filesystem supports certain dentry_operations.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      608af703
  19. 02 12月, 2020 2 次提交
  20. 11 11月, 2020 1 次提交