1. 25 8月, 2021 1 次提交
  2. 24 8月, 2021 1 次提交
  3. 23 8月, 2021 1 次提交
    • J
      fs: remove mandatory file locking support · f7e33bdb
      Jeff Layton 提交于
      We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
      off in Fedora and RHEL8. Several other distros have followed suit.
      
      I've heard of one problem in all that time: Someone migrated from an
      older distro that supported "-o mand" to one that didn't, and the host
      had a fstab entry with "mand" in it which broke on reboot. They didn't
      actually _use_ mandatory locking so they just removed the mount option
      and moved on.
      
      This patch rips out mandatory locking support wholesale from the kernel,
      along with the Kconfig option and the Documentation file. It also
      changes the mount code to ignore the "mand" mount option instead of
      erroring out, and to throw a big, ugly warning.
      Signed-off-by: NJeff Layton <jlayton@kernel.org>
      f7e33bdb
  4. 11 8月, 2021 1 次提交
  5. 13 7月, 2021 2 次提交
    • J
      mm: Add functions to lock invalidate_lock for two mappings · 7506ae6a
      Jan Kara 提交于
      Some operations such as reflinking blocks among files will need to lock
      invalidate_lock for two mappings. Add helper functions to do that.
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      7506ae6a
    • J
      mm: Protect operations adding pages to page cache with invalidate_lock · 730633f0
      Jan Kara 提交于
      Currently, serializing operations such as page fault, read, or readahead
      against hole punching is rather difficult. The basic race scheme is
      like:
      
      fallocate(FALLOC_FL_PUNCH_HOLE)			read / fault / ..
        truncate_inode_pages_range()
      						  <create pages in page
      						   cache here>
        <update fs block mapping and free blocks>
      
      Now the problem is in this way read / page fault / readahead can
      instantiate pages in page cache with potentially stale data (if blocks
      get quickly reused). Avoiding this race is not simple - page locks do
      not work because we want to make sure there are *no* pages in given
      range. inode->i_rwsem does not work because page fault happens under
      mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes
      the performance for mixed read-write workloads suffer.
      
      So create a new rw_semaphore in the address_space - invalidate_lock -
      that protects adding of pages to page cache for page faults / reads /
      readahead.
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      730633f0
  6. 30 6月, 2021 2 次提交
  7. 24 6月, 2021 1 次提交
  8. 04 6月, 2021 1 次提交
  9. 07 5月, 2021 1 次提交
    • D
      drivers/char: remove /dev/kmem for good · bbcd53c9
      David Hildenbrand 提交于
      Patch series "drivers/char: remove /dev/kmem for good".
      
      Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and
      memory ballooning, I started questioning the existence of /dev/kmem.
      
      Comparing it with the /proc/kcore implementation, it does not seem to be
      able to deal with things like
      
      a) Pages unmapped from the direct mapping (e.g., to be used by secretmem)
        -> kern_addr_valid(). virt_addr_valid() is not sufficient.
      
      b) Special cases like gart aperture memory that is not to be touched
        -> mem_pfn_is_ram()
      
      Unless I am missing something, it's at least broken in some cases and might
      fault/crash the machine.
      
      Looks like its existence has been questioned before in 2005 and 2010 [1],
      after ~11 additional years, it might make sense to revive the discussion.
      
      CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by
      mistake?).  All distributions disable it: in Ubuntu it has been disabled
      for more than 10 years, in Debian since 2.6.31, in Fedora at least
      starting with FC3, in RHEL starting with RHEL4, in SUSE starting from
      15sp2, and OpenSUSE has it disabled as well.
      
      1) /dev/kmem was popular for rootkits [2] before it got disabled
         basically everywhere. Ubuntu documents [3] "There is no modern user of
         /dev/kmem any more beyond attackers using it to load kernel rootkits.".
         RHEL documents in a BZ [5] "it served no practical purpose other than to
         serve as a potential security problem or to enable binary module drivers
         to access structures/functions they shouldn't be touching"
      
      2) /proc/kcore is a decent interface to have a controlled way to read
         kernel memory for debugging puposes. (will need some extensions to
         deal with memory offlining/unplug, memory ballooning, and poisoned
         pages, though)
      
      3) It might be useful for corner case debugging [1]. KDB/KGDB might be a
         better fit, especially, to write random memory; harder to shoot
         yourself into the foot.
      
      4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems
         to be incompatible with 64bit [1]. For educational purposes,
         /proc/kcore might be used to monitor value updates -- or older
         kernels can be used.
      
      5) It's broken on arm64, and therefore, completely disabled there.
      
      Looks like it's essentially unused and has been replaced by better
      suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's
      just remove it.
      
      [1] https://lwn.net/Articles/147901/
      [2] https://www.linuxjournal.com/article/10505
      [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled
      [4] https://sourceforge.net/projects/kme/
      [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796
      
      Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com
      Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Corentin Labbe <clabbe@baylibre.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Gregory Clement <gregory.clement@bootlin.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: huang ying <huang.ying.caritas@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: James Troup <james.troup@canonical.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Cc: Liviu Dudau <liviu.dudau@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Niklas Schnelle <schnelle@linux.ibm.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: openrisc@lists.librecores.org
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Pavel Machek (CIP)" <pavel@denx.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Pierre Morel <pmorel@linux.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: sparclinux@vger.kernel.org
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Theodore Dubois <tblodt@icloud.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Xiaoming Ni <nixiaoming@huawei.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbcd53c9
  10. 06 5月, 2021 1 次提交
  11. 01 5月, 2021 1 次提交
    • J
      mm: provide filemap_range_needs_writeback() helper · 63135aa3
      Jens Axboe 提交于
      Patch series "Improve IOCB_NOWAIT O_DIRECT reads", v3.
      
      An internal workload complained because it was using too much CPU, and
      when I took a look, we had a lot of io_uring workers going to town.
      
      For an async buffered read like workload, I am normally expecting _zero_
      offloads to a worker thread, but this one had tons of them.  I'd drop
      caches and things would look good again, but then a minute later we'd
      regress back to using workers.  Turns out that every minute something
      was reading parts of the device, which would add page cache for that
      inode.  I put patches like these in for our kernel, and the problem was
      solved.
      
      Don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
      entries for the given range.  This causes unnecessary work from the
      callers side, when the IO could have been issued totally fine without
      blocking on writeback when there is none.
      
      This patch (of 3):
      
      For O_DIRECT reads/writes, we check if we need to issue a call to
      filemap_write_and_wait_range() to issue and/or wait for writeback for any
      page in the given range.  The existing mechanism just checks for a page in
      the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
      slow path (and needing retry) if there's just a clean page cache page in
      the range.
      
      Provide filemap_range_needs_writeback() which tries a little harder to
      check if we actually need to issue and/or wait for writeback in the range.
      
      Link: https://lkml.kernel.org/r/20210224164455.1096727-1-axboe@kernel.dk
      Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63135aa3
  12. 23 4月, 2021 1 次提交
  13. 12 4月, 2021 2 次提交
    • M
      vfs: remove unused ioctl helpers · 51db776a
      Miklos Szeredi 提交于
      Remove vfs_ioc_setflags_prepare(), vfs_ioc_fssetxattr_check() and
      simple_fill_fsxattr(), which are no longer used.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      51db776a
    • M
      vfs: add fileattr ops · 4c5b4799
      Miklos Szeredi 提交于
      There's a substantial amount of boilerplate in filesystems handling
      FS_IOC_[GS]ETFLAGS/ FS_IOC_FS[GS]ETXATTR ioctls.
      
      Also due to userspace buffers being involved in the ioctl API this is
      difficult to stack, as shown by overlayfs issues related to these ioctls.
      
      Introduce a new internal API named "fileattr" (fsxattr can be confused with
      xattr, xflags is inappropriate, since this is more than just flags).
      
      There's significant overlap between flags and xflags and this API handles
      the conversions automatically, so filesystems may choose which one to use.
      
      In ->fileattr_get() a hint is provided to the filesystem whether flags or
      xattr are being requested by userspace, but in this series this hint is
      ignored by all filesystems, since generating all the attributes is cheap.
      
      If a filesystem doesn't implemement the fileattr API, just fall back to
      f_op->ioctl().  When all filesystems are converted, the fallback can be
      removed.
      
      32bit compat ioctls are now handled by the generic code as well.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4c5b4799
  14. 08 4月, 2021 1 次提交
  15. 23 3月, 2021 6 次提交
  16. 08 3月, 2021 1 次提交
  17. 25 2月, 2021 1 次提交
  18. 28 1月, 2021 1 次提交
  19. 24 1月, 2021 12 次提交
  20. 14 1月, 2021 2 次提交