1. 24 10月, 2021 4 次提交
    • A
      iomap: Add done_before argument to iomap_dio_rw · 4fdccaa0
      Andreas Gruenbacher 提交于
      Add a done_before argument to iomap_dio_rw that indicates how much of
      the request has already been transferred.  When the request succeeds, we
      report that done_before additional bytes were tranferred.  This is
      useful for finishing a request asynchronously when part of the request
      has already been completed synchronously.
      
      We'll use that to allow iomap_dio_rw to be used with page faults
      disabled: when a page fault occurs while submitting a request, we
      synchronously complete the part of the request that has already been
      submitted.  The caller can then take care of the page fault and call
      iomap_dio_rw again for the rest of the request, passing in the number of
      bytes already tranferred.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      4fdccaa0
    • A
      iomap: Support partial direct I/O on user copy failures · 97308f8b
      Andreas Gruenbacher 提交于
      In iomap_dio_rw, when iomap_apply returns an -EFAULT error and the
      IOMAP_DIO_PARTIAL flag is set, complete the request synchronously and
      return a partial result.  This allows the caller to deal with the page
      fault and retry the remainder of the request.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      97308f8b
    • A
      iomap: Fix iomap_dio_rw return value for user copies · 42c498c1
      Andreas Gruenbacher 提交于
      When a user copy fails in one of the helpers of iomap_dio_rw, fail with
      -EFAULT instead of returning 0.  This matches what iomap_dio_bio_actor
      returns when it gets an -EFAULT from bio_iov_iter_get_pages.  With these
      changes, iomap_dio_actor now consistently fails with -EFAULT when a user
      page cannot be faulted in.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      42c498c1
    • A
      gfs2: Fix mmap + page fault deadlocks for buffered I/O · 00bfe02f
      Andreas Gruenbacher 提交于
      In the .read_iter and .write_iter file operations, we're accessing
      user-space memory while holding the inode glock.  There is a possibility
      that the memory is mapped to the same file, in which case we'd recurse
      on the same glock.
      
      We could detect and work around this simple case of recursive locking,
      but more complex scenarios exist that involve multiple glocks,
      processes, and cluster nodes, and working around all of those cases
      isn't practical or even possible.
      
      Avoid these kinds of problems by disabling page faults while holding the
      inode glock.  If a page fault would occur, we either end up with a
      partial read or write or with -EFAULT if nothing could be read or
      written.  In either case, we know that we're not done with the
      operation, so we indicate that we're willing to give up the inode glock
      and then we fault in the missing pages.  If that made us lose the inode
      glock, we return a partial read or write.  Otherwise, we resume the
      operation.
      
      This locking problem was originally reported by Jan Kara.  Linus came up
      with the idea of disabling page faults.  Many thanks to Al Viro and
      Matthew Wilcox for their feedback.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      00bfe02f
  2. 21 10月, 2021 5 次提交
    • A
      gfs2: Eliminate ip->i_gh · 1b223f70
      Andreas Gruenbacher 提交于
      Now that gfs2_file_buffered_write is the only remaining user of
      ip->i_gh, we can move the glock holder to the stack (or rather, use the
      one we already have on the stack); there is no need for keeping the
      holder in the inode anymore.
      
      This is slightly complicated by the fact that we're using ip->i_gh for
      the statfs inode in gfs2_file_buffered_write as well.  Writing to the
      statfs inode isn't very common, so allocate the statfs holder
      dynamically when needed.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      1b223f70
    • A
      gfs2: Move the inode glock locking to gfs2_file_buffered_write · b924bdab
      Andreas Gruenbacher 提交于
      So far, for buffered writes, we were taking the inode glock in
      gfs2_iomap_begin and dropping it in gfs2_iomap_end with the intention of
      not holding the inode glock while iomap_write_actor faults in user
      pages.  It turns out that iomap_write_actor is called inside iomap_begin
      ... iomap_end, so the user pages were still faulted in while holding the
      inode glock and the locking code in iomap_begin / iomap_end was
      completely pointless.
      
      Move the locking into gfs2_file_buffered_write instead.  We'll take care
      of the potential deadlocks due to faulting in user pages while holding a
      glock in a subsequent patch.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      b924bdab
    • B
      gfs2: Introduce flag for glock holder auto-demotion · dc732906
      Bob Peterson 提交于
      This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that
      will allow glocks to be demoted automatically on locking conflicts.
      When a locking request comes in that isn't compatible with the locking
      state of an active holder and that holder has the HIF_MAY_DEMOTE flag
      set, the holder will be demoted before the incoming locking request is
      granted.
      
      Note that this mechanism demotes active holders (with the HIF_HOLDER
      flag set), while before we were only demoting glocks without any active
      holders.  This allows processes to keep hold of locks that may form a
      cyclic locking dependency; the core glock logic will then break those
      dependencies in case a conflicting locking request occurs.  We'll use
      this to avoid giving up the inode glock proactively before faulting in
      pages.
      
      Processes that allow a glock holder to be taken away indicate this by
      calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag.
      Later, they call gfs2_holder_disallow_demote() to clear the flag again,
      and then they check if their holder is still queued: if it is, they are
      still holding the glock; if it isn't, they can re-acquire the glock (or
      abort).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      dc732906
    • A
      gfs2: Clean up function may_grant · 61444649
      Andreas Gruenbacher 提交于
      Pass the first current glock holder into function may_grant and
      deobfuscate the logic there.
      
      While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant.  To make
      that build cleanly, de-constify the may_grant arguments.
      
      We're now using function find_first_holder in do_promote, so move the
      function's definition above do_promote.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      61444649
    • A
      gfs2: Add wrapper for iomap_file_buffered_write · 2eb7509a
      Andreas Gruenbacher 提交于
      Add a wrapper around iomap_file_buffered_write.  We'll add code for when
      the operation needs to be retried here later.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      2eb7509a
  3. 18 10月, 2021 2 次提交
    • A
      iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable · a6294593
      Andreas Gruenbacher 提交于
      Turn iov_iter_fault_in_readable into a function that returns the number
      of bytes not faulted in, similar to copy_to_user, instead of returning a
      non-zero value when any of the requested pages couldn't be faulted in.
      This supports the existing users that require all pages to be faulted in
      as well as new users that are happy if any pages can be faulted in.
      
      Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
      sure this change doesn't silently break things.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      a6294593
    • A
      gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable} · bb523b40
      Andreas Gruenbacher 提交于
      Turn fault_in_pages_{readable,writeable} into versions that return the
      number of bytes not faulted in, similar to copy_to_user, instead of
      returning a non-zero value when any of the requested pages couldn't be
      faulted in.  This supports the existing users that require all pages to
      be faulted in as well as new users that are happy if any pages can be
      faulted in.
      
      Rename the functions to fault_in_{readable,writeable} to make sure
      this change doesn't silently break things.
      
      Neither of these functions is entirely trivial and it doesn't seem
      useful to inline them, so move them to mm/gup.c.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      bb523b40
  4. 07 10月, 2021 6 次提交
  5. 06 10月, 2021 1 次提交
  6. 05 10月, 2021 7 次提交
  7. 04 10月, 2021 1 次提交
    • C
      elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings · 9b2f72cc
      Chen Jingwen 提交于
      In commit b212921b ("elf: don't use MAP_FIXED_NOREPLACE for elf
      executable mappings") we still leave MAP_FIXED_NOREPLACE in place for
      load_elf_interp.
      
      Unfortunately, this will cause kernel to fail to start with:
      
          1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already
          Failed to execute /init (error -17)
      
      The reason is that the elf interpreter (ld.so) has overlapping segments.
      
        readelf -l ld-2.31.so
        Program Headers:
          Type           Offset             VirtAddr           PhysAddr
                         FileSiz            MemSiz              Flags  Align
          LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                         0x000000000002c94c 0x000000000002c94c  R E    0x10000
          LOAD           0x000000000002dae0 0x000000000003dae0 0x000000000003dae0
                         0x00000000000021e8 0x0000000000002320  RW     0x10000
          LOAD           0x000000000002fe00 0x000000000003fe00 0x000000000003fe00
                         0x00000000000011ac 0x0000000000001328  RW     0x10000
      
      The reason for this problem is the same as described in commit
      ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments").
      
      Not only executable binaries, elf interpreters (e.g. ld.so) can have
      overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go
      back to MAP_FIXED in load_elf_interp.
      
      Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map")
      Cc: <stable@vger.kernel.org> # v4.19
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NChen Jingwen <chenjingwen6@huawei.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b2f72cc
  8. 02 10月, 2021 1 次提交
  9. 01 10月, 2021 9 次提交
  10. 30 9月, 2021 4 次提交