1. 24 10月, 2018 19 次提交
    • D
      afs: Implement YFS support in the fs client · 30062bd1
      David Howells 提交于
      Implement support for talking to YFS-variant fileservers in the cache
      manager and the filesystem client.  These implement upgraded services on
      the same port as their AFS services.
      
      YFS fileservers provide expanded capabilities over AFS.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      30062bd1
    • D
      afs: Expand data structure fields to support YFS · d4936803
      David Howells 提交于
      Expand fields in various data structures to support the expanded
      information that YFS is capable of returning.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d4936803
    • D
      afs: Get the target vnode in afs_rmdir() and get a callback on it · f58db83f
      David Howells 提交于
      Get the target vnode in afs_rmdir() and validate it before we attempt the
      deletion, The vnode pointer will be passed through to the delivery function
      in a later patch so that the delivery function can mark it deleted.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f58db83f
    • D
      afs: Calc callback expiry in op reply delivery · 12d8e95a
      David Howells 提交于
      Calculate the callback expiration time at the point of operation reply
      delivery, using the reply time queried from AF_RXRPC on that call as a
      base.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12d8e95a
    • D
      afs: Fix FS.FetchStatus delivery from updating wrong vnode · 36bb5f49
      David Howells 提交于
      The FS.FetchStatus reply delivery function was updating inode of the
      directory in which a lookup had been done with the status of the looked up
      file.  This corrupts some of the directory state.
      
      Fixes: 5cf9dd55 ("afs: Prospectively look up extra files when doing a single lookup")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      36bb5f49
    • D
      afs: Implement the YFS cache manager service · 35dbfba3
      David Howells 提交于
      Implement the YFS cache manager service which gives extra capabilities on
      top of AFS.  This is done by listening for an additional service on the
      same port and indicating that anyone requesting an upgrade should be
      upgraded to the YFS port.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      35dbfba3
    • D
      afs: Remove callback details from afs_callback_break struct · 06aeb297
      David Howells 提交于
      Remove unnecessary details of a broken callback, such as version, expiry
      and type, from the afs_callback_break struct as they're not actually used
      and make the list take more memory.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      06aeb297
    • D
      afs: Commit the status on a new file/dir/symlink · 00671912
      David Howells 提交于
      Call the function to commit the status on a new file, dir or symlink so
      that the access rights for the caller's key are cached for that object.
      
      Without this, the next access to the file will cause a FetchStatus
      operation to be emitted to retrieve the access rights.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      00671912
    • D
      afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS · 3b6492df
      David Howells 提交于
      Increase the sizes of the volume ID to 64 bits and the vnode ID (inode
      number equivalent) to 96 bits to allow the support of YFS.
      
      This requires the iget comparator to check the vnode->fid rather than i_ino
      and i_generation as i_ino is not sufficiently capacious.  It also requires
      this data to be placed into the vnode cache key for fscache.
      
      For the moment, just discard the top 32 bits of the vnode ID when returning
      it though stat.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3b6492df
    • D
      afs: Don't invoke the server to read data beyond EOF · 2a0b4f64
      David Howells 提交于
      When writing a new page, clear space in the page rather than attempting to
      load it from the server if the space is beyond the EOF.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2a0b4f64
    • D
      afs: Add a couple of tracepoints to log I/O errors · f51375cd
      David Howells 提交于
      Add a couple of tracepoints to log the production of I/O errors within the AFS
      filesystem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f51375cd
    • D
      afs: Handle EIO from delivery function · 4ac15ea5
      David Howells 提交于
      Fix afs_deliver_to_call() to handle -EIO being returned by the operation
      delivery function, indicating that the call found itself in the wrong
      state, by printing an error and aborting the call.
      
      Currently, an assertion failure will occur.  This can happen, say, if the
      delivery function falls off the end without calling afs_extract_data() with
      the want_more parameter set to false to collect the end of the Rx phase of
      a call.
      
      The assertion failure looks like:
      
      	AFS: Assertion failed
      	4 == 7 is false
      	0x4 == 0x7 is false
      	------------[ cut here ]------------
      	kernel BUG at fs/afs/rxrpc.c:462!
      
      and is matched in the trace buffer by a line like:
      
      kworker/7:3-3226 [007] ...1 85158.030203: afs_io_error: c=0003be0c r=-5 CM_REPLY
      
      Fixes: 98bf40cd ("afs: Protect call->state changes against signals")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4ac15ea5
    • D
      afs: Fix TTL on VL server and address lists · ded2f4c5
      David Howells 提交于
      Currently the TTL on VL server and address lists isn't set in all
      circumstances and may be set to poor choices in others, since the TTL is
      derived from the SRV/AFSDB DNS record if and when available.
      
      Fix the TTL by limiting the range to a minimum and maximum from the current
      time.  At some point these can be made into sysctl knobs.  Further, use the
      TTL we obtained from the upcall to set the expiry on negative results too;
      in future a mechanism can be added to force reloading of such data.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ded2f4c5
    • D
      afs: Implement VL server rotation · 0a5143f2
      David Howells 提交于
      Track VL servers as independent entities rather than lumping all their
      addresses together into one set and implement server-level rotation by:
      
       (1) Add the concept of a VL server list, where each server has its own
           separate address list.  This code is similar to the FS server list.
      
       (2) Use the DNS resolver to retrieve a set of servers and their associated
           addresses, ports, preference and weight ratings.
      
       (3) In the case of a legacy DNS resolver or an address list given directly
           through /proc/net/afs/cells, create a list containing just a dummy
           server record and attach all the addresses to that.
      
       (4) Implement a simple rotation policy, for the moment ignoring the
           priorities and weights assigned to the servers.
      
       (5) Show the address list through /proc/net/afs/<cell>/vlservers.  This
           also displays the source and status of the data as indicated by the
           upcall.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0a5143f2
    • D
      afs: Improve FS server rotation error handling · e7f680f4
      David Howells 提交于
      Improve the error handling in FS server rotation by:
      
       (1) Cache the latest useful error value for the fs operation as a whole in
           struct afs_fs_cursor separately from the error cached in the
           afs_addr_cursor struct.  The one in the address cursor gets clobbered
           occasionally.  Copy over the error to the fs operation only when it's
           something we'd be interested in passing to userspace.
      
       (2) Make it so that EDESTADDRREQ is the default that is seen only if no
           addresses are available to be accessed.
      
       (3) When calling utility functions, such as checking a volume status or
           probing a fileserver, don't let a successful result clobber the cached
           error in the cursor; instead, stash the result in a temporary variable
           until it has been assessed.
      
       (4) Don't return ETIMEDOUT or ETIME if a better error, such as
           ENETUNREACH, is already cached.
      
       (5) On leaving the rotation loop, turn any remote abort code into a more
           useful error than ECONNABORTED.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e7f680f4
    • D
      afs: Set up the iov_iter before calling afs_extract_data() · 12bdcf33
      David Howells 提交于
      afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
      each time it is called to describe the remaining buffer to be filled.
      
      Instead:
      
       (1) Put an iterator in the afs_call struct.
      
       (2) Set the iterator for each marshalling stage to load data into the
           appropriate places.  A number of convenience functions are provided to
           this end (eg. afs_extract_to_buf()).
      
           This iterator is then passed to afs_extract_data().
      
       (3) Use the new ITER_DISCARD iterator to discard any excess data provided
           by FetchData.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12bdcf33
    • D
      afs: Better tracing of protocol errors · 160cb957
      David Howells 提交于
      Include the site of detection of AFS protocol errors in trace lines to
      better be able to determine what went wrong.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      160cb957
    • D
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells 提交于
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      aa563d7b
    • D
      iov_iter: Use accessor function · 00e23707
      David Howells 提交于
      Use accessor functions to access an iterator's type and direction.  This
      allows for the possibility of using some other method of determining the
      type of iterator than if-chains with bitwise-AND conditions.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      00e23707
  2. 18 10月, 2018 3 次提交
    • E
      fscache: Fix out of bound read in long cookie keys · fa520c47
      Eric Sandeen 提交于
      fscache_set_key() can incur an out-of-bounds read, reported by KASAN:
      
       BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
       Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615
      
      and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236
      
        BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
        BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
        Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466
      
      This happens for any index_key_len which is not divisible by 4 and is
      larger than the size of the inline key, because the code allocates exactly
      index_key_len for the key buffer, but the hashing loop is stepping through
      it 4 bytes (u32) at a time in the buf[] array.
      
      Fix this by calculating how many u32 buffers we'll need by using
      DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
      buffer to hold the index_key, then using that same count as the hashing
      index limit.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa520c47
    • D
      fscache: Fix incomplete initialisation of inline key space · 1ff22883
      David Howells 提交于
      The inline key in struct rxrpc_cookie is insufficiently initialized,
      zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
      bytes will end up hashing uninitialized memory because the memcpy only
      partially fills the last buf[] element.
      
      Fix this by clearing fscache_cookie objects on allocation rather than using
      the slab constructor to initialise them.  We're going to pretty much fill
      in the entire struct anyway, so bringing it into our dcache writably
      shouldn't incur much overhead.
      
      This removes the need to do clearance in fscache_set_key() (where we aren't
      doing it correctly anyway).
      
      Also, we don't need to set cookie->key_len in fscache_set_key() as we
      already did it in the only caller, so remove that.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Reported-by: NEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ff22883
    • A
      cachefiles: fix the race between cachefiles_bury_object() and rmdir(2) · 169b8033
      Al Viro 提交于
      the victim might've been rmdir'ed just before the lock_rename();
      unlike the normal callers, we do not look the source up after the
      parents are locked - we know it beforehand and just recheck that it's
      still the child of what used to be its parent.  Unfortunately,
      the check is too weak - we don't spot a dead directory since its
      ->d_parent is unchanged, dentry is positive, etc.  So we sail all
      the way to ->rename(), with hosting filesystems _not_ expecting
      to be asked renaming an rmdir'ed subdirectory.
      
      The fix is easy, fortunately - the lock on parent is sufficient for
      making IS_DEADDIR() on child safe.
      
      Cc: stable@vger.kernel.org
      Fixes: 9ae326a6 (CacheFiles: A cache that backs onto a mounted filesystem)
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      169b8033
  3. 15 10月, 2018 1 次提交
    • D
      afs: Fix clearance of reply · f0a7d188
      David Howells 提交于
      The recent patch to fix the afs_server struct leak didn't actually fix the
      bug, but rather fixed some of the symptoms.  The problem is that an
      asynchronous call that holds a resource pointed to by call->reply[0] will
      find the pointer cleared in the call destructor, thereby preventing the
      resource from being cleaned up.
      
      In the case of the server record leak, the afs_fs_get_capabilities()
      function in devel code sets up a call with reply[0] pointing at the server
      record that should be altered when the result is obtained, but this was
      being cleared before the destructor was called, so the put in the
      destructor does nothing and the record is leaked.
      
      Commit f014ffb0 removed the additional ref obtained by
      afs_install_server(), but the removal of this ref is actually used by the
      garbage collector to mark a server record as being defunct after the record
      has expired through lack of use.
      
      The offending clearance of call->reply[0] upon completion in
      afs_process_async_call() has been there from the origin of the code, but
      none of the asynchronous calls actually use that pointer currently, so it
      should be safe to remove (note that synchronous calls don't involve this
      function).
      
      Fix this by the following means:
      
       (1) Revert commit f014ffb0.
      
       (2) Remove the clearance of reply[0] from afs_process_async_call().
      
      Without this, afs_manage_servers() will suffer an assertion failure if it
      sees a server record that didn't get used because the usage count is not 1.
      
      Fixes: f014ffb0 ("afs: Fix afs_server struct leak")
      Fixes: 08e0e7c8 ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0a7d188
  4. 13 10月, 2018 3 次提交
  5. 12 10月, 2018 3 次提交
    • D
      afs: Fix afs_server struct leak · f014ffb0
      David Howells 提交于
      Fix a leak of afs_server structs.  The routine that installs them in the
      various lookup lists and trees gets a ref on leaving the function, whether
      it added the server or a server already exists.  It shouldn't increment
      the refcount if it added the server.
      
      The effect of this that "rmmod kafs" will hang waiting for the leaked
      server to become unused.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f014ffb0
    • A
      gfs2: Fix iomap buffered write support for journaled files (2) · fee5150c
      Andreas Gruenbacher 提交于
      It turns out that the fix in commit 6636c3cc56 is bad; the assertion
      that the iomap code no longer creates buffer heads is incorrect for
      filesystems that set the IOMAP_F_BUFFER_HEAD flag.
      
      Instead, what's happening is that gfs2_iomap_begin_write treats all
      files that have the jdata flag set as journaled files, which is
      incorrect as long as those files are inline ("stuffed").  We're handling
      stuffed files directly via the page cache, which is why we ended up with
      pages without buffer heads in gfs2_page_add_databufs.
      
      Fix this by handling stuffed journaled files correctly in
      gfs2_iomap_begin_write.
      
      This reverts commit 6636c3cc5690c11631e6366cf9a28fb99c8b25bb.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      fee5150c
    • D
      afs: Fix cell proc list · 6b3944e4
      David Howells 提交于
      Access to the list of cells by /proc/net/afs/cells has a couple of
      problems:
      
       (1) It should be checking against SEQ_START_TOKEN for the keying the
           header line.
      
       (2) It's only holding the RCU read lock, so it can't just walk over the
           list without following the proper RCU methods.
      
      Fix these by using an hlist instead of an ordinary list and using the
      appropriate accessor functions to follow it with RCU.
      
      Since the code that adds a cell to the list must also necessarily change,
      sort the list on insertion whilst we're at it.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b3944e4
  6. 10 10月, 2018 1 次提交
  7. 09 10月, 2018 1 次提交
    • D
      filesystem-dax: Fix dax_layout_busy_page() livelock · d7782145
      Dan Williams 提交于
      In the presence of multi-order entries the typical
      pagevec_lookup_entries() pattern may loop forever:
      
      	while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
      				min(end - index, (pgoff_t)PAGEVEC_SIZE),
      				indices)) {
      		...
      		for (i = 0; i < pagevec_count(&pvec); i++) {
      			index = indices[i];
      			...
      		}
      		index++; /* BUG */
      	}
      
      The loop updates 'index' for each index found and then increments to the
      next possible page to continue the lookup. However, if the last entry in
      the pagevec is multi-order then the next possible page index is more
      than 1 page away. Fix this locally for the filesystem-dax case by
      checking for dax-multi-order entries. Going forward new users of
      multi-order entries need to be similarly careful, or we need a generic
      way to report the page increment in the radix iterator.
      
      Fixes: 5fac7408 ("mm, fs, dax: handle layout changes to pinned dax...")
      Cc: <stable@vger.kernel.org>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d7782145
  8. 06 10月, 2018 5 次提交
    • D
      xfs: fix data corruption w/ unaligned reflink ranges · b3998900
      Dave Chinner 提交于
      When reflinking sub-file ranges, a data corruption can occur when
      the source file range includes a partial EOF block. This shares the
      unknown data beyond EOF into the second file at a position inside
      EOF, exposing stale data in the second file.
      
      XFS only supports whole block sharing, but we still need to
      support whole file reflink correctly.  Hence if the reflink
      request includes the last block of the souce file, only proceed with
      the reflink operation if it lands at or past the destination file's
      current EOF. If it lands within the destination file EOF, reject the
      entire request with -EINVAL and make the caller go the hard way.
      
      This avoids the data corruption vector, but also avoids disruption
      of returning EINVAL to userspace for the common case of whole file
      cloning.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b3998900
    • D
      xfs: fix data corruption w/ unaligned dedupe ranges · dceeb47b
      Dave Chinner 提交于
      A deduplication data corruption is Exposed by fstests generic/505 on
      XFS. It is caused by extending the block match range to include the
      partial EOF block, but then allowing unknown data beyond EOF to be
      considered a "match" to data in the destination file because the
      comparison is only made to the end of the source file. This corrupts
      the destination file when the source extent is shared with it.
      
      XFS only supports whole block dedupe, but we still need to appear to
      support whole file dedupe correctly.  Hence if the dedupe request
      includes the last block of the souce file, don't include it in the
      actual XFS dedupe operation. If the rest of the range dedupes
      successfully, then report the partial last block as deduped, too, so
      that userspace sees it as a successful dedupe rather than return
      EINVAL because we can't dedupe unaligned blocks.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      dceeb47b
    • A
      ocfs2: fix locking for res->tracking and dlm->tracking_list · cbe355f5
      Ashish Samant 提交于
      In dlm_init_lockres() we access and modify res->tracking and
      dlm->tracking_list without holding dlm->track_lock.  This can cause list
      corruptions and can end up in kernel panic.
      
      Fix this by locking res->tracking and dlm->tracking_list with
      dlm->track_lock instead of dlm->spinlock.
      
      Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@oracle.comSigned-off-by: NAshish Samant <ashish.samant@oracle.com>
      Reviewed-by: NChangwei Ge <ge.changwei@h3c.com>
      Acked-by: NJoseph Qi <jiangqi903@gmail.com>
      Acked-by: NJun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbe355f5
    • J
      proc: restrict kernel stack dumps to root · f8a00cef
      Jann Horn 提交于
      Currently, you can use /proc/self/task/*/stack to cause a stack walk on
      a task you control while it is running on another CPU.  That means that
      the stack can change under the stack walker.  The stack walker does
      have guards against going completely off the rails and into random
      kernel memory, but it can interpret random data from your kernel stack
      as instruction pointers and stack pointers.  This can cause exposure of
      kernel stack contents to userspace.
      
      Restrict the ability to inspect kernel stacks of arbitrary tasks to root
      in order to prevent a local attacker from exploiting racy stack unwinding
      to leak kernel task stack contents.  See the added comment for a longer
      rationale.
      
      There don't seem to be any users of this userspace API that can't
      gracefully bail out if reading from the file fails.  Therefore, I believe
      that this change is unlikely to break things.  In the case that this patch
      does end up needing a revert, the next-best solution might be to fake a
      single-entry stack based on wchan.
      
      Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
      Fixes: 2ec220e2 ("proc: add /proc/*/stack")
      Signed-off-by: NJann Horn <jannh@google.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8a00cef
    • L
      ocfs2: fix crash in ocfs2_duplicate_clusters_by_page() · 69eb7765
      Larry Chen 提交于
      ocfs2_duplicate_clusters_by_page() may crash if one of the extent's pages
      is dirty.  When a page has not been written back, it is still in dirty
      state.  If ocfs2_duplicate_clusters_by_page() is called against the dirty
      page, the crash happens.
      
      To fix this bug, we can just unlock the page and wait until the page until
      its not dirty.
      
      The following is the backtrace:
      
      kernel BUG at /root/code/ocfs2/refcounttree.c:2961!
      [exception RIP: ocfs2_duplicate_clusters_by_page+822]
      __ocfs2_move_extent+0x80/0x450 [ocfs2]
      ? __ocfs2_claim_clusters+0x130/0x250 [ocfs2]
      ocfs2_defrag_extent+0x5b8/0x5e0 [ocfs2]
      __ocfs2_move_extents_range+0x2a4/0x470 [ocfs2]
      ocfs2_move_extents+0x180/0x3b0 [ocfs2]
      ? ocfs2_wait_for_recovery+0x13/0x70 [ocfs2]
      ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2]
      ocfs2_ioctl+0x253/0x640 [ocfs2]
      do_vfs_ioctl+0x90/0x5f0
      SyS_ioctl+0x74/0x80
      do_syscall_64+0x74/0x140
      entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Once we find the page is dirty, we do not wait until it's clean, rather we
      use write_one_page() to write it back
      
      Link: http://lkml.kernel.org/r/20180829074740.9438-1-lchen@suse.com
      [lchen@suse.com: update comments]
        Link: http://lkml.kernel.org/r/20180830075041.14879-1-lchen@suse.com
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NLarry Chen <lchen@suse.com>
      Acked-by: NChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69eb7765
  9. 05 10月, 2018 3 次提交
  10. 04 10月, 2018 1 次提交