1. 12 7月, 2018 12 次提交
    • C
      xfs: remove xfs_reflink_find_cow_mapping · 060d4eaa
      Christoph Hellwig 提交于
      We only have one caller left, and open coding the simple extent list
      lookup in it allows us to make the code both more understandable and
      reuse calculations and variables already present.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      060d4eaa
    • C
    • D
      xfs: make xfs_writepage_map extent map centric · e2f6ad46
      Dave Chinner 提交于
      xfs_writepage_map() iterates over the bufferheads on a page to decide
      what sort of IO to do and what actions to take.  However, when it comes
      to reflink and deciding when it needs to execute a COW operation, we no
      longer look at the bufferhead state but instead we ignore than and look
      up internal state held in the COW fork extent list.
      
      This means xfs_writepage_map() is somewhat confused. It does stuff, then
      ignores it, then tries to handle the impedence mismatch by shovelling the
      results inside the existing mapping code.  It works, but it's a bit of a
      mess and it makes it hard to fix the cached map bug that the writepage
      code currently has.
      
      To unify the two different mechanisms, we first have to choose a direction.
      That's already been set - we're de-emphasising bufferheads so they are no
      longer a control structure as we need to do taht to allow for eventual
      removal.  Hence we need to move away from looking at bufferhead state to
      determine what operations we need to perform.
      
      We can't completely get rid of bufferheads yet - they do contain some
      state that is absolutely necessary, such as whether that part of the page
      contains valid data or not (buffer_uptodate()).  Other state in the
      bufferhead is redundant:
      
      	BH_dirty - the page is dirty, so we can ignore this and just
      		write it
      	BH_delay - we have delalloc extent info in the DATA fork extent
      		tree
      	BH_unwritten - same as BH_delay
      	BH_mapped - indicates we've already used it once for IO and it is
      		mapped to a disk address. Needs to be ignored for COW
      		blocks.
      
      The BH_mapped flag is an interesting case - it's supposed to indicate that
      it's already mapped to disk and so we can just use it "as is".  In theory,
      we don't even have to do an extent lookup to find where to write it too,
      but we have to do that anyway to determine we are actually writing over a
      valid extent.  Hence it's not even serving the purpose of avoiding a an
      extent lookup during writeback, and so we can pretty much ignore it.
      Especially as we have to ignore it for COW operations...
      
      Therefore, use the extent map as the source of information to tell us
      what actions we need to take and what sort of IO we should perform.  The
      first step is to have xfs_map_blocks() set the io type according to what
      it looks up.  This means it can easily handle both normal overwrite and
      COW cases.  The only thing we also need to add is the ability to return
      hole mappings.
      
      We need to return and cache hole mappings now for the case of multiple
      blocks per page.  We no longer use the BH_mapped to indicate a block over
      a hole, so we have to get that info from xfs_map_blocks().  We cache it so
      that holes that span two pages don't need separate lookups.  This allows us
      to avoid ever doing write IO over a hole, too.
      
      Now that we have xfs_map_blocks() returning both a cached map and the type
      of IO we need to perform, we can rewrite xfs_writepage_map() to drop all
      the bufferhead control. It's also much simplified because it doesn't need
      to explicitly handle COW operations.  Instead of iterating bufferheads, it
      iterates blocks within the page and then looks up what per-block state is
      required from the appropriate bufferhead.  It then validates the cached
      map, and if it's not valid, we get a new map.  If we don't get a valid map
      or it's over a hole, we skip the block.
      
      At this point, we have to remap the bufferhead via xfs_map_at_offset().
      As previously noted, we had to do this even if the buffer was already
      mapped as the mapping would be stale for XFS_IO_DELALLOC, XFS_IO_UNWRITTEN
      and XFS_IO_COW IO types.  With xfs_map_blocks() now controlling the type,
      even XFS_IO_OVERWRITE types need remapping, as converted-but-not-yet-
      written delalloc extents beyond EOF can be reported at XFS_IO_OVERWRITE.
      Bufferheads that span such regions still need their BH_Delay flags cleared
      and their block numbers calculated, so we now unconditionally map each
      bufferhead before submission.
      
      But wait! There's more - remember the old "treat unwritten extents as
      holes on read" hack?  Yeah, that means we can have a dirty page with
      unmapped, unwritten bufferheads that contain data!  What makes these so
      special is that the unwritten "hole" bufferheads do not have a valid block
      device pointer, so if we attempt to write them xfs_add_to_ioend() blows
      up. So we make xfs_map_at_offset() do the "realtime or data device"
      lookup from the inode and ignore what was or wasn't put into the
      bufferhead when the buffer was instantiated.
      
      The astute reader will have realised by now that this code treats
      unwritten extents in multiple-blocks-per-page situations differently.
      If we get any combination of unwritten blocks on a dirty page that contain
      valid data in the page, we're going to convert them to real extents.  This
      can actually be a win, because it means that pages with interleaving
      unwritten and written blocks will get converted to a single written extent
      with zeros replacing the interspersed unwritten blocks.  This is actually
      good for reducing extent list and conversion overhead, and it means we
      issue a contiguous IO instead of lots of little ones.  The downside is
      that we use up a little extra IO bandwidth.  Neither of these seem like a
      bad thing given that spinning disks are seek sensitive, and SSDs/pmem have
      bandwidth to burn and the lower Io latency/CPU overhead of fewer, larger
      IOs will result in better performance on them...
      
      As a result of all this, the only state we actually care about from the
      bufferhead is a single flag - BH_Uptodate. We still use the bufferhead to
      pass some information to the bio via xfs_add_to_ioend(), but that is
      trivial to separate and pass explicitly.  This means we really only need
      1 bit of state per block per page from the buffered write path in the
      writeback path.  Everything else we do with the bufferhead is purely to
      make the buffered IO front end continue to work correctly. i.e we've
      pretty much marginalised bufferheads in the writeback path completely.
      Signed-off-By: NDave Chinner <dchinner@redhat.com>
      [hch: forward port, refactor and split off bits into other commits]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e2f6ad46
    • C
      xfs: rename the offset variable in xfs_writepage_map · 6a4c9501
      Christoph Hellwig 提交于
      Calling it file_offset makes the usage more clear, especially with
      a new poffset variable that will be added soon for the offset inside
      the page.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      6a4c9501
    • C
      xfs: remove xfs_map_cow · 5c665e5b
      Christoph Hellwig 提交于
      We can handle the existing cow mapping case as a special case directly
      in xfs_writepage_map, and share code for allocating delalloc blocks
      with regular I/O in xfs_map_blocks.  This means we need to always
      call xfs_map_blocks for reflink inodes, but we can still skip most of
      the work if it turns out that there is no COW mapping overlapping the
      current block.
      
      As a subtle detail we need to start caching holes in the wpc to deal
      with the case of COW reservations between EOF.  But we'll need that
      infrastructure later anyway, so this is no big deal.
      
      Based on a patch from Dave Chinner.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      5c665e5b
    • C
      xfs: remove xfs_reflink_trim_irec_to_next_cow · fca8c805
      Christoph Hellwig 提交于
      We already have to check for overlapping COW extents everytime we
      come back to a page in xfs_writepage_map / xfs_map_cow, so this
      additional trim is not required.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      fca8c805
    • C
      xfs: don't use XFS_BMAPI_IGSTATE in xfs_map_blocks · a7b28f72
      Christoph Hellwig 提交于
      We want to be able to use the extent state as a reliably indicator for
      the type of I/O, and stop using the buffer head state.  For this we
      need to stop using the XFS_BMAPI_IGSTATE so that we don't see merged
      extents of different types.
      
      Based on a patch from Dave Chinner.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      a7b28f72
    • C
      xfs: don't clear imap_valid for a non-uptodate buffers · c57371a1
      Christoph Hellwig 提交于
      Finding a buffer that isn't uptodate doesn't invalidate the mapping for
      any given block.  The last_sector check will already take care of starting
      another ioend as soon as we find any non-update buffer, and if the current
      mapping doesn't include the next uptodate buffer the xfs_imap_valid check
      will take care of it.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      c57371a1
    • C
      xfs: do not set the page uptodate in xfs_writepage_map · 91cdfd17
      Christoph Hellwig 提交于
      We already track the page uptodate status based on the buffer uptodate
      status, which is updated whenever reading or zeroing blocks.
      
      This code has been there since commit a ptool commit in 2002, which
      claims to:
      
          "merge" the 2.4 fsx fix for block size < page size to 2.5.  This needed
          major changes to actually fit.
      
      and isn't present in other writepage implementations.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      91cdfd17
    • C
      xfs: move locking into xfs_bmap_punch_delalloc_range · d4380177
      Christoph Hellwig 提交于
      Both callers want the same looking, so do it only once.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      d4380177
    • C
      xfs: simplify xfs_aops_discard_page · 03625721
      Christoph Hellwig 提交于
      Instead of looking at the buffer heads to see if a block is delalloc just
      call xfs_bmap_punch_delalloc_range on the whole page - this will leave
      any non-delalloc block intact and handle the iteration for us.  As a side
      effect one more place stops caring about buffer heads and we can remove the
      xfs_check_page_type function entirely.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      03625721
    • C
      xfs: use iomap for blocksize == PAGE_SIZE readpage and readpages · 8b2e77c1
      Christoph Hellwig 提交于
      For file systems with a block size that equals the page size we never do
      partial reads, so we can use the buffer_head-less iomap versions of
      readpage and readpages without conflicting with the buffer_head structures
      create later in write_begin.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      8b2e77c1
  2. 25 6月, 2018 8 次提交
    • D
      xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation · d8cb5e42
      Darrick J. Wong 提交于
      In __xfs_ag_resv_init we incorrectly calculate the amount by which to
      decrease fdblocks when reserving blocks for the rmapbt.  Because rmapbt
      allocations do not decrease fdblocks, we must decrease fdblocks by the
      entire size of the requested reservation in order to achieve our goal of
      always having enough free blocks to satisfy an rmapbt expansion.
      
      This is in contrast to the refcountbt/finobt, which /do/ subtract from
      fdblocks whenever they allocate a block.  For this allocation type we
      preserve the existing behavior where we decrease fdblocks only by the
      requested reservation minus the size of the existing tree.
      
      This fixes the problem where the available block counts reported by
      statfs change across a remount if there had been an rmapbt size change
      since mount time.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      d8cb5e42
    • D
      xfs: ensure post-EOF zeroing happens after zeroing part of a file · e53c4b59
      Darrick J. Wong 提交于
      If a user asks us to zero_range part of a file, the end of the range is
      EOF, and not aligned to a page boundary, invoke writeback of the EOF
      page to ensure that the post-EOF part of the page is zeroed.  This
      ensures that we don't expose stale memory contents via mmap, if in a
      clumsy manner.
      
      Found by running generic/127 when it runs zero_range and mapread at EOF
      one after the other.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      e53c4b59
    • D
      xfs: fix off-by-one error in xfs_rtalloc_query_range · a3a374bf
      Darrick J. Wong 提交于
      In commit 8ad560d2 ("xfs: strengthen rtalloc query range checks")
      we strengthened the input parameter checks in the rtbitmap range query
      function, but introduced an off-by-one error in the process.  The call
      to xfs_rtfind_forw deals with the high key being rextents, but we clamp
      the high key to rextents - 1.  This causes the returned results to stop
      one block short of the end of the rtdev, which is incorrect.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      a3a374bf
    • D
      xfs: fix uninitialized field in rtbitmap fsmap backend · 232d0a24
      Darrick J. Wong 提交于
      Initialize the extent count field of the high key so that when we use
      the high key to synthesize an 'unknown owner' record (i.e. used space
      record) at the end of the queried range we have a field with which to
      compute rm_blockcount.  This is not strictly necessary because the
      synthesizer never uses the rm_blockcount field, but we can shut up the
      static code analysis anyway.
      
      Coverity-id: 1437358
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      232d0a24
    • D
      xfs: recheck reflink state after grabbing ILOCK_SHARED for a write · 5bd88d15
      Darrick J. Wong 提交于
      The reflink iflag could have changed since the earlier unlocked check,
      so if we got ILOCK_SHARED for a write and but we're now a reflink inode
      we have to switch to ILOCK_EXCL and relock.
      
      This helps us avoid blowing lock assertions in things like generic/166:
      
      XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/xfs_reflink.c, line: 383
      WARNING: CPU: 1 PID: 24707 at fs/xfs/xfs_message.c:104 assfail+0x25/0x30 [xfs]
      Modules linked in: deadline_iosched dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: scsi_debug]
      CPU: 1 PID: 24707 Comm: xfs_io Not tainted 4.18.0-rc1-djw #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
      RIP: 0010:assfail+0x25/0x30 [xfs]
      Code: ff 0f 0b c3 90 66 66 66 66 90 48 89 f1 41 89 d0 48 c7 c6 e8 ef 1b a0 48 89 fa 31 ff e8 54 f9 ff ff 80 3d fd ba 0f 00 00 75 03 <0f> 0b c3 0f 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 63 f6 49 89 f9
      RSP: 0018:ffffc90006423ad8 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff880030b65e80 RCX: 0000000000000000
      RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa01b0447
      RBP: ffffc90006423c10 R08: 0000000000000000 R09: 0000000000000000
      R10: ffff88003d43fc30 R11: f000000000000000 R12: ffff880077cda000
      R13: 0000000000000000 R14: ffffc90006423c30 R15: ffffc90006423bf9
      FS:  00007feba8986800(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000138ab58 CR3: 000000003d40a000 CR4: 00000000000006a0
      Call Trace:
       xfs_reflink_allocate_cow+0x24c/0x3d0 [xfs]
       xfs_file_iomap_begin+0x6d2/0xeb0 [xfs]
       ? iomap_to_fiemap+0x80/0x80
       iomap_apply+0x5e/0x130
       iomap_dio_rw+0x2e0/0x400
       ? iomap_to_fiemap+0x80/0x80
       ? xfs_file_dio_aio_write+0x133/0x4a0 [xfs]
       xfs_file_dio_aio_write+0x133/0x4a0 [xfs]
       xfs_file_write_iter+0x7b/0xb0 [xfs]
       __vfs_write+0x16f/0x1f0
       vfs_write+0xc8/0x1c0
       ksys_pwrite64+0x74/0x90
       do_syscall_64+0x56/0x180
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      5bd88d15
    • D
      xfs: don't allow insert-range to shift extents past the maximum offset · f62cb48e
      Darrick J. Wong 提交于
      Zorro Lang reports that generic/485 blows an assert on a filesystem with
      512 byte blocks.  The test tries to fallocate a post-eof extent at the
      maximum file size and calls insert range to shift the extents right by
      two blocks.  On a 512b block filesystem this causes startoff to overflow
      the 54-bit startoff field, leading to the assert.
      
      Therefore, always check the rightmost extent to see if it would overflow
      prior to invoking the insert range machinery.
      
      Reported-by: zlang@redhat.com
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200137Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      f62cb48e
    • D
      xfs: don't trip over negative free space in xfs_reserve_blocks · aafe12ce
      Darrick J. Wong 提交于
      If we somehow end up with a filesystem that has fewer free blocks than
      the blocks set aside to avoid ENOSPC deadlocks, it's possible that the
      free space calculation in xfs_reserve_blocks will spit out a negative
      number (because percpu_counter_sum returns s64).  We fail to notice
      this negative number and set fdblks_delta to it.  Now we increment
      fdblocks(!) and the unsigned type of m_resblks means that we end up
      setting a ridiculously huge m_resblks reservation.
      
      Avoid this comedy of errors by detecting the negative free space and
      returning -ENOSPC.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      aafe12ce
    • D
      xfs: allow empty transactions while frozen · 10ee2526
      Darrick J. Wong 提交于
      In commit e89c0413 ("xfs: implement the GETFSMAP ioctl") we
      created the ability to obtain empty transactions.  These transactions
      have no log or block reservations and therefore can't modify anything.
      Since they're also NO_WRITECOUNT they can run while the fs is frozen,
      so we don't need to WARN_ON about that usage.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      10ee2526
  3. 22 6月, 2018 3 次提交
    • D
      xfs: xfs_iflush_abort() can be called twice on cluster writeback failure · e53946db
      Dave Chinner 提交于
      When a corrupt inode is detected during xfs_iflush_cluster, we can
      get a shutdown ASSERT failure like this:
      
      XFS (pmem1): Metadata corruption detected at xfs_symlink_shortform_verify+0x5c/0xa0, inode 0x86627 data fork
      XFS (pmem1): Unmount and run xfs_repair
      XFS (pmem1): xfs_do_force_shutdown(0x8) called from line 3372 of file fs/xfs/xfs_inode.c.  Return address = ffffffff814f4116
      XFS (pmem1): Corruption of in-memory data detected.  Shutting down filesystem
      XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c.  Return address = ffffffff814a8a88
      XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c.  Return address = ffffffff814a8ef9
      XFS (pmem1): Please umount the filesystem and rectify the problem(s)
      XFS: Assertion failed: xfs_isiflocked(ip), file: fs/xfs/xfs_inode.h, line: 258
      .....
      Call Trace:
       xfs_iflush_abort+0x10a/0x110
       xfs_iflush+0xf3/0x390
       xfs_inode_item_push+0x126/0x1e0
       xfsaild+0x2c5/0x890
       kthread+0x11c/0x140
       ret_from_fork+0x24/0x30
      
      Essentially, xfs_iflush_abort() has been called twice on the
      original inode that that was flushed. This happens because the
      inode has been flushed to teh buffer successfully via
      xfs_iflush_int(), and so when another inode is detected as corrupt
      in xfs_iflush_cluster, the buffer is marked stale and EIO, and
      iodone callbacks are run on it.
      
      Running the iodone callbacks walks across the original inode and
      calls xfs_iflush_abort() on it. When xfs_iflush_cluster() returns
      to xfs_iflush(), it runs the error path for that function, and that
      calls xfs_iflush_abort() on the inode a second time, leading to the
      above assert failure as the inode is not flush locked anymore.
      
      This bug has been there a long time.
      
      The simple fix would be to just avoid calling xfs_iflush_abort() in
      xfs_iflush() if we've got a failure from xfs_iflush_cluster().
      However, xfs_iflush_cluster() has magic delwri buffer handling that
      means it may or may not have run IO completion on the buffer, and
      hence sometimes we have to call xfs_iflush_abort() from
      xfs_iflush(), and sometimes we shouldn't.
      
      After reading through all the error paths and the delwri buffer
      code, it's clear that the error handling in xfs_iflush_cluster() is
      unnecessary. If the buffer is delwri, it leaves it on the delwri
      list so that when the delwri list is submitted it sees a shutdown
      fliesystem in xfs_buf_submit() and that marks the buffer stale, EIO
      and runs IO completion. i.e. exactly what xfs+iflush_cluster() does
      when it's not a delwri buffer. Further, marking a buffer stale
      clears the _XBF_DELWRI_Q flag on the buffer, which means when
      submission of the buffer occurs, it just skips over it and releases
      it.
      
      IOWs, the error handling in xfs_iflush_cluster doesn't need to care
      if the buffer is already on a the delwri queue or not - it just
      needs to mark the buffer stale, EIO and run completions. That means
      we can just use the easy fix for xfs_iflush() to avoid the double
      abort.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e53946db
    • D
      xfs: More robust inode extent count validation · 23fcb334
      Dave Chinner 提交于
      When the inode is in extent format, it can't have more extents that
      fit in the inode fork. We don't currenty check this, and so this
      corruption goes unnoticed by the inode verifiers. This can lead to
      crashes operating on invalid in-memory structures.
      
      Attempts to access such a inode will now error out in the verifier
      rather than allowing modification operations to proceed.
      Reported-by: NWen Xu <wen.xu@gatech.edu>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      [darrick: fix a typedef, add some braces and breaks to shut up compiler warnings]
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      23fcb334
    • C
      xfs: simplify xfs_bmap_punch_delalloc_range · e2ac8363
      Christoph Hellwig 提交于
      Instead of using xfs_bmapi_read to find delalloc extents and then punch
      them out using xfs_bunmapi, opencode the loop to iterate over the extents
      and call xfs_bmap_del_extent_delay directly.  This both simplifies the
      code and reduces the number of extent tree lookups required.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e2ac8363
  4. 21 6月, 2018 1 次提交
  5. 12 6月, 2018 1 次提交
  6. 09 6月, 2018 6 次提交
  7. 07 6月, 2018 2 次提交
    • A
      xfs: fix string handling in label get/set functions · 4bb8b65a
      Arnd Bergmann 提交于
      [sandeen: fix subject, avoid copy-out of uninit data in getlabel]
      
      gcc-8 reports two warnings for the newly added getlabel/setlabel code:
      
      fs/xfs/xfs_ioctl.c: In function 'xfs_ioc_getlabel':
      fs/xfs/xfs_ioctl.c:1822:38: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
        strncpy(label, sbp->sb_fname, sizeof(sbp->sb_fname));
                                            ^
      In function 'strncpy',
          inlined from 'xfs_ioc_setlabel' at /git/arm-soc/fs/xfs/xfs_ioctl.c:1863:2,
          inlined from 'xfs_file_ioctl' at /git/arm-soc/fs/xfs/xfs_ioctl.c:1918:10:
      include/linux/string.h:254:9: error: '__builtin_strncpy' output may be truncated copying 12 bytes from a string of length 12 [-Werror=stringop-truncation]
        return __builtin_strncpy(p, q, size);
      
      In both cases, part of the problem is that one of the strncpy()
      arguments is a fixed-length character array with zero-padding rather
      than a zero-terminated string. In the first one case, we also get an
      odd warning about sizeof-pointer-memaccess, which doesn't seem right
      (the sizeof is for an array that happens to be the same as the second
      strncpy argument).
      
      To work around the bogus warning, I use a plain 'XFSLABEL_MAX' for
      the strncpy() length when copying the label in getlabel. For setlabel(),
      using memcpy() with the correct length that is already known avoids
      the second warning and is slightly simpler.
      
      In a related issue, it appears that we accidentally skip the trailing
      \0 when copying a 12-character label back to user space in getlabel().
      Using the correct sizeof() argument here copies the extra character.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85602
      Fixes: f7664b31 ("xfs: implement online get/set fs label")
      Cc: Eric Sandeen <sandeen@redhat.com>
      Cc: Martin Sebor <msebor@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      4bb8b65a
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  8. 06 6月, 2018 7 次提交
    • D
      xfs: validate btree records on retrieval · 9e6c08d4
      Dave Chinner 提交于
      So we don't check the validity of records as we walk the btree. When
      there are corrupt records in the free space btree (e.g. zero
      startblock/length or beyond EOAG) we just blindly use it and things
      go bad from there. That leads to assert failures on debug kernels
      like this:
      
      XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_alloc.c, line: 450
      ....
      Call Trace:
       xfs_alloc_fixup_trees+0x368/0x5c0
       xfs_alloc_ag_vextent_near+0x79a/0xe20
       xfs_alloc_ag_vextent+0x1d3/0x330
       xfs_alloc_vextent+0x5e9/0x870
      
      Or crashes like this:
      
      XFS (loop0): xfs_buf_find: daddr 0x7fb28 out of range, EOFS 0x8000
      .....
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
      ....
      Call Trace:
       xfs_bmap_add_extent_hole_real+0x67d/0x930
       xfs_bmapi_write+0x934/0xc90
       xfs_da_grow_inode_int+0x27e/0x2f0
       xfs_dir2_grow_inode+0x55/0x130
       xfs_dir2_sf_to_block+0x94/0x5d0
       xfs_dir2_sf_addname+0xd0/0x590
       xfs_dir_createname+0x168/0x1a0
       xfs_rename+0x658/0x9b0
      
      By checking that free space records pulled from the trees are
      within the valid range, we catch many of these corruptions before
      they can do damage.
      
      This is a generic btree record checking deficiency. We need to
      validate the records we fetch from all the different btrees before
      we use them to catch corruptions like this.
      
      This patch results in a corrupt record emitting an error message and
      returning -EFSCORRUPTED, and the higher layers catch that and abort:
      
       XFS (loop0): Size Freespace BTree record corruption in AG 0 detected!
       XFS (loop0): start block 0x0 block count 0x0
       XFS (loop0): Internal error xfs_trans_cancel at line 1012 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x42a/0x670
       .....
       Call Trace:
        dump_stack+0x85/0xcb
        xfs_trans_cancel+0x19f/0x1c0
        xfs_create+0x42a/0x670
        xfs_generic_create+0x1f6/0x2c0
        vfs_create+0xf9/0x180
        do_mknodat+0x1f9/0x210
        do_syscall_64+0x5a/0x180
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      .....
       XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1013 of file fs/xfs/xfs_trans.c.  Return address = ffffffff81500868
       XFS (loop0): Corruption of in-memory data detected.  Shutting down filesystem
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      9e6c08d4
    • D
      xfs: push corruption -> ESTALE conversion to xfs_nfs_get_inode() · 29cad0b3
      Dave Chinner 提交于
      In xfs_imap_to_bp(), we convert a -EFSCORRUPTED error to -EINVAL if
      we are doing an untrusted lookup. This is done because we need
      failed filehandle lookups to report -ESTALE to the caller, and it
      does this by converting -EINVAL and -ENOENT errors to -ESTALE.
      
      The squashing of EFSCORRUPTED in imap_to_bp makes it impossible for
      for xfs_iget(UNTRUSTED) callers to determine the difference between
      "inode does not exist" and "corruption detected during lookup". We
      realy need that distinction in places calling xfS_iget(UNTRUSTED),
      so move the filehandle error case handling all the way out to
      xfs_nfs_get_inode() where it is needed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      29cad0b3
    • D
      xfs: verify root inode more thoroughly · 541b5acc
      Dave Chinner 提交于
      When looking up the root inode at mount time, we don't actually do
      any verification to check that the inode is allocated and accounted
      for correctly in the INOBT. Make the checks on the root inode more
      robust by making it an untrusted lookup. This forces the inode
      lookup to use the inode btree to verify the inode is allocated
      and mapped correctly to disk. This will also have the effect of
      catching a significant number of AGI/INOBT related corruptions in
      AG 0 at mount time.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      541b5acc
    • D
      xfs: verify COW extent size hint is valid in inode verifier · 02a0fda8
      Dave Chinner 提交于
      There are rules for vald extent size hints. We enforce them when
      applications set them, but fuzzers violate those rules and that
      screws us over. Validate COW extent size hint rules in the inode
      verifier to catch this.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      02a0fda8
    • D
      xfs: verify extent size hint is valid in inode verifier · 7d71a671
      Dave Chinner 提交于
      There are rules for vald extent size hints. We enforce them when
      applications set them, but fuzzers violate those rules and that
      screws us over.
      
      This results in alignment assertion failures when setting up
      allocations such as this in direct IO:
      
      XFS: Assertion failed: ap->length, file: fs/xfs/libxfs/xfs_bmap.c, line: 3432
      ....
      Call Trace:
       xfs_bmap_btalloc+0x415/0x910
       xfs_bmapi_write+0x71c/0x12e0
       xfs_iomap_write_direct+0x2a9/0x420
       xfs_file_iomap_begin+0x4dc/0xa70
       iomap_apply+0x43/0x100
       iomap_file_buffered_write+0x62/0x90
       xfs_file_buffered_aio_write+0xba/0x300
       __vfs_write+0xd5/0x150
       vfs_write+0xb6/0x180
       ksys_write+0x45/0xa0
       do_syscall_64+0x5a/0x180
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      And from xfs_db:
      
      core.extsize = 10380288
      
      Which is not an integer multiple of the block size, and so violates
      Rule #7 for setting extent size hints. Validate extent size hint
      rules in the inode verifier to catch this.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      7d71a671
    • D
      xfs: catch bad stripe alignment configurations · fa4ca9c5
      Dave Chinner 提交于
      When stripe alignments are invalid, data alignment algorithms in the
      allocator may not work correctly. Ensure we catch superblocks with
      invalid stripe alignment setups at mount time. These data alignment
      mismatches are now detected at mount time like this:
      
      XFS (loop0): SB stripe unit sanity check failed
      XFS (loop0): Metadata corruption detected at xfs_sb_read_verify+0xab/0x110, xfs_sb block 0xffffffffffffffff
      XFS (loop0): Unmount and run xfs_repair
      XFS (loop0): First 128 bytes of corrupted metadata buffer:
      0000000091c2de02: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 10 00  XFSB............
      0000000023bff869: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      00000000cdd8c893: 17 32 37 15 ff ca 46 3d 9a 17 d3 33 04 b5 f1 a2  .27...F=...3....
      000000009fd2844f: 00 00 00 00 00 00 00 04 00 00 00 00 00 00 06 d0  ................
      0000000088e9b0bb: 00 00 00 00 00 00 06 d1 00 00 00 00 00 00 06 d2  ................
      00000000ff233a20: 00 00 00 01 00 00 10 00 00 00 00 01 00 00 00 00  ................
      000000009db0ac8b: 00 00 03 60 e1 34 02 00 08 00 00 02 00 00 00 00  ...`.4..........
      00000000f7022460: 00 00 00 00 00 00 00 00 0c 09 0b 01 0c 00 00 19  ................
      XFS (loop0): SB validate failed with error -117.
      
      And the mount fails.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      fa4ca9c5
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00