1. 26 7月, 2018 1 次提交
    • A
      gfs2: Special-case rindex for gfs2_grow · 77612578
      Andreas Gruenbacher 提交于
      To speed up the common case of appending to a file,
      gfs2_write_alloc_required presumes that writing beyond the end of a file
      will always require additional blocks to be allocated.  This assumption
      is incorrect for preallocates files, but there are no negative
      consequences as long as *some* space is still left on the filesystem.
      
      One special file that always has some space preallocated beyond the end
      of the file is the rindex: when growing a filesystem, gfs2_grow adds one
      or more new resource groups and appends records describing those
      resource groups to the rindex; the preallocated space ensures that this
      is always possible.
      
      However, when a filesystem is completely full, gfs2_write_alloc_required
      will indicate that an additional allocation is required, and appending
      the next record to the rindex will fail even though space for that
      record has already been preallocated.  To fix that, skip the incorrect
      optimization in gfs2_write_alloc_required, but for the rindex only.
      Other writes to preallocated space beyond the end of the file are still
      allowed to fail on completely full filesystems.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      77612578
  2. 25 7月, 2018 9 次提交
    • B
      GFS2: rgrp free blocks used incorrectly · f6753df3
      Bob Peterson 提交于
      Before this patch, several functions in rgrp.c checked the value of
      rgd->rd_free_clone. That does not take into account blocks that were
      reserved by a multi-block reservation. This causes a problem when
      space gets tight in the file system. For example, when function
      gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
      satisfy the request, it can accept a rgrp that it should reject
      because, although there are enough blocks to satisfy the request
      _now_, those blocks may be reserved for another running process.
      
      A second problem with this occurs when we've reserved the remaining
      blocks in an rgrp: function rg_mblk_search() can reject an rgrp
      improperly because it calculates:
      
         u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;
      
      But rd_reserved includes blocks that the current process just
      reserved in its own call to inplace_reserve. For example, it can
      reserve the last 128 blocks of an rgrp, then reject that same rgrp
      because the above calculates out to free_blocks = 0;
      
      Consequences include, but are not limited to, (1) leaving holes,
      and thus increasing file system fragmentation, and (2) reporting
      file system is full long before it actually is.
      
      This patch introduces a new function, rgd_free, which returns the
      number of clone-free blocks (blocks that are truly free as opposed
      to blocks that are still being used because an unlinked file is
      still open) minus the number of blocks reserved by processes, but
      not counting the blocks we ourselves reserved (because obviously
      we need to allocate them).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      f6753df3
    • C
      gfs2: remove redundant variable 'moved' · d1b0cb93
      Colin Ian King 提交于
      Variable 'moved' s being assigned but is never used hence it is
      redundant and can be removed.  This has been the case ever since commit
      c752666c.
      
      Cleans up clang warning:
      warning: variable 'moved' set but not used [-Wunused-but-set-variable]
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      d1b0cb93
    • A
      gfs2: use iomap_readpage for blocksize == PAGE_SIZE · f95cbb44
      Andreas Gruenbacher 提交于
      We only use iomap_readpage for pages that don't have buffer heads
      attached yet: iomap_readpage would otherwise read pages from disk that
      are marked buffer_uptodate() but not PageUptodate().  Those pages may
      actually contain data more recent than what's on disk.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      f95cbb44
    • A
      gfs2: Use iomap for stuffed direct I/O reads · 1d45bb7f
      Andreas Gruenbacher 提交于
      Remove the fallback code from direct to buffered I/O for stuffed reads.
      
      For stuffed writes, we must keep the fallback code: the deferred glock
      we are holding under direct I/O doesn't allow to write to the inode or
      change the file size.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      1d45bb7f
    • A
      gfs2: fallocate_chunk: Always initialize struct iomap · c2589282
      Andreas Gruenbacher 提交于
      In fallocate_chunk, always initialize the iomap before calling
      gfs2_iomap_get_alloc: future changes could otherwise cause things like
      iomap.flags to leak across calls.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      c2589282
    • B
      GFS2: Fix recovery issues for spectators · 4a772772
      Bob Peterson 提交于
      This patch fixes a couple problems dealing with spectators who
      remain with gfs2 mounts after the last non-spectator node fails.
      
      Before this patch, spectator mounts would try to acquire the dlm's
      mounted lock EX as part of its normal recovery sequence.
      The mounted lock is only used to determine whether the node is
      the first mounter, the first node to mount the file system, for
      the purposes of file system recovery and journal replay.
      
      It's not necessary for spectators: they should never do journal
      recovery. If they acquire the lock it will prevent another "real"
      first-mounter from acquiring the lock in EX mode, which means it
      also cannot do journal recovery because it doesn't think it's the
      first node to mount the file system.
      
      This patch checks if the mounter is a spectator, and if so, avoids
      grabbing the mounted lock. This allows a secondary mounter who is
      really the first non-spectator mounter, to do journal recovery:
      since the spectator doesn't acquire the lock, it can grab it in
      EX mode, and therefore consider itself to be the first mounter
      both as a "real" first mount, and as a first-real-after-spectator.
      
      Note that the control lock still needs to be taken in PR mode
      in order to fetch the lvb value so it has the current status of
      all journal's recovery. This is used as it is today by a first
      mounter to replay the journals. For spectators, it's merely
      used to fetch the status bits. All recovery is bypassed and the
      node waits until recovery is completed by a non-spectator node.
      
      I also improved the cryptic message given by control_mount when
      a spectator is waiting for a non-spectator to perform recovery.
      
      It also fixes a problem in gfs2_recover_set whereby spectators
      were never queueing recovery work for their own journal.
      They cannot do recovery themselves, but they still need to queue
      the work so they can check the recovery bits and clear the
      DFL_BLOCK_LOCKS bit once the recovery happens on another node.
      
      When the work queue runs on a spectator, it bypasses most of the
      work so it won't print a bunch of annoying messages. All it will
      print is a bunch of messages that look like this until recovery
      completes on the non-spectator node:
      
      GFS2: fsid=mycluster:scratch.s: recover generation 3 jid 0
      GFS2: fsid=mycluster:scratch.s: recover jid 0 result busy
      
      These continue every 1.5 seconds until the recovery is done by
      the non-spectator, at which time it says:
      
      GFS2: fsid=mycluster:scratch.s: recover generation 4 done
      
      Then it proceeds with its mount.
      
      If the file system is mounted in spectator node and the last
      remaining non-spectator is fenced, any IO to the file system is
      blocked by dlm and the spectator waits until recovery is
      performed by a non-spectator.
      
      If a spectator tries to mount the file system before any
      non-spectators, it blocks and repeatedly gives this kernel
      message:
      
      GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
      GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      4a772772
    • S
      fs: gfs2: Adding new return type vm_fault_t · 109dbb1e
      Souptick Joarder 提交于
      Use new return type vm_fault_t for gfs2_page_mkwrite
      handler.
      
      see commit 1c8f4220 ("mm: change return type to
      vm_fault_t") for reference.
      Signed-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      109dbb1e
    • C
      gfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr · 910f3d58
      Chengguang Xu 提交于
      It seems better to get size by calling posix_acl_xattr_size() instead of
      calling posix_acl_to_xattr() with NULL buffer argument.
      
      posix_acl_xattr_size() never returns 0, so remove the unnecessary check.
      Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      910f3d58
    • B
      gfs2: Don't reject a supposedly full bitmap if we have blocks reserved · e79e0e14
      Bob Peterson 提交于
      Before this patch, you could get into situations like this:
      
      1. Process 1 searches for X free blocks, finds them, makes a reservation
      2. Process 2 searches for free blocks in the same rgrp, but now the
         bitmap is full because process 1's reservation is skipped over.
         So it marks the bitmap as GBF_FULL.
      3. Process 1 tries to allocate blocks from its own reservation, but
         since the GBF_FULL bit is set, it skips over the rgrp and searches
         elsewhere, thus not using its own reservation.
      
      This patch adds an additional check to allow processes to use their
      own reservations.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      e79e0e14
  3. 05 7月, 2018 2 次提交
  4. 04 7月, 2018 3 次提交
  5. 02 7月, 2018 5 次提交
  6. 21 6月, 2018 5 次提交
  7. 20 6月, 2018 6 次提交
  8. 16 6月, 2018 2 次提交
  9. 15 6月, 2018 7 次提交
    • D
      afs: Optimise callback breaking by not repeating volume lookup · 47ea0f2e
      David Howells 提交于
      At the moment, afs_break_callbacks calls afs_break_one_callback() for each
      separate FID it was given, and the latter looks up the volume individually
      for each one.
      
      However, this is inefficient if two or more FIDs have the same vid as we
      could reuse the volume.  This is complicated by cell aliasing whereby we
      may have multiple cells sharing a volume and can therefore have multiple
      callback interests for any particular volume ID.
      
      At the moment afs_break_one_callback() scans the entire list of volumes
      we're getting from a server and breaks the appropriate callback in every
      matching volume, regardless of cell.  This scan is done for every FID.
      
      Optimise callback breaking by the following means:
      
       (1) Sort the FID list by vid so that all FIDs belonging to the same volume
           are clumped together.
      
           This is done through the use of an indirection table as we cannot do
           an insertion sort on the afs_callback_break array as we decode FIDs
           into it as we subsequently also have to decode callback info into it
           that corresponds by array index only.
      
           We also don't really want to bubblesort afterwards if we can avoid it.
      
       (2) Sort the server->cb_interests array by vid so that all the matching
           volumes are grouped together.  This permits the scan to stop after
           finding a record that has a higher vid.
      
       (3) When breaking FIDs, we try to keep server->cb_break_lock as long as
           possible, caching the start point in the array for that volume group
           as long as possible.
      
           It might make sense to add another layer in that list and have a
           refcounted volume ID anchor that has the matching interests attached
           to it rather than being in the list.  This would allow the lock to be
           dropped without losing the cursor.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      47ea0f2e
    • D
      afs: Display manually added cells in dynamic root mount · 0da0b7fd
      David Howells 提交于
      Alter the dynroot mount so that cells created by manipulation of
      /proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
      cell as a module parameter will cause directories for those cells to be
      created in the dynamic root superblock for the network namespace[*].
      
      To this end:
      
       (1) Only one dynamic root superblock is now created per network namespace
           and this is shared between all attempts to mount it.  This makes it
           easier to find the superblock to modify.
      
       (2) When a dynamic root superblock is created, the list of cells is walked
           and directories created for each cell already defined.
      
       (3) When a new cell is added, if a dynamic root superblock exists, a
           directory is created for it.
      
       (4) When a cell is destroyed, the directory is removed.
      
       (5) These directories are created by calling lookup_one_len() on the root
           dir which automatically creates them if they don't exist.
      
      [*] Inasmuch as network namespaces are currently supported here.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0da0b7fd
    • D
      afs: Enable IPv6 DNS lookups · c88d5a7f
      David Howells 提交于
      Remove the restriction on DNS lookup upcalls that prevents ipv6 addresses
      from being looked up.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c88d5a7f
    • D
      afs: Show all of a server's addresses in /proc/fs/afs/servers · 0aac4bce
      David Howells 提交于
      Show all of a server's addresses in /proc/fs/afs/servers, placing the
      second plus addresses on padded lines of their own.  The current address is
      marked with a star.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0aac4bce
    • D
      afs: Handle CONFIG_PROC_FS=n · b6cfbeca
      David Howells 提交于
      The AFS filesystem depends at the moment on /proc for configuration and
      also presents information that way - however, this causes a compilation
      failure if procfs is disabled.
      
      Fix it so that the procfs bits aren't compiled in if procfs is disabled.
      
      This means that you can't configure the AFS filesystem directly, but it is
      still usable provided that an up-to-date keyutils is installed to look up
      cells by SRV or AFSDB DNS records.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b6cfbeca
    • D
      proc: Make inline name size calculation automatic · 24074a35
      David Howells 提交于
      Make calculation of the size of the inline name in struct proc_dir_entry
      automatic, rather than having to manually encode the numbers and failing to
      allow for lockdep.
      
      Require a minimum inline name size of 33+1 to allow for names that look
      like two hex numbers with a dash between.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      24074a35
    • A
      orangefs: simplify compat ioctl handling · 430ff791
      Al Viro 提交于
      no need to mess with copy_in_user(), etc...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      430ff791
新手
引导
客服 返回
顶部