1. 29 10月, 2010 27 次提交
  2. 28 10月, 2010 13 次提交
    • V
      9p: Add datasync to client side TFSYNC/RFSYNC for dotl · b165d601
      Venkateswararao Jujjuri (JV) 提交于
      SYNOPSIS
          size[4] Tfsync tag[2] fid[4] datasync[4]
      
          size[4] Rfsync tag[2]
      
      DESCRIPTION
      
          The Tfsync transaction transfers ("flushes") all modified in-core data of
          file identified by fid to the disk device (or other  permanent  storage
          device)  where that  file  resides.
      
          If datasync flag is specified data will be fleshed but does not flush
          modified metadata unless  that  metadata  is  needed  in order to allow a
          subsequent data retrieval to be correctly handled.
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      b165d601
    • M
      9p: Implement TREADLINK operation for 9p2000.L · 329176cc
      M. Mohan Kumar 提交于
      Synopsis
      
      	size[4] TReadlink tag[2] fid[4]
      	size[4] RReadlink tag[2] target[s]
      
      Description
      	Readlink is used to return the contents of the symoblic link
              referred by fid. Contents of symboic link is returned as a
              response.
      
      	target[s] - Contents of the symbolic link referred by fid.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      329176cc
    • M
      9p: Use V9FS_MAGIC in statfs · 368c09d2
      M. Mohan Kumar 提交于
      Use V9FS_MAGIC as the file system type while filling kernel statfs
      strucutre instead of using host file system magic number. Also move
      the definition of V9FS_MAGIC from v9fs.h to standard magic.h file.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      368c09d2
    • M
      9p: Implement TGETLOCK · 1d769cd1
      M. Mohan Kumar 提交于
      Synopsis
      
          size[4] TGetlock tag[2] fid[4] getlock[n]
          size[4] RGetlock tag[2] getlock[n]
      
      Description
      
      TGetlock is used to test for the existence of byte range posix locks on a file
      identified by given fid. The reply contains getlock structure. If the lock could
      be placed it returns F_UNLCK in type field of getlock structure.  Otherwise it
      returns the details of the conflicting locks in the getlock structure
      
          getlock structure:
            type[1] - Type of lock: F_RDLCK, F_WRLCK
            start[8] - Starting offset for lock
            length[8] - Number of bytes to check for the lock
                   If length is 0, check for lock in all bytes starting at the location
                  'start' through to the end of file
            pid[4] - PID of the process that wants to take lock/owns the task
                     in case of reply
            client[4] - Client id of the system that owns the process which
                        has the conflicting lock
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      1d769cd1
    • M
      9p: Implement TLOCK · a099027c
      M. Mohan Kumar 提交于
      Synopsis
      
          size[4] TLock tag[2] fid[4] flock[n]
          size[4] RLock tag[2] status[1]
      
      Description
      
      Tlock is used to acquire/release byte range posix locks on a file
      identified by given fid. The reply contains status of the lock request
      
          flock structure:
              type[1] - Type of lock: F_RDLCK, F_WRLCK, F_UNLCK
              flags[4] - Flags could be either of
                P9_LOCK_FLAGS_BLOCK - Blocked lock request, if there is a
                  conflicting lock exists, wait for that lock to be released.
                P9_LOCK_FLAGS_RECLAIM - Reclaim lock request, used when client is
                  trying to reclaim a lock after a server restrart (due to crash)
              start[8] - Starting offset for lock
              length[8] - Number of bytes to lock
                If length is 0, lock all bytes starting at the location 'start'
                through to the end of file
              pid[4] - PID of the process that wants to take lock
              client_id[4] - Unique client id
      
              status[1] - Status of the lock request, can be
                P9_LOCK_SUCCESS(0), P9_LOCK_BLOCKED(1), P9_LOCK_ERROR(2) or
                P9_LOCK_GRACE(3)
                P9_LOCK_SUCCESS - Request was successful
                P9_LOCK_BLOCKED - A conflicting lock is held by another process
                P9_LOCK_ERROR - Error while processing the lock request
                P9_LOCK_GRACE - Server is in grace period, it can't accept new lock
                  requests in this period (except locks with
                  P9_LOCK_FLAGS_RECLAIM flag set)
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      a099027c
    • V
      [9p] Introduce client side TFSYNC/RFSYNC for dotl. · 920e65dc
      Venkateswararao Jujjuri (JV) 提交于
      SYNOPSIS
          size[4] Tfsync tag[2] fid[4]
      
          size[4] Rfsync tag[2]
      
      DESCRIPTION
      
      The Tfsync transaction transfers ("flushes") all modified in-core data of
      file identified by fid to the disk device (or other  permanent  storage
      device)  where that  file  resides.
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      920e65dc
    • A
    • T
      ext4,jbd2: convert tracepoints to use major/minor numbers · a269029d
      Theodore Ts'o 提交于
      Unfortunately perf can't deal with anything other than direct structure
      accesses in the TP_printk() section.  It will drop dead when it sees
      jbd2_dev_to_name() in the "print fmt" section of the tracepoint.
      
      Addresses-Google-Bug: 3138508
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a269029d
    • T
      ext4: fix kernel oops if the journal superblock has a non-zero j_errno · 7f93cff9
      Theodore Ts'o 提交于
      Commit 84061e07 fixed an accounting bug only to introduce the
      possibility of a kernel OOPS if the journal has a non-zero j_errno
      field indicating that the file system had detected a fs inconsistency.
      After the journal replay, if the journal superblock indicates that the
      file system has an error, this indication is transfered to the file
      system and then ext4_commit_super() is called to write this to the
      disk.
      
      But since the percpu counters are now initialized after the journal
      replay, the call to ext4_commit_super() will cause a kernel oops since
      it needs to use the percpu counters the ext4 superblock structure.
      
      The fix is to skip setting the ext4 free block and free inode fields
      if the percpu counter has not been set.
      
      Thanks to Ken Sumrall for reporting and analyzing the root causes of
      this bug.
      
      Addresses-Google-Bug: #3054080
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7f93cff9
    • E
      ext4: implement writeback livelock avoidance using page tagging · 5b41d924
      Eric Sandeen 提交于
      This is analogous to Jan Kara's commit,
      f446daae
      mm: implement writeback livelock avoidance using page tagging
      
      but since we forked write_cache_pages, we need to reimplement
      it there (and in ext4_da_writepages, since range_cyclic handling
      was moved to there)
      
      If you start a large buffered IO to a file, and then set
      fsync after it, you'll find that fsync does not complete
      until the other IO stops.
      
      If you continue re-dirtying the file (say, putting dd
      with conv=notrunc in a loop), when fsync finally completes
      (after all IO is done), it reports via tracing that
      it has written many more pages than the file contains;
      in other words it has synced and re-synced pages in
      the file multiple times.
      
      This then leads to problems with our writeback_index
      update, since it advances it by pages written, and
      essentially sets writeback_index off the end of the
      file...
      
      With the following patch, we only sync as much as was
      dirty at the time of the sync.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5b41d924
    • L
      fs: Add FITRIM ioctl · 367a51a3
      Lukas Czerner 提交于
      Adds an filesystem independent ioctl to allow implementation of file
      system batched discard support. I takes fstrim_range structure as an
      argument. fstrim_range is definec in the include/fs.h and its
      definition is as follows.
      
      struct fstrim_range {
      	start;
      	len;
      	minlen;
      }
      
      start	- first Byte to trim
      len	- number of Bytes to trim from start
      minlen	- minimum extent length to trim, free extents shorter than this
      	  number of Bytes will be ignored. This will be rounded up to fs
      	  block size.
      
      It is also possible to specify NULL as an argument. In this case the
      arguments will set itself as follows:
      
      start = 0;
      len = ULLONG_MAX;
      minlen = 0;
      
      So it will trim the whole file system at one run.
      
      After the FITRIM is done, the number of actually discarded Bytes is stored
      in fstrim_range.len to give the user better insight on how much storage
      space has been really released for wear-leveling.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      367a51a3
    • E
      ext4: don't use ext4_allocation_contexts for tracing · 3e1e5f50
      Eric Sandeen 提交于
      Many tracepoints were populating an ext4_allocation_context
      to pass in, but this requires a slab allocation even when
      tracepoints are off.  In fact, 4 of 5 of these allocations
      were only for tracing.  In addition, we were only using a
      small fraction of the 144 bytes of this structure for this
      purpose.
      
      We can do away with all these alloc/frees of the ac and
      simply pass in the bits we care about, instead.
      
      I tested this by turning on tracing and running through
      xfstests on x86_64.  I did not actually do anything with
      the trace output, however.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3e1e5f50
    • E
      ext4: fix oops in trace_ext4_mb_release_group_pa · 4d547616
      Eric Sandeen 提交于
      Our QA reported an oops in the ext4_mb_release_group_pa tracing,
      and Josef Bacik pointed out that it was because we may have a
      non-null but uninitialized ac_inode in the allocation context.
      
      I can reproduce it when running xfstests with ext4 tracepoints on, 
      on a CONFIG_SLAB_DEBUG kernel.
      
      We call trace_ext4_mb_release_group_pa from 2 places, 
      ext4_mb_discard_group_preallocations and 
      ext4_mb_discard_lg_preallocations
      
      In both cases we allocate an ac as a container just for tracing (!)
      and never fill in the ac_inode.  There's no reason to be assigning,
      testing, or printing it as far as I can see, so just remove it from
      the tracepoint.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4d547616