1. 15 2月, 2008 16 次提交
  2. 14 2月, 2008 5 次提交
  3. 11 2月, 2008 5 次提交
    • J
      NLM: don't requeue block if it was invalidated while GRANT_MSG was in flight · c64e80d5
      Jeff Layton 提交于
      It's possible for lockd to catch a SIGKILL while a GRANT_MSG callback
      is in flight. If this happens we don't want lockd to insert the block
      back into the nlm_blocked list.
      
      This helps that situation, but there's still a possible race. Fixing
      that will mean adding real locking for nlm_blocked.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      c64e80d5
    • J
      NLM: don't reattempt GRANT_MSG when there is already an RPC in flight · 9706501e
      Jeff Layton 提交于
      With the current scheme in nlmsvc_grant_blocked, we can end up with more
      than one GRANT_MSG callback for a block in flight. Right now, we requeue
      the block unconditionally so that a GRANT_MSG callback is done again in
      30s. If the client is unresponsive, it can take more than 30s for the
      call already in flight to time out.
      
      There's no benefit to having more than one GRANT_MSG RPC queued up at a
      time, so put it on the list with a timeout of NLM_NEVER before doing the
      RPC call. If the RPC call submission fails, we requeue it with a short
      timeout. If it works, then nlmsvc_grant_callback will end up requeueing
      it with a shorter timeout after it completes.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      9706501e
    • J
      NLM: have server-side RPC clients default to soft RPC tasks · 90bd17c8
      Jeff Layton 提交于
      Now that it no longer does an RPC ping, lockd always ends up queueing
      an RPC task for the GRANT_MSG callback. But, it also requeues the block
      for later attempts. Since these are hard RPC tasks, if the client we're
      calling back goes unresponsive the GRANT_MSG callbacks can stack up in
      the RPC queue.
      
      Fix this by making server-side RPC clients default to soft RPC tasks.
      lockd requeues the block anyway, so this should be OK.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      90bd17c8
    • J
      NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients · 031fd3aa
      Jeff Layton 提交于
      It's currently possible for an unresponsive NLM client to completely
      lock up a server's lockd. The scenario is something like this:
      
      1) client1 (or a process on the server) takes a lock on a file
      2) client2 tries to take a blocking lock on the same file and
         awaits the callback
      3) client2 goes unresponsive (plug pulled, network partition, etc)
      4) client1 releases the lock
      
      ...at that point the server's lockd will try to queue up a GRANT_MSG
      callback for client2, but first it requeues the block with a timeout of
      30s. nlm_async_call will attempt to bind the RPC client to client2 and
      will call rpc_ping. rpc_ping entails a sync RPC call and if client2 is
      unresponsive it will take around 60s for that to time out. Once it times
      out, it's already time to retry the block and the whole process repeats.
      
      Once in this situation, nlmsvc_retry_blocked will never return until
      the host starts responding again. lockd won't service new calls.
      
      Fix this by skipping the RPC ping on NLM RPC clients. This makes
      nlm_async_call return quickly when called.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      031fd3aa
    • B
      splice: fix user pointer access in get_iovec_page_array() · 712a30e6
      Bastian Blank 提交于
      Commit 8811930d ("splice: missing user
      pointer access verification") added the proper access_ok() calls to
      copy_from_user_mmap_sem() which ensures we can copy the struct iovecs
      from userspace to the kernel.
      
      But we also must check whether we can access the actual memory region
      pointed to by the struct iovec to fix the access checks properly.
      Signed-off-by: NBastian Blank <waldi@debian.org>
      Acked-by: NOliver Pinter <oliver.pntr@gmail.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      712a30e6
  4. 10 2月, 2008 8 次提交
    • T
      ext4: Add new "development flag" to the ext4 filesystem · 469108ff
      Theodore Tso 提交于
      This flag is simply a generic "this is a crash/burn test filesystem"
      marker.  If it is set, then filesystem code which is "in development"
      will be allowed to mount the filesystem.  Filesystem code which is not
      considered ready for prime-time will check for this flag, and if it is
      not set, it will refuse to touch the filesystem.
      
      As we start rolling ext4 out to distro's like Fedora, et. al, this makes
      it less likely that a user might accidentally start using ext4 on a
      production filesystem; a bad thing, since that will essentially make it
      be unfsckable until e2fsprogs catches up.
      Signed-off-by: NTheodore Tso <tytso@MIT.EDU>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      469108ff
    • A
      ext4: Don't panic in case of corrupt bitmap · 26346ff6
      Aneesh Kumar K.V 提交于
      Multiblock allocator calls BUG_ON in many case if the free and used
      blocks count obtained looking at the bitmap is different from what
      the allocator internally accounted for. Use ext4_error in such case
      and don't panic the system.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      26346ff6
    • E
      ext4: allocate struct ext4_allocation_context from a kmem cache · 256bdb49
      Eric Sandeen 提交于
      struct ext4_allocation_context is rather large, and this bloats
      the stack of many functions which use it.  Allocating it from
      a named slab cache will alleviate this.
      
      For example, with this change (on top of the noinline patch sent earlier):
      
      -ext4_mb_new_blocks		200
      +ext4_mb_new_blocks		 40
      
      -ext4_mb_free_blocks		344
      +ext4_mb_free_blocks		168
      
      -ext4_mb_release_inode_pa	216
      +ext4_mb_release_inode_pa	 40
      
      -ext4_mb_release_group_pa	192
      +ext4_mb_release_group_pa	 24
      
      Most of these stack-allocated structs are actually used only for
      mballoc history; and in those cases often a smaller struct would do.
      So changing that may be another way around it, at least for those
      functions, if preferred.  For now, in those cases where the ac
      is only for history, an allocation failure simply skips the history
      recording, and does not cause any other failures.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      256bdb49
    • D
      JBD2: Clear buffer_ordered flag for barried IO request on success · c4e35e07
      Dave Kleikamp 提交于
      In JBD2 jbd2_journal_write_commit_record(), clear the buffer_ordered
      flag for the bh after barried IO has succeed. This prevents later, if
      the same buffer head were submitted to the underlying device, which has
      been reconfigured to not support barrier request, the JBD2 commit code
      could treat it as a normal IO (without barrier).
      
      This is a port from JBD/ext3 fix from Neil Brown.
      
      More details from Neil:
      
      Some devices - notably dm and md - can change their behaviour in
      response to BIO_RW_BARRIER requests.  They might start out accepting
      such requests but on reconfiguration, they find out that they cannot
      any more. JBD2 deal with this by always testing if BIO_RW_BARRIER
      requests fail with EOPNOTSUPP, and retrying the write
      requests without the barrier (probably after waiting for any pending
      writes to complete).
      
      However there is a bug in the handling this in JBD2 for ext4 .
      
      When ext4/JBD2 to submit a BIO_RW_BARRIER request,
      it sets the buffer_ordered flag on the buffer head.
      If the request completes successfully, the flag STAYS SET.
      
      Other code might then write the same buffer_head after the device has
      been reconfigured to not accept barriers.  This write will then fail,
      but the "other code" is not ready to handle EOPNOTSUPP errors and the
      error will be treated as fatal.
      
      Cc:  Neil Brown <neilb@suse.de>
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c4e35e07
    • J
      ext4: Fix Direct I/O locking · 7fb5409d
      Jan Kara 提交于
      We cannot start transaction in ext4_direct_IO() and just let it last
      during the whole write because dio_get_page() acquires mmap_sem which
      ranks above transaction start (e.g. because we have dependency chain
      mmap_sem->PageLock->journal_start, or because we update atime while
      holding mmap_sem) and thus deadlocks could happen. We solve the problem
      by starting a transaction separately for each ext4_get_block() call.
      
      We *could* have a problem that we allocate a block and before its data
      are written out the machine crashes and thus we expose stale data. But
      that does not happen because for hole-filling generic code falls back to
      buffered writes and for file extension, we add inode to orphan list and
      thus in case of crash, journal replay will truncate inode back to the
      original size.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7fb5409d
    • A
      ext4: Fix circular locking dependency with migrate and rm. · 8009f9fb
      Aneesh Kumar K.V 提交于
      In order to prevent a circular locking dependency when an unlink
      operation is racing with an ext4 migration, we delay taking i_data_sem
      until just before switch the inode format, and use i_mutex to prevent
      writes and truncates during the first part of the migration operation.
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8009f9fb
    • C
      fix up kerneldoc in fs/ioctl.c a little bit · f6a4c8bd
      Christoph Hellwig 提交于
       - remove non-standard in/out markers
       - use tabs for formatting
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
      Cc: Erez Zadok <ezk@cs.sunysb.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f6a4c8bd
    • J
      UML: fix hostfs build · 6966a977
      Jiri Kosina 提交于
      /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c: In function 'hostfs_show_options':
      /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c:328: error: dereferencing pointer to incomplete type
      
      We need to include mount.h to get vfsmount.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Reported-by: NAdrian Bunk <bunk@stusta.de>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6966a977
  5. 09 2月, 2008 6 次提交