1. 07 12月, 2007 2 次提交
  2. 06 12月, 2007 7 次提交
    • A
      remove nonsense force-casts from ocfs2 · 97bd7919
      Al Viro 提交于
      endianness annotations in networking code had been in place for quite a
      while; in particular, sin_port and s_addr are annotated as big-endian.
      
      Code in ocfs2 had __force casts added apparently to shut the sparse
      warnings up; of course, these days they only serve to *produce* warnings
      for no reason whatsoever...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97bd7919
    • A
      regression: bfs endianness bug · 7e46aa5c
      Al Viro 提交于
      BFS_FILEBLOCKS() expects struct bfs_inode * (on-disk data, with little-
      endian fields), not struct bfs_inode_info * (in-core stuff, with host-
      endian ones).
      
      It's a macro and fields with the right names are present in
      bfs_inode_info, so it compiles, but on big-endian host it gives bogus
      results.
      
      Introduced in commit f433dc56 ("Fixes to
      the BFS filesystem driver").
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e46aa5c
    • A
      regression: cifs endianness bug · 9b5e6857
      Al Viro 提交于
      access_flags_to_mode() gets on-the-wire data (little-endian) and treats
      it as host-endian.
      
      Introduced in commit e01b6400 ("[CIFS]
      enable get mode from ACL when cifsacl mount option specified")
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b5e6857
    • A
      proc: fix proc_dir_entry refcounting · 5a622f2d
      Alexey Dobriyan 提交于
      Creating PDEs with refcount 0 and "deleted" flag has problems (see below).
      Switch to usual scheme:
      * PDE is created with refcount 1
      * every de_get does +1
      * every de_put() and remove_proc_entry() do -1
      * once refcount reaches 0, PDE is freed.
      
      This elegantly fixes at least two following races (both observed) without
      introducing new locks, without abusing old locks, without spreading
      lock_kernel():
      
      1) PDE leak
      
      remove_proc_entry			de_put
      -----------------			------
      			[refcnt = 1]
      if (atomic_read(&de->count) == 0)
      					if (atomic_dec_and_test(&de->count))
      						if (de->deleted)
      							/* also not taken! */
      							free_proc_entry(de);
      else
      	de->deleted = 1;
      		[refcount=0, deleted=1]
      
      2) use after free
      
      remove_proc_entry			de_put
      -----------------			------
      			[refcnt = 1]
      
      					if (atomic_dec_and_test(&de->count))
      if (atomic_read(&de->count) == 0)
      	free_proc_entry(de);
      						/* boom! */
      						if (de->deleted)
      							free_proc_entry(de);
      
      BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
      printing eip: c10acdda *pdpt = 00000000338f8001 *pde = 0000000000000000
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom
      Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c086340 #4)
      EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1
      EIP is at strnlen+0x6/0x18
      EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe
      ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000)
      Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400
             c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400
             f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34
      Call Trace:
       [<c10ac4f0>] vsnprintf+0x2ad/0x49b
       [<c10ac779>] vscnprintf+0x14/0x1f
       [<c1018e6b>] vprintk+0xc5/0x2f9
       [<c10379f1>] handle_fasteoi_irq+0x0/0xab
       [<c1004f44>] do_IRQ+0x9f/0xb7
       [<c117db3b>] preempt_schedule_irq+0x3f/0x5b
       [<c100264e>] need_resched+0x1f/0x21
       [<c10190ba>] printk+0x1b/0x1f
       [<c107c8ad>] de_put+0x3d/0x50
       [<c107c8f8>] proc_delete_inode+0x38/0x41
       [<c107c8c0>] proc_delete_inode+0x0/0x41
       [<c1066298>] generic_delete_inode+0x5e/0xc6
       [<c1065aa9>] iput+0x60/0x62
       [<c1063c8e>] d_kill+0x2d/0x46
       [<c1063fa9>] dput+0xdc/0xe4
       [<c10571a1>] __fput+0xb0/0xcd
       [<c1054e49>] filp_close+0x48/0x4f
       [<c1055ee9>] sys_close+0x67/0xa5
       [<c10026b6>] sysenter_past_esp+0x5f/0x85
      =======================
      Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9
      EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44
      
      Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds,
      module is already pinned and remove_proc_entry() can't happen => nobody
      can mark PDE deleted.
      
      Dummy proc root in netns code is not marked with refcount 1. AFAICS, we
      never get it, it's just for proper /proc/net removal. I double checked
      CLONE_NETNS continues to work.
      
      Patch survives many hours of modprobe/rmmod/cat loops without new bugs
      which can be attributed to refcounting.
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a622f2d
    • J
      jbd: Fix assertion failure in fs/jbd/checkpoint.c · d4beaf4a
      Jan Kara 提交于
      Before we start committing a transaction, we call
      __journal_clean_checkpoint_list() to cleanup transaction's written-back
      buffers.
      
      If this call happens to remove all of them (and there were already some
      buffers), __journal_remove_checkpoint() will decide to free the transaction
      because it isn't (yet) a committing transaction and soon we fail some
      assertion - the transaction really isn't ready to be freed :).
      
      We change the check in __journal_remove_checkpoint() to free only a
      transaction in T_FINISHED state.  The locking there is subtle though (as
      everywhere in JBD ;().  We use j_list_lock to protect the check and a
      subsequent call to __journal_drop_transaction() and do the same in the end
      of journal_commit_transaction() which is the only place where a transaction
      can get to T_FINISHED state.
      
      Probably I'm too paranoid here and such locking is not really necessary -
      checkpoint lists are processed only from log_do_checkpoint() where a
      transaction must be already committed to be processed or from
      __journal_clean_checkpoint_list() where kjournald itself calls it and thus
      transaction cannot change state either.  Better be safe if something
      changes in future...
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4beaf4a
    • E
      ufs: fix nexstep dir block size · 0c664f97
      Evgeniy Dushistov 提交于
      This patch fixes regression, introduced since 2.6.16.  NextStep variant of
      UFS as OpenStep uses directory block size equals to 1024.  Without this
      change, ufs_check_page fails in many cases.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NEvgeniy Dushistov <dushistov@mail.ru>
      Cc: Dave Bailey <dsbailey@pacbell.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c664f97
    • J
      aio: only account I/O wait time in read_events if there are active requests · e00ba3da
      Jeff Moyer 提交于
      On 2.6.24, top started showing 100% iowait on one CPU when a UML instance was
      running (but completely idle).  The UML code sits in io_getevents waiting for
      an event to be submitted and completed.
      
      Fix this by checking ctx->reqs_active before scheduling to determine whether
      or not we are waiting for I/O.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e00ba3da
  3. 01 12月, 2007 1 次提交
    • E
      [NETNS]: Fix /proc/net breakage · 2b1e300a
      Eric W. Biederman 提交于
      Well I clearly goofed when I added the initial network namespace support
      for /proc/net.  Currently things work but there are odd details visible to
      user space, even when we have a single network namespace.
      
      Since we do not cache proc_dir_entry dentries at the moment we can just
      modify ->lookup to return a different directory inode depending on the
      network namespace of the process looking at /proc/net, replacing the
      current technique of using a magic and fragile follow_link method.
      
      To accomplish that this patch:
      - introduces a shadow_proc method to allow different dentries to
        be returned from proc_lookup.
      - Removes the old /proc/net follow_link magic
      - Fixes a weakness in our not caching of proc generic dentries.
      
      As shadow_proc uses a task struct to decided which dentry to return we can
      go back later and fix the proc generic caching without modifying any code
      that uses the shadow_proc method.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      2b1e300a
  4. 30 11月, 2007 10 次提交
  5. 29 11月, 2007 2 次提交
  6. 28 11月, 2007 8 次提交
  7. 27 11月, 2007 8 次提交
  8. 25 11月, 2007 1 次提交
  9. 21 11月, 2007 1 次提交
    • J
      [CIFS] Fix potential data corruption when writing out cached dirty pages · cea21805
      Jeff Layton 提交于
      Fix RedHat bug 329431
      
      The idea here is separate "conscious" from "unconscious" flushes.
      Conscious flushes are those due to a fsync() or close(). Unconscious
      ones are flushes that occur as a side effect of some other operation or
      due to memory pressure.
      
      Currently, when an error occurs during an unconscious flush (ENOSPC or
      EIO), we toss out the page and don't preserve that error to report to
      the user when a conscious flush occurs. If after the unconscious flush,
      there are no more dirty pages for the inode, the conscious flush will
      simply return success even though there were previous errors when writing
      out pages. This can lead to data corruption.
      
      The easiest way to reproduce this is to mount up a CIFS share that's
      very close to being full or where the user is very close to quota. mv
      a file to the share that's slightly larger than the quota allows. The
      writes will all succeed (since they go to pagecache). The mv will do a
      setattr to set the new file's attributes. This calls
      filemap_write_and_wait,
      which will return an error since all of the pages can't be written out.
      Then later, when the flush and release ops occur, there are no more
      dirty pages in pagecache for the file and those operations return 0. mv
      then assumes that the file was written out correctly and deletes the
      original.
      
      CIFS already has a write_behind_rc variable where it stores the results
      from earlier flushes, but that value is only reported in cifs_close.
      Since the VFS ignores the return value from the release operation, this
      isn't helpful. We should be reporting this error during the flush
      operation.
      
      This patch does the following:
      
      1) changes cifs_fsync to use filemap_write_and_wait and cifs_flush and also
      sync to check its return code. If it returns successful, they then check
      the value of write_behind_rc to see if an earlier flush had reported any
      errors. If so, they return that error and clear write_behind_rc.
      
      2) sets write_behind_rc in a few other places where pages are written
      out as a side effect of other operations and the code waits on them.
      
      3) changes cifs_setattr to only call filemap_write_and_wait for
      ATTR_SIZE changes.
      
      4) makes cifs_writepages accurately distinguish between EIO and ENOSPC
      errors when writing out pages.
      
      Some simple testing indicates that the patch works as expected and that
      it fixes the reproduceable known problem.
      Acked-by: NDave Kleikamp <shaggy@austin.rr.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      cea21805