1. 17 10月, 2008 40 次提交
    • T
      block: add partition attribute for partition number · 0fc71e3d
      Tejun Heo 提交于
      With extended devt, finding out the partition number becomes a bit
      more challenging as subtracting the minor number from that of the
      parent device doesn't work anymore.  The only thing left is parsing
      the partition name which is brittle and not exactly universal (some
      have '-' between the device name and partition number while others
      don't).  This patch introduced partition attribute which contains the
      partition number of the device.  This should make finding partitions
      and its index easier.
      
      This problem and solution were suggested by H. Peter Anvin.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      0fc71e3d
    • T
      Configure out AIO support · ebf3f09c
      Thomas Petazzoni 提交于
      This patchs adds the CONFIG_AIO option which allows to remove support
      for asynchronous I/O operations, that are not necessarly used by
      applications, particularly on embedded devices. As this is a
      size-reduction option, it depends on CONFIG_EMBEDDED. It allows to
      save ~7 kilobytes of kernel code/data:
      
         text	   data	    bss	    dec	    hex	filename
      1115067	 119180	 217088	1451335	 162547	vmlinux
      1108025	 119048	 217088	1444161	 160941	vmlinux.new
        -7042    -132       0   -7174   -1C06 +/-
      
      This patch has been originally written by Matt Mackall
      <mpm@selenic.com>, and is part of the Linux Tiny project.
      
      [randy.dunlap@oracle.com: build fix]
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Zach Brown <zach.brown@oracle.com>
      Signed-off-by: NMatt Mackall <mpm@selenic.com>
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebf3f09c
    • N
      afs: convert to new aops · 15b4650e
      Nick Piggin 提交于
      Cannot assume writes will fully complete, so this conversion goes the easy
      way and always brings the page uptodate before the write.
      
      [dhowells@redhat.com: style tweaks]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15b4650e
    • O
      pid_ns: de_thread: kill the now unneeded ->child_reaper change · 07edbde5
      Oleg Nesterov 提交于
      de_thread() checks if the old leader was the ->child_reaper, this is not
      possible any longer.  With the previous patch ->group_leader itself will
      change ->child_reaper on exit.
      
      Henceforth find_new_reaper() is the only function (apart from
      initialization) which plays with ->child_reaper.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      07edbde5
    • A
      proc: move sysrq-trigger out of fs/proc/ · f40cbaa5
      Alexey Dobriyan 提交于
      Move it into sysrq.c, along with the rest of the sysrq implementation.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f40cbaa5
    • K
      block: sanitize invalid partition table entries · ac0d86f5
      Kay Sievers 提交于
      We currently follow blindly what the partition table lies about the
      disk, and let the kernel create block devices which can not be accessed.
      Trying to identify the device leads to kernel logs full of:
        sdb: rw=0, want=73392, limit=28800
        attempt to access beyond end of device
      
      Here is an example of a broken partition table, where sda2 starts
      behind the end of the disk, and sdb3 is larger than the entire disk:
        Disk /dev/sdb: 14 MB, 14745600 bytes
        1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors
           Device Boot      Start         End      Blocks   Id  System
        /dev/sdb1              29        7800        3886   83  Linux
        /dev/sdb2           37801       45601        3900+  83  Linux
        /dev/sdb3           15602       73402       28900+  83  Linux
        /dev/sdb4           23403       28796        2697   83  Linux
      
      The kernel creates these completely invalid devices, which can not be
      accessed, or may lead to other unpredictable failures:
        grep . /sys/class/block/sdb*/{start,size}
        /sys/class/block/sdb/size:28800
        /sys/class/block/sdb1/start:29
        /sys/class/block/sdb1/size:7772
        /sys/class/block/sdb2/start:37801
        /sys/class/block/sdb2/size:7801
        /sys/class/block/sdb3/start:15602
        /sys/class/block/sdb3/size:57801
        /sys/class/block/sdb4/start:23403
        /sys/class/block/sdb4/size:5394
      
      With this patch, we ignore partitions which start behind the end of the disk,
      and limit partitions to the end of the disk if they pretend to be larger:
        grep . /sys/class/block/sdb*/{start,size}
        /sys/class/block/sdb/size:28800
        /sys/class/block/sdb1/start:29
        /sys/class/block/sdb1/size:7772
        /sys/class/block/sdb3/start:15602
        /sys/class/block/sdb3/size:13198
        /sys/class/block/sdb4/start:23403
        /sys/class/block/sdb4/size:5394
      
      These warnings are printed to the kernel log:
        sdb: p2 ignored, start 37801 is behind the end of the disk
        sdb: p3 size 57801 limited to end of disk
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac0d86f5
    • A
      fs/partitions/acorn.c: remove dead code · 6722e45c
      Adrian Bunk 提交于
      I missed this when I did the arm26 removal.
      Reported-by: NRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6722e45c
    • A
      COMPAT_BINFMT_ELF definition tweak · 4cea5ceb
      Alexey Dobriyan 提交于
      Don't repeat BINFMT_ELF definition, simply multiply COMPAT and BINFMT_ELF.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4cea5ceb
    • P
      binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE · 5edc2a51
      Paul Mundt 提交于
      These auxvec entries are the only ones left unhandled out of the current
      base implementation. This syncs up binfmt_elf_fdpic with linux/auxvec.h
      and current binfmt_elf.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5edc2a51
    • P
      binfmt_elf_fdpic: convert initial stack alignment to arch_align_stack() · c7637941
      Paul Mundt 提交于
      binfmt_elf_fdpic seems to have grabbed a hard-coded hack from an ancient
      version of binfmt_elf in order to try and fix up initial stack alignment
      on multi-threaded x86, which while in addition to being unused, was also
      pushed down beyond the first set of operations on the stack pointer,
      negating the entire purpose.
      
      These days, we have an architecture independent arch_align_stack(), so we
      switch to using that instead. Move the initial alignment up before the
      initial stores while we're at it.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7637941
    • P
      binfmt_elf_fdpic: support auxvec base platform string · ec23847d
      Paul Mundt 提交于
      Commit 483fad1c ("ELF loader support for
      auxvec base platform string") introduced AT_BASE_PLATFORM, but only
      implemented it for binfmt_elf.
      
      Given that AT_VECTOR_SIZE_BASE is unconditionally enlarged for us, and
      it's only optionally added in for the platforms that set
      ELF_BASE_PLATFORM, wire it up for binfmt_elf_fdpic, too.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ec23847d
    • A
      quota: remove CVS keywords · b73c29f6
      Adrian Bunk 提交于
      Remove CVS keywords that weren't updated for a long time from comments.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b73c29f6
    • J
      fs/reiserfs: use an IS_ERR test rather than a NULL test · 67b172c0
      Julien Brunel 提交于
      In case of error, the function open_xa_dir returns an ERR pointer, but
      never returns a NULL pointer.  So a NULL test that comes after an IS_ERR
      test should be deleted.
      
      The semantic match that finds this problem is as follows:
      (http://www.emn.fr/x-info/coccinelle/)
      
      // <smpl>
      @match_bad_null_test@
      expression x, E;
      statement S1,S2;
      @@
      x = open_xa_dir(...)
      ... when != x = E
      (
      *  if (x == NULL && ...) S1 else S2
      |
      *  if (x == NULL || ...) S1 else S2
      )
      // </smpl>
      Signed-off-by: NJulien Brunel <brunel@diku.dk>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jan Kara <jack@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      67b172c0
    • A
      reiserfs/procfs.c: remove CVS keywords · 6b23ea76
      Adrian Bunk 提交于
      Remove CVS keywords that weren't updated for a long time from comments.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b23ea76
    • E
      hfs: fix namelength memory corruption · d38b7aa7
      Eric Sesterhenn 提交于
      Fix a stack corruption caused by a corrupted hfs filesystem.  If the
      catalog name length is corrupted the memcpy overwrites the catalog btree
      structure.  Since the field is limited to HFS_NAMELEN bytes in the
      structure and the file format, we throw an error if it is too long.
      
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d38b7aa7
    • E
      hfsplus: check read_mapping_page() return value · 649f1ee6
      Eric Sesterhenn 提交于
      While testing more corrupted images with hfsplus, i came across
      one which triggered the following bug:
      
      [15840.675016] BUG: unable to handle kernel paging request at fffffffb
      [15840.675016] IP: [<c0116a4f>] kmap+0x15/0x56
      [15840.675016] *pde = 00008067 *pte = 00000000
      [15840.675016] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
      [15840.675016] Modules linked in:
      [15840.675016]
      [15840.675016] Pid: 11575, comm: ln Not tainted (2.6.27-rc4-00123-gd3ee1b40-dirty #29)
      [15840.675016] EIP: 0060:[<c0116a4f>] EFLAGS: 00010202 CPU: 0
      [15840.675016] EIP is at kmap+0x15/0x56
      [15840.675016] EAX: 00000246 EBX: fffffffb ECX: 00000000 EDX: cab919c0
      [15840.675016] ESI: 000007dd EDI: cab0bcf4 EBP: cab0bc98 ESP: cab0bc94
      [15840.675016]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      [15840.675016] Process ln (pid: 11575, ti=cab0b000 task=cab919c0 task.ti=cab0b000)
      [15840.675016] Stack: 00000000 cab0bcdc c0231cfb 00000000 cab0bce0 00000800 ca9290c0 fffffffb
      [15840.675016]        cab145d0 cab919c0 cab15998 22222222 22222222 22222222 00000001 cab15960
      [15840.675016]        000007dd cab0bcf4 cab0bd04 c022cb3a cab0bcf4 cab15a6c ca9290c0 00000000
      [15840.675016] Call Trace:
      [15840.675016]  [<c0231cfb>] ? hfsplus_block_allocate+0x6f/0x2d3
      [15840.675016]  [<c022cb3a>] ? hfsplus_file_extend+0xc4/0x1db
      [15840.675016]  [<c022ce41>] ? hfsplus_get_block+0x8c/0x19d
      [15840.675016]  [<c06adde4>] ? sub_preempt_count+0x9d/0xab
      [15840.675016]  [<c019ece6>] ? __block_prepare_write+0x147/0x311
      [15840.675016]  [<c0161934>] ? __grab_cache_page+0x52/0x73
      [15840.675016]  [<c019ef4f>] ? block_write_begin+0x79/0xd5
      [15840.675016]  [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
      [15840.675016]  [<c019f22a>] ? cont_write_begin+0x27f/0x2af
      [15840.675016]  [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
      [15840.675016]  [<c0139ebe>] ? tick_program_event+0x28/0x4c
      [15840.675016]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [15840.675016]  [<c022b723>] ? hfsplus_write_begin+0x2d/0x32
      [15840.675016]  [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
      [15840.675016]  [<c0161988>] ? pagecache_write_begin+0x33/0x107
      [15840.675016]  [<c01879e5>] ? __page_symlink+0x3c/0xae
      [15840.675016]  [<c019ad34>] ? __mark_inode_dirty+0x12f/0x137
      [15840.675016]  [<c0187a70>] ? page_symlink+0x19/0x1e
      [15840.675016]  [<c022e6eb>] ? hfsplus_symlink+0x41/0xa6
      [15840.675016]  [<c01886a9>] ? vfs_symlink+0x99/0x101
      [15840.675016]  [<c018a2f6>] ? sys_symlinkat+0x6b/0xad
      [15840.675016]  [<c018a348>] ? sys_symlink+0x10/0x12
      [15840.675016]  [<c01038bd>] ? sysenter_do_call+0x12/0x31
      [15840.675016]  =======================
      [15840.675016] Code: 00 00 75 10 83 3d 88 2f ec c0 02 75 07 89 d0 e8 12 56 05 00 5d c3 55 ba 06 00 00 00 89 e5 53 89 c3 b8 3d eb 7e c0 e8 16 74 00 00 <8b> 03 c1 e8 1e 69 c0 d8 02 00 00 05 b8 69 8e c0 2b 80 c4 02 00
      [15840.675016] EIP: [<c0116a4f>] kmap+0x15/0x56 SS:ESP 0068:cab0bc94
      [15840.675016] ---[ end trace 4fea40dad6b70e5f ]---
      
      This happens because the return value of read_mapping_page() is passed on
      to kmap unchecked.  The bug is triggered after the first
      read_mapping_page() in hfsplus_block_allocate(), this patch fixes all
      three usages in this functions but leaves the ones further down in the
      file unchanged.
      Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      649f1ee6
    • E
      hfsplus: fix Buffer overflow with a corrupted image · efc7ffcb
      Eric Sesterhenn 提交于
      When an hfsplus image gets corrupted it might happen that the catalog
      namelength field gets b0rked.  If we mount such an image the memcpy() in
      hfsplus_cat_build_key_uni() writes more than the 255 that fit in the name
      field.  Depending on the size of the overwritten data, we either only get
      memory corruption or also trigger an oops like this:
      
      [  221.628020] BUG: unable to handle kernel paging request at c82b0000
      [  221.629066] IP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151
      [  221.629066] *pde = 0ea29163 *pte = 082b0160
      [  221.629066] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
      [  221.629066] Modules linked in:
      [  221.629066]
      [  221.629066] Pid: 4845, comm: mount Not tainted (2.6.27-rc4-00123-gd3ee1b40-dirty #28)
      [  221.629066] EIP: 0060:[<c022d4b1>] EFLAGS: 00010206 CPU: 0
      [  221.629066] EIP is at hfsplus_find_cat+0x10d/0x151
      [  221.629066] EAX: 00000029 EBX: 00016210 ECX: 000042c2 EDX: 00000002
      [  221.629066] ESI: c82d70ca EDI: c82b0000 EBP: c82d1bcc ESP: c82d199c
      [  221.629066]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      [  221.629066] Process mount (pid: 4845, ti=c82d1000 task=c8224060 task.ti=c82d1000)
      [  221.629066] Stack: c080b3c4 c82aa8f8 c82d19c2 00016210 c080b3be c82d1bd4 c82aa8f0 00000300
      [  221.629066]        01000000 750008b1 74006e00 74006900 65006c00 c82d6400 c013bd35 c8224060
      [  221.629066]        00000036 00000046 c82d19f0 00000082 c8224548 c8224060 00000036 c0d653cc
      [  221.629066] Call Trace:
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c0107aa3>] ? native_sched_clock+0x82/0x96
      [  221.629066]  [<c01302d2>] ? __kernel_text_address+0x1b/0x27
      [  221.629066]  [<c010487a>] ? dump_trace+0xca/0xd6
      [  221.629066]  [<c0109e32>] ? save_stack_address+0x0/0x2c
      [  221.629066]  [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
      [  221.629066]  [<c013b571>] ? save_trace+0x37/0x8d
      [  221.629066]  [<c013b62e>] ? add_lock_to_list+0x67/0x8d
      [  221.629066]  [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
      [  221.629066]  [<c013553d>] ? down+0xc/0x2f
      [  221.629066]  [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c0107aa3>] ? native_sched_clock+0x82/0x96
      [  221.629066]  [<c013da5d>] ? mark_held_locks+0x43/0x5a
      [  221.629066]  [<c013dc3a>] ? trace_hardirqs_on+0xb/0xd
      [  221.629066]  [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
      [  221.629066]  [<c06abec8>] ? _spin_unlock_irqrestore+0x42/0x58
      [  221.629066]  [<c013555c>] ? down+0x2b/0x2f
      [  221.629066]  [<c022aa68>] ? hfsplus_iget+0xa0/0x154
      [  221.629066]  [<c022b0b9>] ? hfsplus_fill_super+0x280/0x447
      [  221.629066]  [<c0107aa3>] ? native_sched_clock+0x82/0x96
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
      [  221.629066]  [<c041c9e4>] ? string+0x2b/0x74
      [  221.629066]  [<c041cd16>] ? vsnprintf+0x2e9/0x512
      [  221.629066]  [<c010487a>] ? dump_trace+0xca/0xd6
      [  221.629066]  [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
      [  221.629066]  [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
      [  221.629066]  [<c013b571>] ? save_trace+0x37/0x8d
      [  221.629066]  [<c013b62e>] ? add_lock_to_list+0x67/0x8d
      [  221.629066]  [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
      [  221.629066]  [<c01354d3>] ? up+0xc/0x2f
      [  221.629066]  [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
      [  221.629066]  [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
      [  221.629066]  [<c0107aa3>] ? native_sched_clock+0x82/0x96
      [  221.629066]  [<c041cfb7>] ? snprintf+0x1b/0x1d
      [  221.629066]  [<c01ba466>] ? disk_name+0x25/0x67
      [  221.629066]  [<c0183960>] ? get_sb_bdev+0xcd/0x10b
      [  221.629066]  [<c016ad92>] ? kstrdup+0x2a/0x4c
      [  221.629066]  [<c022a7b3>] ? hfsplus_get_sb+0x13/0x15
      [  221.629066]  [<c022ae39>] ? hfsplus_fill_super+0x0/0x447
      [  221.629066]  [<c0183583>] ? vfs_kern_mount+0x3b/0x76
      [  221.629066]  [<c0183602>] ? do_kern_mount+0x32/0xba
      [  221.629066]  [<c01960d4>] ? do_new_mount+0x46/0x74
      [  221.629066]  [<c0196277>] ? do_mount+0x175/0x193
      [  221.629066]  [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
      [  221.629066]  [<c01663b2>] ? __get_free_pages+0x1e/0x24
      [  221.629066]  [<c06ac07b>] ? lock_kernel+0x19/0x8c
      [  221.629066]  [<c01962e6>] ? sys_mount+0x51/0x9b
      [  221.629066]  [<c01962f9>] ? sys_mount+0x64/0x9b
      [  221.629066]  [<c01038bd>] ? sysenter_do_call+0x12/0x31
      [  221.629066]  =======================
      [  221.629066] Code: 89 c2 c1 e2 08 c1 e8 08 09 c2 8b 85 e8 fd ff ff 66 89 50 06 89 c7 53 83 c7 08 56 57 68 c4 b3 80 c0 e8 8c 5c ef ff 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 83 c3 06 8b 95 e8 fd ff ff 0f
      [  221.629066] EIP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151 SS:ESP 0068:c82d199c
      [  221.629066] ---[ end trace e417a1d67f0d0066 ]---
      
      Since hfsplus_cat_build_key_uni() returns void and only has one callsite,
      the check is performed at the callsite.
      Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efc7ffcb
    • M
      hfsplus: quieten down mounting hfsplus journaled fs read only · 81a73719
      Mike Crowe 提交于
      Check whether the file system was to be mounted read only anyway before
      warning about changing the mount to read only.
      Signed-off-by: NMike Crowe <mac@mcrowe.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81a73719
    • H
      befs: annotate fs32 on tests for superblock endianness · 152b95a1
      Harvey Harrison 提交于
      Does compile-time byteswapping rather than runtime.
      
      Noticed by sparse:
      fs/befs/super.c:29:6: warning: cast to restricted __le32
      fs/befs/super.c:29:6: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/super.c:31:11: warning: cast to restricted __be32
      fs/befs/super.c:31:11: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:811:7: warning: cast to restricted __le32
      fs/befs/linuxvfs.c:811:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
      fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      152b95a1
    • E
      ext2: avoid printk floods in the face of directory corruption · bd39597c
      Eric Sandeen 提交于
      A very large directory with many read failures (either due to storage
      problems, or due to invalid size & blocks from corruption) will generate a
      printk storm as the filesystem continues to try to read all the blocks.
      This flood of messages can tie up the box until it is complete - which may
      be a very long time, especially for very large corrupted values.
      
      This is fixed by only reporting the corruption once each time we try to
      read the directory.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Eugene Teo <eugeneteo@kernel.sg>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd39597c
    • M
      ext2: fix ext2 block reservation early ENOSPC issue · d707d31c
      Mingming Cao 提交于
      We could run into ENOSPC error on ext2, even when there is free blocks on
      the filesystem.
      
      The problem is triggered in the case the goal block group has 0 free
      blocks , and the rest block groups are skipped due to the check of
      "free_blocks < windowsz/2".  Current code could fall back to non
      reservation allocation to prevent early ENOSPC after examing all the block
      groups with reservation on , but this code was bypassed if the reservation
      window is turned off already, which is true in this case.
      
      This patch fixed two issues:
      1) We don't need to turn off block reservation if the goal block group has
      0 free blocks left and continue search for the rest of block groups.
      
      Current code the intention is to turn off the block reservation if the
      goal allocation group has a few (some) free blocks left (not enough for
      make the desired reservation window),to try to allocation in the goal
      block group, to get better locality.  But if the goal blocks have 0 free
      blocks, it should leave the block reservation on, and continues search for
      the next block groups,rather than turn off block reservation completely.
      
      2) we don't need to check the window size if the block reservation is off.
      
      The problem was originally found and fixed in ext4.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d707d31c
    • I
      autofs4: add miscellaneous device for ioctls · 8d7b48e0
      Ian Kent 提交于
      Add a miscellaneous device to the autofs4 module for routing ioctls.  This
      provides the ability to obtain an ioctl file handle for an autofs mount
      point that is possibly covered by another mount.
      
      The actual problem with autofs is that it can't reconnect to existing
      mounts.  Immediately one things of just adding the ability to remount
      autofs file systems would solve it, but alas, that can't work.  This is
      because autofs direct mounts and the implementation of "on demand mount
      and expire" of nested mount trees have the file system mounted on top of
      the mount trigger dentry.
      
      To resolve this a miscellaneous device node for routing ioctl commands to
      these mount points has been implemented in the autofs4 kernel module and a
      library added to autofs.  This provides the ability to open a file
      descriptor for these over mounted autofs mount points.
      
      Please refer to Documentation/filesystems/autofs4-mount-control.txt for a
      discussion of the problem, implementation alternatives considered and a
      description of the interface.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d7b48e0
    • I
      autofs4: track uid and gid of last mount requester · c0f54d3e
      Ian Kent 提交于
      Track the uid and gid of the last process to request a mount for on an
      autofs dentry.
      
      [akpm@linux-foundation.org: fix tpyo in comment]
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0f54d3e
    • I
      autofs4: cleanup autofs mount type usage · bb979d7f
      Ian Kent 提交于
      Usage of the AUTOFS_TYPE_* defines is a little confusing and appears
      inconsistent.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bb979d7f
    • T
      eCryptfs: remove netlink transport · 624ae528
      Tyler Hicks 提交于
      The netlink transport code has not worked for a while and the miscdev
      transport is a simpler solution.  This patch removes the netlink code and
      makes the miscdev transport the only eCryptfs kernel to userspace
      transport.
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: Dustin Kirkland <kirkland@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      624ae528
    • B
      ecryptfs: convert to use new aops · 807b7ebe
      Badari Pulavarty 提交于
      Convert ecryptfs to use write_begin/write_end
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Acked-by: NMichael Halcrow <mhalcrow@us.ibm.com>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      807b7ebe
    • M
      eCryptfs: remove retry loop in ecryptfs_readdir() · 7d6c7045
      Michael Halcrow 提交于
      The retry block in ecryptfs_readdir() has been in the eCryptfs code base
      for a while, apparently for no good reason.  This loop could potentially
      run without terminating.  This patch removes the loop, instead erroring
      out if vfs_readdir() on the lower file fails.
      Signed-off-by: NMichael Halcrow <mhalcrow@us.ibm.com>
      Reported-by: NAl Viro <viro@ZinIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d6c7045
    • K
      Allow recursion in binfmt_script and binfmt_misc · bf2a9a39
      Kirill A. Shutemov 提交于
      binfmt_script and binfmt_misc disallow recursion to avoid stack overflow
      using sh_bang and misc_bang.  It causes problem in some cases:
      
      $ echo '#!/bin/ls' > /tmp/t0
      $ echo '#!/tmp/t0' > /tmp/t1
      $ echo '#!/tmp/t1' > /tmp/t2
      $ chmod +x /tmp/t*
      $ /tmp/t2
      zsh: exec format error: /tmp/t2
      
      Similar problem with binfmt_misc.
      
      This patch introduces field 'recursion_depth' into struct linux_binprm to
      track recursion level in binfmt_misc and binfmt_script.  If recursion
      level more then BINPRM_MAX_RECURSION it generates -ENOEXEC.
      
      [akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint]
      Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf2a9a39
    • K
      alpha: introduce field 'taso' into struct linux_binprm · 53112488
      Kirill A. Shutemov 提交于
      This change is Alpha-specific.  It adds field 'taso' into struct
      linux_binprm to remember if the application is TASO.  Previously, field
      sh_bang was used for this purpose.
      Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53112488
    • A
      binfmt_som.c: add MODULE_LICENSE · cde162c2
      Adrian Bunk 提交于
      Add the missing MODULE_LICENSE("GPL").
      Reported-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cde162c2
    • C
      compat: move cp_compat_stat to common code · f7a5000f
      Christoph Hellwig 提交于
      struct stat / compat_stat is the same on all architectures, so
      cp_compat_stat should be, too.
      
      Turns out it is, except that various architectures have slightly and some
      high2lowuid/high2lowgid or the direct assignment instead of the
      SET_UID/SET_GID that expands to the correct one anyway.
      
      This patch replaces the arch-specific cp_compat_stat implementations with
      a common one based on the x86-64 one.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
      Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7a5000f
    • F
      Remove Andrew Morton's old email accounts · e1f8e874
      Francois Cami 提交于
      People can use the real name an an index into MAINTAINERS to find the
      current email address.
      Signed-off-by: NFrancois Cami <francois.cami@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1f8e874
    • D
      epoll: drop unnecessary test · f337b9c5
      Davide Libenzi 提交于
      Thomas found that there is an unnecessary (always true) test in
      ep_send_events().  The callback never inserts into ->rdllink while the
      send loop is performed, and also does the ~EP_PRIVATE_BITS test.  Given
      we're holding the mutex during this time, the conditions tested inside the
      loop are always true.  This patch drops the test done inside the
      re-insertion loop.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f337b9c5
    • J
      exec.c, compat.c: fix count(), compat_count() bounds checking · 362e6663
      Jason Baron 提交于
      With MAX_ARG_STRINGS set to 0x7FFFFFFF, and being passed to 'count()' and
      compat_count(), it would appear that the current max bounds check of
      fs/exec.c:394:
      
      	if(++i > max)
      		return -E2BIG;
      
      would never trigger. Since 'i' is of type int, so values would wrap and the
      function would continue looping.
      
      Simple fix seems to be chaning ++i to i++ and checking for '>='.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Ollie Wild" <aaw@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      362e6663
    • V
      uclinux: fix gzip header parsing in binfmt_flat.c · f4cfb18d
      Volodymyr G. Lukiianyk 提交于
      There are off-by-one errors in decompress_exec() when calculating the length of
      optional "original file name" and "comment" fields: the "ret" index is not
      incremented when terminating '\0' character is reached. The check of the buffer
      overflow (after an "extra-field" length was taken into account) is also fixed.
      
      I've encountered this off-by-one error when tried to reuse
      gzip-header-parsing part of the decompress_exec() function.  There was an
      "original file name" field in the payload (with miscalculated length) and
      zlib_inflate() returned Z_DATA_ERROR.  But after the fix similar to this
      one all worked fine.
      Signed-off-by: NVolodymyr G Lukiianyk <volodymyrgl@gmail.com>
      Acked-by: NGreg Ungerer <gerg@snapgear.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4cfb18d
    • E
      kobject: Cleanup kobject_rename and !CONFIG_SYSFS · 0b4a4fea
      Eric W. Biederman 提交于
      It finally dawned on me what the clean fix to sysfs_rename_dir
      calling kobject_set_name is.  Move the work into kobject_rename
      where it belongs.  The callers serialize us anyway so this is
      safe.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      0b4a4fea
    • T
      sysfs: Make dir and name args to sysfs_notify() const · 8c0e3998
      Trent Piepho 提交于
      Because they can be, and because code like this produces a warning if
      they're not:
      
      struct device_attribute dev_attr;
      
      sysfs_notify(&kobj, NULL, dev_attr.attr.name);
      Signed-off-by: NTrent Piepho <tpiepho@freescale.com>
      CC: Neil Brown <neilb@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      8c0e3998
    • T
      sysfs: use ilookup5() instead of ilookup5_nowait() · 45c076c5
      Tejun Heo 提交于
      As inode creation is protected by sysfs_mutex, ilookup5_nowait()
      always either fails to find at all or finds one which is fully
      initialized, so using ilookup5_nowait() or ilookup5() doesn't make any
      difference.  Switch to ilookup5() as it's planned to be removed.  This
      change also makes lookup return value handling a bit simpler.
      
      This change was suggested by Al Viro.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Al Viro <viro@hera.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      45c076c5
    • N
      sysfs: fix deadlock · b31ca3f5
      Nick Piggin 提交于
      On Thu, Sep 11, 2008 at 10:27:10AM +0200, Ingo Molnar wrote:
      
      > and it's working fine on most boxes. One testbox found this new locking
      > scenario:
      >
      > PM: Adding info for No Bus:vcsa7
      > EDAC DEBUG: MC0: i82860_check()
      >
      > =======================================================
      > [ INFO: possible circular locking dependency detected ]
      > 2.6.27-rc6-tip #1
      > -------------------------------------------------------
      > X/4873 is trying to acquire lock:
      >  (&bb->mutex){--..}, at: [<c020ba20>] mmap+0x40/0xa0
      >
      > but task is already holding lock:
      >  (&mm->mmap_sem){----}, at: [<c0125a1e>] sys_mmap2+0x8e/0xc0
      >
      > which lock already depends on the new lock.
      >
      >
      > the existing dependency chain (in reverse order) is:
      >
      > -> #1 (&mm->mmap_sem){----}:
      >        [<c017dc96>] validate_chain+0xa96/0xf50
      >        [<c017ef2b>] __lock_acquire+0x2cb/0x5b0
      >        [<c017f299>] lock_acquire+0x89/0xc0
      >        [<c01aa8fb>] might_fault+0x6b/0x90
      >        [<c040b618>] copy_to_user+0x38/0x60
      >        [<c020bcfb>] read+0xfb/0x170
      >        [<c01c09a5>] vfs_read+0x95/0x110
      >        [<c01c1443>] sys_pread64+0x63/0x80
      >        [<c012146f>] sysenter_do_call+0x12/0x43
      >        [<ffffffff>] 0xffffffff
      >
      > -> #0 (&bb->mutex){--..}:
      >        [<c017d8b7>] validate_chain+0x6b7/0xf50
      >        [<c017ef2b>] __lock_acquire+0x2cb/0x5b0
      >        [<c017f299>] lock_acquire+0x89/0xc0
      >        [<c0d6f2ab>] __mutex_lock_common+0xab/0x3c0
      >        [<c0d6f698>] mutex_lock_nested+0x38/0x50
      >        [<c020ba20>] mmap+0x40/0xa0
      >        [<c01b111e>] mmap_region+0x14e/0x450
      >        [<c01b170f>] do_mmap_pgoff+0x2ef/0x310
      >        [<c0125a3d>] sys_mmap2+0xad/0xc0
      >        [<c012146f>] sysenter_do_call+0x12/0x43
      >        [<ffffffff>] 0xffffffff
      >
      > other info that might help us debug this:
      >
      > 1 lock held by X/4873:
      >  #0:  (&mm->mmap_sem){----}, at: [<c0125a1e>] sys_mmap2+0x8e/0xc0
      >
      > stack backtrace:
      > Pid: 4873, comm: X Not tainted 2.6.27-rc6-tip #1
      >  [<c017cd09>] print_circular_bug_tail+0x79/0xc0
      >  [<c017d8b7>] validate_chain+0x6b7/0xf50
      >  [<c017a5b5>] ? trace_hardirqs_off_caller+0x15/0xb0
      >  [<c017ef2b>] __lock_acquire+0x2cb/0x5b0
      >  [<c017f299>] lock_acquire+0x89/0xc0
      >  [<c020ba20>] ? mmap+0x40/0xa0
      >  [<c0d6f2ab>] __mutex_lock_common+0xab/0x3c0
      >  [<c020ba20>] ? mmap+0x40/0xa0
      >  [<c0d6f698>] mutex_lock_nested+0x38/0x50
      >  [<c020ba20>] ? mmap+0x40/0xa0
      >  [<c020ba20>] mmap+0x40/0xa0
      >  [<c01b111e>] mmap_region+0x14e/0x450
      >  [<c01afb88>] ? arch_get_unmapped_area_topdown+0xf8/0x160
      >  [<c01b170f>] do_mmap_pgoff+0x2ef/0x310
      >  [<c0125a3d>] sys_mmap2+0xad/0xc0
      >  [<c012146f>] sysenter_do_call+0x12/0x43
      >  [<c0120000>] ? __switch_to+0x130/0x220
      >  =======================
      > evbug.c: Event. Dev: input3, Type: 20, Code: 0, Value: 500
      > warning: `sudo' uses deprecated v2 capabilities in a way that may be insecure.
      >
      > i've attached the config.
      >
      > at first sight it looks like a genuine bug in fs/sysfs/bin.c?
      
      Yes, it is a real bug by the looks. bin.c takes bb->mutex under mmap_sem
      when it is mmapped, and then does its copy_*_user under bb->mutex too.
      
      Here is a basic fix for the sysfs lor.
      
      
      From: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      b31ca3f5
    • N
      sysfs: Support sysfs_notify from atomic context with new sysfs_notify_dirent · f1282c84
      Neil Brown 提交于
      Support sysfs_notify from atomic context with new sysfs_notify_dirent
      
      sysfs_notify currently takes sysfs_mutex.
      This means that it cannot be called in atomic context.
      sysfs_mutex  is sometimes held over a malloc (sysfs_rename_dir)
      so it can block on low memory.
      
      In md I want to be able to notify on a sysfs attribute from
      atomic context, and I don't want to block on low memory because I
      could be in the writeout path for freeing memory.
      
      So:
       - export the "sysfs_dirent" structure along with sysfs_get, sysfs_put
         and sysfs_get_dirent so I can get the sysfs_dirent that I want to
         notify on and hold it in an md structure.
       - split sysfs_notify_dirent out of sysfs_notify so the sysfs_dirent
         can be notified on with no blocking (just a spinlock).
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      f1282c84