1. 28 1月, 2012 6 次提交
    • P
    • P
      logfs: Propagate page parameter to __logfs_write_inode · 0bd90387
      Prasad Joshi 提交于
      During GC LogFS has to rewrite each valid block to a separate segment.
      Rewrite operation reads data from an old segment and writes it to a
      newly allocated segment. Since every write operation changes data
      block pointers maintained in inode, inode should also be rewritten.
      
      In GC path to avoid AB-BA deadlock LogFS marks a page with
      PG_pre_locked in addition to locking the page (PG_locked). The page
      lock is ignored iff the page is pre-locked.
      
      LogFS uses a special file called segment file. The segment file
      maintains an 8 bytes entry for every segment. It keeps track of erase
      count, level etc. for every segment.
      
      Bad things happen with a segment belonging to the segment file is GCed
      
       ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/readwrite.c:297!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
      		serio_raw [last unloaded: logfs]
      Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
      		VirtualBox
      EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
      EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
      EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
      ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
      Stack:
      f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
      00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
      00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
      Call Trace:
      [<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
      [<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
      [<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
      [<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
      [<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
      [<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
      [<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
      [<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
      [<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
      [<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
      [<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
      [<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
      [<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
      [<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
      [<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
      [<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
      [<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
      [<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
      [<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
      [<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
      [<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
      [<c1126b21>] mount_fs+0x21/0xd0
      [<c10f6f6f>] ? __alloc_percpu+0xf/0x20
      [<c113da41>] ? alloc_vfsmnt+0xb1/0x130
      [<c113db4b>] vfs_kern_mount+0x4b/0xa0
      [<c113e06e>] do_kern_mount+0x3e/0xe0
      [<c113f60d>] do_mount+0x34d/0x670
      [<c10f2749>] ? strndup_user+0x49/0x70
      [<c113fcab>] sys_mount+0x6b/0xa0
      [<c142d87c>] syscall_call+0x7/0xb
      Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
      EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
      ---[ end trace 96e67d5b3aa3d6ca ]---
      
      The patch passes locked page to __logfs_write_inode. It calls function
      logfs_get_wblocks() to pre-lock the page. This ensures any further
      attempts to lock the page are ignored (esp from get_erase_count).
      Acked-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      0bd90387
    • P
      logfs: set superblock shutdown flag after generic sb shutdown · ecfd8909
      Prasad Joshi 提交于
      While unmounting the file system LogFS calls generic_shutdown_super.
      The function does file system independent superblock shutdown.
      However, it might result in call file system specific inode eviction.
      
      LogFS marks FS shutting down by setting bit LOGFS_SB_FLAG_SHUTDOWN in
      super->s_flags. Since, inode eviction might call truncate on inode,
      following BUG is observed when file system is unmounted:
      
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/segment.c:362!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU 3
      Modules linked in: logfs binfmt_misc ppdev virtio_blk parport_pc lp
      	parport psmouse floppy virtio_pci serio_raw virtio_ring virtio
      
      Pid: 1933, comm: umount Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa008c841>]  [<ffffffffa008c841>]
      		logfs_segment_write+0x211/0x230 [logfs]
      RSP: 0018:ffff880062d7b9e8  EFLAGS: 00010202
      RAX: 000000000000000e RBX: ffff88006eca9000 RCX: 0000000000000000
      RDX: ffff88006fd87c40 RSI: ffffea00014ff468 RDI: ffff88007b68e000
      RBP: ffff880062d7ba48 R08: 8000000020451430 R09: 0000000000000000
      R10: dead000000100100 R11: 0000000000000000 R12: ffff88006fd87c40
      R13: ffffea00014ff468 R14: ffff88005ad0a460 R15: 0000000000000000
      FS:  00007f25d50ea760(0000) GS:ffff88007fd80000(0000)
      	knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000d05e48 CR3: 0000000062c72000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process umount (pid: 1933, threadinfo ffff880062d7a000,
      	task ffff880070b44500)
      Stack:
      ffff880062d7ba38 ffff88005ad0a508 0000000000001000 0000000000000000
      8000000020451430 ffffea00014ff468 ffff880062d7ba48 ffff88005ad0a460
      ffff880062d7bad8 ffffea00014ff468 ffff88006fd87c40 0000000000000000
      Call Trace:
      [<ffffffffa0088fee>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0089360>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0089312>] __logfs_write_rec+0xf2/0x220 [logfs]
      [<ffffffffa00894a4>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa0089616>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa008a19e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa008a6b8>] __logfs_write_inode+0x98/0x110 [logfs]
      [<ffffffffa008a7c4>] logfs_truncate+0x54/0x290 [logfs]
      [<ffffffffa008abfc>] logfs_evict_inode+0xdc/0x190 [logfs]
      [<ffffffff8115eef5>] evict+0x85/0x170
      [<ffffffff8115f126>] iput+0xe6/0x1b0
      [<ffffffff8115b4a8>] shrink_dcache_for_umount_subtree+0x218/0x280
      [<ffffffff8115ce91>] shrink_dcache_for_umount+0x51/0x90
      [<ffffffff8114796c>] generic_shutdown_super+0x2c/0x100
      [<ffffffffa008cc47>] logfs_kill_sb+0x57/0xf0 [logfs]
      [<ffffffff81147de5>] deactivate_locked_super+0x45/0x70
      [<ffffffff811487ea>] deactivate_super+0x4a/0x70
      [<ffffffff81163934>] mntput_no_expire+0xa4/0xf0
      [<ffffffff8116469f>] sys_umount+0x6f/0x380
      [<ffffffff814dd46b>] system_call_fastpath+0x16/0x1b
      Code: 55 c8 49 8d b6 a8 00 00 00 45 89 f9 45 89 e8 4c 89 e1 4c 89 55
      b8 c7 04 24 00 00 00 00 e8 68 fc ff ff 4c 8b 55 b8 e9 3c ff ff ff <0f>
      0b 0f 0b c7 45 c0 00 00 00 00 e9 44 fe ff ff 66 66 66 66 66
      RIP  [<ffffffffa008c841>] logfs_segment_write+0x211/0x230 [logfs]
      RSP <ffff880062d7b9e8>
      ---[ end trace fe6b040cea952290 ]---
      
      Therefore, move super->s_flags setting after the fs-indenpendent work
      has been finished.
      Reviewed-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      ecfd8909
    • P
      logfs: take write mutex lock during fsync and sync · 13ced29c
      Prasad Joshi 提交于
      LogFS uses super->s_write_mutex while writing data to disk. Taking the
      same mutex lock in sync and fsync code path solves the following BUG:
      
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/dev_bdev.c:134!
      
      Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa007deed>]  [<ffffffffa007deed>]
                      bdev_writeseg+0x25d/0x270 [logfs]
      Call Trace:
      [<ffffffffa007c381>] logfs_open_area+0x91/0x150 [logfs]
      [<ffffffff8128dcb2>] ? find_level.clone.9+0x62/0x100
      [<ffffffffa007c49c>] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
      [<ffffffff810ef005>] ? mempool_kmalloc+0x15/0x20
      [<ffffffff810ef383>] ? mempool_alloc+0x53/0x130
      [<ffffffffa007c7a4>] logfs_segment_write+0x1d4/0x230 [logfs]
      [<ffffffffa0078f8e>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0079300>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0079444>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa00795b6>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa007a13e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa0073e33>] __logfs_writepage+0x23/0x80 [logfs]
      [<ffffffffa007410c>] logfs_writepage+0xdc/0x110 [logfs]
      [<ffffffff810f5ba7>] __writepage+0x17/0x40
      [<ffffffff810f6208>] write_cache_pages+0x208/0x4f0
      [<ffffffff810f5b90>] ? set_page_dirty+0x70/0x70
      [<ffffffff810f653a>] generic_writepages+0x4a/0x70
      [<ffffffff810f75d1>] do_writepages+0x21/0x40
      [<ffffffff8116b9d1>] writeback_single_inode+0x101/0x250
      [<ffffffff8116bdbd>] writeback_sb_inodes+0xed/0x1c0
      [<ffffffff8116c5fb>] writeback_inodes_wb+0x7b/0x1e0
      [<ffffffff8116cc23>] wb_writeback+0x4c3/0x530
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff8116cd6b>] wb_do_writeback+0xdb/0x290
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff814d6208>] ? _raw_spin_unlock_irqrestore+0x18/0x40
      [<ffffffff8105aa5a>] ? del_timer+0x8a/0x120
      [<ffffffff8116cfac>] bdi_writeback_thread+0x8c/0x2e0
      [<ffffffff8116cf20>] ? wb_do_writeback+0x290/0x290
      [<ffffffff8106d2e6>] kthread+0x96/0xa0
      [<ffffffff814de514>] kernel_thread_helper+0x4/0x10
      [<ffffffff8106d250>] ? kthread_worker_fn+0x190/0x190
      [<ffffffff814de510>] ? gs_change+0xb/0xb
      RIP  [<ffffffffa007deed>] bdev_writeseg+0x25d/0x270 [logfs]
      ---[ end trace 0211ad60a57657c4 ]---
      Reviewed-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      13ced29c
    • J
      logfs: Prevent memory corruption · 934eed39
      Joern Engel 提交于
      This is a bad one.  I wonder whether we were so far protected by
      no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.
      
      Found by Dan Carpenter <dan.carpenter@oracle.com> using smatch.
      Signed-off-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      934eed39
    • P
      logfs: update page reference count for pined pages · 96150606
      Prasad Joshi 提交于
      LogFS sets PG_private flag to indicate a pined page. We assumed that
      marking a page as private is enough to ensure its existence. But
      instead it is necessary to hold a reference count to the page.
      
      The change resolves the following BUG
      
      BUG: Bad page state in process flush-253:16  pfn:6a6d0
      page flags: 0x100000000000808(uptodate|private)
      Suggested-and-Acked-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>
      96150606
  2. 04 1月, 2012 4 次提交
    • L
      Revert "rtc: Expire alarms after the time is set." · f423fc62
      Linus Torvalds 提交于
      This reverts commit 93b2ec01.
      
      The call to "schedule_work()" in rtc_initialize_alarm() happens too
      early, and can cause oopses at bootup
      
      Neil Brown explains why we do it:
      
        "If you set an alarm in the future, then shutdown and boot again after
         that time, then you will end up with a timer_queue node which is in
         the past.
      
         When this happens the queue gets stuck.  That entry-in-the-past won't
         get removed until and interrupt happens and an interrupt won't happen
         because the RTC only triggers an interrupt when the alarm is "now".
      
         So you'll find that e.g.  "hwclock" will always tell you that
         'select' timed out.
      
         So we force the interrupt work to happen at the start just in case."
      
      and has a patch that convert it to do things in-process rather than with
      the worker thread, but right now it's too late to play around with this,
      so we just revert the patch that caused problems for now.
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Requested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Requested-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f423fc62
    • L
      Revert "rtc: Disable the alarm in the hardware" · 157e8bf8
      Linus Torvalds 提交于
      This reverts commit c0afabd3.
      
      It causes failures on Toshiba laptops - instead of disabling the alarm,
      it actually seems to enable it on the affected laptops, resulting in
      (for example) the laptop powering on automatically five minutes after
      shutdown.
      
      There's a patch for it that appears to work for at least some people,
      but it's too late to play around with this, so revert for now and try
      again in the next merge window.
      
      See for example
      
      	http://bugs.debian.org/652869
      
      Reported-and-bisected-by: Andreas Friedrich <afrie@gmx.net> (Toshiba Tecra)
      Reported-by: Antonio-M. Corbi Bellot <antonio.corbi@ua.es> (Toshiba Portege R500)
      Reported-by: Marco Santos <marco.santos@waynext.com> (Toshiba Portege Z830)
      Reported-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>  (Toshiba Portege R830)
      Cc: Jonathan Nieder <jrnieder@gmail.com>
      Requested-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: stable@kernel.org  # for the versions that applied this
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      157e8bf8
    • M
      hung_task: fix false positive during vfork · f9fab10b
      Mandeep Singh Baines 提交于
      vfork parent uninterruptibly and unkillably waits for its child to
      exec/exit. This wait is of unbounded length. Ignore such waits
      in the hung_task detector.
      Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
      Reported-by: NSasha Levin <levinsasha928@gmail.com>
      LKML-Reference: <1325344394.28904.43.camel@lappy>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f9fab10b
    • J
      security: Fix security_old_inode_init_security() when CONFIG_SECURITY is not set · 30e05324
      Jan Kara 提交于
      Commit 1e39f384 ("evm: fix build problems") makes the stub version
      of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
      not set.
      
      But that makes callers such as reiserfs_security_init() assume that
      security_old_inode_init_security() has set name, value, and len
      arguments properly - but security_old_inode_init_security() left them
      uninitialized which then results in interesting failures.
      
      Revert security_old_inode_init_security() to the old behavior of
      returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
      just fine.
      
      [ Also fixed the S_PRIVATE(inode) case of the actual non-stub
        security_old_inode_init_security() function to return EOPNOTSUPP
        for the same reason, as pointed out by Mimi Zohar.
      
        It got incorrectly changed to match the new function in commit
        fb88c2b6: "evm: fix security/security_old_init_security return
        code".   - Linus ]
      Reported-by: NJorge Bastos <mysql.jorge@decimal.pt>
      Acked-by: NJames Morris <jmorris@namei.org>
      Acked-by: NMimi Zohar <zohar@us.ibm.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30e05324
  3. 03 1月, 2012 2 次提交
  4. 02 1月, 2012 1 次提交
  5. 01 1月, 2012 3 次提交
  6. 31 12月, 2011 11 次提交
  7. 30 12月, 2011 11 次提交
    • L
      Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 89307bab
      Linus Torvalds 提交于
      * 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu: Initialize domain->handler in iommu_domain_alloc()
      89307bab
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 50b2abed
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        packet: fix possible dev refcnt leak when bind fail
        netem: dont call vfree() under spinlock and BH disabled
        netfilter: ctnetlink: fix scheduling while atomic if helper is autoloaded
        netfilter: ctnetlink: fix return value of ctnetlink_get_expect()
      50b2abed
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7578ed02
      Linus Torvalds 提交于
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86: Fix raw_spin_unlock_irqrestore() usage
        oprofile, arm/sh: Fix oprofile_arch_exit() linkage issue
      7578ed02
    • L
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · d2bac6ab
      Linus Torvalds 提交于
      * 'for-linus' of git://oss.sgi.com/xfs/xfs:
        xfs: log all dirty inodes in xfs_fs_sync_fs
        xfs: log the inode in ->write_inode calls for kupdate
      d2bac6ab
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 1cac8e88
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: fix blk_queue_end_tag()
        block: re-use existing 'reading' variable instead of checking direction again
        block, cfq: fix empty queue crash caused by request merge
      1cac8e88
    • H
      mm: hugetlb: fix non-atomic enqueue of huge page · b0365c8d
      Hillf Danton 提交于
      If a huge page is enqueued under the protection of hugetlb_lock, then the
      operation is atomic and safe.
      Signed-off-by: NHillf Danton <dhillf@gmail.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>		[2.6.37+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0365c8d
    • A
      procfs: do not confuse jiffies with cputime64_t · 34845636
      Andreas Schwab 提交于
      Commit 2a95ea6c ("procfs: do not overflow get_{idle,iowait}_time
      for nohz") did not take into account that one some architectures jiffies
      and cputime use different units.
      
      This causes get_idle_time() to return numbers in the wrong units, making
      the idle time fields in /proc/stat wrong.
      
      Instead of converting the usec value returned by
      get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
      usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34845636
    • K
      mm/mempolicy.c: refix mbind_range() vma issue · e26a5114
      KOSAKI Motohiro 提交于
      commit 8aacc9f5 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the
      slightly incorrect fix.
      
      Why? Think following case.
      
      1. map 4 pages of a file at offset 0
      
         [0123]
      
      2. map 2 pages just after the first mapping of the same file but with
         page offset 2
      
         [0123][23]
      
      3. mbind() 2 pages from the first mapping at offset 2.
         mbind_range() should treat new vma is,
      
         [0123][23]
           |23|
           mbind vma
      
         but it does
      
         [0123][23]
           |01|
           mbind vma
      
         Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar).
      
      This patch fixes it.
      
      [testcase]
        test result - before the patch
      
      	case4: 126: test failed. expect '2,4', actual '2,2,2'
             	case5: passed
      	case6: passed
      	case7: passed
      	case8: passed
      	case_n: 246: test failed. expect '4,2', actual '1,4'
      
      	------------[ cut here ]------------
      	kernel BUG at mm/filemap.c:135!
      	invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC
      
      	(snip long bug on messages)
      
        test result - after the patch
      
      	case4: passed
             	case5: passed
      	case6: passed
      	case7: passed
      	case8: passed
      	case_n: passed
      
        source:  mbind_vma_test.c
      ============================================================
       #include <numaif.h>
       #include <numa.h>
       #include <sys/mman.h>
       #include <stdio.h>
       #include <unistd.h>
       #include <stdlib.h>
       #include <string.h>
      
      static unsigned long pagesize;
      void* mmap_addr;
      struct bitmask *nmask;
      char buf[1024];
      FILE *file;
      char retbuf[10240] = "";
      int mapped_fd;
      
      char *rubysrc = "ruby -e '\
        pid = %d; \
        vstart = 0x%llx; \
        vend = 0x%llx; \
        s = `pmap -q #{pid}`; \
        rary = []; \
        s.each_line {|line|; \
          ary=line.split(\" \"); \
          addr = ary[0].to_i(16); \
          if(vstart <= addr && addr < vend) then \
            rary.push(ary[1].to_i()/4); \
          end; \
        }; \
        print rary.join(\",\"); \
      '";
      
      void init(void)
      {
      	void* addr;
      	char buf[128];
      
      	nmask = numa_allocate_nodemask();
      	numa_bitmask_setbit(nmask, 0);
      
      	pagesize = getpagesize();
      
      	sprintf(buf, "%s", "mbind_vma_XXXXXX");
      	mapped_fd = mkstemp(buf);
      	if (mapped_fd == -1)
      		perror("mkstemp "), exit(1);
      	unlink(buf);
      
      	if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0)
      		perror("lseek "), exit(1);
      	if (write(mapped_fd, "\0", 1) < 0)
      		perror("write "), exit(1);
      
      	addr = mmap(NULL, pagesize*8, PROT_NONE,
      		    MAP_SHARED, mapped_fd, 0);
      	if (addr == MAP_FAILED)
      		perror("mmap "), exit(1);
      
      	if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0)
      		perror("mprotect "), exit(1);
      
      	mmap_addr = addr + pagesize;
      
      	/* make page populate */
      	memset(mmap_addr, 0, pagesize*6);
      }
      
      void fin(void)
      {
      	void* addr = mmap_addr - pagesize;
      	munmap(addr, pagesize*8);
      
      	memset(buf, 0, sizeof(buf));
      	memset(retbuf, 0, sizeof(retbuf));
      }
      
      void mem_bind(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_BIND, nmask->maskp, nmask->size, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void mem_interleave(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void mem_unbind(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_DEFAULT, NULL, 0, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void Assert(char *expected, char *value, char *name, int line)
      {
      	if (strcmp(expected, value) == 0) {
      		fprintf(stderr, "%s: passed\n", name);
      		return;
      	}
      	else {
      		fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n",
      			name, line,
      			expected, value);
      //		exit(1);
      	}
      }
      
      /*
            AAAA
          PPPPPPNNNNNN
          might become
          PPNNNNNNNNNN
          case 4 below
      */
      void case4(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 4);
      	mem_unbind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("2,4", retbuf, "case4", __LINE__);
      
      	fin();
      }
      
      /*
             AAAA
       PPPPPPNNNNNN
       might become
       PPPPPPPPPPNN
       case 5 below
      */
      void case5(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case5", __LINE__);
      
      	fin();
      }
      
      /*
      	    AAAA
      	PPPPNNNNXXXX
      	might become
      	PPPPPPPPPPPP 6
      */
      void case6(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_bind(4, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("6", retbuf, "case6", __LINE__);
      
      	fin();
      }
      
      /*
          AAAA
      PPPPNNNNXXXX
      might become
      PPPPPPPPXXXX 7
      */
      void case7(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_interleave(4, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case7", __LINE__);
      
      	fin();
      }
      
      /*
          AAAA
      PPPPNNNNXXXX
      might become
      PPPPNNNNNNNN 8
      */
      void case8(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_interleave(4, 2);
      	mem_interleave(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("2,4", retbuf, "case8", __LINE__);
      
      	fin();
      }
      
      void case_n(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	/* make redundunt mappings [0][1234][34][7] */
      	mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE,
      	     MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3);
      
      	/* Expect to do nothing. */
      	mem_unbind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case_n", __LINE__);
      
      	fin();
      }
      
      int main(int argc, char** argv)
      {
      	case4();
      	case5();
      	case6();
      	case7();
      	case8();
      	case_n();
      
      	return 0;
      }
      =============================================================
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Caspar Zhang <caspar@casparzhang.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: <stable@vger.kernel.org>		[3.1.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e26a5114
    • H
      gspca: Fix bulk mode cameras no longer working (regression fix) · 757e55c2
      Hans de Goede 提交于
      The new iso bandwidth calculation code accidentally has broken support
      for bulk mode cameras. This has broken the following drivers:
      finepix, jeilinj, ovfx2, ov534, ov534_9, se401, sq905, sq905c, sq930x,
      stv0680, vicam.
      
      Thix patch fixes this. Fix tested with: se401, sq905, sq905c, stv0680 & vicam
      cams.
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      757e55c2
    • T
      Input: sentelic - fix retrieving number of buttons · 6ccbcf2c
      Tai-hwa Liang 提交于
      Fixing wrong register offset which is used to retrieve the number of buttons
      attached to the hardware.
      Signed-off-by: NTai-hwa Liang <avatar@sentelic.com>
      Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
      6ccbcf2c
    • S
      ceph: disable use of dcache for readdir etc. · a4d46363
      Sage Weil 提交于
      Ceph attempts to use the dcache to satisfy negative lookups and readdir
      when the entire directory contents are in cache.  Disable this behavior
      until lingering bugs in this code are shaken out; we'll re-enable these
      hooks once things are fully stable.
      Signed-off-by: NSage Weil <sage@newdream.net>
      a4d46363
  8. 29 12月, 2011 1 次提交
  9. 28 12月, 2011 1 次提交