1. 26 2月, 2011 15 次提交
    • M
      mm: vmscan: stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT · 2876592f
      Mel Gorman 提交于
      should_continue_reclaim() for reclaim/compaction allows scanning to
      continue even if pages are not being reclaimed until the full list is
      scanned.  In terms of allocation success, this makes sense but potentially
      it introduces unwanted latency for high-order allocations such as
      transparent hugepages and network jumbo frames that would prefer to fail
      the allocation attempt and fallback to order-0 pages.  Worse, there is a
      potential that the full LRU scan will clear all the young bits, distort
      page aging information and potentially push pages into swap that would
      have otherwise remained resident.
      
      This patch will stop reclaim/compaction if no pages were reclaimed in the
      last SWAP_CLUSTER_MAX pages that were considered.  For allocations such as
      hugetlbfs that use __GFP_REPEAT and have fewer fallback options, the full
      LRU list may still be scanned.
      
      Order-0 allocation should not be affected because RECLAIM_MODE_COMPACTION
      is not set so the following avoids the gfp_mask being examined:
      
              if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION))
                      return false;
      
      A tool was developed based on ftrace that tracked the latency of
      high-order allocations while transparent hugepage support was enabled and
      three benchmarks were run.  The "fix-infinite" figures are 2.6.38-rc4 with
      Johannes's patch "vmscan: fix zone shrinking exit when scan work is done"
      applied.
      
        STREAM Highorder Allocation Latency Statistics
                       fix-infinite     break-early
        1 :: Count            10298           10229
        1 :: Min             0.4560          0.4640
        1 :: Mean            1.0589          1.0183
        1 :: Max            14.5990         11.7510
        1 :: Stddev          0.5208          0.4719
        2 :: Count                2               1
        2 :: Min             1.8610          3.7240
        2 :: Mean            3.4325          3.7240
        2 :: Max             5.0040          3.7240
        2 :: Stddev          1.5715          0.0000
        9 :: Count           111696          111694
        9 :: Min             0.5230          0.4110
        9 :: Mean           10.5831         10.5718
        9 :: Max            38.4480         43.2900
        9 :: Stddev          1.1147          1.1325
      
      Mean time for order-1 allocations is reduced.  order-2 looks increased but
      with so few allocations, it's not particularly significant.  THP mean
      allocation latency is also reduced.  That said, allocation time varies so
      significantly that the reductions are within noise.
      
      Max allocation time is reduced by a significant amount for low-order
      allocations but reduced for THP allocations which presumably are now
      breaking before reclaim has done enough work.
      
        SysBench Highorder Allocation Latency Statistics
                       fix-infinite     break-early
        1 :: Count            15745           15677
        1 :: Min             0.4250          0.4550
        1 :: Mean            1.1023          1.0810
        1 :: Max            14.4590         10.8220
        1 :: Stddev          0.5117          0.5100
        2 :: Count                1               1
        2 :: Min             3.0040          2.1530
        2 :: Mean            3.0040          2.1530
        2 :: Max             3.0040          2.1530
        2 :: Stddev          0.0000          0.0000
        9 :: Count             2017            1931
        9 :: Min             0.4980          0.7480
        9 :: Mean           10.4717         10.3840
        9 :: Max            24.9460         26.2500
        9 :: Stddev          1.1726          1.1966
      
      Again, mean time for order-1 allocations is reduced while order-2
      allocations are too few to draw conclusions from.  The mean time for THP
      allocations is also slightly reduced albeit the reductions are within
      varianes.
      
      Once again, our maximum allocation time is significantly reduced for
      low-order allocations and slightly increased for THP allocations.
      
        Anon stream mmap reference Highorder Allocation Latency Statistics
        1 :: Count             1376            1790
        1 :: Min             0.4940          0.5010
        1 :: Mean            1.0289          0.9732
        1 :: Max             6.2670          4.2540
        1 :: Stddev          0.4142          0.2785
        2 :: Count                1               -
        2 :: Min             1.9060               -
        2 :: Mean            1.9060               -
        2 :: Max             1.9060               -
        2 :: Stddev          0.0000               -
        9 :: Count            11266           11257
        9 :: Min             0.4990          0.4940
        9 :: Mean        27250.4669      24256.1919
        9 :: Max      11439211.0000    6008885.0000
        9 :: Stddev     226427.4624     186298.1430
      
      This benchmark creates one thread per CPU which references an amount of
      anonymous memory 1.5 times the size of physical RAM.  This pounds swap
      quite heavily and is intended to exercise THP a bit.
      
      Mean allocation time for order-1 is reduced as before.  It's also reduced
      for THP allocations but the variations here are pretty massive due to
      swap.  As before, maximum allocation times are significantly reduced.
      
      Overall, the patch reduces the mean and maximum allocation latencies for
      the smaller high-order allocations.  This was with Slab configured so it
      would be expected to be more significant with Slub which uses these size
      allocations more aggressively.
      
      The mean allocation times for THP allocations are also slightly reduced.
      The maximum latency was slightly increased as predicted by the comments
      due to reclaim/compaction breaking early.  However, workloads care more
      about the latency of lower-order allocations than THP so it's an
      acceptable trade-off.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
      Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2876592f
    • M
      drivers/nfc/pn544.c: add missing regulator · ac3c8304
      Matti J. Aaltonen 提交于
      The regulator framework is used for power management.  The regulators are
      only named in the driver code, the actual control stuff is in the board
      file for each architecture or use case.
      
      The PN544 chip has three regulators that can be controlled or not -
      depending on the architecture where the chip is being used.  So some of
      the regulators may not be controllable.  In our current case the third
      regulator, which was missing from the code, went unnoticed because we
      didn't need to control it.  To be as general as possible - in this respect
      - the driver needs to list all regulators.  Then the board file can be
      used to actually set the usage.
      Signed-off-by: NMatti J. Aaltonen <matti.j.aaltonen@nokia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac3c8304
    • M
      drivers/nfc/Kconfig: use full form of the NFC acronym · d73fa4b9
      Matti J. Aaltonen 提交于
      Spell out the NFC acronym when it's shown for the first time.
      Signed-off-by: NMatti J. Aaltonen <matti.j.aaltonen@nokia.com>
      Acked-by: NWolfram Sang <w.sang@pengutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d73fa4b9
    • F
      swiotlb: fix wrong panic · fba99fa3
      FUJITA Tomonori 提交于
      swiotlb's map_page wrongly calls panic() when it can't find a buffer fit
      for device's dma mask.  It should return an error instead.
      
      Devices with an odd dma mask (i.e.  under 4G) like b44 network card hit
      this bug (the system crashes):
      
         http://marc.info/?l=linux-kernel&m=129648943830106&w=2
      
      If swiotlb returns an error, b44 driver can use the own bouncing
      mechanism.
      Reported-by: NChuck Ebbert <cebbert@redhat.com>
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Tested-by: NArkadiusz Miskiewicz <arekm@maven.pl>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fba99fa3
    • H
      MAINTAINERS: add Chinese documentation maintainer · f8407f26
      Harry Wei 提交于
      I have translated some kernel documentation so I wish to maintain the
      Chinese documentation in our kernel directories.
      Signed-off-by: NHarry Wei <harryxiyou@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f8407f26
    • G
      mm: grab rcu read lock in move_pages() · a879bf58
      Greg Thelen 提交于
      The move_pages() usage of find_task_by_vpid() requires rcu_read_lock() to
      prevent free_pid() from reclaiming the pid.
      
      Without this patch, RCU warnings are printed in v2.6.38-rc4 move_pages()
      with:
      
        CONFIG_LOCKUP_DETECTOR=y
        CONFIG_PREEMPT=y
        CONFIG_LOCKDEP=y
        CONFIG_PROVE_LOCKING=y
        CONFIG_PROVE_RCU=y
      
      Previously, migrate_pages() went through a similar transformation
      replacing usage of tasklist_lock with rcu read lock:
      
        commit 55cfaa3c
        Author: Zeng Zhaoming <zengzm.kernel@gmail.com>
        Date:   Thu Dec 2 14:31:13 2010 -0800
      
            mm/mempolicy.c: add rcu read lock to protect pid structure
      
        commit 1e50df39
        Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
        Date:   Thu Jan 13 15:46:14 2011 -0800
      
            mempolicy: remove tasklist_lock from migrate_pages
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Zeng Zhaoming <zengzm.kernel@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a879bf58
    • D
      epoll: prevent creating circular epoll structures · 22bacca4
      Davide Libenzi 提交于
      In several places, an epoll fd can call another file's ->f_op->poll()
      method with ep->mtx held.  This is in general unsafe, because that other
      file could itself be an epoll fd that contains the original epoll fd.
      
      The code defends against this possibility in its own ->poll() method using
      ep_call_nested, but there are several other unsafe calls to ->poll
      elsewhere that can be made to deadlock.  For example, the following simple
      program causes the call in ep_insert recursively call the original fd's
      ->poll, leading to deadlock:
      
       #include <unistd.h>
       #include <sys/epoll.h>
      
       int main(void) {
           int e1, e2, p[2];
           struct epoll_event evt = {
               .events = EPOLLIN
           };
      
           e1 = epoll_create(1);
           e2 = epoll_create(2);
           pipe(p);
      
           epoll_ctl(e2, EPOLL_CTL_ADD, e1, &evt);
           epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &evt);
           write(p[1], p, sizeof p);
           epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt);
      
           return 0;
       }
      
      On insertion, check whether the inserted file is itself a struct epoll,
      and if so, do a recursive walk to detect whether inserting this file would
      create a loop of epoll structures, which could lead to deadlock.
      
      [nelhage@ksplice.com: Use epmutex to serialize concurrent inserts]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NNelson Elhage <nelhage@ksplice.com>
      Reported-by: NNelson Elhage <nelhage@ksplice.com>
      Tested-by: NNelson Elhage <nelhage@ksplice.com>
      Cc: <stable@kernel.org>		[2.6.34+, possibly earlier]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22bacca4
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 · 6366213e
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
        regulator, mc13xxx: Remove pointless test for unsigned less than zero
        regulator: Fix warning with CONFIG_BUG disabled
      6366213e
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · 4660ba63
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
        Btrfs: fix fiemap bugs with delalloc
        Btrfs: set FMODE_EXCL in btrfs_device->mode
        Btrfs: make btrfs_rm_device() fail gracefully
        Btrfs: Avoid accessing unmapped kernel address
        Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl
        Btrfs: allow balance to explicitly allocate chunks as it relocates
        Btrfs: put ENOSPC debugging under a mount option
      4660ba63
    • L
      Merge branch 'x86-fixes-for-linus' of... · 958ede7f
      Linus Torvalds 提交于
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems
        x86/mrst: Fix apb timer rating when lapic timer is used
        x86: Fix reboot problem on VersaLogic Menlow boards
      958ede7f
    • J
      RTC: fix typo in drivers/rtc/rtc-at91sam9.c · d4035850
      Jelle Martijn Kok 提交于
      The member of the rtc_class_ops struct is called alarm_irq_enable and
      not alarm_irq_enabled
      
      CC: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJelle Martijn Kok <jmkok@youcom.nl>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4035850
    • L
      Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 · c1bc3beb
      Linus Torvalds 提交于
      * 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
        usb: musb: core: set has_tt flag
        USB: xhci: mark local functions as static
        USB: xhci: fix couple sparse annotations
        USB: xhci: rework xhci_print_ir_set() to get ir set from xhci itself
        USB: Reset USB 3.0 devices on (re)discovery
        xhci: Fix an error in count_sg_trbs_needed()
        xhci: Fix errors in the running total calculations in the TRB math
        xhci: Clarify some expressions in the TRB math
        xhci: Avoid BUG() in interrupt context
      c1bc3beb
    • L
      Merge branch 'for-linus' of git://neil.brown.name/md · 638691a7
      Linus Torvalds 提交于
      * 'for-linus' of git://neil.brown.name/md:
        md: Fix - again - partition detection when array becomes active
        Fix over-zealous flush_disk when changing device size.
        md: avoid spinlock problem in blk_throtl_exit
        md: correctly handle probe of an 'mdp' device.
        md: don't set_capacity before array is active.
        md: Fix raid1->raid0 takeover
      638691a7
    • A
      RxRPC: Allocate tokens with kzalloc to avoid oops in rxrpc_destroy · 0a93ea2e
      Anton Blanchard 提交于
      With slab poisoning enabled, I see the following oops:
      
        Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73
        ...
        NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104
        LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104
        Call Trace:
        [c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable)
        [c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c
        [c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0
        [c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468
        [c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8
        [c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70
      
      We aren't initialising token->next, but the code in destroy_context relies
      on the list being NULL terminated. Use kzalloc to zero out all the fields.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a93ea2e
    • A
      afs: Fix oops in afs_unlink_writeback · f129ccc9
      Anton Blanchard 提交于
      I'm seeing the following oops when testing afs:
      
        Unable to handle kernel paging request for data at address 0x00000008
        ...
        NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0
        LR [c00000000033987c] .afs_put_writeback+0x98/0xec
        Call Trace:
        [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec
        [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c
        [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320
        [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4
        [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc
        [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178
        [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128
        [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8
        [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0
        [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40
      
      afs_write_begin hits an error and calls afs_unlink_writeback. In there
      we do list_del_init on an uninitialised list.
      
      The patch below initialises ->link when creating the afs_writeback struct.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f129ccc9
  2. 25 2月, 2011 13 次提交
  3. 24 2月, 2011 12 次提交
    • J
      x86/mrst: Fix apb timer rating when lapic timer is used · 7b62dbec
      Jacob Pan 提交于
      Need to adjust the clockevent device rating for the structure
      that will be registered with clockevent system instead of the
      temporary structure.
      
      Without this fix, APB timer rating will be higher than LAPIC
      timer such that it can not be released later to be used as the
      broadcast timer.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      LKML-Reference: <1298506046-439-1-git-send-email-jacob.jun.pan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b62dbec
    • J
      Unlock vfsmount_lock in do_umount · bf9faa2a
      J. R. Okajima 提交于
      By the commit
      	b3e19d92 2011-01-07 fs: scale mntget/mntput
      vfsmount_lock was introduced around testing mnt_count.
      Fix the mis-typed 'unlock'
      Signed-off-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bf9faa2a
    • N
      md: Fix - again - partition detection when array becomes active · f0b4f7e2
      NeilBrown 提交于
      Revert
          b821eaa5
      and
          f3b99be1
      
      When I wrote the first of these I had a wrong idea about the
      lifetime of 'struct block_device'.  It can disappear at any time that
      the block device is not open if it falls out of the inode cache.
      
      So relying on the 'size' recorded with it to detect when the
      device size has changed and so we need to revalidate, is wrong.
      
      Rather, we really do need the 'changed' attribute stored directly in
      the mddev and set/tested as appropriate.
      
      Without this patch, a sequence of:
         mknod / open / close / unlink
      
      (which can cause a block_device to be created and then destroyed)
      will result in a rescan of the partition table and consequence removal
      and addition of partitions.
      Several of these in a row can get udev racing to create and unlink and
      other code can get confused.
      
      With the patch, the rescan is only performed when needed and so there
      are no races.
      
      This is suitable for any stable kernel from 2.6.35.
      Reported-by: N"Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      f0b4f7e2
    • N
      Fix over-zealous flush_disk when changing device size. · 93b270f7
      NeilBrown 提交于
      There are two cases when we call flush_disk.
      In one, the device has disappeared (check_disk_change) so any
      data will hold becomes irrelevant.
      In the oter, the device has changed size (check_disk_size_change)
      so data we hold may be irrelevant.
      
      In both cases it makes sense to discard any 'clean' buffers,
      so they will be read back from the device if needed.
      
      In the former case it makes sense to discard 'dirty' buffers
      as there will never be anywhere safe to write the data.  In the
      second case it *does*not* make sense to discard dirty buffers
      as that will lead to file system corruption when you simply enlarge
      the containing devices.
      
      flush_disk calls __invalidate_devices.
      __invalidate_device calls both invalidate_inodes and invalidate_bdev.
      
      invalidate_inodes *does* discard I_DIRTY inodes and this does lead
      to fs corruption.
      
      invalidate_bev *does*not* discard dirty pages, but I don't really care
      about that at present.
      
      So this patch adds a flag to __invalidate_device (calling it
      __invalidate_device2) to indicate whether dirty buffers should be
      killed, and this is passed to invalidate_inodes which can choose to
      skip dirty inodes.
      
      flusk_disk then passes true from check_disk_change and false from
      check_disk_size_change.
      
      dm avoids tripping over this problem by calling i_size_write directly
      rathher than using check_disk_size_change.
      
      md does use check_disk_size_change and so is affected.
      
      This regression was introduced by commit 608aeef1 which causes
      check_disk_size_change to call flush_disk, so it is suitable for any
      kernel since 2.6.27.
      
      Cc: stable@kernel.org
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Andrew Patterson <andrew.patterson@hp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      93b270f7
    • H
      mm: fix possible cause of a page_mapped BUG · a3e8cc64
      Hugh Dickins 提交于
      Robert Swiecki reported a BUG_ON(page_mapped) from a fuzzer, punching
      a hole with madvise(,, MADV_REMOVE).  That path is under mutex, and
      cannot be explained by lack of serialization in unmap_mapping_range().
      
      Reviewing the code, I found one place where vm_truncate_count handling
      should have been updated, when I switched at the last minute from one
      way of managing the restart_addr to another: mremap move changes the
      virtual addresses, so it ought to adjust the restart_addr.
      
      But rather than exporting the notion of restart_addr from memory.c, or
      converting to restart_pgoff throughout, simply reset vm_truncate_count
      to 0 to force a rescan if mremap move races with preempted truncation.
      
      We have no confirmation that this fixes Robert's BUG,
      but it is a fix that's worth making anyway.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3e8cc64
    • M
      mm: prevent concurrent unmap_mapping_range() on the same inode · 2aa15890
      Miklos Szeredi 提交于
      Michael Leun reported that running parallel opens on a fuse filesystem
      can trigger a "kernel BUG at mm/truncate.c:475"
      
      Gurudas Pai reported the same bug on NFS.
      
      The reason is, unmap_mapping_range() is not prepared for more than
      one concurrent invocation per inode.  For example:
      
        thread1: going through a big range, stops in the middle of a vma and
           stores the restart address in vm_truncate_count.
      
        thread2: comes in with a small (e.g. single page) unmap request on
           the same vma, somewhere before restart_address, finds that the
           vma was already unmapped up to the restart address and happily
           returns without doing anything.
      
      Another scenario would be two big unmap requests, both having to
      restart the unmapping and each one setting vm_truncate_count to its
      own value.  This could go on forever without any of them being able to
      finish.
      
      Truncate and hole punching already serialize with i_mutex.  Other
      callers of unmap_mapping_range() do not, and it's difficult to get
      i_mutex protection for all callers.  In particular ->d_revalidate(),
      which calls invalidate_inode_pages2_range() in fuse, may be called
      with or without i_mutex.
      
      This patch adds a new mutex to 'struct address_space' to prevent
      running multiple concurrent unmap_mapping_range() on the same mapping.
      
      [ We'll hopefully get rid of all this with the upcoming mm
        preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
        lockbreak" patch in particular.  But that is for 2.6.39 ]
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Reported-by: NMichael Leun <lkml20101129@newton.leun.net>
      Reported-by: NGurudas Pai <gurudas.pai@oracle.com>
      Tested-by: NGurudas Pai <gurudas.pai@oracle.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2aa15890
    • L
      Revert "Bluetooth: Enable USB autosuspend by default on btusb" · 78794b2c
      Linus Torvalds 提交于
      This reverts commit 556ea928.
      
      Jeff Chua reports that it can cause some bluetooth devices (he mentions
      an Bluetooth Intermec scanner) to just stop responding after a while
      with messages like
      
        [ 4533.361959] btusb 8-1:1.0: no reset_resume for driver btusb?
        [ 4533.361964] btusb 8-1:1.1: no reset_resume for driver btusb?
      
      from the kernel. See also
      
        https://bugzilla.kernel.org/show_bug.cgi?id=26182
      
      for other reports.
      Reported-by: NJeff Chua <jeff.chua.linux@gmail.com>
      Reported-by: NAndrew Meakovski <meako@bigmir.net>
      Reported-by: NJim Faulkner <jfaulkne@ccs.neu.edu>
      Acked-by: NGreg KH <gregkh@suse.de>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Acked-by: NGustavo F. Padovan <padovan@profusion.mobi>
      Cc: stable@kernel.org (for 2.6.37)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78794b2c
    • D
      Merge branch 'drm-intel-fixes' of... · fbf92bea
      Dave Airlie 提交于
      Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel into drm-fixes
      
      * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel:
        drm/i915: fix corruptions on i8xx due to relaxed fencing
        drm/i915: skip FDI & PCH enabling for DP_A
        agp/intel: Experiment with a 855GM GWB bit
        drm/i915: don't enable FDI & transcoder interrupts after all
        drm/i915: Ignore a hung GPU when flushing the framebuffer prior to a switch
      fbf92bea
    • D
      drm/i915: fix corruptions on i8xx due to relaxed fencing · c2e0eb16
      Daniel Vetter 提交于
      It looks like gen2 has a peculiar interleaved 2-row inter-tile
      layout. Probably inherited from i81x which had 2kb tiles (which
      naturally fit an even-number-of-tile-rows scheme to fit onto 4kb
      pages). There is no other mention of this in any docs (also not
      in the Intel internal documention according to Chris Wilson).
      
      Problem manifests itself in corruptions in the second half of the
      last tile row (if the bo has an odd number of tiles). Which can
      only happen with relaxed tiling (introduced in a00b10c3).
      
      So reject set_tiling calls that don't satisfy this constrain to
      prevent broken userspace from causing havoc. While at it, also
      check the size for newer chipsets.
      
      LKML: https://lkml.org/lkml/2011/2/19/5Reported-by: NIndan Zupancic <indan@nul.nu>
      Tested-by: NIndan Zupancic <indan@nul.nu>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      c2e0eb16
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · ef324285
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
        Added support for usb ethernet (0x0fe6, 0x9700)
        r8169: fix RTL8168DP power off issue.
        r8169: correct settings of rtl8102e.
        r8169: fix incorrect args to oob notify.
        DM9000B: Fix PHY power for network down/up
        DM9000B: Fix reg_save after spin_lock in dm9000_timeout
        net_sched: long word align struct qdisc_skb_cb data
        sfc: lower stack usage in efx_ethtool_self_test
        bridge: Use IPv6 link-local address for multicast listener queries
        bridge: Fix MLD queries' ethernet source address
        bridge: Allow mcast snooping for transient link local addresses too
        ipv6: Add IPv6 multicast address flag defines
        bridge: Add missing ntohs()s for MLDv2 report parsing
        bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
        bridge: Fix IPv6 multicast snooping by storing correct protocol type
        p54pci: update receive dma buffers before and after processing
        fix cfg80211_wext_siwfreq lock ordering...
        rt2x00: Fix WPA TKIP Michael MIC failures.
        ath5k: Fix fast channel switching
        tcp: undo_retrans counter fixes
        ...
      ef324285
    • L
      Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · b5f7376e
      Linus Torvalds 提交于
      * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        amd64-agp: fix crash at second module load
        drm/radeon: fix regression with AA resolve checking
        drm: drop commented out code and preceding comment
        drm/vblank: Enable precise vblank timestamps for interlaced and doublescan modes.
        drm/vblank: Use memory barriers optimized for atomic_t instead of generics.
        drm/vblank: Use abs64(diff_ns) for s64 diff_ns instead of abs(diff_ns)
        drm/radeon/kms: align height of fb allocation.
        Revert "drm/radeon/kms: switch back to min->max pll post divider iteration"
      b5f7376e
    • D