1. 23 8月, 2012 29 次提交
  2. 22 8月, 2012 11 次提交
    • L
      Merge branch 'akpm' (Andrew's patch-bomb) · 23dcfa61
      Linus Torvalds 提交于
      Merge fixes from Andrew Morton.
      
      Random drivers and some VM fixes.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (17 commits)
        mm: compaction: Abort async compaction if locks are contended or taking too long
        mm: have order > 0 compaction start near a pageblock with free pages
        rapidio/tsi721: fix unused variable compiler warning
        rapidio/tsi721: fix inbound doorbell interrupt handling
        drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode
        mm: correct page->pfmemalloc to fix deactivate_slab regression
        drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes
        mm/compaction.c: fix deferring compaction mistake
        drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources
        string: do not export memweight() to userspace
        hugetlb: update hugetlbpage.txt
        checkpatch: add control statement test to SINGLE_STATEMENT_DO_WHILE_MACRO
        mm: hugetlbfs: correctly populate shared pmd
        cciss: fix incorrect scsi status reporting
        Documentation: update mount option in filesystem/vfat.txt
        mm: change nr_ptes BUG_ON to WARN_ON
        cs5535-clockevt: typo, it's MFGPT, not MFPGT
      23dcfa61
    • L
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · a484147a
      Linus Torvalds 提交于
      Pull media fixes from Mauro Carvalho Chehab:
       "For bug fixes, at soc_camera, si470x, uvcvideo, iguanaworks IR driver,
        radio_shark Kbuild fixes, and at the V4L2 core (radio fixes)."
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] media: soc_camera: don't clear pix->sizeimage in JPEG mode
        [media] media: mx2_camera: Fix clock handling for i.MX27
        [media] video: mx2_camera: Use clk_prepare_enable/clk_disable_unprepare
        [media] video: mx1_camera: Use clk_prepare_enable/clk_disable_unprepare
        [media] media: mx3_camera: buf_init() add buffer state check
        [media] radio-shark2: Only compile led support when CONFIG_LED_CLASS is set
        [media] radio-shark: Only compile led support when CONFIG_LED_CLASS is set
        [media] radio-shark*: Call cancel_work_sync from disconnect rather then release
        [media] radio-shark*: Remove work-around for dangling pointer in usb intfdata
        [media] Add USB dependency for IguanaWorks USB IR Transceiver
        [media] Add missing logging for rangelow/high of hwseek
        [media] VIDIOC_ENUM_FREQ_BANDS fix
        [media] mem2mem_testdev: fix querycap regression
        [media] si470x: v4l2-compliance fixes
        [media] DocBook: Remove a spurious character
        [media] uvcvideo: Reset the bytesused field when recycling an erroneous buffer
      a484147a
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f8ba75e
      Linus Torvalds 提交于
      Pull networking update from David Miller:
       "A couple weeks of bug fixing in there.  The largest chunk is all the
        broken crap Amerigo Wang found in the netpoll layer."
      
       1) netpoll and it's users has several serious bugs:
          a) uses GFP_KERNEL with locks held
          b) interfaces requiring interrupts disabled are called with them
             enabled
          c) and vice versa
          d) VLAN tag demuxing, as per all other RX packet input paths, is not
             applied
      
          All from Amerigo Wang.
      
       2) Hopefully cure the ipv4 mapped ipv6 address TCP early demux bugs for
          good, from Neal Cardwell.
      
       3) Unlike AF_UNIX, AF_PACKET sockets don't set a default credentials
          when the user doesn't specify one explicitly during sendmsg().
          Instead we attach an empty (zero) SCM credential block which is
          definitely not what we want.  Fix from Eric Dumazet.
      
       4) IPv6 illegally invokes netdevice notifiers with RCU lock held, fix
          from Ben Hutchings.
      
       5) inet_csk_route_child_sock() checks wrong inet options pointer, fix
          from Christoph Paasch.
      
       6) When AF_PACKET is used for transmit, packet loopback doesn't behave
          properly when a socket fanout is enabled, from Eric Leblond.
      
       7) On bluetooth l2cap channel create failure, we leak the socket, from
          Jaganath Kanakkassery.
      
       8) Fix all the netprio file handling bugs found by Al Viro, from John
          Fastabend.
      
       9) Several error return and NULL deref bug fixes in networking drivers
          from Julia Lawall.
      
      10) A large smattering of struct padding et al.  kernel memory leaks to
          userspace found of Mathias Krause.
      
      11) Conntrack expections in netfilter can access an uninitialized timer,
          fix from Pablo Neira Ayuso.
      
      12) Several netfilter SIP tracker bug fixes from Patrick McHardy.
      
      13) IPSEC ipv6 routes are not initialized correctly all the time,
          resulting in an OOPS in inet_putpeer().  Also from Patrick McHardy.
      
      14) Bridging does rcu_dereference() outside of RCU protected area, from
          Stephen Hemminger.
      
      15) Fix routing cache removal performance regression when looking up
          output routes that have a local destination.  From Zheng Yan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
        af_netlink: force credentials passing [CVE-2012-3520]
        ipv4: fix ip header ident selection in __ip_make_skb()
        ipv4: Use newinet->inet_opt in inet_csk_route_child_sock()
        tcp: fix possible socket refcount problem
        net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child()
        net/core/dev.c: fix kernel-doc warning
        netconsole: remove a redundant netconsole_target_put()
        net: ipv6: fix oops in inet_putpeer()
        net/stmmac: fix issue of clk_get for Loongson1B.
        caif: Do not dereference NULL in chnl_recv_cb()
        af_packet: don't emit packet on orig fanout group
        drivers/net/irda: fix error return code
        drivers/net/wan/dscc4.c: fix error return code
        drivers/net/wimax/i2400m/fw.c: fix error return code
        smsc75xx: add missing entry to MAINTAINERS
        net: qmi_wwan: new devices: UML290 and K5006-Z
        net: sh_eth: Add eth support for R8A7779 device
        netdev/phy: skip disabled mdio-mux nodes
        dt: introduce for_each_available_child_of_node, of_get_next_available_child
        net: netprio: fix cgrp create and write priomap race
        ...
      8f8ba75e
    • M
      mm: compaction: Abort async compaction if locks are contended or taking too long · c67fe375
      Mel Gorman 提交于
      Jim Schutt reported a problem that pointed at compaction contending
      heavily on locks.  The workload is straight-forward and in his own words;
      
      	The systems in question have 24 SAS drives spread across 3 HBAs,
      	running 24 Ceph OSD instances, one per drive.  FWIW these servers
      	are dual-socket Intel 5675 Xeons w/48 GB memory.  I've got ~160
      	Ceph Linux clients doing dd simultaneously to a Ceph file system
      	backed by 12 of these servers.
      
      Early in the test everything looks fine
      
        procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
         r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
        31 15          0     287216        576   38606628    0    0     2  1158    2   14   1  3  95  0  0
        27 15          0     225288        576   38583384    0    0    18 2222016 203357 134876  11 56  17 15  0
        28 17          0     219256        576   38544736    0    0    11 2305932 203141 146296  11 49  23 17  0
         6 18          0     215596        576   38552872    0    0     7 2363207 215264 166502  12 45  22 20  0
        22 18          0     226984        576   38596404    0    0     3 2445741 223114 179527  12 43  23 22  0
      
      and then it goes to pot
      
        procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
         r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
        163  8          0     464308        576   36791368    0    0    11 22210  866  536   3 13  79  4  0
        207 14          0     917752        576   36181928    0    0   712 1345376 134598 47367   7 90   1  2  0
        123 12          0     685516        576   36296148    0    0   429 1386615 158494 60077   8 84   5  3  0
        123 12          0     598572        576   36333728    0    0  1107 1233281 147542 62351   7 84   5  4  0
        622  7          0     660768        576   36118264    0    0   557 1345548 151394 59353   7 85   4  3  0
        223 11          0     283960        576   36463868    0    0    46 1107160 121846 33006   6 93   1  1  0
      
      Note that system CPU usage is very high blocks being written out has
      dropped by 42%. He analysed this with perf and found
      
        perf record -g -a sleep 10
        perf report --sort symbol --call-graph fractal,5
          34.63%  [k] _raw_spin_lock_irqsave
                  |
                  |--97.30%-- isolate_freepages
                  |          compaction_alloc
                  |          unmap_and_move
                  |          migrate_pages
                  |          compact_zone
                  |          compact_zone_order
                  |          try_to_compact_pages
                  |          __alloc_pages_direct_compact
                  |          __alloc_pages_slowpath
                  |          __alloc_pages_nodemask
                  |          alloc_pages_vma
                  |          do_huge_pmd_anonymous_page
                  |          handle_mm_fault
                  |          do_page_fault
                  |          page_fault
                  |          |
                  |          |--87.39%-- skb_copy_datagram_iovec
                  |          |          tcp_recvmsg
                  |          |          inet_recvmsg
                  |          |          sock_recvmsg
                  |          |          sys_recvfrom
                  |          |          system_call
                  |          |          __recv
                  |          |          |
                  |          |           --100.00%-- (nil)
                  |          |
                  |           --12.61%-- memcpy
                   --2.70%-- [...]
      
      There was other data but primarily it is all showing that compaction is
      contended heavily on the zone->lock and zone->lru_lock.
      
      commit [b2eef8c0: mm: compaction: minimise the time IRQs are disabled
      while isolating pages for migration] noted that it was possible for
      migration to hold the lru_lock for an excessive amount of time. Very
      broadly speaking this patch expands the concept.
      
      This patch introduces compact_checklock_irqsave() to check if a lock
      is contended or the process needs to be scheduled. If either condition
      is true then async compaction is aborted and the caller is informed.
      The page allocator will fail a THP allocation if compaction failed due
      to contention. This patch also introduces compact_trylock_irqsave()
      which will acquire the lock only if it is not contended and the process
      does not need to schedule.
      Reported-by: NJim Schutt <jaschut@sandia.gov>
      Tested-by: NJim Schutt <jaschut@sandia.gov>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c67fe375
    • M
      mm: have order > 0 compaction start near a pageblock with free pages · de74f1cc
      Mel Gorman 提交于
      Commit 7db8889a ("mm: have order > 0 compaction start off where it
      left") introduced a caching mechanism to reduce the amount work the free
      page scanner does in compaction.  However, it has a problem.  Consider
      two process simultaneously scanning free pages
      
      					    			C
      	Process A		M     S     			F
      			|---------------------------------------|
      	Process B		M 	FS
      
      	C is zone->compact_cached_free_pfn
      	S is cc->start_pfree_pfn
      	M is cc->migrate_pfn
      	F is cc->free_pfn
      
      In this diagram, Process A has just reached its migrate scanner, wrapped
      around and updated compact_cached_free_pfn accordingly.
      
      Simultaneously, Process B finishes isolating in a block and updates
      compact_cached_free_pfn again to the location of its free scanner.
      
      Process A moves to "end_of_zone - one_pageblock" and runs this check
      
                      if (cc->order > 0 && (!cc->wrapped ||
                                            zone->compact_cached_free_pfn >
                                            cc->start_free_pfn))
                              pfn = min(pfn, zone->compact_cached_free_pfn);
      
      compact_cached_free_pfn is above where it started so the free scanner
      skips almost the entire space it should have scanned.  When there are
      multiple processes compacting it can end in a situation where the entire
      zone is not being scanned at all.  Further, it is possible for two
      processes to ping-pong update to compact_cached_free_pfn which is just
      random.
      
      Overall, the end result wrecks allocation success rates.
      
      There is not an obvious way around this problem without introducing new
      locking and state so this patch takes a different approach.
      
      First, it gets rid of the skip logic because it's not clear that it
      matters if two free scanners happen to be in the same block but with
      racing updates it's too easy for it to skip over blocks it should not.
      
      Second, it updates compact_cached_free_pfn in a more limited set of
      circumstances.
      
      If a scanner has wrapped, it updates compact_cached_free_pfn to the end
      	of the zone. When a wrapped scanner isolates a page, it updates
      	compact_cached_free_pfn to point to the highest pageblock it
      	can isolate pages from.
      
      If a scanner has not wrapped when it has finished isolated pages it
      	checks if compact_cached_free_pfn is pointing to the end of the
      	zone. If so, the value is updated to point to the highest
      	pageblock that pages were isolated from. This value will not
      	be updated again until a free page scanner wraps and resets
      	compact_cached_free_pfn.
      
      This is not optimal and it can still race but the compact_cached_free_pfn
      will be pointing to or very near a pageblock with free pages.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de74f1cc
    • A
      rapidio/tsi721: fix unused variable compiler warning · 9a9a9a7a
      Alexandre Bounine 提交于
      Fix unused variable compiler warning when built with CONFIG_RAPIDIO_DEBUG
      option off.
      
      This patch is applicable to kernel versions starting from v3.2
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a9a9a7a
    • A
      rapidio/tsi721: fix inbound doorbell interrupt handling · 3670e7e1
      Alexandre Bounine 提交于
      Make sure that there is no doorbell messages left behind due to disabled
      interrupts during inbound doorbell processing.
      
      The most common case for this bug is loss of rionet JOIN messages in
      systems with three or more rionet participants and MSI or MSI-X enabled.
      As result, requests for packet transfers may finish with "destination
      unreachable" error message.
      
      This patch is applicable to kernel versions starting from v3.2.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3670e7e1
    • A
      drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode · 7dbfb315
      Atsushi Nemoto 提交于
      Correct the offset by subtracting 20 from tm_hour before taking the
      modulo 12.
      
      [ "Why 20?" I hear you ask. Or at least I did.
      
        Here's the reason why: RS5C348_BIT_PM is 32, and is - stupidly -
        included in the RS5C348_HOURS_MASK define.  So it's really subtracting
        out that bit to get "hour+12".  But then because it does things modulo
        12, it needs to add the 12 in again afterwards anyway.
      
        This code is confused.  It would be much clearer if RS5C348_HOURS_MASK
        just didn't include the RS5C348_BIT_PM bit at all, then it wouldn't
        need to do the silly subtract either.
      
        Whatever. It's all just math, the end result is the same.   - Linus ]
      Reported-by: NJames Nute <newten82@gmail.com>
      Tested-by: NJames Nute <newten82@gmail.com>
      Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7dbfb315
    • A
      mm: correct page->pfmemalloc to fix deactivate_slab regression · b121186a
      Alex Shi 提交于
      Commit cfd19c5a ("mm: only set page->pfmemalloc when
      ALLOC_NO_WATERMARKS was used") tried to narrow down page->pfmemalloc
      setting, but it missed some places the pfmemalloc should be set.
      
      So, in __slab_alloc, the unalignment pfmemalloc and ALLOC_NO_WATERMARKS
      cause incorrect deactivate_slab() on our core2 server:
      
          64.73%           fio  [kernel.kallsyms]     [k] _raw_spin_lock
                           |
                           --- _raw_spin_lock
                              |
                              |---0.34%-- deactivate_slab
                              |          __slab_alloc
                              |          kmem_cache_alloc
                              |          |
      
      That causes our fio sync write performance to have a 40% regression.
      
      Move the checking in get_page_from_freelist() which resolves this issue.
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NSage Weil <sage@inktank.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b121186a
    • I
      drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes · 5ed12f12
      Ilya Shchepetkov 提交于
      Dynamically allocated sysfs attributes must be initialized using
      sysfs_attr_init(), otherwise lockdep complains: BUG: key <address> not in
      .data!
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: NIlya Shchepetkov <shchepetkov@ispras.ru>
      Cc: Chris Verges <chrisv@cyberswitching.com>
      Cc: Christian Pellegrin <chripell@fsfe.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ed12f12
    • M
      mm/compaction.c: fix deferring compaction mistake · c81758fb
      Minchan Kim 提交于
      Commit aff62249 ("vmscan: only defer compaction for failed order and
      higher") fixed bad deferring policy but made mistake about checking
      compact_order_failed in __compact_pgdat().  So it can't update
      compact_order_failed with the new order.  This ends up preventing
      correct operation of policy deferral.  This patch fixes it.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c81758fb