1. 25 5月, 2011 2 次提交
    • K
      mm: per-node vmstat: show proper vmstats · fa25c503
      KOSAKI Motohiro 提交于
      commit 2ac39037 ("writeback: add
      /sys/devices/system/node/<node>/vmstat") added vmstat entry.  But
      strangely it only show nr_written and nr_dirtied.
      
              # cat /sys/devices/system/node/node20/vmstat
              nr_written 0
              nr_dirtied 0
      
      Of course, It's not adequate.  With this patch, the vmstat show all vm
      stastics as /proc/vmstat.
      
              # cat /sys/devices/system/node/node0/vmstat
      	nr_free_pages 899224
      	nr_inactive_anon 201
      	nr_active_anon 17380
      	nr_inactive_file 31572
      	nr_active_file 28277
      	nr_unevictable 0
      	nr_mlock 0
      	nr_anon_pages 17321
      	nr_mapped 8640
      	nr_file_pages 60107
      	nr_dirty 33
      	nr_writeback 0
      	nr_slab_reclaimable 6850
      	nr_slab_unreclaimable 7604
      	nr_page_table_pages 3105
      	nr_kernel_stack 175
      	nr_unstable 0
      	nr_bounce 0
      	nr_vmscan_write 0
      	nr_writeback_temp 0
      	nr_isolated_anon 0
      	nr_isolated_file 0
      	nr_shmem 260
      	nr_dirtied 1050
      	nr_written 938
      	numa_hit 962872
      	numa_miss 0
      	numa_foreign 0
      	numa_interleave 8617
      	numa_local 962872
      	numa_other 0
      	nr_anon_transparent_hugepages 0
      
      [akpm@linux-foundation.org: no externs in .c files]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa25c503
    • D
      arch, mm: filter disallowed nodes from arch specific show_mem functions · 7bf02ea2
      David Rientjes 提交于
      Architectures that implement their own show_mem() function did not pass
      the filter argument to show_free_areas() to appropriately avoid emitting
      the state of nodes that are disallowed in the current context.  This patch
      now passes the filter argument to show_free_areas() so those nodes are now
      avoided.
      
      This patch also removes the show_free_areas() wrapper around
      __show_free_areas() and converts existing callers to pass an empty filter.
      
      ia64 emits additional information for each node, so skip_free_areas_zone()
      must be made global to filter disallowed nodes and it is converted to use
      a nid argument rather than a zone for this use case.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7bf02ea2
  2. 24 5月, 2011 2 次提交
  3. 23 5月, 2011 16 次提交
  4. 22 5月, 2011 6 次提交
    • G
      KVM: make guest mode entry to be rcu quiescent state · 8fa22068
      Gleb Natapov 提交于
      KVM does not hold any references to rcu protected data when it switches
      CPU into a guest mode. In fact switching to a guest mode is very similar
      to exiting to userspase from rcu point of view. In addition CPU may stay
      in a guest mode for quite a long time (up to one time slice). Lets treat
      guest mode as quiescent state, just like we do with user-mode execution.
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      8fa22068
    • S
      KVM: PPC: booke: add sregs support · 5ce941ee
      Scott Wood 提交于
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5ce941ee
    • A
      KVM: Use pci_store/load_saved_state() around VM device usage · f8fcfd77
      Alex Williamson 提交于
      Store the device saved state so that we can reload the device back
      to the original state when it's unassigned.  This has the benefit
      that the state survives across pci_reset_function() calls via
      the PCI sysfs reset interface while the VM is using the device.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      f8fcfd77
    • A
      PCI: Add interfaces to store and load the device saved state · ffbdd3f7
      Alex Williamson 提交于
      For KVM device assignment, we'd like to save off the state of a device
      prior to passing it to the guest and restore it later.  We also want
      to allow pci_reset_funciton() to be called while the device is owned
      by the guest.  This however overwrites and invalidates the struct pci_dev
      buffers, so we can't just manually call save and restore.  Add generic
      interfaces for the saved state to be stored and reloaded back into
      struct pci_dev at a later time.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ffbdd3f7
    • A
      PCI: Track the size of each saved capability data area · 24a4742f
      Alex Williamson 提交于
      This will allow us to store and load it later.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      24a4742f
    • Y
      PCI/e1000e: Add and use pci_disable_link_state_locked() · 9f728f53
      Yinghai Lu 提交于
      Need to use it in _e1000e_disable_aspm.  This routine is used for error
      recovery, where the pci_bus_sem is already held, and we don't want
      pci_disable_link_state to try to take it again.  So add a locked variant
      for use in cases like this.
      
      Found lock up:
      
      [ 2374.654557] kworker/32:1    D ffff881027f6b0f0     0  6075      2 0x00000000
      [ 2374.654816]  ffff88503f099a68 0000000000000046 ffff88503f098000 0000000000004000
      [ 2374.654837]  00000000001d1ec0 ffff88503f099fd8 00000000001d1ec0 ffff88503f099fd8
      [ 2374.654860]  0000000000004000 00000000001d1ec0 ffff88503dcc8000 ffff88503f090000
      [ 2374.654880] Call Trace:
      [ 2374.654898]  [<ffffffff810b1302>] ? __lock_acquired+0x3a/0x224
      [ 2374.654914]  [<ffffffff81c2b59c>] ? _raw_spin_unlock_irq+0x30/0x36
      [ 2374.654925]  [<ffffffff810b069d>] ? trace_hardirqs_on_caller+0x1f/0x178
      [ 2374.654936]  [<ffffffff81c2ab24>] rwsem_down_failed_common+0xd3/0x103
      [ 2374.654945]  [<ffffffff810b158f>] ? __lock_contended+0x3a/0x2a2
      [ 2374.654955]  [<ffffffff81c2ab7b>] rwsem_down_read_failed+0x12/0x14
      [ 2374.654967]  [<ffffffff813371e4>] call_rwsem_down_read_failed+0x14/0x30
      [ 2374.654981]  [<ffffffff8135df20>] ? pci_disable_link_state+0x5f/0xf5
      [ 2374.654990]  [<ffffffff81c2a0e6>] ? down_read+0x7e/0x91
      [ 2374.654999]  [<ffffffff8135df20>] ? pci_disable_link_state+0x5f/0xf5
      [ 2374.655008]  [<ffffffff8135df20>] pci_disable_link_state+0x5f/0xf5
      [ 2374.655024]  [<ffffffff81661796>] e1000e_disable_aspm+0x55/0x5a
      [ 2374.655037]  [<ffffffff816677eb>] e1000_io_slot_reset+0x59/0xea
      [ 2374.655048]  [<ffffffff8135fe0d>] ? report_mmio_enabled+0x5d/0x5d
      [ 2374.655057]  [<ffffffff8135fe3b>] report_slot_reset+0x2e/0x5d
      [ 2374.655072]  [<ffffffff8135369e>] pci_walk_bus+0x8a/0xb7
      [ 2374.655081]  [<ffffffff8135fe0d>] ? report_mmio_enabled+0x5d/0x5d
      [ 2374.655091]  [<ffffffff813603be>] broadcast_error_message+0xa4/0xb2
      [ 2374.655101]  [<ffffffff81352c71>] ? pci_bus_read_config_dword+0x72/0x80
      [ 2374.655110]  [<ffffffff813606df>] do_recovery+0x9e/0xf9
      [ 2374.655120]  [<ffffffff81360786>] handle_error_source+0x4c/0x51
      [ 2374.655129]  [<ffffffff81360974>] aer_isr_one_error+0x1e9/0x21a
      [ 2374.655138]  [<ffffffff81360a6c>] aer_isr+0xc7/0xcc
      [ 2374.655147]  [<ffffffff813609a5>] ? aer_isr_one_error+0x21a/0x21a
      [ 2374.655159]  [<ffffffff81096d9f>] process_one_work+0x237/0x3ec
      [ 2374.655168]  [<ffffffff81096d10>] ? process_one_work+0x1a8/0x3ec
      [ 2374.655178]  [<ffffffff8109728d>] worker_thread+0x17c/0x240
      [ 2374.655186]  [<ffffffff810b0803>] ? trace_hardirqs_on+0xd/0xf
      [ 2374.655196]  [<ffffffff81097111>] ? manage_workers+0xab/0xab
      [ 2374.655209]  [<ffffffff8109c8ed>] kthread+0xa0/0xa8
      [ 2374.655223]  [<ffffffff81c332d4>] kernel_thread_helper+0x4/0x10
      [ 2374.655232]  [<ffffffff81c2b880>] ? retint_restore_args+0xe/0xe
      [ 2374.655243]  [<ffffffff8109c84d>] ? __init_kthread_worker+0x5b/0x5b
      [ 2374.655252]  [<ffffffff81c332d0>] ? gs_change+0xb/0xb
      
      when aer happens,
      pci_walk_bus already have down_read(&pci_bus_sem)...
      then report_slot_reset
              ==> e1000_io_slot_reset
                      ==> e1000e_disable_aspm
                              ==> pci_disable_link_state...
      
      We can not use pci_disable_link_state, and it will try to hold pci_bus_sem again.
      
      Try to have __pci_disable_link_state that will not need to hold pci_bus_sem.
      
      -v2: change name to pci_disable_link_state_locked() according to Jesse.
      
      [jbarnes: make sure new function is exported for modules]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      9f728f53
  5. 21 5月, 2011 5 次提交
  6. 20 5月, 2011 9 次提交
    • H
      [media] v4l: Add M420 format definition · 0e59fd05
      Hans de Goede 提交于
      M420 is a hybrid YUV 4:2:0 packet/planar format. Two Y lines are
      followed by an interleaved U/V line.
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      [laurent.pinchart@ideasonboard.com: split into v4l/uvcvideo patches]
      [laurent.pinchart@ideasonboard.com: add documentation]
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      0e59fd05
    • S
      [media] v4l: Add V4L2_MBUS_FMT_JPEG_1X8 media bus format · 16846524
      Sylwester Nawrocki 提交于
      Add V4L2_MBUS_FMT_JPEG_1X8 format and the corresponding Docbook
      documentation.
      Signed-off-by: NSylwester Nawrocki <s.nawrocki@samsung.com>
      Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
      Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      16846524
    • L
      [media] uvcvideo: Make the API public · 5f708812
      Laurent Pinchart 提交于
      Move the public API definitions to include/linux/uvcvideo.h and bump the
      version number to 1.1.0. Compatibility with the old API is kept,
      application can still be compiled against the private header and will
      not break.
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      5f708812
    • A
      [media] Add Y10B, a 10 bpp bit-packed greyscale format · 8bb36c21
      Antonio Ospite 提交于
      Add a 10 bits per pixel greyscale format in a bit-packed array representation,
      naming it Y10B. Such pixel format is supplied for instance by the Kinect
      sensor device.
      Signed-off-by: NAntonio Ospite <ospite@studenti.unina.it>
      Signed-off-by: NJean-François Moine <moinejf@free.fr>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      8bb36c21
    • N
      sched: Increase SCHED_LOAD_SCALE resolution · c8b28116
      Nikhil Rao 提交于
      Introduce SCHED_LOAD_RESOLUTION, which scales is added to
      SCHED_LOAD_SHIFT and increases the resolution of
      SCHED_LOAD_SCALE. This patch sets the value of
      SCHED_LOAD_RESOLUTION to 10, scaling up the weights for all
      sched entities by a factor of 1024. With this extra resolution,
      we can handle deeper cgroup hiearchies and the scheduler can do
      better shares distribution and load load balancing on larger
      systems (especially for low weight task groups).
      
      This does not change the existing user interface, the scaled
      weights are only used internally. We do not modify
      prio_to_weight values or inverses, but use the original weights
      when calculating the inverse which is used to scale execution
      time delta in calc_delta_mine(). This ensures we do not lose
      accuracy when accounting time to the sched entities. Thanks to
      Nikunj Dadhania for fixing an bug in c_d_m() that broken fairness.
      
      Below is some analysis of the performance costs/improvements of
      this patch.
      
      1. Micro-arch performance costs:
      
      Experiment was to run Ingo's pipe_test_100k 200 times with the
      task pinned to one cpu. I measured instruction, cycles and
      stalled-cycles for the runs. See:
      
         http://thread.gmane.org/gmane.linux.kernel/1129232/focus=1129389
      
      for more info.
      
      -tip (baseline):
      
       Performance counter stats for '/root/load-scale/pipe-test-100k' (200 runs):
      
             964,991,769 instructions             #    0.82  insns per cycle
                                                  #    0.33  stalled cycles per insn
                                                  #    ( +-  0.05% )
           1,171,186,635 cycles                   #    0.000 GHz                      ( +-  0.08% )
             306,373,664 stalled-cycles-backend   #   26.16% backend  cycles idle     ( +-  0.28% )
             314,933,621 stalled-cycles-frontend  #   26.89% frontend cycles idle     ( +-  0.34% )
      
              1.122405684  seconds time elapsed  ( +-  0.05% )
      
      -tip+patches:
      
       Performance counter stats for './load-scale/pipe-test-100k' (200 runs):
      
             963,624,821 instructions             #    0.82  insns per cycle
                                                  #    0.33  stalled cycles per insn
                                                  #    ( +-  0.04% )
           1,175,215,649 cycles                   #    0.000 GHz                      ( +-  0.08% )
             315,321,126 stalled-cycles-backend   #   26.83% backend  cycles idle     ( +-  0.28% )
             316,835,873 stalled-cycles-frontend  #   26.96% frontend cycles idle     ( +-  0.29% )
      
              1.122238659  seconds time elapsed  ( +-  0.06% )
      
      With this patch, instructions decrease by ~0.10% and cycles
      increase by 0.27%. This doesn't look statistically significant.
      The number of stalled cycles in the backend increased from
      26.16% to 26.83%. This can be attributed to the shifts we do in
      c_d_m() and other places. The fraction of stalled cycles in the
      frontend remains about the same, at 26.96% compared to 26.89% in -tip.
      
      2. Balancing low-weight task groups
      
      Test setup: run 50 tasks with random sleep/busy times (biased
      around 100ms) in a low weight container (with cpu.shares = 2).
      Measure %idle as reported by mpstat over a 10s window.
      
      -tip (baseline):
      
      06:47:48 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
      06:47:49 PM  all   94.32    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.62  15888.00
      06:47:50 PM  all   94.57    0.00    0.62    0.00    0.00    0.00    0.00    0.00    4.81  16180.00
      06:47:51 PM  all   94.69    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.25  15966.00
      06:47:52 PM  all   95.81    0.00    0.00    0.00    0.00    0.00    0.00    0.00    4.19  16053.00
      06:47:53 PM  all   94.88    0.06    0.00    0.00    0.00    0.00    0.00    0.00    5.06  15984.00
      06:47:54 PM  all   93.31    0.00    0.00    0.00    0.00    0.00    0.00    0.00    6.69  15806.00
      06:47:55 PM  all   94.19    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.75  15896.00
      06:47:56 PM  all   92.87    0.00    0.00    0.00    0.00    0.00    0.00    0.00    7.13  15716.00
      06:47:57 PM  all   94.88    0.00    0.00    0.00    0.00    0.00    0.00    0.00    5.12  15982.00
      06:47:58 PM  all   95.44    0.00    0.00    0.00    0.00    0.00    0.00    0.00    4.56  16075.00
      Average:     all   94.49    0.01    0.08    0.00    0.00    0.00    0.00    0.00    5.42  15954.60
      
      -tip+patches:
      
      06:47:03 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
      06:47:04 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16630.00
      06:47:05 PM  all   99.69    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.31  16580.20
      06:47:06 PM  all   99.69    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.25  16596.00
      06:47:07 PM  all   99.20    0.00    0.74    0.00    0.00    0.06    0.00    0.00    0.00  17838.61
      06:47:08 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16540.00
      06:47:09 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16575.00
      06:47:10 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16614.00
      06:47:11 PM  all   99.94    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.06  16588.00
      06:47:12 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.00  16593.00
      06:47:13 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.00  16551.00
      Average:     all   99.84    0.00    0.09    0.00    0.00    0.01    0.00    0.00    0.06  16711.58
      
      We see an improvement in idle% on the system (drops from 5.42% on -tip to 0.06%
      with the patches).
      
      We see an improvement in idle% on the system (drops from 5.42%
      on -tip to 0.06% with the patches).
      Signed-off-by: NNikhil Rao <ncrao@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1305754668-18792-1-git-send-email-ncrao@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c8b28116
    • N
      sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculations · 1399fa78
      Nikhil Rao 提交于
      SCHED_LOAD_SCALE is used to increase nice resolution and to
      scale cpu_power calculations in the scheduler. This patch
      introduces SCHED_POWER_SCALE and converts all uses of
      SCHED_LOAD_SCALE for scaling cpu_power to use SCHED_POWER_SCALE
      instead.
      
      This is a preparatory patch for increasing the resolution of
      SCHED_LOAD_SCALE, and there is no need to increase resolution
      for cpu_power calculations.
      Signed-off-by: NNikhil Rao <ncrao@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Link: http://lkml.kernel.org/r/1305738580-9924-3-git-send-email-ncrao@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1399fa78
    • E
      macvlan: remove one synchronize_rcu() call · 449f4544
      Eric Dumazet 提交于
      When one macvlan device is dismantled, we can avoid one
      synchronize_rcu() call done after deletion from hash list, since caller
      will perform a synchronize_net() call after its ndo_stop() call.
      
      Add a new netdev->dismantle field to signal this dismantle intent.
      
      Reduces RTNL hold time.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Ben Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      449f4544
    • S
      signal.h need a definition of struct task_struct · 1477fcc2
      Stephen Rothwell 提交于
      This fixes these build errors on powerpc:
      
        In file included from arch/powerpc/mm/fault.c:18:
        include/linux/signal.h:239: error: 'struct task_struct' declared inside parameter list
        include/linux/signal.h:239: error: its scope is only this definition or declaration, which is probably not what you want
        include/linux/signal.h:240: error: 'struct task_struct' declared inside parameter list
        ..
      
      Exposed by commit e66eed65 ("list: remove prefetching from regular
      list iterators"), which removed the include of <linux/prefetch.h> from
      <linux/list.h>.
      
      Without that, linux/signal.h no longer accidentally got the declaration
      of 'struct task_struct'.
      
      Fix by properly declaring the struct, rather than introducing any new
      header file dependency.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1477fcc2
    • K
      libata: Power off empty ports · 8a745f1f
      Kristen Carlson Accardi 提交于
      Give users the option of completely powering off unoccupied
      SATA ports using the existing min_power link_power_management_policy
      option.  When the use selects this option on an empty port, we
      will power the port off by setting DET to off.  For occupied ports,
      behavior is unchanged.
      Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com>
      Signed-off-by: NJeff Garzik <jgarzik@pobox.com>
      8a745f1f