1. 29 11月, 2012 2 次提交
    • F
      cputime: Consolidate cputime adjustment code · d37f761d
      Frederic Weisbecker 提交于
      task_cputime_adjusted() and thread_group_cputime_adjusted()
      essentially share the same code. They just don't use the same
      source:
      
      * The first function uses the cputime in the task struct and the
      previous adjusted snapshot that ensures monotonicity.
      
      * The second adds the cputime of all tasks in the group and the
      previous adjusted snapshot of the whole group from the signal
      structure.
      
      Just consolidate the common code that does the adjustment. These
      functions just need to fetch the values from the appropriate
      source.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      d37f761d
    • F
      cputime: Rename thread_group_times to thread_group_cputime_adjusted · e80d0a1a
      Frederic Weisbecker 提交于
      We have thread_group_cputime() and thread_group_times(). The naming
      doesn't provide enough information about the difference between
      these two APIs.
      
      To lower the confusion, rename thread_group_times() to
      thread_group_cputime_adjusted(). This name better suggests that
      it's a version of thread_group_cputime() that does some stabilization
      on the raw cputime values. ie here: scale on top of CFS runtime
      stats and bound lower value for monotonicity.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      e80d0a1a
  2. 10 11月, 2012 1 次提交
    • A
      of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again · 0bce04be
      Andreas Larsson 提交于
      This bug-fix makes sure that of_address_to_resource is defined extern for sparc
      so that the sparc-specific implementation of of_address_to_resource() is once
      again used when including include/linux/of_address.h in a sparc context. A
      number of drivers in mainline relies on this function working for sparc.
      
      The bug was introduced in a850a755, "of/address:
      add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
      static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
      defined for sparc. This is good behavior for the other functions in
      include/linux/of_address.h, as the extern functions defined in
      drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
      for of_address_to_resource there exists a sparc-specific implementation in
      arch/sparc/arch/sparc/kernel/of_device_common.c
      
      Solution suggested by: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndreas Larsson <andreas@gaisler.com>
      Acked-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0bce04be
  3. 09 11月, 2012 1 次提交
    • A
      revert "epoll: support for disabling items, and a self-test app" · a80a6b85
      Andrew Morton 提交于
      Revert commit 03a7beb5 ("epoll: support for disabling items, and a
      self-test app") pending resolution of the issues identified by Michael
      Kerrisk, copied below.
      
      We'll revisit this for 3.8.
      
      : I've taken a look at this patch as it currently stands in 3.7-rc1, and
      : done a bit of testing. (By the way, the test program
      : tools/testing/selftests/epoll/test_epoll.c does not compile...)
      :
      : There are one or two places where the behavior seems a little strange,
      : so I have a question or two at the end of this mail. But other than
      : that, I want to check my understanding so that the interface can be
      : correctly documented.
      :
      : Just to go though my understanding, the problem is the following
      : scenario in a multithreaded application:
      :
      : 1. Multiple threads are performing epoll_wait() operations,
      :    and maintaining a user-space cache that contains information
      :    corresponding to each file descriptor being monitored by
      :    epoll_wait().
      :
      : 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
      :    a file descriptor from the epoll interest list, and
      :    delete the corresponding record from the user-space cache.
      :
      : 3. The problem with (2) is that some other thread may have
      :    previously done an epoll_wait() that retrieved information
      :    about the fd in question, and may be in the middle of using
      :    information in the cache that relates to that fd. Thus,
      :    there is a potential race.
      :
      : 4. The race can't solved purely in user space, because doing
      :    so would require applying a mutex across the epoll_wait()
      :    call, which would of course blow thread concurrency.
      :
      : Right?
      :
      : Your solution is the EPOLL_CTL_DISABLE operation. I want to
      : confirm my understanding about how to use this flag, since
      : the description that has accompanied the patches so far
      : has been a bit sparse
      :
      : 0. In the scenario you're concerned about, deleting a file
      :    descriptor means (safely) doing the following:
      :    (a) Deleting the file descriptor from the epoll interest list
      :        using EPOLL_CTL_DEL
      :    (b) Deleting the corresponding record in the user-space cache
      :
      : 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
      :    conjunction with EPOLLONESHOT.
      :
      : 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
      :    conjunction is a logical error.
      :
      : 3. The correct way to code multithreaded applications using
      :    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
      :
      :    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
      :       should EPOLLONESHOT.
      :
      :    b. When a thread wants to delete a file descriptor, it
      :       should do the following:
      :
      :       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
      :       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
      :           was zero, then the file descriptor can be safely
      :           deleted by the thread that made this call.
      :       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
      :           then the descriptor is in use. In this case, the calling
      :           thread should set a flag in the user-space cache to
      :           indicate that the thread that is using the descriptor
      :           should perform the deletion operation.
      :
      : Is all of the above correct?
      :
      : The implementation depends on checking on whether
      : (events & ~EP_PRIVATE_BITS) == 0
      : This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
      : set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
      : causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
      : cleared.
      :
      : A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
      : is only useful in conjunction with EPOLLONESHOT. However, as things
      : stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
      : not have EPOLLONESHOT set in 'events' This results in the following
      : (slightly surprising) behavior:
      :
      : (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
      :     (the indicator that the file descriptor can be safely deleted).
      : (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
      :
      : This doesn't seem particularly useful, and in fact is probably an
      : indication that the user made a logic error: they should only be using
      : epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
      : EPOLLONESHOT was set in 'events'. If that is correct, then would it
      : not make sense to return an error to user space for this case?
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paton J. Lewis" <palewis@adobe.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a80a6b85
  4. 08 11月, 2012 3 次提交
  5. 07 11月, 2012 1 次提交
  6. 04 11月, 2012 1 次提交
  7. 03 11月, 2012 1 次提交
  8. 01 11月, 2012 2 次提交
    • X
      KVM: x86: fix vcpu->mmio_fragments overflow · 87da7e66
      Xiao Guangrong 提交于
      After commit b3356bf0 (KVM: emulator: optimize "rep ins" handling),
      the pieces of io data can be collected and write them to the guest memory
      or MMIO together
      
      Unfortunately, kvm splits the mmio access into 8 bytes and store them to
      vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
      will cause vcpu->mmio_fragments overflow
      
      The bug can be exposed by isapc (-M isapc):
      
      [23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ ......]
      [23154.858083] Call Trace:
      [23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
      [23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
      [23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]
      
      Actually, we can use one mmio_fragment to store a large mmio access then
      split it when we pass the mmio-exit-info to userspace. After that, we only
      need two entries to store mmio info for the cross-mmio pages access
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      87da7e66
    • K
      xen/mmu: Use Xen specific TLB flush instead of the generic one. · 95a7d768
      Konrad Rzeszutek Wilk 提交于
      As Mukesh explained it, the MMUEXT_TLB_FLUSH_ALL allows the
      hypervisor to do a TLB flush on all active vCPUs. If instead
      we were using the generic one (which ends up being xen_flush_tlb)
      we end up making the MMUEXT_TLB_FLUSH_LOCAL hypercall. But
      before we make that hypercall the kernel will IPI all of the
      vCPUs (even those that were asleep from the hypervisor
      perspective). The end result is that we needlessly wake them
      up and do a TLB flush when we can just let the hypervisor
      do it correctly.
      
      This patch gives around 50% speed improvement when migrating
      idle guest's from one host to another.
      
      Oracle-bug: 14630170
      
      CC: stable@vger.kernel.org
      Tested-by: NJingjie Jiang <jingjie.jiang@oracle.com>
      Suggested-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      95a7d768
  9. 30 10月, 2012 6 次提交
    • T
      ALSA: Add a reference counter to card instance · a0830dbd
      Takashi Iwai 提交于
      For more strict protection for wild disconnections, a refcount is
      introduced to the card instance, and let it up/down when an object is
      referred via snd_lookup_*() in the open ops.
      
      The free-after-last-close check is also changed to check this refcount
      instead of the empty list, too.
      Reported-by: NMatthieu CASTET <matthieu.castet@parrot.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      a0830dbd
    • F
      cputime: Separate irqtime accounting from generic vtime · 3e1df4f5
      Frederic Weisbecker 提交于
      vtime_account() doesn't have the same role in
      CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_IRQ_TIME_ACCOUNTING.
      
      In the first case it handles time accounting in any context. In
      the second case it only handles irq time accounting.
      
      So when vtime_account() is called from outside vtime_account_irq_*()
      this call is pointless to CONFIG_IRQ_TIME_ACCOUNTING.
      
      To fix the confusion, change vtime_account() to irqtime_account_irq()
      in CONFIG_IRQ_TIME_ACCOUNTING. This way we ensure future account_vtime()
      calls won't waste useless cycles in the irqtime APIs.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      3e1df4f5
    • F
      cputime: Specialize irq vtime hooks · fa5058f3
      Frederic Weisbecker 提交于
      With CONFIG_VIRT_CPU_ACCOUNTING, when vtime_account()
      is called in irq entry/exit, we perform a check on the
      context: if we are interrupting the idle task we
      account the pending cputime to idle, otherwise account
      to system time or its sub-areas: tsk->stime, hardirq time,
      softirq time, ...
      
      However this check for idle only concerns the hardirq entry
      and softirq entry:
      
      * Hardirq may directly interrupt the idle task, in which case
      we need to flush the pending CPU time to idle.
      
      * The idle task may be directly interrupted by a softirq if
      it calls local_bh_enable(). There is probably no such call
      in any idle task but we need to cover every case. Ksoftirqd
      is not concerned because the idle time is flushed on context
      switch and softirq in the end of hardirq have the idle time
      already flushed from the hardirq entry.
      
      In the other cases we always account to system/irq time:
      
      * On hardirq exit we account the time to hardirq time.
      * On softirq exit we account the time to softirq time.
      
      To optimize this and avoid the indirect call to vtime_account()
      and the checks it performs, specialize the vtime irq APIs and
      only perform the check on irq entry. Irq exit can directly call
      vtime_account_system().
      
      CONFIG_IRQ_TIME_ACCOUNTING behaviour doesn't change and directly
      maps to its own vtime_account() implementation. One may want
      to take benefits from the new APIs to optimize irq time accounting
      as well in the future.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      fa5058f3
    • F
      kvm: Directly account vtime to system on guest switch · b080935c
      Frederic Weisbecker 提交于
      Switching to or from guest context is done on ioctl context.
      So by the time we call kvm_guest_enter() or kvm_guest_exit()
      we know we are not running the idle task.
      
      As a result, we can directly account the cputime using
      vtime_account_system().
      
      There are two good reasons to do this:
      
      * We avoid some useless checks on guest switch. It optimizes
      a bit this fast path.
      
      * In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
      checks for irq time to account. This is pointless since we know
      we are not in an irq on guest switch. This is wasting cpu cycles
      for no good reason. vtime_account_system() OTOH is a no-op in
      this config option.
      
      * We can remove the irq disable/enable around kvm guest switch in s390.
      
      A further optimization may consist in introducing a vtime_account_guest()
      that directly calls account_guest_time().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Xiantao Zhang <xiantao.zhang@intel.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      b080935c
    • F
      vtime: Make vtime_account_system() irqsafe · 11113334
      Frederic Weisbecker 提交于
      vtime_account_system() currently has only one caller with
      vtime_account() which is irq safe.
      
      Now we are going to call it from other places like kvm where
      irqs are not always disabled by the time we account the cputime.
      
      So let's make it irqsafe. The arch implementation part is now
      prefixed with "__".
      
      vtime_account_idle() arch implementation is prefixed accordingly
      to stay consistent.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      11113334
    • F
      vtime: Gather vtime declarations to their own header file · dcbf832e
      Frederic Weisbecker 提交于
      These APIs are scattered around and are going to expand a bit.
      Let's create a dedicated header file for sanity.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      dcbf832e
  10. 29 10月, 2012 2 次提交
  11. 27 10月, 2012 1 次提交
    • J
      mac80211: verify that skb data is present · 9b395bc3
      Johannes Berg 提交于
      A number of places in the mesh code don't check that
      the frame data is present and in the skb header when
      trying to access. Add those checks and the necessary
      pskb_may_pull() calls. This prevents accessing data
      that doesn't actually exist.
      
      To do this, export ieee80211_get_mesh_hdrlen() to be
      able to use it in mac80211.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      9b395bc3
  12. 26 10月, 2012 1 次提交
  13. 25 10月, 2012 2 次提交
  14. 24 10月, 2012 7 次提交
  15. 23 10月, 2012 3 次提交
  16. 22 10月, 2012 1 次提交
  17. 20 10月, 2012 4 次提交
  18. 19 10月, 2012 1 次提交