1. 10 11月, 2012 2 次提交
    • T
      cgroup: use rculist ops for cgroup->children · eb6fd504
      Tejun Heo 提交于
      Use RCU safe list operations for cgroup->children.  This will be used
      to implement cgroup children / descendant walking which can be used by
      controllers.
      
      Note that cgroup_create() now puts a new cgroup at the end of the
      ->children list instead of head.  This isn't strictly necessary but is
      done so that the iteration order is more conventional.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      eb6fd504
    • T
      cgroup: add cgroup_subsys->post_create() · a8638030
      Tejun Heo 提交于
      Currently, there's no way for a controller to find out whether a new
      cgroup finished all ->create() allocatinos successfully and is
      considered "live" by cgroup.
      
      This becomes a problem later when we add generic descendants walking
      to cgroup which can be used by controllers as controllers don't have a
      synchronization point where it can synchronize against new cgroups
      appearing in such walks.
      
      This patch adds ->post_create().  It's called after all ->create()
      succeeded and the cgroup is linked into the generic cgroup hierarchy.
      This plays the counterpart of ->pre_destroy().
      
      When used in combination with the to-be-added generic descendant
      iterators, ->post_create() can be used to implement reliable state
      inheritance.  It will be explained with the descendant iterators.
      
      v2: Added a paragraph about its future use w/ descendant iterators per
          Michal.
      
      v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
          Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Glauber Costa <glommer@parallels.com>
      a8638030
  2. 06 11月, 2012 4 次提交
    • T
      cgroup: make ->pre_destroy() return void · bcf6de1b
      Tejun Heo 提交于
      All ->pre_destory() implementations return 0 now, which is the only
      allowed return value.  Make it return void.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      bcf6de1b
    • T
      cgroup: remove CGRP_WAIT_ON_RMDIR, cgroup_exclude_rmdir() and cgroup_release_and_wakeup_rmdir() · b25ed609
      Tejun Heo 提交于
      CGRP_WAIT_ON_RMDIR is another kludge which was added to make cgroup
      destruction rollback somewhat working.  cgroup_rmdir() used to drain
      CSS references and CGRP_WAIT_ON_RMDIR and the associated waitqueue and
      helpers were used to allow the task performing rmdir to wait for the
      next relevant event.
      
      Unfortunately, the wait is visible to controllers too and the
      mechanism got exposed to memcg by 88703267 ("cgroup avoid permanent
      sleep at rmdir").
      
      Now that the draining and retries are gone, CGRP_WAIT_ON_RMDIR is
      unnecessary.  Remove it and all the mechanisms supporting it.  Note
      that memcontrol.c changes are essentially revert of 88703267
      ("cgroup avoid permanent sleep at rmdir").
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      b25ed609
    • T
      cgroup: kill CSS_REMOVED · e9316080
      Tejun Heo 提交于
      CSS_REMOVED is one of the several contortions which were necessary to
      support css reference draining on cgroup removal.  All css->refcnts
      which need draining should be deactivated and verified to equal zero
      atomically w.r.t. css_tryget().  If any one isn't zero, all refcnts
      needed to be re-activated and css_tryget() shouldn't fail in the
      process.
      
      This was achieved by letting css_tryget() busy-loop until either the
      refcnt is reactivated (failed removal attempt) or CSS_REMOVED is set
      (committing to removal).
      
      Now that css refcnt draining is no longer used, there's no need for
      atomic rollback mechanism.  css_tryget() simply can look at the
      reference count and fail if it's deactivated - it's never getting
      re-activated.
      
      This patch removes CSS_REMOVED and updates __css_tryget() to fail if
      the refcnt is deactivated.  As deactivation and removal are a single
      step now, they no longer need to be protected against css_tryget()
      happening from irq context.  Remove local_irq_disable/enable() from
      cgroup_rmdir().
      
      Note that this removes css_is_removed() whose only user is VM_BUG_ON()
      in memcontrol.c.  We can replace it with a check on the refcnt but
      given that the only use case is a debug assert, I think it's better to
      simply unexport it.
      
      v2: Comment updated and explanation on local_irq_disable/enable()
          added per Michal Hocko.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      e9316080
    • T
      cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs · ed957793
      Tejun Heo 提交于
      2ef37d3f ("memcg: Simplify mem_cgroup_force_empty_list error
      handling") removed the last user of __DEPRECATED_clear_css_refs.  This
      patch removes __DEPRECATED_clear_css_refs and mechanisms to support
      it.
      
      * Conditionals dependent on __DEPRECATED_clear_css_refs removed.
      
      * cgroup_clear_css_refs() can no longer fail.  All that needs to be
        done are deactivating refcnts, setting CSS_REMOVED and putting the
        base reference on each css.  Remove cgroup_clear_css_refs() and the
        failure path, and open-code the loops into cgroup_rmdir().
      
      This patch keeps the two for_each_subsys() loops separate while open
      coding them.  They can be merged now but there are scheduled changes
      which need them to be separate, so keep them separate to reduce the
      amount of churn.
      
      local_irq_save/restore() from cgroup_clear_css_refs() are replaced
      with local_irq_disable/enable() for simplicity.  This is safe as
      cgroup_rmdir() is always called with IRQ enabled.  Note that this IRQ
      switching is necessary to ensure that css_tryget() isn't called from
      IRQ context on the same CPU while lower context is between CSS
      deactivation and setting CSS_REMOVED as css_tryget() would hang
      forever in such cases waiting for CSS to be re-activated or
      CSS_REMOVED set.  This will go away soon.
      
      v2: cgroup_call_pre_destroy() removal dropped per Michal.  Commit
          message updated to explain local_irq_disable/enable() conversion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      ed957793
  3. 03 11月, 2012 1 次提交
  4. 01 11月, 2012 2 次提交
    • X
      KVM: x86: fix vcpu->mmio_fragments overflow · 87da7e66
      Xiao Guangrong 提交于
      After commit b3356bf0 (KVM: emulator: optimize "rep ins" handling),
      the pieces of io data can be collected and write them to the guest memory
      or MMIO together
      
      Unfortunately, kvm splits the mmio access into 8 bytes and store them to
      vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
      will cause vcpu->mmio_fragments overflow
      
      The bug can be exposed by isapc (-M isapc):
      
      [23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ ......]
      [23154.858083] Call Trace:
      [23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
      [23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
      [23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]
      
      Actually, we can use one mmio_fragment to store a large mmio access then
      split it when we pass the mmio-exit-info to userspace. After that, we only
      need two entries to store mmio info for the cross-mmio pages access
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      87da7e66
    • K
      xen/mmu: Use Xen specific TLB flush instead of the generic one. · 95a7d768
      Konrad Rzeszutek Wilk 提交于
      As Mukesh explained it, the MMUEXT_TLB_FLUSH_ALL allows the
      hypervisor to do a TLB flush on all active vCPUs. If instead
      we were using the generic one (which ends up being xen_flush_tlb)
      we end up making the MMUEXT_TLB_FLUSH_LOCAL hypercall. But
      before we make that hypercall the kernel will IPI all of the
      vCPUs (even those that were asleep from the hypervisor
      perspective). The end result is that we needlessly wake them
      up and do a TLB flush when we can just let the hypervisor
      do it correctly.
      
      This patch gives around 50% speed improvement when migrating
      idle guest's from one host to another.
      
      Oracle-bug: 14630170
      
      CC: stable@vger.kernel.org
      Tested-by: NJingjie Jiang <jingjie.jiang@oracle.com>
      Suggested-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      95a7d768
  5. 30 10月, 2012 1 次提交
  6. 29 10月, 2012 2 次提交
  7. 27 10月, 2012 2 次提交
    • O
      freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() · 5d8f72b5
      Oleg Nesterov 提交于
      try_to_freeze_tasks() and cgroup_freezer rely on scheduler locks
      to ensure that a task doing STOPPED/TRACED -> RUNNING transition
      can't escape freezing. This mostly works, but ptrace_stop() does
      not necessarily call schedule(), it can change task->state back to
      RUNNING and check freezing() without any lock/barrier in between.
      
      We could add the necessary barrier, but this patch changes
      ptrace_stop() and do_signal_stop() to use freezable_schedule().
      This fixes the race, freezer_count() and freezer_should_skip()
      carefully avoid the race.
      
      And this simplifies the code, try_to_freeze_tasks/update_if_frozen
      no longer need to use task_is_stopped_or_traced() checks with the
      non trivial assumptions. We can rely on the mechanism which was
      specially designed to mark the sleeping task as "frozen enough".
      
      v2: As Tejun pointed out, we can also change get_signal_to_deliver()
      and move try_to_freeze() up before 'relock' label.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5d8f72b5
    • J
      mac80211: verify that skb data is present · 9b395bc3
      Johannes Berg 提交于
      A number of places in the mesh code don't check that
      the frame data is present and in the skb header when
      trying to access. Add those checks and the necessary
      pskb_may_pull() calls. This prevents accessing data
      that doesn't actually exist.
      
      To do this, export ieee80211_get_mesh_hdrlen() to be
      able to use it in mac80211.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      9b395bc3
  8. 26 10月, 2012 1 次提交
  9. 25 10月, 2012 2 次提交
  10. 24 10月, 2012 2 次提交
  11. 23 10月, 2012 3 次提交
  12. 22 10月, 2012 1 次提交
  13. 20 10月, 2012 4 次提交
  14. 19 10月, 2012 1 次提交
  15. 18 10月, 2012 4 次提交
    • K
      of/irq: sparse fixes · d2e41518
      Kim Phillips 提交于
      drivers/of/irq.c:195:57: warning: restricted __be32 degrades to integer
      drivers/of/irq.c:196:51: warning: restricted __be32 degrades to integer
      drivers/of/irq.c:199:57: warning: restricted __be32 degrades to integer
      drivers/of/irq.c:201:58: warning: restricted __be32 degrades to integer
      drivers/of/irq.c:470:37: warning: incorrect type in assignment (different modifiers)
      drivers/of/irq.c:470:37:    expected int ( *[usertype] irq_init_cb )( ... )
      drivers/of/irq.c:470:37:    got void const *const data
      drivers/of/irq.c:96:5: error: symbol 'of_irq_map_raw' redeclared with different type (originally declared at include/linux/of_irq.h:61) - incompatible argument 2 (different base types)
      
      drivers/of/of_pci_irq.c:91:40: warning: incorrect type in argument 2 (different base types)
      drivers/of/of_pci_irq.c:91:40:    expected unsigned int const [usertype] *intspec
      drivers/of/of_pci_irq.c:91:40:    got restricted __be32 *<noident>
      drivers/of/of_pci_irq.c:91:53: warning: incorrect type in argument 4 (different base types)
      drivers/of/of_pci_irq.c:91:53:    expected unsigned int const [usertype] *addr
      drivers/of/of_pci_irq.c:91:53:    got restricted __be32 *<noident>
      Signed-off-by: NKim Phillips <kim.phillips@freescale.com>
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      d2e41518
    • K
      of/address: sparse fixes · 47b1e689
      Kim Phillips 提交于
      drivers/of/address.c:66:29: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:66:29:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:66:29:    got unsigned int [usertype] *addr
      drivers/of/address.c:87:32: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:87:32:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:87:32:    got unsigned int [usertype] *addr
      drivers/of/address.c:91:30: warning: incorrect type in assignment (different base types)
      drivers/of/address.c:91:30:    expected unsigned int [unsigned] [usertype] <noident>
      drivers/of/address.c:91:30:    got restricted __be32 [usertype] <noident>
      drivers/of/address.c:92:22: warning: incorrect type in assignment (different base types)
      drivers/of/address.c:92:22:    expected unsigned int [unsigned] [usertype] <noident>
      drivers/of/address.c:92:22:    got restricted __be32 [usertype] <noident>
      drivers/of/address.c:147:35: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:147:35:    expected restricted __be32 const [usertype] *addr
      drivers/of/address.c:147:35:    got unsigned int [usertype] *addr
      drivers/of/address.c:157:34: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:157:34:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:157:34:    got unsigned int [usertype] *
      drivers/of/address.c:256:29: warning: restricted __be32 degrades to integer
      drivers/of/address.c:256:36: warning: restricted __be32 degrades to integer
      drivers/of/address.c:262:34: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:262:34:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:262:34:    got unsigned int [usertype] *
      drivers/of/address.c:372:41: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:372:41:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:372:41:    got unsigned int [usertype] *addr
      drivers/of/address.c:395:53: warning: incorrect type in argument 2 (different base types)
      drivers/of/address.c:395:53:    expected restricted __be32 const [usertype] *addr
      drivers/of/address.c:395:53:    got unsigned int [usertype] *addr
      drivers/of/address.c:443:50: warning: incorrect type in argument 2 (different base types)
      drivers/of/address.c:443:50:    expected restricted __be32 const [usertype] *addr
      drivers/of/address.c:443:50:    got unsigned int *<noident>
      drivers/of/address.c:455:49: warning: incorrect type in argument 1 (different base types)
      drivers/of/address.c:455:49:    expected restricted __be32 const [usertype] *cell
      drivers/of/address.c:455:49:    got unsigned int *<noident>
      drivers/of/address.c:480:60: warning: incorrect type in argument 2 (different base types)
      drivers/of/address.c:480:60:    expected restricted __be32 const [usertype] *addr
      drivers/of/address.c:480:60:    got unsigned int *<noident>
      drivers/of/address.c:412:5: warning: symbol '__of_translate_address' was not declared. Should it be static?
      drivers/of/address.c:520:14: error: symbol 'of_get_address' redeclared with different type (originally declared at include/linux/of_address.h:22) - different base types
      Signed-off-by: NKim Phillips <kim.phillips@freescale.com>
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      47b1e689
    • O
      of: add stub of_get_child_by_name for non-OF builds · 25c040c9
      Olof Johansson 提交于
      Fixes build error on s3c6400_defconfig, introduced by commit
      06455bbc, "dt/s3c64xx/spi: Use
      of_get_child_by_name to get a named child".
      
      drivers/spi/spi-s3c64xx.c: In function 's3c64xx_get_slave_ctrldata':
      drivers/spi/spi-s3c64xx.c:838:2: error: implicit declaration of function
          'of_get_child_by_name' [-Werror=implicit-function-declaration]
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      25c040c9
    • G
      USB: usb.h: remove dbg() macro · 2c78040c
      Greg Kroah-Hartman 提交于
      There are no users of this macro anymore in the kernel tree, so finally
      delete it.
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c78040c
  16. 17 10月, 2012 8 次提交