1. 10 11月, 2012 4 次提交
    • T
      cgroup_freezer: trivial cleanups · bcd66c89
      Tejun Heo 提交于
      * Clean-up indentation and line-breaks.  Drop the invalid comment
        about freezer->lock.
      
      * Make all internal functions take @freezer instead of both @cgroup
        and @freezer.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      bcd66c89
    • T
      cgroup: implement generic child / descendant walk macros · 574bd9f7
      Tejun Heo 提交于
      Currently, cgroup doesn't provide any generic helper for walking a
      given cgroup's children or descendants.  This patch adds the following
      three macros.
      
      * cgroup_for_each_child() - walk immediate children of a cgroup.
      
      * cgroup_for_each_descendant_pre() - visit all descendants of a cgroup
        in pre-order tree traversal.
      
      * cgroup_for_each_descendant_post() - visit all descendants of a
        cgroup in post-order tree traversal.
      
      All three only require the user to hold RCU read lock during
      traversal.  Verifying that each iterated cgroup is online is the
      responsibility of the user.  When used with proper synchronization,
      cgroup_for_each_descendant_pre() can be used to propagate state
      updates to descendants in reliable way.  See comments for details.
      
      v2: s/config/state/ in commit message and comments per Michal.  More
          documentation on synchronization rules.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujisu.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      574bd9f7
    • T
      cgroup: use rculist ops for cgroup->children · eb6fd504
      Tejun Heo 提交于
      Use RCU safe list operations for cgroup->children.  This will be used
      to implement cgroup children / descendant walking which can be used by
      controllers.
      
      Note that cgroup_create() now puts a new cgroup at the end of the
      ->children list instead of head.  This isn't strictly necessary but is
      done so that the iteration order is more conventional.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      eb6fd504
    • T
      cgroup: add cgroup_subsys->post_create() · a8638030
      Tejun Heo 提交于
      Currently, there's no way for a controller to find out whether a new
      cgroup finished all ->create() allocatinos successfully and is
      considered "live" by cgroup.
      
      This becomes a problem later when we add generic descendants walking
      to cgroup which can be used by controllers as controllers don't have a
      synchronization point where it can synchronize against new cgroups
      appearing in such walks.
      
      This patch adds ->post_create().  It's called after all ->create()
      succeeded and the cgroup is linked into the generic cgroup hierarchy.
      This plays the counterpart of ->pre_destroy().
      
      When used in combination with the to-be-added generic descendant
      iterators, ->post_create() can be used to implement reliable state
      inheritance.  It will be explained with the descendant iterators.
      
      v2: Added a paragraph about its future use w/ descendant iterators per
          Michal.
      
      v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
          Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Glauber Costa <glommer@parallels.com>
      a8638030
  2. 08 11月, 2012 1 次提交
  3. 07 11月, 2012 3 次提交
    • T
      device_cgroup: add lockdep asserts · 4b1c7840
      Tejun Heo 提交于
      device_cgroup uses RCU safe ->exceptions list which is write-protected
      by devcgroup_mutex and has had some issues using locking correctly.
      Add lockdep asserts to utility functions so that future errors can be
      easily detected.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      4b1c7840
    • T
      Merge branch 'cgroup/for-3.7-fixes' into cgroup/for-3.8 · 5b805f2a
      Tejun Heo 提交于
      This is to receive device_cgroup fixes so that further device_cgroup
      changes can be made in cgroup/for-3.8.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5b805f2a
    • T
      device_cgroup: fix RCU usage · 201e72ac
      Tejun Heo 提交于
      dev_cgroup->exceptions is protected with devcgroup_mutex for writes
      and RCU for reads; however, RCU usage isn't correct.
      
      * dev_exception_clean() doesn't use RCU variant of list_del() and
        kfree().  The function can race with may_access() and may_access()
        may end up dereferencing already freed memory.  Use list_del_rcu()
        and kfree_rcu() instead.
      
      * may_access() may be called only with RCU read locked but doesn't use
        RCU safe traversal over ->exceptions.  Use list_for_each_entry_rcu().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: stable@vger.kernel.org
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      201e72ac
  4. 06 11月, 2012 10 次提交
    • A
      device_cgroup: fix unchecked cgroup parent usage · 64e10477
      Aristeu Rozanski 提交于
      In 4cef7299 ("device_cgroup: add proper checking when changing
      default behavior") the cgroup parent usage is unchecked.  root will not
      have a parent and trying to use device.{allow,deny} will cause problems.
      For some reason my stressing scripts didn't test the root directory so I
      didn't catch it on my regular tests.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      64e10477
    • T
      Merge branch 'cgroup-rmdir-updates' into cgroup/for-3.8 · 1db1e31b
      Tejun Heo 提交于
      Pull rmdir updates into for-3.8 so that further callback updates can
      be put on top.  This pull created a trivial conflict between the
      following two commits.
      
        8c7f6edb ("cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them")
        ed957793 ("cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs")
      
      The former added a field to cgroup_subsys and the latter removed one
      from it.  They happen to be colocated causing the conflict.  Keeping
      what's added and removing what's removed resolves the conflict.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1db1e31b
    • T
      cgroup: make ->pre_destroy() return void · bcf6de1b
      Tejun Heo 提交于
      All ->pre_destory() implementations return 0 now, which is the only
      allowed return value.  Make it return void.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      bcf6de1b
    • M
      hugetlb: do not fail in hugetlb_cgroup_pre_destroy · 9d093cb1
      Michal Hocko 提交于
      Now that pre_destroy callbacks are called from the context where neither
      any task can attach the group nor any children group can be added there
      is no other way to fail from hugetlb_pre_destroy.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9d093cb1
    • M
      memcg: make mem_cgroup_reparent_charges non failing · ab5196c2
      Michal Hocko 提交于
      Now that pre_destroy callbacks are called from the context where neither
      any task can attach the group nor any children group can be added there
      is no other way to fail from mem_cgroup_pre_destroy.
      mem_cgroup_pre_destroy doesn't have to take a reference to memcg's css
      because all css' are marked dead already.
      
      tj: Remove now unused local variable @cgrp from
          mem_cgroup_reparent_charges().
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ab5196c2
    • T
      cgroup: remove CGRP_WAIT_ON_RMDIR, cgroup_exclude_rmdir() and cgroup_release_and_wakeup_rmdir() · b25ed609
      Tejun Heo 提交于
      CGRP_WAIT_ON_RMDIR is another kludge which was added to make cgroup
      destruction rollback somewhat working.  cgroup_rmdir() used to drain
      CSS references and CGRP_WAIT_ON_RMDIR and the associated waitqueue and
      helpers were used to allow the task performing rmdir to wait for the
      next relevant event.
      
      Unfortunately, the wait is visible to controllers too and the
      mechanism got exposed to memcg by 88703267 ("cgroup avoid permanent
      sleep at rmdir").
      
      Now that the draining and retries are gone, CGRP_WAIT_ON_RMDIR is
      unnecessary.  Remove it and all the mechanisms supporting it.  Note
      that memcontrol.c changes are essentially revert of 88703267
      ("cgroup avoid permanent sleep at rmdir").
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      b25ed609
    • T
      cgroup: deactivate CSS's and mark cgroup dead before invoking ->pre_destroy() · 1a90dd50
      Tejun Heo 提交于
      Because ->pre_destroy() could fail and can't be called under
      cgroup_mutex, cgroup destruction did something very ugly.
      
        1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.
      
        2. Release cgroup_mutex and call ->pre_destroy().
      
        3. Re-grab cgroup_mutex and verify it can still be destroyed; fail
           otherwise.
      
        4. Continue destroying.
      
      In addition to being ugly, it has been always broken in various ways.
      For example, memcg ->pre_destroy() expects the cgroup to be inactive
      after it's done but tasks can be attached and detached between #2 and
      #3 and the conditions that memcg verified in ->pre_destroy() might no
      longer hold by the time control reaches #3.
      
      Now that ->pre_destroy() is no longer allowed to fail.  We can switch
      to the following.
      
        1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.
      
        2. Deactivate CSS's and mark the cgroup removed thus preventing any
           further operations which can invalidate the verification from #1.
      
        3. Release cgroup_mutex and call ->pre_destroy().
      
        4. Re-grab cgroup_mutex and continue destroying.
      
      After this change, controllers can safely assume that ->pre_destroy()
      will only be called only once for a given cgroup and, once
      ->pre_destroy() is called, the cgroup will stay dormant till it's
      destroyed.
      
      This removes the only reason ->pre_destroy() can fail - new task being
      attached or child cgroup being created inbetween.  Error out path is
      removed and ->pre_destroy() invocation is open coded in
      cgroup_rmdir().
      
      v2: cgroup_call_pre_destroy() removal moved to this patch per Michal.
          Commit message updated per Glauber.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Glauber Costa <glommer@parallels.com>
      1a90dd50
    • T
      cgroup: use cgroup_lock_live_group(parent) in cgroup_create() · 976c06bc
      Tejun Heo 提交于
      This patch makes cgroup_create() fail if @parent is marked removed.
      This is to prepare for further updates to cgroup_rmdir() path.
      
      Note that this change isn't strictly necessary.  cgroup can only be
      created via mkdir and the removed marking and dentry removal happen
      without releasing cgroup_mutex, so cgroup_create() can never race with
      cgroup_rmdir().  Even after the scheduled updates to cgroup_rmdir(),
      cgroup_mkdir() and cgroup_rmdir() are synchronized by i_mutex
      rendering the added liveliness check unnecessary.
      
      Do it anyway such that locking is contained inside cgroup proper and
      we don't get nasty surprises if we ever grow another caller of
      cgroup_create().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      976c06bc
    • T
      cgroup: kill CSS_REMOVED · e9316080
      Tejun Heo 提交于
      CSS_REMOVED is one of the several contortions which were necessary to
      support css reference draining on cgroup removal.  All css->refcnts
      which need draining should be deactivated and verified to equal zero
      atomically w.r.t. css_tryget().  If any one isn't zero, all refcnts
      needed to be re-activated and css_tryget() shouldn't fail in the
      process.
      
      This was achieved by letting css_tryget() busy-loop until either the
      refcnt is reactivated (failed removal attempt) or CSS_REMOVED is set
      (committing to removal).
      
      Now that css refcnt draining is no longer used, there's no need for
      atomic rollback mechanism.  css_tryget() simply can look at the
      reference count and fail if it's deactivated - it's never getting
      re-activated.
      
      This patch removes CSS_REMOVED and updates __css_tryget() to fail if
      the refcnt is deactivated.  As deactivation and removal are a single
      step now, they no longer need to be protected against css_tryget()
      happening from irq context.  Remove local_irq_disable/enable() from
      cgroup_rmdir().
      
      Note that this removes css_is_removed() whose only user is VM_BUG_ON()
      in memcontrol.c.  We can replace it with a check on the refcnt but
      given that the only use case is a debug assert, I think it's better to
      simply unexport it.
      
      v2: Comment updated and explanation on local_irq_disable/enable()
          added per Michal Hocko.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      e9316080
    • T
      cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs · ed957793
      Tejun Heo 提交于
      2ef37d3f ("memcg: Simplify mem_cgroup_force_empty_list error
      handling") removed the last user of __DEPRECATED_clear_css_refs.  This
      patch removes __DEPRECATED_clear_css_refs and mechanisms to support
      it.
      
      * Conditionals dependent on __DEPRECATED_clear_css_refs removed.
      
      * cgroup_clear_css_refs() can no longer fail.  All that needs to be
        done are deactivating refcnts, setting CSS_REMOVED and putting the
        base reference on each css.  Remove cgroup_clear_css_refs() and the
        failure path, and open-code the loops into cgroup_rmdir().
      
      This patch keeps the two for_each_subsys() loops separate while open
      coding them.  They can be merged now but there are scheduled changes
      which need them to be separate, so keep them separate to reduce the
      amount of churn.
      
      local_irq_save/restore() from cgroup_clear_css_refs() are replaced
      with local_irq_disable/enable() for simplicity.  This is safe as
      cgroup_rmdir() is always called with IRQ enabled.  Note that this IRQ
      switching is necessary to ensure that css_tryget() isn't called from
      IRQ context on the same CPU while lower context is between CSS
      deactivation and setting CSS_REMOVED as css_tryget() would hang
      forever in such cases waiting for CSS to be re-activated or
      CSS_REMOVED set.  This will go away soon.
      
      v2: cgroup_call_pre_destroy() removal dropped per Michal.  Commit
          message updated to explain local_irq_disable/enable() conversion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      ed957793
  5. 05 11月, 2012 1 次提交
  6. 04 11月, 2012 4 次提交
    • L
      Merge tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · d4164973
      Linus Torvalds 提交于
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Fix a bunch of deadlock situations:
         * State recovery can deadlock if we fail to release sequence ids
           before scheduling the recovery thread.
         * Calling deactivate_super() from an RPC workqueue thread can
           deadlock because of the call to rpc_shutdown_client.
      
       - Display the device name correctly in /proc/*/mounts
      
       - Fix a number of incorrect error return values:
         * When NFSv3 mounts fail due to a timeout.
         * On NFSv4.1 backchannel setup failure
         * On NFSv4 open access checks
      
       - pnfs_find_alloc_layout() must check the layout pointer for NULL
      
       - Fix a regression in the legacy DNS resolved
      
      * tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS4: nfs4_opendata_access should return errno
        NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
        SUNRPC: return proper errno from backchannel_rqst
        NFS: add nfs_sb_deactive_async to avoid deadlock
        nfs: Show original device name verbatim in /proc/*/mount{s,info}
        nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeouts
        nfs: Check whether a layout pointer is NULL before free it
        NFS: fix bug in legacy DNS resolver.
        NFSv4: nfs4_locku_done must release the sequence id
        NFSv4.1: We must release the sequence id when we fail to get a session slot
        NFS: Wait for session recovery to finish before returning
      d4164973
    • L
      Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 225ff868
      Linus Torvalds 提交于
      Pull thermal management & ACPI update from Zhang Rui,
      
      Ho humm.  Normally these things go through Len.  But it's just three
      small fixes, I guess I can pull directly too.
      
      * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        exynos4_tmu_driver_ids should be exynos_tmu_driver_ids.
        ACPI video: Ignore errors after _DOD evaluation.
        thermal: solve compilation errors in rcar_thermal
      225ff868
    • L
      Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux · 209c510e
      Linus Torvalds 提交于
      Pull i2c embedded fixes from Wolfram Sang:
       "Two patches are usual stuff.
      
        The bigger patch is needed to correct a wrong decision made in this
        merge window.  We hoped to get the PIOQUEUE mode in the mxs driver
        working with DMA, but it turned out to be too broken (leading to data
        loss), so we now think it is best to remove it entirely and work only
        with DMA now.  The patch should be in 3.7.  IMO, so users never get
        the chance to use both modes in parallel."
      
      * 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
        i2c: tegra: set irq name as device name
        i2c-nomadik: Fixup clock handling
        i2c: mxs: remove broken PIOQUEUE support
      209c510e
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 53f9313f
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Scattered selection of fixes:
      
         - radeon: load detect fixes from SuSE/AMD
         - intel: misc i830, sdvo regression, vesafb kickoff ums fix
         - exynos: maintainers entry update + fixes
         - udl: fix stride scanout issue
      
        it's slightly bigger than I'd probably like, but nothing looked
        dangerous enough to hold off on."
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/udl: fix stride issues scanning out stride != width*bpp
        drm/radeon: add load detection support for ext DAC on R200 (v2)
        DRM/radeon: For single CRTC GPUs move handling of CRTC_CRT_ON to crtc_dpms().
        DRM/Radeon: Fix TV DAC Load Detection for single CRTC chips.
        DRM/Radeon: Clean up code in TV DAC load detection.
        drm/radeon: fix ATPX function documentation
        drivers/gpu/drm/radeon/evergreen_cs.c: Remove unnecessary semicolon
        DRM/Radeon: On DVI-I use Load Detection when EDID is bogus.
        DRM/Radeon: Fix primary DAC Load Detection for RV100 chips.
        DRM/Radeon: Fix Load Detection on legacy primary DAC.
        drm: exynos: removed warning due to missing typecast for mixer driver data
        drm/exynos: add support for ARCH_MULTIPLATFORM
        MAINTAINERS: Add git repository for Exynos DRM
        drm/exynos: fix display on issue
        drm/i915: Only kick out vesafb if we takeover the fbcon with KMS
        drm/i915: be less verbose about inability to provide vendor backlight
        drm/i915: clear the entire sdvo infoframe buffer
        drm/i915: VGA needs to be on pipe A on i830M
        drm/i915: fix overlay on i830M
      53f9313f
  7. 03 11月, 2012 17 次提交