1. 26 10月, 2012 2 次提交
    • A
      device_cgroup: rename deny_all to behavior · 5b7aa7d5
      Aristeu Rozanski 提交于
      This was done in a v2 patch but v1 ended up being committed.  The
      variable name is less confusing and stores the default behavior when no
      matching exception exists.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b7aa7d5
    • J
      cgroup: fix invalid rcu dereference · 8c9506d1
      Jiri Slaby 提交于
      Commit ad676077 ("device_cgroup: convert device_cgroup internally to
      policy + exceptions") removed rcu locks which are needed in
      task_devcgroup called in this chain:
      
        devcgroup_inode_mknod OR __devcgroup_inode_permission ->
          __devcgroup_inode_permission ->
            task_devcgroup ->
              task_subsys_state ->
                task_subsys_state_check.
      
      Change the code so that task_devcgroup is safely called with rcu read
      lock held.
      
        ===============================
        [ INFO: suspicious RCU usage. ]
        3.6.0-rc5-next-20120913+ #42 Not tainted
        -------------------------------
        include/linux/cgroup.h:553 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 0
        2 locks held by kdevtmpfs/23:
         #0:  (sb_writers){.+.+.+}, at: [<ffffffff8116873f>]
        mnt_want_write+0x1f/0x50
         #1:  (&sb->s_type->i_mutex_key#3/1){+.+.+.}, at: [<ffffffff811558af>]
        kern_path_create+0x7f/0x170
      
        stack backtrace:
        Pid: 23, comm: kdevtmpfs Not tainted 3.6.0-rc5-next-20120913+ #42
        Call Trace:
          lockdep_rcu_suspicious+0xfd/0x130
          devcgroup_inode_mknod+0x19d/0x240
          vfs_mknod+0x71/0xf0
          handle_create.isra.2+0x72/0x200
          devtmpfsd+0x114/0x140
          ? handle_create.isra.2+0x200/0x200
          kthread+0xd6/0xe0
          kernel_thread_helper+0x4/0x10
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c9506d1
  2. 06 10月, 2012 4 次提交
  3. 15 9月, 2012 1 次提交
    • T
      cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them · 8c7f6edb
      Tejun Heo 提交于
      Currently, cgroup hierarchy support is a mess.  cpu related subsystems
      behave correctly - configuration, accounting and control on a parent
      properly cover its children.  blkio and freezer completely ignore
      hierarchy and treat all cgroups as if they're directly under the root
      cgroup.  Others show yet different behaviors.
      
      These differing interpretations of cgroup hierarchy make using cgroup
      confusing and it impossible to co-mount controllers into the same
      hierarchy and obtain sane behavior.
      
      Eventually, we want full hierarchy support from all subsystems and
      probably a unified hierarchy.  Users using separate hierarchies
      expecting completely different behaviors depending on the mounted
      subsystem is deterimental to making any progress on this front.
      
      This patch adds cgroup_subsys.broken_hierarchy and sets it to %true
      for controllers which are lacking in hierarchy support.  The goal of
      this patch is two-fold.
      
      * Move users away from using hierarchy on currently non-hierarchical
        subsystems, so that implementing proper hierarchy support on those
        doesn't surprise them.
      
      * Keep track of which controllers are broken how and nudge the
        subsystems to implement proper hierarchy support.
      
      For now, start with a single warning message.  We can whine louder
      later on.
      
      v2: Fixed a typo spotted by Michal. Warning message updated.
      
      v3: Updated memcg part so that it doesn't generate warning in the
          cases where .use_hierarchy=false doesn't make the behavior
          different from root.use_hierarchy=true.  Fixed a typo spotted by
          Glauber.
      
      v4: Check ->broken_hierarchy after cgroup creation is complete so that
          ->create() can affect the result per Michal.  Dropped unnecessary
          memcg root handling per Michal.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      8c7f6edb
  4. 02 4月, 2012 1 次提交
    • T
      cgroup: convert all non-memcg controllers to the new cftype interface · 4baf6e33
      Tejun Heo 提交于
      Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio,
      net_cls and device controllers to use the new cftype based interface.
      Termination entry is added to cftype arrays and populate callbacks are
      replaced with cgroup_subsys->base_cftypes initializations.
      
      This is functionally identical transformation.  There shouldn't be any
      visible behavior change.
      
      memcg is rather special and will be converted separately.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Paul Menage <paul@paulmenage.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      4baf6e33
  5. 03 2月, 2012 1 次提交
    • L
      cgroup: remove cgroup_subsys argument from callbacks · 761b3ef5
      Li Zefan 提交于
      The argument is not used at all, and it's not necessary, because
      a specific callback handler of course knows which subsys it
      belongs to.
      
      Now only ->pupulate() takes this argument, because the handlers of
      this callback always call cgroup_add_file()/cgroup_add_files().
      
      So we reduce a few lines of code, though the shrinking of object size
      is minimal.
      
       16 files changed, 113 insertions(+), 162 deletions(-)
      
         text    data     bss     dec     hex filename
      5486240  656987 7039960 13183187         c928d3 vmlinux.o.orig
      5486170  656987 7039960 13183117         c9288d vmlinux.o
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      761b3ef5
  6. 13 12月, 2011 1 次提交
  7. 21 7月, 2011 1 次提交
  8. 20 6月, 2011 1 次提交
  9. 27 5月, 2011 1 次提交
    • B
      cgroups: add per-thread subsystem callbacks · f780bdb7
      Ben Blum 提交于
      Add cgroup subsystem callbacks for per-thread attachment in atomic contexts
      
      Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
      for cgroups's subsystem interface.  Unlike can_attach and attach, these
      are for per-thread operations, to be called potentially many times when
      attaching an entire threadgroup.
      
      Also, the old "bool threadgroup" interface is removed, as replaced by
      this.  All subsystems are modified for the new interface - of note is
      cpuset, which requires from/to nodemasks for attach to be globally scoped
      (though per-cpuset would work too) to persist from its pre_attach to
      attach_task and attach.
      
      This is a pre-patch for cgroup-procs-writable.patch.
      Signed-off-by: NBen Blum <bblum@andrew.cmu.edu>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Reviewed-by: NPaul Menage <menage@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f780bdb7
  10. 23 4月, 2010 1 次提交
  11. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  12. 24 9月, 2009 1 次提交
  13. 19 6月, 2009 1 次提交
  14. 03 4月, 2009 1 次提交
  15. 09 1月, 2009 2 次提交
  16. 20 10月, 2008 3 次提交
  17. 03 9月, 2008 1 次提交
  18. 26 7月, 2008 3 次提交
  19. 14 7月, 2008 2 次提交
  20. 05 7月, 2008 1 次提交
  21. 07 6月, 2008 3 次提交
  22. 29 4月, 2008 2 次提交
    • S
      cgroups: introduce cft->read_seq() · 29486df3
      Serge E. Hallyn 提交于
      Introduce a read_seq() helper in cftype, which uses seq_file to print out
      lists.  Use it in the devices cgroup.  Also split devices.allow into two
      files, so now devices.deny and devices.allow are the ones to use to manipulate
      the whitelist, while devices.list outputs the cgroup's current whitelist.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NPaul Menage <menage@google.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29486df3
    • S
      cgroups: implement device whitelist · 08ce5f16
      Serge E. Hallyn 提交于
      Implement a cgroup to track and enforce open and mknod restrictions on device
      files.  A device cgroup associates a device access whitelist with each cgroup.
       A whitelist entry has 4 fields.  'type' is a (all), c (char), or b (block).
      'all' means it applies to all types and all major and minor numbers.  Major
      and minor are either an integer or * for all.  Access is a composition of r
      (read), w (write), and m (mknod).
      
      The root device cgroup starts with rwm to 'all'.  A child devcg gets a copy of
      the parent.  Admins can then remove devices from the whitelist or add new
      entries.  A child cgroup can never receive a device access which is denied its
      parent.  However when a device access is removed from a parent it will not
      also be removed from the child(ren).
      
      An entry is added using devices.allow, and removed using
      devices.deny.  For instance
      
      	echo 'c 1:3 mr' > /cgroups/1/devices.allow
      
      allows cgroup 1 to read and mknod the device usually known as
      /dev/null.  Doing
      
      	echo a > /cgroups/1/devices.deny
      
      will remove the default 'a *:* mrw' entry.
      
      CAP_SYS_ADMIN is needed to change permissions or move another task to a new
      cgroup.  A cgroup may not be granted more permissions than the cgroup's parent
      has.  Any task can move itself between cgroups.  This won't be sufficient, but
      we can decide the best way to adequately restrict movement later.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix may-be-used-uninitialized warning]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Looks-good-to: Pavel Emelyanov <xemul@openvz.org>
      Cc: Daniel Hokka Zakrisson <daniel@hozac.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08ce5f16