1. 24 7月, 2014 35 次提交
  2. 07 7月, 2014 1 次提交
    • Y
      workqueue: zero cpumask of wq_numa_possible_cpumask on init · 5a6024f1
      Yasuaki Ishimatsu 提交于
      When hot-adding and onlining CPU, kernel panic occurs, showing following
      call trace.
      
        BUG: unable to handle kernel paging request at 0000000000001d08
        IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10
        PGD 0
        Oops: 0000 [#1] SMP
        ...
        Call Trace:
         [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50
         [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0
         [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0
         [<ffffffff811926f1>] new_slab+0x91/0x300
         [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482
         [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0
         [<ffffffff810a3c78>] ? load_balance+0x218/0x890
         [<ffffffff8101a679>] ? sched_clock+0x9/0x10
         [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10
         [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200
         [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0
         [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60
         [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140
         [<ffffffff8105d0ec>] do_fork+0xbc/0x360
         [<ffffffff8105d3b6>] kernel_thread+0x26/0x30
         [<ffffffff81086652>] kthreadd+0x2c2/0x300
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
         [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
      
      In my investigation, I found the root cause is wq_numa_possible_cpumask.
      All entries of wq_numa_possible_cpumask is allocated by
      alloc_cpumask_var_node(). And these entries are used without initializing.
      So these entries have wrong value.
      
      When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
      wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
      calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set
      as follow:
      
      3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
      3593         if (wq_numa_enabled) {
      3594                 for_each_node(node) {
      3595                         if (cpumask_subset(pool->attrs->cpumask,
      3596                                            wq_numa_possible_cpumask[node])) {
      3597                                 pool->node = node;
      3598                                 break;
      3599                         }
      3600                 }
      3601         }
      
      But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
      node is selected. As a result, kernel panic occurs.
      
      By this patch, all entries of wq_numa_possible_cpumask are allocated by
      zalloc_cpumask_var_node to initialize them. And the panic disappeared.
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: bce90380 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]")
      5a6024f1
  3. 06 7月, 2014 1 次提交
  4. 04 7月, 2014 1 次提交
  5. 02 7月, 2014 2 次提交
    • T
      cpuset: break kernfs active protection in cpuset_write_resmask() · 76bb5ab8
      Tejun Heo 提交于
      Writing to either "cpuset.cpus" or "cpuset.mems" file flushes
      cpuset_hotplug_work so that cpu or memory hotunplug doesn't end up
      migrating tasks off a cpuset after new resources are added to it.
      
      As cpuset_hotplug_work calls into cgroup core via
      cgroup_transfer_tasks(), this flushing adds the dependency to cgroup
      core locking from cpuset_write_resmak().  This used to be okay because
      cgroup interface files were protected by a different mutex; however,
      8353da1f ("cgroup: remove cgroup_tree_mutex") simplified the
      cgroup core locking and this dependency became a deadlock hazard -
      cgroup file removal performed under cgroup core lock tries to drain
      on-going file operation which is trying to flush cpuset_hotplug_work
      blocked on the same cgroup core lock.
      
      The locking simplification was done because kernfs added an a lot
      easier way to deal with circular dependencies involving kernfs active
      protection.  Let's use the same strategy in cpuset and break active
      protection in cpuset_write_resmask().  While it isn't the prettiest,
      this is a very rare, likely unique, situation which also goes away on
      the unified hierarchy.
      
      The commands to trigger the deadlock warning without the patch and the
      lockdep output follow.
      
       localhost:/ # mount -t cgroup -o cpuset xxx /cpuset
       localhost:/ # mkdir /cpuset/tmp
       localhost:/ # echo 1 > /cpuset/tmp/cpuset.cpus
       localhost:/ # echo 0 > cpuset/tmp/cpuset.mems
       localhost:/ # echo $$ > /cpuset/tmp/tasks
       localhost:/ # echo 0 > /sys/devices/system/cpu/cpu1/online
      
        ======================================================
        [ INFO: possible circular locking dependency detected ]
        3.16.0-rc1-0.1-default+ #7 Not tainted
        -------------------------------------------------------
        kworker/1:0/32649 is trying to acquire lock:
         (cgroup_mutex){+.+.+.}, at: [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150
      
        but task is already holding lock:
         (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #2 (cpuset_hotplug_work){+.+...}:
        ...
        -> #1 (s_active#175){++++.+}:
        ...
        -> #0 (cgroup_mutex){+.+.+.}:
        ...
      
        other info that might help us debug this:
      
        Chain exists of:
          cgroup_mutex --> s_active#175 --> cpuset_hotplug_work
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(cpuset_hotplug_work);
      				 lock(s_active#175);
      				 lock(cpuset_hotplug_work);
          lock(cgroup_mutex);
      
         *** DEADLOCK ***
      
        2 locks held by kworker/1:0/32649:
         #0:  ("events"){.+.+.+}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
         #1:  (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
      
        stack backtrace:
        CPU: 1 PID: 32649 Comm: kworker/1:0 Not tainted 3.16.0-rc1-0.1-default+ #7
       ...
        Call Trace:
         [<ffffffff815a5f78>] dump_stack+0x72/0x8a
         [<ffffffff810c263f>] print_circular_bug+0x10f/0x120
         [<ffffffff810c481e>] check_prev_add+0x43e/0x4b0
         [<ffffffff810c4ee6>] validate_chain+0x656/0x7c0
         [<ffffffff810c53d2>] __lock_acquire+0x382/0x660
         [<ffffffff810c57a9>] lock_acquire+0xf9/0x170
         [<ffffffff815aa13f>] mutex_lock_nested+0x6f/0x380
         [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150
         [<ffffffff811129c0>] hotplug_update_tasks_insane+0x110/0x1d0
         [<ffffffff81112bbd>] cpuset_hotplug_update_tasks+0x13d/0x180
         [<ffffffff811148ec>] cpuset_hotplug_workfn+0x18c/0x630
         [<ffffffff810854d4>] process_one_work+0x254/0x520
         [<ffffffff810875dd>] worker_thread+0x13d/0x3d0
         [<ffffffff8108e0c8>] kthread+0xf8/0x100
         [<ffffffff815acaec>] ret_from_fork+0x7c/0xb0
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NLi Zefan <lizefan@huawei.com>
      Tested-by: NLi Zefan <lizefan@huawei.com>
      76bb5ab8
    • S
      tracing: Remove ftrace_stop/start() from reading the trace file · 099ed151
      Steven Rostedt (Red Hat) 提交于
      Disabling reading and writing to the trace file should not be able to
      disable all function tracing callbacks. There's other users today
      (like kprobes and perf). Reading a trace file should not stop those
      from happening.
      
      Cc: stable@vger.kernel.org # 3.0+
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      099ed151