1. 27 2月, 2014 1 次提交
    • L
      cpuset: fix a locking issue in cpuset_migrate_mm() · 47295830
      Li Zefan 提交于
      I can trigger a lockdep warning:
      
        # mount -t cgroup -o cpuset xxx /cgroup
        # mkdir /cgroup/cpuset
        # mkdir /cgroup/tmp
        # echo 0 > /cgroup/tmp/cpuset.cpus
        # echo 0 > /cgroup/tmp/cpuset.mems
        # echo 1 > /cgroup/tmp/cpuset.memory_migrate
        # echo $$ > /cgroup/tmp/tasks
        # echo 1 > /cgruop/tmp/cpuset.mems
      
        ===============================
        [ INFO: suspicious RCU usage. ]
        3.14.0-rc1-0.1-default+ #32 Not tainted
        -------------------------------
        include/linux/cgroup.h:682 suspicious rcu_dereference_check() usage!
        ...
          [<ffffffff81582174>] dump_stack+0x72/0x86
          [<ffffffff810b8f01>] lockdep_rcu_suspicious+0x101/0x140
          [<ffffffff81105ba1>] cpuset_migrate_mm+0xb1/0xe0
        ...
      
      We used to hold cgroup_mutex when calling cpuset_migrate_mm(), but now
      we hold cpuset_mutex, which causes task_css() to complain.
      
      This is not a false-positive but a real issue.
      
      Holding cpuset_mutex won't prevent a task from migrating to another
      cpuset, and it won't prevent the original task->cgroup from destroying
      during this change.
      
      Fixes: 5d21cc2d (cpuset: replace cgroup_mutex locking with cpuset internal locking)
      Cc: <stable@vger.kernel.org> # 3.9+
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Sigend-off-by: NTejun Heo <tj@kernel.org>
      47295830
  2. 19 2月, 2014 1 次提交
    • T
      cgroup: update cgroup_enable_task_cg_lists() to grab siglock · 532de3fc
      Tejun Heo 提交于
      Currently, there's nothing preventing cgroup_enable_task_cg_lists()
      from missing set PF_EXITING and race against cgroup_exit().  Depending
      on the timing, cgroup_exit() may finish with the task still linked on
      css_set leading to list corruption.  Fix it by grabbing siglock in
      cgroup_enable_task_cg_lists() so that PF_EXITING is guaranteed to be
      visible.
      
      This whole on-demand cg_list optimization is extremely fragile and has
      ample possibility to lead to bugs which can cause things like
      once-a-year oops during boot.  I'm wondering whether the better
      approach would be just adding "cgroup_disable=all" handling which
      disables the whole cgroup rather than tempting fate with this
      on-demand craziness.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      532de3fc
  3. 13 2月, 2014 1 次提交
    • T
      Revert "cgroup: use an ordered workqueue for cgroup destruction" · 1a11533f
      Tejun Heo 提交于
      This reverts commit ab3f5faa.
      Explanation from Hugh:
      
        It's because more thorough testing, by others here, found that it
        wasn't always solving the problem: so I asked Tejun privately to
        hold off from sending it in, until we'd worked out why not.
      
        Most of our testing being on a v3,11-based kernel, it was perfectly
        possible that the problem was merely our own e.g. missing Tejun's
        8a2b7538 ("workqueue: fix ordered workqueues in NUMA setups").
      
        But that turned out not to be enough to fix it either. Then Filipe
        pointed out how percpu_ref_kill_and_confirm() uses call_rcu_sched()
        before we ever get to put the offline on to the workqueue: by the
        time we get to the workqueue, the ordering has already been lost.
      
        So, thanks for the Acks, but I'm afraid that this ordered workqueue
        solution is just not good enough: we should simply forget that patch
        and provide a different answer."
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      1a11533f
  4. 11 2月, 2014 1 次提交
    • L
      cgroup: protect modifications to cgroup_idr with cgroup_mutex · 0ab02ca8
      Li Zefan 提交于
      Setup cgroupfs like this:
        # mount -t cgroup -o cpuacct xxx /cgroup
        # mkdir /cgroup/sub1
        # mkdir /cgroup/sub2
      
      Then run these two commands:
        # for ((; ;)) { mkdir /cgroup/sub1/tmp && rmdir /mnt/sub1/tmp; } &
        # for ((; ;)) { mkdir /cgroup/sub2/tmp && rmdir /mnt/sub2/tmp; } &
      
      After seconds you may see this warning:
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
      idr_remove called for id=6 which is not allocated.
      ...
      Call Trace:
       [<ffffffff8156063c>] dump_stack+0x7a/0x96
       [<ffffffff810591ac>] warn_slowpath_common+0x8c/0xc0
       [<ffffffff81059296>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff81300aa7>] sub_remove+0x87/0x1b0
       [<ffffffff810f3f02>] ? css_killed_work_fn+0x32/0x1b0
       [<ffffffff81300bf5>] idr_remove+0x25/0xd0
       [<ffffffff810f2bab>] cgroup_destroy_css_killed+0x5b/0xc0
       [<ffffffff810f4000>] css_killed_work_fn+0x130/0x1b0
       [<ffffffff8107cdbc>] process_one_work+0x26c/0x550
       [<ffffffff8107eefe>] worker_thread+0x12e/0x3b0
       [<ffffffff81085f96>] kthread+0xe6/0xf0
       [<ffffffff81570bac>] ret_from_fork+0x7c/0xb0
      ---[ end trace 2d1577ec10cf80d0 ]---
      
      It's because allocating/removing cgroup ID is not properly synchronized.
      
      The bug was introduced when we converted cgroup_ida to cgroup_idr.
      While synchronization is already done inside ida_simple_{get,remove}(),
      users are responsible for concurrent calls to idr_{alloc,remove}().
      
      tj: Refreshed on top of b58c8998 ("cgroup: fix error return from
      cgroup_create()").
      
      Fixes: 4e96ee8e ("cgroup: convert cgroup_ida to cgroup_idr")
      Cc: <stable@vger.kernel.org> #3.12+
      Reported-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      0ab02ca8
  5. 08 2月, 2014 3 次提交
    • T
      cgroup: fix locking in cgroup_cfts_commit() · 48573a89
      Tejun Heo 提交于
      cgroup_cfts_commit() walks the cgroup hierarchy that the target
      subsystem is attached to and tries to apply the file changes.  Due to
      the convolution with inode locking, it can't keep cgroup_mutex locked
      while iterating.  It currently holds only RCU read lock around the
      actual iteration and then pins the found cgroup using dget().
      
      Unfortunately, this is incorrect.  Although the iteration does check
      cgroup_is_dead() before invoking dget(), there's nothing which
      prevents the dentry from going away inbetween.  Note that this is
      different from the usual css iterations where css_tryget() is used to
      pin the css - css_tryget() tests whether the css can be pinned and
      fails if not.
      
      The problem can be solved by simply holding cgroup_mutex instead of
      RCU read lock around the iteration, which actually reduces LOC.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      48573a89
    • T
      cgroup: fix error return from cgroup_create() · b58c8998
      Tejun Heo 提交于
      cgroup_create() was returning 0 after allocation failures.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      b58c8998
    • T
      cgroup: fix error return value in cgroup_mount() · eb46bf89
      Tejun Heo 提交于
      When cgroup_mount() fails to allocate an id for the root, it didn't
      set ret before jumping to unlock_drop ending up returning 0 after a
      failure.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      eb46bf89
  6. 07 2月, 2014 1 次提交
    • H
      cgroup: use an ordered workqueue for cgroup destruction · ab3f5faa
      Hugh Dickins 提交于
      Sometimes the cleanup after memcg hierarchy testing gets stuck in
      mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.
      
      There may turn out to be several causes, but a major cause is this: the
      workitem to offline parent can get run before workitem to offline child;
      parent's mem_cgroup_reparent_charges() circles around waiting for the
      child's pages to be reparented to its lrus, but it's holding cgroup_mutex
      which prevents the child from reaching its mem_cgroup_reparent_charges().
      
      Just use an ordered workqueue for cgroup_destroy_wq.
      
      tj: Committing as the temporary fix until the reverse dependency can
          be removed from memcg.  Comment updated accordingly.
      
      Fixes: e5fca243 ("cgroup: use a dedicated workqueue for cgroup destruction")
      Suggested-by: NFilipe Brandenburger <filbranden@google.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ab3f5faa
  7. 04 2月, 2014 3 次提交
    • T
      nfs: include xattr.h from fs/nfs/nfs3proc.c · 0a6be655
      Tejun Heo 提交于
      fs/nfs/nfs3proc.c is making use of xattr but was getting linux/xattr.h
      indirectly through linux/cgroup.h, which will soon drop the inclusion
      of xattr.h.  Explicitly include linux/xattr.h from nfs3proc.c so that
      compilation doesn't fail when linux/cgroup.h drops linux/xattr.h.
      
      As the following cgroup changes will depend on these changes, it
      probably would be easier to route this through cgroup branch.  Would
      that be okay?
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: linux-nfs@vger.kernel.org
      0a6be655
    • L
      cpuset: update MAINTAINERS entry · 230579d7
      Li Zefan 提交于
      Add mailing list and tree tag to the entry.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      230579d7
    • T
      arm, pm, vmpressure: add missing slab.h includes · 1ff6bbfd
      Tejun Heo 提交于
      arch/arm/mach-tegra/pm.c, kernel/power/console.c and mm/vmpressure.c
      were somehow getting slab.h indirectly through cgroup.h which in turn
      was getting it indirectly through xattr.h.  A scheduled cgroup change
      drops xattr.h inclusion from cgroup.h and breaks compilation of these
      three files.  Add explicit slab.h includes to the three files.
      
      A pending cgroup patch depends on this change and it'd be great if
      this can be routed through cgroup/for-3.14-fixes branch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NStephen Warren <swarren@wwwdotorg.org>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Cc: linux-tegra@vger.kernel.org
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: cgroups@vger.kernel.org
      1ff6bbfd
  8. 03 2月, 2014 16 次提交
    • L
      Linus 3.14-rc1 · 38dbfb59
      Linus Torvalds 提交于
      38dbfb59
    • L
      Merge branch 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 69048e01
      Linus Torvalds 提交于
      Pull parisc updates from Helge Deller:
       "The three major changes in this patchset is a implementation for
        flexible userspace memory maps, cache-flushing fixes (again), and a
        long-discussed ABI change to make EWOULDBLOCK the same value as
        EAGAIN.
      
        parisc has been the only platform where we had EWOULDBLOCK != EAGAIN
        to keep HP-UX compatibility.  Since we will probably never implement
        full HP-UX support, we prefer to drop this compatibility to make it
        easier for us with Linux userspace programs which mostly never checked
        for both values.  We don't expect major fall-outs because of this
        change, and if we face some, we will simply rebuild the necessary
        applications in the debian archives"
      
      * 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: add flexible mmap memory layout support
        parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
        parisc: convert uapi/asm/stat.h to use native types only
        parisc: wire up sched_setattr and sched_getattr
        parisc: fix cache-flushing
        parisc/sti_console: prefer Linux fonts over built-in ROM fonts
      69048e01
    • M
      hpfs: optimize quad buffer loading · 1c0b8a7a
      Mikulas Patocka 提交于
      HPFS needs to load 4 consecutive 512-byte sectors when accessing the
      directory nodes or bitmaps.  We can't switch to 2048-byte block size
      because files are allocated in the units of 512-byte sectors.
      
      Previously, the driver would allocate a 2048-byte area using kmalloc,
      copy the data from four buffers to this area and eventually copy them
      back if they were modified.
      
      In the current implementation of the buffer cache, buffers are allocated
      in the pagecache.  That means that 4 consecutive 512-byte buffers are
      stored in consecutive areas in the kernel address space.  So, we don't
      need to allocate extra memory and copy the content of the buffers there.
      
      This patch optimizes the code to avoid copying the buffers.  It checks
      if the four buffers are stored in contiguous memory - if they are not,
      it falls back to allocating a 2048-byte area and copying data there.
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c0b8a7a
    • M
      hpfs: remember free space · 2cbe5c76
      Mikulas Patocka 提交于
      Previously, hpfs scanned all bitmaps each time the user asked for free
      space using statfs.  This patch changes it so that hpfs scans the
      bitmaps only once, remembes the free space and on next invocation of
      statfs it returns the value instantly.
      
      New versions of wine are hammering on the statfs syscall very heavily,
      making some games unplayable when they're stored on hpfs, with load
      times in minutes.
      
      This should be backported to the stable kernels because it fixes
      user-visible problem (excessive level load times in wine).
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2cbe5c76
    • H
      parisc: add flexible mmap memory layout support · 9dabf60d
      Helge Deller 提交于
      Add support for the flexible mmap memory layout (as described in
      http://lwn.net/Articles/91829). This is especially very interesting on
      parisc since we currently only support 32bit userspace (even with a
      64bit Linux kernel).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      9dabf60d
    • G
      parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc · f5a408d5
      Guy Martin 提交于
      On Linux, only parisc uses a different value for EWOULDBLOCK which
      causes a lot of troubles for applications not checking for both values.
      Since the hpux compat is long dead, make EWOULDBLOCK behave the same as
      all other architectures.
      Signed-off-by: NGuy Martin <gmsoft@tuxicoman.be>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      f5a408d5
    • H
      parisc: convert uapi/asm/stat.h to use native types only · 9391bc77
      Helge Deller 提交于
      The stat.h header file is exported to userspace. Some userspace
      applications failed to compile due to missing/unknown types, so we
      better convert it to use native types only (like it's done on other
      architectures too).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      9391bc77
    • H
      parisc: wire up sched_setattr and sched_getattr · 998bbb2f
      Helge Deller 提交于
      Signed-off-by: NHelge Deller <deller@gmx.de>
      998bbb2f
    • H
      parisc: fix cache-flushing · 57737c49
      Helge Deller 提交于
      This commit:
      f8dae006: parisc: Ensure full cache coherency for kmap/kunmap
      caused negative caching side-effects, e.g. hanging processes with expect and
      too many inequivalent alias messages from flush_dcache_page() on Debian 5 systems.
      
      This patch now partly reverts it and has been in production use on our debian buildd
      makeservers since a week without any major problems.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      57737c49
    • H
      parisc/sti_console: prefer Linux fonts over built-in ROM fonts · 8a10bc9d
      Helge Deller 提交于
      The built-in ROM fonts lack many necessary ASCII characters, which is
      why it makes sens to prefer the Linux fonts instead if they are
      available.  This makes consoles on STI graphics cards which are not
      supported by the stifb driver (e.g. Visualize FXe) looks much nicer.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # v3.13
      8a10bc9d
    • L
      Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 602456bf
      Linus Torvalds 提交于
      Pull hwmon kconfig fixes from Jean Delvare.
      
      * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors
        hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors
      602456bf
    • L
      Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux · 7b383bef
      Linus Torvalds 提交于
      Pull SLAB changes from Pekka Enberg:
       "Random bug fixes that have accumulated in my inbox over the past few
        months"
      
      * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
        mm: Fix warning on make htmldocs caused by slab.c
        mm: slub: work around unneeded lockdep warning
        mm: sl[uo]b: fix misleading comments
        slub: Fix possible format string bug.
        slub: use lockdep_assert_held
        slub: Fix calculation of cpu slabs
        slab.h: remove duplicate kmalloc declaration and fix kernel-doc warnings
      7b383bef
    • L
      Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux · 87af5e5c
      Linus Torvalds 提交于
      Pull turbostat updates from Len Brown.
      
      * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
        tools/power turbostat: introduce -s to dump counters
        tools/power turbostat: remove unused command line option
        turbostat: Add option to report joules consumed per sample
        turbostat: run on HSX
        turbostat: Add a .gitignore to ignore the compiled turbostat binary
        turbostat: Clean up error handling; disambiguate error messages; use err and errx
        turbostat: Factor out common function to open file and exit on failure
        turbostat: Add a helper to parse a single int out of a file
        turbostat: Check return value of fscanf
        turbostat: Use GCC's CPUID functions to support PIC
        turbostat: Don't attempt to printf an off_t with %zx
        turbostat: Don't put unprocessed uapi headers in the include path
      87af5e5c
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · e4c0da21
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Here's a set of patches for (hopefully) -rc1.  Some of them are fixes,
        but a good number of them also do things such as enable new drivers in
        the defconfigs for platforms that have such devices, increases
        coverage of the multiplatform defconfig and some DTS changes that
        plumbs up some of the devices that now have bindings and driver
        support.
      
        The commit dates are recent; we've mostly collected these fixes in the
        last few days but I also had to rebuild the branch yesterday to sort
        out some internal conflicts which reset the timestamps.  The changes
        should have been tested by each platform maintainer already (and few
        of them have cross-platform impact) so I'm personally not too
        concerned by it at this time"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
        ARM: multi_v7_defconfig: remove redundant entries and re-enable TI_EDMA
        ARM: multi_v7_defconfig: add mvebu drivers
        clocksource: kona: Add basic use of external clock
        drivers: bus: fix CCI driver kcalloc call parameters swap
        ARM: dts: bcm28155-ap: Fix Card Detection GPIO
        ARM: multi_v7_defconfig: Select CONFIG_AT803X_PHY
        ARM: keystone: config: fix build warning when CONFIG_DMADEVICES is not set
        MAINTAINERS: ARM: SiRF: use regex patterns to involve all SiRF drivers
        ARM: dts: zynq: Add SDHCI nodes
        ARM: hisi: don't select SMP
        ARM: tegra: rebuild tegra_defconfig to add DEBUG_FS
        ARM: multi_v7: copy most options from tegra_defconfig
        ARM: iop32x: fix power off handling for the EM7210 board
        ARM: integrator: restore static map on the CP
        ARM: msm_defconfig: Enable MSM clock drivers
        ARM: dts: msm: Add clock controller nodes and hook into uart
        ARM: OMAP4+: move errata initialization to omap4_pm_init_early
        ARM: OMAP4460: cpuidle: Extend PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD on cpuidle
        ARM: mvebu: fix compilation warning on Armada 370 (i.e. non-SMP)
        ARM: shmobile: r8a7790.dtsi: ficx i2c[0-3] clock reference
        ...
      e4c0da21
    • J
      hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors · 632007e2
      Jean Delvare 提交于
      Similar to what was done for the lm75 driver.
      
      Add depends on THERMAL since that is what provides the
      register/unregister functions above, but only if THERMAL_OF was
      selected as this is an optional feature of the driver.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Acked-by: NEduardo Valentin <eduardo.valentin@ti.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      632007e2
    • J
      hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors · 920130a9
      Jean Delvare 提交于
      Based on an earlier attempt by Randy Dunlap.
      
      Fix SENSORS_LM75 dependencies to eliminate build errors:
      
      drivers/built-in.o: In function `lm75_remove':
      lm75.c:(.text+0x12bd8c): undefined reference to `thermal_zone_of_sensor_unregister'
      drivers/built-in.o: In function `lm75_probe':
      lm75.c:(.text+0x12c123): undefined reference to `thermal_zone_of_sensor_register'
      
      Add depends on THERMAL since that is what provides the
      register/unregister functions above, but only if THERMAL_OF was
      selected as this is an optional feature of the driver.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Acked-by: NEduardo Valentin <eduardo.valentin@ti.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      920130a9
  9. 02 2月, 2014 9 次提交
  10. 01 2月, 2014 4 次提交