1. 10 5月, 2016 1 次提交
    • S
      cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces · 4f41fc59
      Serge E. Hallyn 提交于
      Patch summary:
      
      When showing a cgroupfs entry in mountinfo, show the path of the mount
      root dentry relative to the reader's cgroup namespace root.
      
      Short explanation (courtesy of mkerrisk):
      
      If we create a new cgroup namespace, then we want both /proc/self/cgroup
      and /proc/self/mountinfo to show cgroup paths that are correctly
      virtualized with respect to the cgroup mount point.  Previous to this
      patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo
      does not.
      
      Long version:
      
      When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup
      namespace, and then mounts a new instance of the freezer cgroup, the new
      mount will be rooted at /a/b.  The root dentry field of the mountinfo
      entry will show '/a/b'.
      
       cat > /tmp/do1 << EOF
       mount -t cgroup -o freezer freezer /mnt
       grep freezer /proc/self/mountinfo
       EOF
      
       unshare -Gm  bash /tmp/do1
       > 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
       > 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer
      
      The task's freezer cgroup entry in /proc/self/cgroup will simply show
      '/':
      
       grep freezer /proc/self/cgroup
       9:freezer:/
      
      If instead the same task simply bind mounts the /a/b cgroup directory,
      the resulting mountinfo entry will again show /a/b for the dentry root.
      However in this case the task will find its own cgroup at /mnt/a/b,
      not at /mnt:
      
       mount --bind /sys/fs/cgroup/freezer/a/b /mnt
       130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer
      
      In other words, there is no way for the task to know, based on what is
      in mountinfo, which cgroup directory is its own.
      
      Example (by mkerrisk):
      
      First, a little script to save some typing and verbiage:
      
      echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup | grep freezer)"
      cat /proc/self/mountinfo | grep freezer |
              awk '{print "\tmountinfo:\t\t" $4 "\t" $5}'
      
      Create cgroup, place this shell into the cgroup, and look at the state
      of the /proc files:
      
      2653
      2653                         # Our shell
      14254                        # cat(1)
              /proc/self/cgroup:      10:freezer:/a/b
              mountinfo:              /       /sys/fs/cgroup/freezer
      
      Create a shell in new cgroup and mount namespaces. The act of creating
      a new cgroup namespace causes the process's current cgroups directories
      to become its cgroup root directories. (Here, I'm using my own version
      of the "unshare" utility, which takes the same options as the util-linux
      version):
      
      Look at the state of the /proc files:
      
              /proc/self/cgroup:      10:freezer:/
              mountinfo:              /       /sys/fs/cgroup/freezer
      
      The third entry in /proc/self/cgroup (the pathname of the cgroup inside
      the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which
      is rooted at /a/b in the outer namespace.
      
      However, the info in /proc/self/mountinfo is not for this cgroup
      namespace, since we are seeing a duplicate of the mount from the
      old mount namespace, and the info there does not correspond to the
      new cgroup namespace. However, trying to create a new mount still
      doesn't show us the right information in mountinfo:
      
                                            # propagating to other mountns
              /proc/self/cgroup:      7:freezer:/
              mountinfo:              /a/b    /mnt/freezer
      
      The act of creating a new cgroup namespace caused the process's
      current freezer directory, "/a/b", to become its cgroup freezer root
      directory. In other words, the pathname directory of the directory
      within the newly mounted cgroup filesystem should be "/",
      but mountinfo wrongly shows us "/a/b". The consequence of this is
      that the process in the cgroup namespace cannot correctly construct
      the pathname of its cgroup root directory from the information in
      /proc/PID/mountinfo.
      
      With this patch, the dentry root field in mountinfo is shown relative
      to the reader's cgroup namespace.  So the same steps as above:
      
              /proc/self/cgroup:      10:freezer:/a/b
              mountinfo:              /       /sys/fs/cgroup/freezer
              /proc/self/cgroup:      10:freezer:/
              mountinfo:              /../..  /sys/fs/cgroup/freezer
              /proc/self/cgroup:      10:freezer:/
              mountinfo:              /       /mnt/freezer
      
      cgroup.clone_children  freezer.parent_freezing  freezer.state      tasks
      cgroup.procs           freezer.self_freezing    notify_on_release
      3164
      2653                   # First shell that placed in this cgroup
      3164                   # Shell started by 'unshare'
      14197                  # cat(1)
      Signed-off-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
      Tested-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4f41fc59
  2. 03 5月, 2016 1 次提交
    • S
      kernfs_path_from_node_locked: don't overwrite nlen · e99ed4de
      Serge Hallyn 提交于
      We've calculated @len to be the bytes we need for '/..' entries from
      @kn_from to the common ancestor, and calculated @nlen to be the extra
      bytes we need to get from the common ancestor to @kn_to.  We use them
      as such at the end.  But in the loop copying the actual entries, we
      overwrite @nlen.  Use a temporary variable for that instead.
      
      Without this, the return length, when the buffer is large enough, is
      wrong.  (When the buffer is NULL or too small, the returned value is
      correct. The buffer contents are also correct.)
      
      Interestingly, no callers of this function are affected by this as of
      yet.  However the upcoming cgroup_show_path() will be.
      Signed-off-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
      e99ed4de
  3. 26 4月, 2016 3 次提交
    • T
      memcg: relocate charge moving from ->attach to ->post_attach · 264a0ae1
      Tejun Heo 提交于
      Hello,
      
      So, this ended up a lot simpler than I originally expected.  I tested
      it lightly and it seems to work fine.  Petr, can you please test these
      two patches w/o the lru drain drop patch and see whether the problem
      is gone?
      
      Thanks.
      ------ 8< ------
      If charge moving is used, memcg performs relabeling of the affected
      pages from its ->attach callback which is called under both
      cgroup_threadgroup_rwsem and thus can't create new kthreads.  This is
      fragile as various operations may depend on workqueues making forward
      progress which relies on the ability to create new kthreads.
      
      There's no reason to perform charge moving from ->attach which is deep
      in the task migration path.  Move it to ->post_attach which is called
      after the actual migration is finished and cgroup_threadgroup_rwsem is
      dropped.
      
      * move_charge_struct->mm is added and ->can_attach is now responsible
        for pinning and recording the target mm.  mem_cgroup_clear_mc() is
        updated accordingly.  This also simplifies mem_cgroup_move_task().
      
      * mem_cgroup_move_task() is now called from ->post_attach instead of
        ->attach.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Debugged-and-tested-by: NPetr Mladek <pmladek@suse.com>
      Reported-by: NCyril Hrubis <chrubis@suse.cz>
      Reported-by: NJohannes Weiner <hannes@cmpxchg.org>
      Fixes: 1ed13287 ("sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem")
      Cc: <stable@vger.kernel.org> # 4.4+
      264a0ae1
    • T
      cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback · 5cf1cacb
      Tejun Heo 提交于
      Since e93ad19d ("cpuset: make mm migration asynchronous"), cpuset
      kicks off asynchronous NUMA node migration if necessary during task
      migration and flushes it from cpuset_post_attach_flush() which is
      called at the end of __cgroup_procs_write().  This is to avoid
      performing migration with cgroup_threadgroup_rwsem write-locked which
      can lead to deadlock through dependency on kworker creation.
      
      memcg has a similar issue with charge moving, so let's convert it to
      an official callback rather than the current one-off cpuset specific
      function.  This patch adds cgroup_subsys->post_attach callback and
      makes cpuset register cpuset_post_attach_flush() as its ->post_attach.
      
      The conversion is mostly one-to-one except that the new callback is
      called under cgroup_mutex.  This is to guarantee that no other
      migration operations are started before ->post_attach callbacks are
      finished.  cgroup_mutex is one of the outermost mutex in the system
      and has never been and shouldn't be a problem.  We can add specialized
      synchronization around __cgroup_procs_write() but I don't think
      there's any noticeable benefit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org> # 4.4+ prerequisite for the next patch
      5cf1cacb
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · bcc981e9
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "This fixes a couple of regressions in the talitos driver that were
        introduced back in 4.3.
      
        The first bug causes a crash when the driver's AEAD functionality is
        used while the second bug prevents its AEAD feature from working once
        you get past the first bug"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: talitos - fix AEAD tcrypt tests
        crypto: talitos - fix crash in talitos_cra_init()
      bcc981e9
  4. 25 4月, 2016 1 次提交
  5. 24 4月, 2016 11 次提交
  6. 23 4月, 2016 6 次提交
    • L
      Merge tag 'pinctrl-v4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 09502d9f
      Linus Torvalds 提交于
      Pull pin control fixes from Linus Walleij:
       "Some pin control driver fixes came in.  One headed for stable and the
        other two are just ordinary merge window fixes.
      
         - Make the i.MX driver select REGMAP as a dependency
         - Fix up the Mediatek debounce time unit
         - Fix a real hairy ffs vs __ffs issue in the Single pinctrl driver"
      
      * tag 'pinctrl-v4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs
        pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce
        pinctrl: imx: Kconfig: PINCTRL_IMX select REGMAP
      09502d9f
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · ddce1921
      Linus Torvalds 提交于
      Pull arm64 fixes from Catalin Marinas:
      
       - Cache invalidation fix for early CPU boot status update (incorrect
         cacheline)
      
       - of_put_node() missing in the spin_table code
      
       - EL1/El2 early init inconsistency when Virtualisation Host Extensions
         are present
      
       - RCU warning fix in the arm_pmu.c driver
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Fix EL1/EL2 early init inconsistencies with VHE
        drivers/perf: arm-pmu: fix RCU usage on pmu resume from low-power
        arm64: spin-table: add missing of_node_put()
        arm64: fix invalidation of wrong __early_cpu_boot_status cacheline
      ddce1921
    • L
      Merge tag 'powerpc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ff061624
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "Three powerpc cpu feature fixes from Anton Blanchard:
      
         - scan_features() updated incorrect bits for REAL_LE
      
         - update cpu_user_features2 in scan_features()
      
         - update TM user feature bits in scan_features()"
      
      * tag 'powerpc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Update TM user feature bits in scan_features()
        powerpc: Update cpu_user_features2 in scan_features()
        powerpc: scan_features() updates incorrect bits for REAL_LE
      ff061624
    • L
      Merge tag 'iommu-fixes-v4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 7c5047a1
      Linus Torvalds 提交于
      Pull IOMMU fixes from Joerg Roedel:
       "The fixes include:
      
         - Two patches to revert the use of default domains in the ARM SMMU
           driver.  Enabling this caused regressions which need more thorough
           fixing.  So the regressions are fixed for now by disabling the use
           of default domains.
      
         - A fix for a v4.4 regression in the AMD IOMMU driver which broke
           devices behind invisible PCIe-to-PCI bridges with IOMMU enabled"
      
      * tag 'iommu-fixes-v4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/arm-smmu: Don't allocate resources for bypass domains
        iommu/arm-smmu: Fix stream-match conflict with IOMMU_DOMAIN_DMA
        iommu/amd: Fix checking of pci dma aliases
      7c5047a1
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · d61fb48b
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "i915, nouveau and amdgpu/radeon fixes in this:
      
        nouveau:
           Two fixes, one for a regression with dithering and one for a bug
           hit by the userspace drivers.
      
        i915:
           A few fixes, mostly things heading for stable, two important
           skylake GT3/4 hangs.
      
        radeon/amdgpu:
           Some audio, suspend/resume and some runtime PM fixes, along with
           two patches to harden the userptr ABI a bit"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (24 commits)
        drm: Loongson-3 doesn't fully support wc memory
        drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries
        amdgpu/uvd: add uvd fw version for amdgpu
        drm/amdgpu: forbid mapping of userptr bo through radeon device file
        drm/radeon: forbid mapping of userptr bo through radeon device file
        drm/amdgpu: bump the afmt limit for CZ, ST, Polaris
        drm/amdgpu: use defines for CRTCs and AMFT blocks
        drm/dp/mst: Validate port in drm_dp_payload_send_msg()
        drm/nouveau/kms: fix setting of default values for dithering properties
        drm/radeon: print a message if ATPX dGPU power control is missing
        Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control"
        drm/amdgpu/acp: fix resume on CZ systems with AZ audio
        drm/radeon: add a quirk for a XFX R9 270X
        drm/radeon: print pci revision as well as pci ids on driver load
        drm/i915: Use fw_domains_put_with_fifo() on HSW
        drm/i915: Force ringbuffers to not be at offset 0
        drm/i915: Adjust size of PIPE_CONTROL used for gen8 render seqno write
        drm/i915/skl: Fix spurious gpu hang with gt3/gt4 revs
        drm/i915/skl: Fix rc6 based gpu/system hang
        drm/i915/userptr: Hold mmref whilst calling get-user-pages
        ...
      d61fb48b
    • L
      Merge tag 'sound-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · d4b05288
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "Again a relatively calm week without surprise: most of fixes are about
        HD-audio, including fixes for Cirrus codec regression and a race over
        regmap access.  Although both change are slightly unintuitive, the
        risk of further breakage is quite low, I hope.
      
        Other than that, all the rest are trivial"
      
      * tag 'sound-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix possible race on regmap bypass flip
        ALSA: pcxhr: Fix missing mutex unlock
        ALSA: hda - add PCI ID for Intel Broxton-T
        ALSA: hda - Keep powering up ADCs on Cirrus codecs
        ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m
        ALSA - hda: hdmi check NULL pointer in hdmi_set_chmap
        ALSA: hda - Don't trust the reported actual power state
      d4b05288
  7. 22 4月, 2016 17 次提交