1. 03 1月, 2013 5 次提交
    • M
      mm: mempolicy: Convert shared_policy mutex to spinlock · 42288fe3
      Mel Gorman 提交于
      Sasha was fuzzing with trinity and reported the following problem:
      
        BUG: sleeping function called from invalid context at kernel/mutex.c:269
        in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
        2 locks held by trinity-main/6361:
         #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0
         #1:  (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0
        Pid: 6361, comm: trinity-main Tainted: G        W
        3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74
        Call Trace:
          __might_sleep+0x1c3/0x1e0
          mutex_lock_nested+0x29/0x50
          mpol_shared_policy_lookup+0x2e/0x90
          shmem_get_policy+0x2e/0x30
          get_vma_policy+0x5a/0xa0
          mpol_misplaced+0x41/0x1d0
          handle_pte_fault+0x465/0x6a0
      
      This was triggered by a different version of automatic NUMA balancing
      but in theory the current version is vunerable to the same problem.
      
      do_numa_page
        -> numa_migrate_prep
          -> mpol_misplaced
            -> get_vma_policy
              -> shmem_get_policy
      
      It's very unlikely this will happen as shared pages are not marked
      pte_numa -- see the page_mapcount() check in change_pte_range() -- but
      it is possible.
      
      To address this, this patch restores sp->lock as originally implemented
      by Kosaki Motohiro.  In the path where get_vma_policy() is called, it
      should not be calling sp_alloc() so it is not necessary to treat the PTL
      specially.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42288fe3
    • L
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 5439ca6b
      Linus Torvalds 提交于
      Pull ext4 bug fixes from Ted Ts'o:
       "Various bug fixes for ext4.  Perhaps the most serious bug fixed is one
        which could cause file system corruptions when performing file punch
        operations."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: avoid hang when mounting non-journal filesystems with orphan list
        ext4: lock i_mutex when truncating orphan inodes
        ext4: do not try to write superblock on ro remount w/o journal
        ext4: include journal blocks in df overhead calcs
        ext4: remove unaligned AIO warning printk
        ext4: fix an incorrect comment about i_mutex
        ext4: fix deadlock in journal_unmap_buffer()
        ext4: split off ext4_journalled_invalidatepage()
        jbd2: fix assertion failure in jbd2_journal_flush()
        ext4: check dioread_nolock on remount
        ext4: fix extent tree corruption caused by hole punch
      5439ca6b
    • H
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins 提交于
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
    • H
      tmpfs mempolicy: fix /proc/mounts corrupting memory · f2a07f40
      Hugh Dickins 提交于
      Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
      mempolicy testing.  Very nasty.  Reading /proc/mounts, /proc/pid/mounts
      or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
      in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
      pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
      worse.  "mpol=prefer" and "mpol=prefer:Node" are equally toxic.
      
      Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
      when commit e17f74af "mempolicy: don't call mpol_set_nodemask() when
      no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
      which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
      With slab poisoning, you can then rely on mpol_to_str() to set the bit
      for node 0x6b6b, probably in the next page above the caller's stack.
      
      mpol_parse_str() is only called from shmem_parse_options(): no_context
      is always true, so call it unused for now, and remove !no_context code.
      Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
      expect.  Then mpol_to_str() can ignore its no_context argument also,
      the mpol being appropriately initialized whether contextualized or not.
      Rename its no_context unused too, and let subsequent patch remove them
      (that's not needed for stable backporting, which would involve rejects).
      
      I don't understand why MPOL_LOCAL is described as a pseudo-policy:
      it's a reasonable policy which suffers from a confusing implementation
      in terms of MPOL_PREFERRED with MPOL_F_LOCAL.  I believe this would be
      much more robust if MPOL_LOCAL were recognized in switch statements
      throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
      empty) nodes mask like everyone else, instead of its preferred_node
      variant (I presume an optimization from the days before MPOL_LOCAL).
      But that would take me too long to get right and fully tested.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2a07f40
    • E
      epoll: prevent missed events on EPOLL_CTL_MOD · 128dd175
      Eric Wong 提交于
      EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
      ensure events are not missed.  Since the modifications to the interest
      mask are not protected by the same lock as ep_poll_callback, we need to
      ensure the change is visible to other CPUs calling ep_poll_callback.
      
      We also need to ensure f_op->poll() has an up-to-date view of past
      events which occured before we modified the interest mask.  So this
      barrier also pairs with the barrier in wq_has_sleeper().
      
      This should guarantee either ep_poll_callback or f_op->poll() (or both)
      will notice the readiness of a recently-ready/modified item.
      
      This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
      http://thread.gmane.org/gmane.linux.kernel/1408782/Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
      Tested-by: N"Junchang(Jason) Wang" <junchang.wang@yale.edu>
      Cc: netdev@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      128dd175
  2. 31 12月, 2012 3 次提交
    • L
      Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux · 4a490b78
      Linus Torvalds 提交于
      Pull DRM update from Dave Airlie:
       "This is a bit larger due to me not bothering to do anything since
        before Xmas, and other people working too hard after I had clearly
        given up.
      
        It's got the 3 main x86 driver fixes pulls, and a bunch of tegra
        fixes, doesn't fix the Ironlake bug yet, but that does seem to be
        getting closer.
      
         - radeon: gpu reset fixes and userspace packet support
         - i915: watermark fixes, workarounds, i830/845 fix,
         - nouveau: nvd9/kepler microcode fixes, accel is now enabled and
           working, gk106 support
         - tegra: misc fixes."
      
      * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (34 commits)
        Revert "drm: tegra: protect DC register access with mutex"
        drm: tegra: program only one window during modeset
        drm: tegra: clean out old gem prototypes
        drm: tegra: remove redundant tegra2_tmds_config entry
        drm: tegra: protect DC register access with mutex
        drm: tegra: don't leave clients host1x member uninitialized
        drm: tegra: fix front_porch <-> back_porch mixup
        drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
        drm/nvc0/graph: fix fuc, and enable acceleration on GF119
        drm/nouveau/bios: cache ramcfg strap on later chipsets
        drm/nouveau/mxm: silence output if no bios data
        drm/nouveau/bios: parse/display extra version component
        drm/nouveau/bios: implement opcode 0xa9
        drm/nouveau/bios: update gpio parsing apis to match current design
        drm/nouveau: initial support for GK106
        drm/radeon: add WAIT_UNTIL to evergreen VM safe reg list
        drm/i915: disable shrinker lock stealing for create_mmap_offset
        drm/i915: optionally disable shrinker lock stealing
        drm/i915: fix flags in dma buf exporting
        drm/radeon: add support for MEM_WRITE packet
        ...
      4a490b78
    • L
      Merge tag 'omap-late-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 8d91a42e
      Linus Torvalds 提交于
      Pull late ARM cleanups for omap from Olof Johansson:
       "From Tony Lindgren:
      
        Here are few more patches to finish the omap changes for multiplatform
        conversion that are not strictly fixes, but were too complex to do
        with the dependencies during the merge window.  Those are to move of
        serial-omap.h to platform_data, and the removal of remaining
        cpu_is_omap macro usage outside mach-omap2.
      
        Then there are several trivial fixes for typos and few minimal
        omap2plus_defconfig updates."
      
      * tag 'omap-late-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arch/arm/mach-omap2/dpll3xxx.c: drop if around WARN_ON
        OMAP2: Fix a typo - replace regist with register.
        ARM/omap: use module_platform_driver macro
        ARM: OMAP2+: PMU: Remove unused header
        ARM: OMAP4: remove duplicated include from omap_hwmod_44xx_data.c
        ARM: OMAP2+: omap2plus_defconfig: enable twl4030 SoC audio
        ARM: OMAP2+: omap2plus_defconfig: Add tps65217 support
        ARM: OMAP2+: enable devtmpfs and devtmpfs automount
        ARM: OMAP2+: omap_twl: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
        ARM: OMAP2+: Drop plat/cpu.h for omap2plus
        ARM: OMAP: Split fb.c to remove last remaining cpu_is_omap usage
        MAINTAINERS: Add an entry for omap related .dts files
      8d91a42e
    • L
      Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 4fe2dfab
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "It's been quiet over the holidays, but we have had a couple of trivial
        fixes coming in for the newly introduced sunxi platform; one to add it
        to the multiplatform defconfig for build coverage, and one fixup for
        device tree strings."
      
      * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        sunxi: Change the machine compatible string.
        ARM: multi_v7_defconfig: Add ARCH_SUNXI
      4fe2dfab
  3. 30 12月, 2012 10 次提交
  4. 29 12月, 2012 1 次提交
  5. 28 12月, 2012 4 次提交
    • O
      Merge tag 'sunxi-fixes-for-3.8-rc2' of git://github.com/mripard/linux into fixes · 2e376799
      Olof Johansson 提交于
      From Maxime Ripard:
      Fixes for the sunxi core to be merged in 3.8-rc2
      
      * tag 'sunxi-fixes-for-3.8-rc2' of git://github.com/mripard/linux:
        sunxi: Change the machine compatible string.
        ARM: multi_v7_defconfig: Add ARCH_SUNXI
      2e376799
    • L
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 101e5c74
      Linus Torvalds 提交于
      Pull hwmon fixes from Guenter Roeck:
      
       - Report i2c errors to userspace in lm73 driver
      
       - Fix problem with DIV_ROUND_CLOSEST and unsigned divisors in emc6w201
         driver
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (emc6w201) Fix DIV_ROUND_CLOSEST problem with unsigned divisors
        hwmon: (lm73} Detect and report i2c bus errors
      101e5c74
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · ddf75ae3
      Linus Torvalds 提交于
      Pull namespace fixes from Eric Biederman:
       "This tree includes two bug fixes for problems Oleg spotted on his
        review of the recent pid namespace work.  A small fix to not enable
        bottom halves with irqs disabled, and a trivial build fix for f2fs
        with user namespaces enabled."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        f2fs: Don't assign e_id in f2fs_acl_from_disk
        proc: Allow proc_free_inum to be called from any context
        pidns: Stop pid allocation when init dies
        pidns: Outlaw thread creation after unshare(CLONE_NEWPID)
      ddf75ae3
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7fd83b47
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
      1) GRE tunnel drivers don't set the transport header properly, they also
         blindly deref the inner protocol ipv4 and needs some checks.  Fixes
         from Isaku Yamahata.
      
      2) Fix sleeps while atomic in netdevice rename code, from Eric Dumazet.
      
      3) Fix double-spinlock in solos-pci driver, from Dan Carpenter.
      
      4) More ARP bug fixes.  Fix lockdep splat in arp_solicit() and then the
         bug accidentally added by that fix.  From Eric Dumazet and Cong Wang.
      
      5) Remove some __dev* annotations that slipped back in, as well as all
         HOTPLUG references.  From Greg KH
      
      6) RDS protocol uses wrong interfaces to access scatter-gather elements,
         causing a regression.  From Mike Marciniszyn.
      
      7) Fix build error in cpts driver, from Richard Cochran.
      
      8) Fix arithmetic in packet scheduler, from Stefan Hasko.
      
      9) Similarly, fix association during calculation of random backoff in
         batman-adv.  From Akinobu Mita.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
        ipv6/ip6_gre: set transport header correctly
        ipv4/ip_gre: set transport header correctly to gre header
        IB/rds: suppress incompatible protocol when version is known
        IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len
        net/vxlan: Use the underlying device index when joining/leaving multicast groups
        tcp: should drop incoming frames without ACK flag set
        netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled
        cpts: fix a run time warn_on.
        cpts: fix build error by removing useless code.
        batman-adv: fix random jitter calculation
        arp: fix a regression in arp_solicit()
        net: sched: integer overflow fix
        CONFIG_HOTPLUG removal from networking core
        Drivers: network: more __dev* removal
        bridge: call br_netpoll_disable in br_add_if
        ipv4: arp: fix a lockdep splat in arp_solicit()
        tuntap: dont use a private kmem_cache
        net: devnet_rename_seq should be a seqcount
        ip_gre: fix possible use after free
        ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally
        ...
      7fd83b47
  6. 27 12月, 2012 13 次提交
  7. 26 12月, 2012 4 次提交
    • E
      f2fs: Don't assign e_id in f2fs_acl_from_disk · 48c6d121
      Eric W. Biederman 提交于
      With user namespaces enabled building f2fs fails with:
      
       CC      fs/f2fs/acl.o
      fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’:
      fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’
      make[2]: *** [fs/f2fs/acl.o] Error 1
      make[2]: Target `__build' not remade because of errors.
      
      e_id is a backwards compatibility field only used for file systems
      that haven't been converted to use kuids and kgids.  When the posix
      acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is
      unnecessary.  Remove the assignment so f2fs will build with user
      namespaces enabled.
      
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Amit Sahrawat <a.sahrawat@samsung.com>
      Acked-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      48c6d121
    • E
      proc: Allow proc_free_inum to be called from any context · dfb2ea45
      Eric W. Biederman 提交于
      While testing the pid namespace code I hit this nasty warning.
      
      [  176.262617] ------------[ cut here ]------------
      [  176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0()
      [  176.265145] Hardware name: Bochs
      [  176.265677] Modules linked in:
      [  176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18
      [  176.266564] Call Trace:
      [  176.266564]  [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0
      [  176.266564]  [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20
      [  176.266564]  [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0
      [  176.266564]  [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20
      [  176.266564]  [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50
      [  176.266564]  [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80
      [  176.266564]  [<ffffffff8111d195>] put_pid_ns+0x35/0x50
      [  176.266564]  [<ffffffff810c608a>] put_pid+0x4a/0x60
      [  176.266564]  [<ffffffff8146b177>] tty_ioctl+0x717/0xc10
      [  176.266564]  [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90
      [  176.266564]  [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10
      [  176.266564]  [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70
      [  176.266564]  [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550
      [  176.266564]  [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60
      [  176.266564]  [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80
      [  176.266564]  [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0
      [  176.266564]  [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0
      [  176.266564]  [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50
      [  176.266564]  [<ffffffff81939199>] system_call_fastpath+0x16/0x1b
      [  176.266564] ---[ end trace 387af88219ad6143 ]---
      
      It turns out that spin_unlock_bh(proc_inum_lock) is not safe when
      put_pid is called with another spinlock held and irqs disabled.
      
      For now take the easy path and use spin_lock_irqsave(proc_inum_lock)
      in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock).
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      dfb2ea45
    • E
      pidns: Stop pid allocation when init dies · c876ad76
      Eric W. Biederman 提交于
      Oleg pointed out that in a pid namespace the sequence.
      - pid 1 becomes a zombie
      - setns(thepidns), fork,...
      - reaping pid 1.
      - The injected processes exiting.
      
      Can lead to processes attempting access their child reaper and
      instead following a stale pointer.
      
      That waitpid for init can return before all of the processes in
      the pid namespace have exited is also unfortunate.
      
      Avoid these problems by disabling the allocation of new pids in a pid
      namespace when init dies, instead of when the last process in a pid
      namespace is reaped.
      Pointed-out-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      c876ad76
    • M
      ext4: do not try to write superblock on ro remount w/o journal · d096ad0f
      Michael Tokarev 提交于
      When a journal-less ext4 filesystem is mounted on a read-only block
      device (blockdev --setro will do), each remount (for other, unrelated,
      flags, like suid=>nosuid etc) results in a series of scary messages
      from kernel telling about I/O errors on the device.
      
      This is becauese of the following code ext4_remount():
      
             if (sbi->s_journal == NULL)
                      ext4_commit_super(sb, 1);
      
      at the end of remount procedure, which forces writing (flushing) of
      a superblock regardless whenever it is dirty or not, if the filesystem
      is readonly or not, and whenever the device itself is readonly or not.
      
      We only need call ext4_commit_super when the file system had been
      previously mounted read/write.
      
      Thanks to Eric Sandeen for help in diagnosing this issue.
      Signed-off-By: NMichael Tokarev <mjt@tls.msk.ru>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      d096ad0f