1. 27 5月, 2014 3 次提交
    • C
      drm/i915: Prevent negative relocation deltas from wrapping · d23db88c
      Chris Wilson 提交于
      This is pure evil. Userspace, I'm looking at you SNA, repacks batch
      buffers on the fly after generation as they are being passed to the
      kernel for execution. These batches also contain self-referenced
      relocations as a single buffer encompasses the state commands, kernels,
      vertices and sampler. During generation the buffers are placed at known
      offsets within the full batch, and then the relocation deltas (as passed
      to the kernel) are tweaked as the batch is repacked into a smaller buffer.
      This means that userspace is passing negative relocations deltas, which
      subsequently wrap to large values if the batch is at a low address. The
      GPU hangs when it then tries to use the large value as a base for its
      address offsets, rather than wrapping back to the real value (as one
      would hope). As the GPU uses positive offsets from the base, we can
      treat the relocation address as the minimum address read by the GPU.
      For the upper bound, we trust that userspace will not read beyond the
      end of the buffer.
      
      So, how do we fix negative relocations from wrapping? We can either
      check that every relocation looks valid when we write it, and then
      position each object such that we prevent the offset wraparound, or we
      just special-case the self-referential behaviour of SNA and force all
      batches to be above 256k. Daniel prefers the latter approach.
      
      This fixes a GPU hang when it tries to use an address (relocation +
      offset) greater than the GTT size. The issue would occur quite easily
      with full-ppgtt as each fd gets its own VM space, so low offsets would
      often be handed out. However, with the rearrangement of the low GTT due
      to capturing the BIOS framebuffer, it is already affecting kernels 3.15
      onwards. I think only IVB+ is susceptible to this bug, but the workaround
      should only kick in rarely, so it seems sensible to always apply it.
      
      v3: Use a bias for batch buffers to prevent small negative delta relocations
      from wrapping.
      
      v4 from Daniel:
      - s/BIAS/BATCH_OFFSET_BIAS/
      - Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
        were growing rather cumbersome.
      - Add a comment to eb_get_batch explaining why we do this.
      - Apply the batch offset bias everywhere but mention that we've only
        observed it on gen7 gpus.
      - Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
      
      v5: Add static to eb_get_batch, spotted by 0-day tester.
      
      Testcase: igt/gem_bad_reloc
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d23db88c
    • C
      drm/i915: Only copy back the modified fields to userspace from execbuffer · 9aab8bff
      Chris Wilson 提交于
      We only want to modifiy a single field in the userspace view of the
      execbuffer command buffer, so explicitly change that rather than copy
      everything back again.
      
      This serves two purposes:
      
      1. The single fields are much cheaper to copy (constant size so the
      copy uses special case code) and much smaller than the whole array.
      
      2. We modify the array for internal use that need to be masked from
      the user.
      
      Note: We need this backported since without it the next bugfix will
      blow up when userspace recycles batchbuffers and relocations.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9aab8bff
    • C
      drm/i915: Fix dynamic allocation of physical handles · 00731155
      Chris Wilson 提交于
      A single object may be referenced by multiple registers fundamentally
      breaking the static allotment of ids in the current design. When the
      object is used the second time, the physical address of the first
      assignment is relinquished and a second one granted. However, the
      hardware is still reading (and possibly writing) to the old physical
      address now returned to the system. Eventually hilarity will ensue, but
      in the short term, it just means that cursors are broken when using more
      than one pipe.
      
      v2: Fix up leak of pci handle when handling an error during attachment,
      and avoid a double kmap/kunmap. (Ville)
      Rebase against -fixes.
      
      v3: And fix the error handling added in v2 (Ville)
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77351Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      00731155
  2. 26 5月, 2014 10 次提交
  3. 25 5月, 2014 3 次提交
    • J
      hwmon: (ntc_thermistor) Fix OF device ID mapping · ead82d67
      Jean Delvare 提交于
      The mapping from OF device IDs to platform device IDs is wrong.
      TYPE_NCPXXWB473 is 0, TYPE_NCPXXWL333 is 1, so
      ntc_thermistor_id[TYPE_NCPXXWB473] is { "ncp15wb473", TYPE_NCPXXWB473 }
      while
      ntc_thermistor_id[TYPE_NCPXXWL333] is { "ncp18wb473", TYPE_NCPXXWB473 }.
      
      So the name is wrong for all but the "ntc,ncp15wb473" entry, and the
      type is wrong for the "ntc,ncp15wl333" entry.
      
      So map the entries by index, it is neither elegant nor robust but at
      least it is correct.
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      Fixes: 9e8269de hwmon: (ntc_thermistor) Add DT with IIO support to NTC thermistor driver
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Naveen Krishna Chatradhi <ch.naveen@samsung.com>
      Cc: Doug Anderson <dianders@chromium.org>
      ead82d67
    • J
      hwmon: (ntc_thermistor) Fix dependencies · 59cf4243
      Jean Delvare 提交于
      In commit 9e8269de, support was added for ntc_thermistor devices being
      declared in the device tree and implemented on top of IIO. With that
      change, a dependency was added to the ntc_thermistor driver:
      
      	depends on (!OF && !IIO) || (OF && IIO)
      
      This construct has the drawback that the driver can no longer be
      selected when OF is set and IIO isn't, nor when IIO is set and OF is
      not. This is a regression for the original users of the driver.
      
      As the new code depends on IIO and is useless without OF, include it
      only if both are enabled, and set the dependencies accordingly. This
      is clearer, more simple and more correct.
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      Fixes: 9e8269de hwmon: (ntc_thermistor) Add DT with IIO support to NTC thermistor driver
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Naveen Krishna Chatradhi <ch.naveen@samsung.com>
      Cc: Doug Anderson <dianders@chromium.org>
      59cf4243
    • J
      hwmon: Document temp[1-*]_min_hyst sysfs attribute · 01325145
      Jean Delvare 提交于
      The temp[1-*]_min_hyst sysfs attribute is already implemented by 3
      hwmon drivers (adt7x10, lm77 and lm92) but was missing from the
      standard interface.
      
      Also add temp[1-*]_lcrit_hyst for consistency, even though no driver
      implement that one for the time being.
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      01325145
  4. 24 5月, 2014 17 次提交
    • L
      Merge tag 'dmaengine-fixes-3.15-rc5' of... · 03743007
      Linus Torvalds 提交于
      Merge tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine
      
      Pull dmaengine fixes from Dan Williams:
       "Two fixes for -stable:
      
         - async_mult() sometimes maps less buffers than initially requested.
            We end up freeing dmaengine_unmap_data on an invalid pool.
      
         - mv_xor: register write ordering fix"
      
      * tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
        dmaengine: fix dmaengine_unmap failure
        dma: mv_xor: Flush descriptors before activating a channel
      03743007
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 1ee1ceaf
      Linus Torvalds 提交于
      Pull sparc fixes from David Miller:
       "A small bunch of bug fixes, in particular:
      
         1) On older cpus we need a different chunk of virtual address space
            to map the huge page TSB.
      
         2) Missing memory barrier in Niagara2 memcpy.
      
         3) trinity showed some places where fault validation was
            unnecessarily loud on sparc64
      
         4) Some sysfs printf's need a type adjustment, from Toralf Förster"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: fix format string mismatch in arch/sparc/kernel/sysfs.c
        sparc64: Add membar to Niagara2 memcpy code.
        sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
        sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault addresses.
      1ee1ceaf
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5fa6a683
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "It looks like a sizeble collection but this is nearly 3 weeks of bug
        fixing while you were away.
      
         1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute
            the packet through a non-IPSEC protected path and the code has to
            be able to handle SKBs attached to routes lacking an attached xfrm
            state.  From Steffen Klassert.
      
         2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported
            sub-protocols, also from Steffen Klassert.
      
         3) Set local_df on fragmented netfilter skbs otherwise we won't be
            able to forward successfully, from Florian Westphal.
      
         4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without
            holding RCU lock, from Bjorn Mork.
      
         5) local_df test in ip_may_fragment is inverted, from Florian
            Westphal.
      
         6) jme driver doesn't check for DMA mapping failures, from Neil
            Horman.
      
         7) qlogic driver doesn't calculate number of TX queues properly, from
            Shahed Shaikh.
      
         8) fib_info_cnt can drift irreversibly positive if we fail to
            allocate the fi->fib_metrics array, from Sergey Popovich.
      
         9) Fix use after free in ip6_route_me_harder(), also from Sergey
            Popovich.
      
        10) When SYSCTL is disabled, we don't handle local_port_range and
            ping_group_range defaults properly at all, from Cong Wang.
      
        11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim
            driver, fix from Bjorn Mork.
      
        12) cassini driver needs nested lock annotations for TX locking, from
            Emil Goode.
      
        13) On init error ipv6 VTI driver can unregister pernet ops twice,
            oops.  Fix from Mahtias Krause.
      
        14) If macvlan device is down, don't propagate IFF_ALLMULTI changes,
            from Peter Christensen.
      
        15) Missing NULL pointer check while parsing netlink config options in
            ip6_tnl_validate().  From Susant Sahani.
      
        16) Fix handling of neighbour entries during ipv6 router reachability
            probing, from Duan Jiong.
      
        17) x86 and s390 JIT address randomization has some address
            calculation bugs leading to crashes, from Alexei Starovoitov and
            Heiko Carstens.
      
        18) Clear up those uglies with nop patching and net_get_random_once(),
            from Hannes Frederic Sowa.
      
        19) Option length miscalculated in ip6_append_data(), fix also from
            Hannes Frederic Sowa.
      
        20) A while ago we fixed a race during device unregistry when a
            namespace went down, turns out there is a second place that needs
            similar protection.  From Cong Wang.
      
        21) In the new Altera TSE driver multicast filtering isn't working,
            disable it and just use promisc mode until the cause is found.
            From Vince Bridgers.
      
        22) When we disable router enabling in ipv6 we have to flush the
            cached routes explicitly, from Duan Jiong.
      
        23) NBMA tunnels should not cache routes on the tunnel object because
            the key is variable, from Timo Teräs.
      
        24) With stacked devices GRO information in skb->cb[] can be not setup
            properly, make sure it is in all code paths.  From Eric Dumazet.
      
        25) Really fix stacked vlan locking, multiple levels of nesting with
            intervening non-vlan devices are possible.  From Vlad Yasevich.
      
        26) Fallback ipip tunnel device's mtu is not setup properly, from
            Steffen Klassert.
      
        27) The packet scheduler's tcindex filter can crash because we
            structure copy objects with list_head's inside, oops.  From Cong
            Wang.
      
        28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric
            Dumazet.
      
        29) In some configurations 'itag' in __mkroute_input() can end up
            being used uninitialized because of how fib_validate_source()
            works.  Fix it by explitly initializing itag to zero like all the
            other fib_validate_source() callers do, from Li RongQing"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
        batman: fix a bogus warning from batadv_is_on_batman_iface()
        ipv4: initialise the itag variable in __mkroute_input
        bonding: Send ALB learning packets using the right source
        bonding: Don't assume 802.1Q when sending alb learning packets.
        net: doc: Update references to skb->rxhash
        stmmac: Remove unbalanced clk_disable call
        ipv6: gro: fix CHECKSUM_COMPLETE support
        net_sched: fix an oops in tcindex filter
        can: peak_pci: prevent use after free at netdev removal
        ip_tunnel: Initialize the fallback device properly
        vlan: Fix build error wth vlan_get_encap_level()
        can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option
        MAINTAINERS: Pravin Shelar is Open vSwitch maintainer.
        bnx2x: Convert return 0 to return rc
        bonding: Fix alb mode to only use first level vlans.
        bonding: Fix stacked device detection in arp monitoring
        macvlan: Fix lockdep warnings with stacked macvlan devices
        vlan: Fix lockdep warning with stacked vlan devices.
        net: Allow for more then a single subclass for netif_addr_lock
        net: Find the nesting level of a given device by type.
        ...
      5fa6a683
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f02f79db
      Linus Torvalds 提交于
      Pull scheduler fixes from Ingo Molnar:
       "The biggest commit is an irqtime accounting loop latency fix, the rest
        are misc fixes all over the place: deadline scheduling, docs, numa,
        balancer and a bad to-idle latency fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/numa: Initialize newidle balance stats in sd_numa_init()
        sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance()
        sched: Skip double execution of pick_next_task_fair()
        sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check
        sched/deadline: Fix memory leak
        sched/deadline: Fix sched_yield() behavior
        sched: Sanitize irq accounting madness
        sched/docbook: Fix 'make htmldocs' warnings caused by missing description
      f02f79db
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e6a32c3a
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "The biggest changes are fixes for races that kept triggering Trinity
        crashes, plus liblockdep build fixes and smaller misc fixes.
      
        The liblockdep bits in perf/urgent are a pull mistake - they should
        have been in locking/urgent - but by the time I noticed other commits
        were added and testing was done :-/ Sorry about that"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
        perf: Prevent false warning in perf_swevent_add
        perf: Limit perf_event_attr::sample_period to 63 bits
        tools/liblockdep: Remove all build files when doing make clean
        tools/liblockdep: Build liblockdep from tools/Makefile
        perf/x86/intel: Fix Silvermont's event constraints
        perf: Fix perf_event_init_context()
        perf: Fix race in removing an event
      e6a32c3a
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 2b2d323a
      Linus Torvalds 提交于
      Pull drm radeon and nouveau fixes from Dave Airlie:
       "Fixes for the other big two.
      
        The radeon VCE one is large but it fixes some userspace triggerable
        issues, otherwise its blackscreens and oopses.
      
        Nouveau fixes a bleeding laptop panel issue when displayport is used
        sometimes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)
        drm/radeon: avoid segfault on device open when accel is not working.
        drm/radeon: fix typo in finding PLL params
        drm/radeon: fix register typo on si
        drm/radeon: fix buffer placement under memory pressure v2
        drm/radeon: fix page directory update size estimation
        drm/radeon: handle non-VGA class pci devices with ATRM
        drm/radeon: fix DCE83 check for mullins
        drm/radeon: check VCE relocation buffer range v3
        drm/radeon: also try GART for CPU accessed buffers
        drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup
        drm/nvd9/therm: handle another kind of PWM fan
      2b2d323a
    • L
      Merge branch 'akpm' (incoming from Andrew) · fc3ac5c7
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "9 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        MAINTAINERS: add closing angle bracket to Vince Bridgers' email address
        Documentation: fix DOCBOOKS=... building
        ocfs2: fix double kmem_cache_destroy in dlm_init
        mm/memory-failure.c: fix memory leak by race between poison and unpoison
        wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR
        memcg: fix swapcache charge from kernel thread context
        mm: madvise: fix MADV_WILLNEED on shmem swapouts
        mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT
        hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage
      fc3ac5c7
    • T
      MAINTAINERS: add closing angle bracket to Vince Bridgers' email address · 0d9327ab
      Tobias Klauser 提交于
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Cc: Vince Bridgers <vbridgers2013@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d9327ab
    • J
      Documentation: fix DOCBOOKS=... building · e60cbeed
      Johannes Berg 提交于
      Prior to commit 42661299 ("[media] DocBook: Move all media docbook
      stuff into its own directory") it was possible to build only a single
      (or more) book(s) by calling, for example
      
          make htmldocs DOCBOOKS=80211.xml
      
      This now fails:
      
          cp: target `.../Documentation/DocBook//media_api' is not a directory
      
      Ignore errors from that copy to make this possible again.
      
      Fixes: 42661299 ("[media] DocBook: Move all media docbook stuff into its own directory")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e60cbeed
    • J
      ocfs2: fix double kmem_cache_destroy in dlm_init · 66db6cfd
      Joseph Qi 提交于
      In dlm_init, if create dlm_lockname_cache failed in
      dlm_init_master_caches, it will destroy dlm_lockres_cache which created
      before twice.  And this will cause system die when loading modules.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      66db6cfd
    • N
      mm/memory-failure.c: fix memory leak by race between poison and unpoison · 3e030ecc
      Naoya Horiguchi 提交于
      When a memory error happens on an in-use page or (free and in-use)
      hugepage, the victim page is isolated with its refcount set to one.
      
      When you try to unpoison it later, unpoison_memory() calls put_page()
      for it twice in order to bring the page back to free page pool (buddy or
      free hugepage list).  However, if another memory error occurs on the
      page which we are unpoisoning, memory_failure() returns without
      releasing the refcount which was incremented in the same call at first,
      which results in memory leak and unconsistent num_poisoned_pages
      statistics.  This patch fixes it.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: <stable@vger.kernel.org>    [2.6.32+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e030ecc
    • M
      wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR · ad0f614e
      Masatake YAMATO 提交于
      In commit ad86622b ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide
      EXIT_TRACE from user-space") the order of task state definitions were
      changed: EXIT_DEAD and EXIT_ZOMBIE were swapped.  Though the charterers
      for the states in TASK_STATE_TO_CHAR_STR string were not updated.  This
      patch synchronizes the string to the order of definitions.
      Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad0f614e
    • M
      memcg: fix swapcache charge from kernel thread context · 6f6acb00
      Michal Hocko 提交于
      Commit 284f39af ("mm: memcg: push !mm handling out to page cache
      charge function") explicitly checks for page cache charges without any
      mm context (from kernel thread context[1]).
      
      This seemed to be the only possible case where memory could be charged
      without mm context so commit 03583f1a ("memcg: remove unnecessary
      !mm check from try_get_mem_cgroup_from_mm()") removed the mm check from
      get_mem_cgroup_from_mm().  This however caused another NULL ptr
      dereference during early boot when loopback kernel thread splices to
      tmpfs as reported by Stephan Kulow:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000360
        IP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
        Oops: 0000 [#1] SMP
        Modules linked in: btrfs dm_multipath dm_mod scsi_dh multipath raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod parport_pc parport nls_utf8 isofs usb_storage iscsi_ibft iscsi_boot_sysfs arc4 ecb fan thermal nfs lockd fscache nls_iso8859_1 nls_cp437 sg st hid_generic usbhid af_packet sunrpc sr_mod cdrom ata_generic uhci_hcd virtio_net virtio_blk ehci_hcd usbcore ata_piix floppy processor button usb_common virtio_pci virtio_ring virtio edd squashfs loop ppa]
        CPU: 0 PID: 97 Comm: loop1 Not tainted 3.15.0-rc5-5-default #1
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          __mem_cgroup_try_charge_swapin+0x40/0xe0
          mem_cgroup_charge_file+0x8b/0xd0
          shmem_getpage_gfp+0x66b/0x7b0
          shmem_file_splice_read+0x18f/0x430
          splice_direct_to_actor+0xa2/0x1c0
          do_lo_receive+0x5a/0x60 [loop]
          loop_thread+0x298/0x720 [loop]
          kthread+0xc6/0xe0
          ret_from_fork+0x7c/0xb0
      
      Also Branimir Maksimovic reported the following oops which is tiggered
      for the swapcache charge path from the accounting code for kernel threads:
      
        CPU: 1 PID: 160 Comm: kworker/u8:5 Tainted: P           OE 3.15.0-rc5-core2-custom #159
        Hardware name: System manufacturer System Product Name/MAXIMUSV GENE, BIOS 1903 08/19/2013
        task: ffff880404e349b0 ti: ffff88040486a000 task.ti: ffff88040486a000
        RIP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
        Call Trace:
          __mem_cgroup_try_charge_swapin+0x45/0xf0
          mem_cgroup_charge_file+0x9c/0xe0
          shmem_getpage_gfp+0x62c/0x770
          shmem_write_begin+0x38/0x40
          generic_perform_write+0xc5/0x1c0
          __generic_file_aio_write+0x1d1/0x3f0
          generic_file_aio_write+0x4f/0xc0
          do_sync_write+0x5a/0x90
          do_acct_process+0x4b1/0x550
          acct_process+0x6d/0xa0
          do_exit+0x827/0xa70
          kthread+0xc3/0xf0
      
      This patch fixes the issue by reintroducing mm check into
      get_mem_cgroup_from_mm.  We could do the same trick in
      __mem_cgroup_try_charge_swapin as we do for the regular page cache path
      but it is not worth troubles.  The check is not that expensive and it is
      better to have get_mem_cgroup_from_mm more robust.
      
      [1] - http://marc.info/?l=linux-mm&m=139463617808941&w=2
      
      Fixes: 03583f1a ("memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()")
      Reported-and-tested-by: NStephan Kulow <coolo@suse.com>
      Reported-by: NBranimir Maksimovic <branimir.maksimovic@gmail.com>
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f6acb00
    • J
      mm: madvise: fix MADV_WILLNEED on shmem swapouts · 55231e5c
      Johannes Weiner 提交于
      MADV_WILLNEED currently does not read swapped out shmem pages back in.
      
      Commit 0cd6144a ("mm + fs: prepare for non-page entries in page
      cache radix trees") made find_get_page() filter exceptional radix tree
      entries but failed to convert all find_get_page() callers that WANT
      exceptional entries over to find_get_entry().  One of them is shmem swap
      readahead in madvise, which now skips over any swap-out records.
      
      Convert it to find_get_entry().
      
      Fixes: 0cd6144a ("mm + fs: prepare for non-page entries in page cache radix trees")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55231e5c
    • J
      mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT · 7fcbbaf1
      Jens Axboe 提交于
      In some testing I ran today (some fio jobs that spread over two nodes),
      we end up spending 40% of the time in filemap_check_errors().  That
      smells fishy.  Looking further, this is basically what happens:
      
      blkdev_aio_read()
          generic_file_aio_read()
              filemap_write_and_wait_range()
                  if (!mapping->nr_pages)
                      filemap_check_errors()
      
      and filemap_check_errors() always attempts two test_and_clear_bit() on
      the mapping flags, thus dirtying it for every single invocation.  The
      patch below tests each of these bits before clearing them, avoiding this
      issue.  In my test case (4-socket box), performance went from 1.7M IOPS
      to 4.0M IOPS.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7fcbbaf1
    • C
      hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage · b985194c
      Chen Yucong 提交于
      For handling a free hugepage in memory failure, the race will happen if
      another thread hwpoisoned this hugepage concurrently.  So we need to
      check PageHWPoison instead of !PageHWPoison.
      
      If hwpoison_filter(p) returns true or a race happens, then we need to
      unlock_page(hpage).
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Tested-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[2.6.36+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b985194c
    • L
      parisc: 'renameat2()' doesn't need (or have) a separate compat system call · 9abd09ac
      Linus Torvalds 提交于
      The 'renameat2()' system call was incorrectly added as a ENTRY_COMP() in
      the parisc system call table by commit 18e480aa ("parisc: add
      renameat2 syscall").  That causes a link-time error due to there not
      being any compat version of that system call:
      
        arch/parisc/kernel/built-in.o: In function `sys_call_table':
        (.rodata+0xad0): undefined reference to `compat_sys_renameat2'
        make: *** [vmlinux] Error 1
      
      Easily fixed by marking the system call as being the same for compat as
      for native by using ENTRY_SAME() instead of ENTRY_COMP().
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NMiklos Szeredi <miklos@szeredi.hu>
      Acked-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9abd09ac
  5. 23 5月, 2014 7 次提交
    • D
      AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct* · 656f88dd
      David Howells 提交于
      call->async_workfn() can take an afs_call* arg rather than a work_struct* as
      the functions assigned there are now called from afs_async_workfn() which has
      to call container_of() anyway.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      656f88dd
    • N
      AFS: Fix kafs module unloading · 150a6b47
      Nathaniel Wesley Filardo 提交于
      At present, it is not possible to successfully unload the kafs module if there
      are outstanding async outgoing calls (those made with afs_make_call()).  This
      appears to be due to the changes introduced by:
      
      	commit 05949945
      	Author: Tejun Heo <tj@kernel.org>
      	Date:   Fri Mar 7 10:24:50 2014 -0500
      	Subject: afs: don't use PREPARE_WORK
      
      which didn't go far enough.  The problem is due to:
      
       (1) The aforementioned commit introduced a separate handler function pointer
           in the call, call->async_workfn, in addition to the original workqueue
           item, call->async_work, for asynchronous operations because workqueues
           subsystem cannot handle the workqueue item pointer being changed whilst
           the item is queued or being processed.
      
       (2) afs_async_workfn() was introduced in that commit to be the callback for
           call->async_work.  Its sole purpose is to run whatever call->async_workfn
           points to.
      
       (3) call->async_workfn is only used from afs_async_workfn(), which is only
           set on async_work by afs_collect_incoming_call() - ie. for incoming
           calls.
      
       (4) call->async_workfn is *not* set by afs_make_call() when outgoing calls are
           made, and call->async_work is set afs_process_async_call() - and not
           afs_async_workfn().
      
       (5) afs_process_async_call() now changes call->async_workfn rather than
           call->async_work to point to afs_delete_async_call() to clean up, but this
           is only effective for incoming calls because call->async_work does not
           point to afs_async_workfn() for outgoing calls.
      
       (6) Because, for incoming calls, call->async_work remains pointing to
           afs_process_async_call() this results in an infinite loop.
      
      Instead, make the workqueue uniformly vector through call->async_workfn, via
      afs_async_workfn() and simply initialise call->async_workfn to point to
      afs_process_async_call() in afs_make_call().
      Signed-off-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      150a6b47
    • N
      AFS: Part of afs_end_call() is identical to code elsewhere, so split it · 6cf12869
      Nathaniel Wesley Filardo 提交于
      Split afs_end_call() into two pieces, one of which is identical to code in
      afs_process_async_call().  Replace the latter with a call to the first part of
      afs_end_call().
      Signed-off-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6cf12869
    • J
      [SCSI] scsi_transport_sas: move bsg destructor into sas_rphy_remove · 6aa6caff
      Joe Lawrence 提交于
      The recent change in sysfs, bcdde7e2
      "sysfs: make __sysfs_remove_dir() recursive" revealed an asymmetric
      rphy device creation/deletion sequence in scsi_transport_sas:
      
        modprobe mpt2sas
          sas_rphy_add
            device_add A               rphy->dev
            device_add B               sas_device transport class
            device_add C               sas_end_device transport class
            device_add D               bsg class
      
        rmmod mpt2sas
          sas_rphy_delete
            sas_rphy_remove
              device_del B
              device_del C
              device_del A
                sysfs_remove_group     recursive sysfs dir removal
            sas_rphy_free
              device_del D             warning
      
        where device A is the parent of B, C, and D.
      
      When sas_rphy_free tries to unregister the bsg request queue (device D
      above), the ensuing sysfs cleanup discovers that its sysfs group has
      already been removed and emits a warning, "sysfs group... not found for
      kobject 'end_device-X:0'".
      
      Since bsg creation is a side effect of sas_rphy_add, move its
      complementary removal call into sas_rphy_remove. This imposes the
      following tear-down order for the devices above: D, B, C, A.
      
      Note the sas_device and sas_end_device transport class devices (B and C
      above) are created and destroyed both via the list match traversal in
      attribute_container_device_trigger, so the order in which they are
      handled is fixed. This is fine as long as they are deleted before their
      parent device.
      Signed-off-by: NJoe Lawrence <joe.lawrence@stratus.com>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      6aa6caff
    • C
      batman: fix a bogus warning from batadv_is_on_batman_iface() · b6ed5498
      Cong Wang 提交于
      batman tries to search dev->iflink to check if it's a batman interface,
      but ->iflink could be 0, which is not a valid ifindex. It should just
      avoid iflink == 0 case.
      Reported-by: NJet Chen <jet.chen@intel.com>
      Tested-by: NJet Chen <jet.chen@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Antonio Quartulli <antonio@open-mesh.com>
      Cc: Marek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6ed5498
    • L
      ipv4: initialise the itag variable in __mkroute_input · fbdc0ad0
      Li RongQing 提交于
      the value of itag is a random value from stack, and may not be initiated by
      fib_validate_source, which called fib_combine_itag if CONFIG_IP_ROUTE_CLASSID
      is not set
      
      This will make the cached dst uncertainty
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbdc0ad0
    • V
      bonding: Send ALB learning packets using the right source · d0c21d43
      Vlad Yasevich 提交于
      ALB learning packets are currentlyalways sent using the slave mac
      address for all vlans configured on top of bond.   This is not always
      correct, as vlans may change their mac address.
      This patch introduced a concept of strict matching where the
      source of learning packets can either strictly match the address
      passed in, or it can determine a more correct address to use.
      
      There are 3 casese to consider:
        1) Switchover.  In this case, we have a new active slave and we need
           tell the switch about all addresses available on the slave.
        2) Monitor.  We'll periodically refresh learning info for all slaves.
           In this case, we refresh all addresses for current active, and just
           the slave address for other slaves.
        3) Teaching of disabled adddress.  This happens as part of the
           failover and in this case, we alwyas to use just the address
           provided.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0c21d43