1. 19 12月, 2015 1 次提交
    • A
      x86/mce: Ensure offline CPUs don't participate in rendezvous process · d90167a9
      Ashok Raj 提交于
      Intel's MCA implementation broadcasts MCEs to all CPUs on the
      node. This poses a problem for offlined CPUs which cannot
      participate in the rendezvous process:
      
        Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
        Kernel Offset: disabled
        Rebooting in 100 seconds..
      
      More specifically, Linux does a soft offline of a CPU when
      writing a 0 to /sys/devices/system/cpu/cpuX/online, which
      doesn't prevent the #MC exception from being broadcasted to that
      CPU.
      
      Ensure that offline CPUs don't participate in the MCE rendezvous
      and clear the RIP valid status bit so that a second MCE won't
      cause a shutdown.
      
      Without the patch, mce_start() will increment mce_callin and
      wait for all CPUs. Offlined CPUs should avoid participating in
      the rendezvous process altogether.
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      [ Massage commit message. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d90167a9
  2. 10 12月, 2015 8 次提交
    • L
      Merge tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio · 6764e5eb
      Linus Torvalds 提交于
      Pull VFIO fixes from Alex Williamson:
      
       - Various fixes for removing redundancy, const'ifying structs, avoiding
         stack usage, fixing WARN usage (Krzysztof Kozlowski, Julia Lawall,
         Kees Cook, Dan Carpenter)
      
       - Revert No-IOMMU mode as the intended user has not emerged (Alex
         Williamson)
      
      * tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio:
        Revert: "vfio: Include No-IOMMU mode"
        vfio: fix a warning message
        vfio: platform: remove needless stack usage
        vfio-pci: constify pci_error_handlers structures
        vfio: Drop owner assignment from platform_driver
      6764e5eb
    • L
      Merge tag 'devicetree-fixes-for-4.4-rc4' of... · eef121f4
      Linus Torvalds 提交于
      Merge tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
      
      Pull DT fixes from Rob Herring:
       "I think this should be all for 4.4:
      
         - Fix incorrect warning about overlapping memory regions
      
         - Export of_irq_find_parent again which was made static in 4.4, but
           has users pending for 4.5.
      
         - Fix of_msi_map_rid declaration location
      
         - Fix re-entrancy for of_fdt_unflatten_tree
      
         - Clean-up of phys_addr_t printks"
      
      * tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of/irq: move of_msi_map_rid declaration to the correct ifdef section
        of/irq: Export of_irq_find_parent again
        of/fdt: Add mutex protection for calls to __unflatten_device_tree()
        of/address: fix typo in comment block of of_translate_one()
        of: do not use 0x in front of %pa
        of: Fix comparison of reserved memory regions
      eef121f4
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · abb7e2b3
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "One small build fix, a couple do_div() fixes, and a fix for the gpio
        basic clock type are the major changes here.  There's also a couple
        fixes for the TI, sunxi, and scpi clock drivers"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sunxi: pll2: Fix clock running too fast
        clk: scpi: add missing of_node_put
        clk: qoriq: fix memory leak
        imx/clk-pllv2: fix wrong do_div() usage
        imx/clk-pllv1: fix wrong do_div() usage
        clk: mmp: add linux/clk.h includes
        clk: ti: drop locking code from mux/divider drivers
        clk: ti816x: Add missing dmtimer clkdev entries
        clk: ti: fapll: fix wrong do_div() usage
        clk: ti: clkt_dpll: fix wrong do_div() usage
        clk: gpio: Get parent clk names in of_gpio_clk_setup()
      abb7e2b3
    • L
      Merge tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmi · 9a0f76fd
      Linus Torvalds 提交于
      Pull IPMI fix from Corey Minyard:
       "Fix an Oops if an interrupt occurs at startup.  This can happen on
        some hardware"
      
      * tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmi:
        ipmi: move timer init to before irq is setup
      9a0f76fd
    • J
      ipmi: move timer init to before irq is setup · 27f972d3
      Jan Stancek 提交于
      We encountered a panic on boot in ipmi_si on a dell per320 due to an
      uninitialized timer as follows.
      
      static int smi_start_processing(void       *send_info,
                                      ipmi_smi_t intf)
      {
              /* Try to claim any interrupts. */
              if (new_smi->irq_setup)
                      new_smi->irq_setup(new_smi);
      
       --> IRQ arrives here and irq handler tries to modify uninitialized timer
      
          which triggers BUG_ON(!timer->function) in __mod_timer().
      
       Call Trace:
         <IRQ>
         [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si]
         [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si]
         [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si]
         [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350
         [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si]
         [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170
         [<ffffffff810f245e>] handle_edge_irq+0xde/0x180
         [<ffffffff8100fc59>] handle_irq+0x49/0xa0
         [<ffffffff8154643c>] do_IRQ+0x6c/0xf0
         [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
      
              /* Set up the timer that drives the interface. */
              setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi);
      
      The following patch fixes the problem.
      
      To: Openipmi-developer@lists.sourceforge.net
      To: Corey Minyard <minyard@acm.org>
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NTony Camuso <tcamuso@redhat.com>
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Cc: stable@vger.kernel.org # Applies cleanly to 3.10-, needs small rework before
      27f972d3
    • S
      bitops.h: correctly handle rol32 with 0 byte shift · d7e35dfa
      Sasha Levin 提交于
      ROL on a 32 bit integer with a shift of 32 or more is undefined and the
      result is arch-dependent. Avoid this by handling the trivial case of
      roling by 0 correctly.
      
      The trivial solution of checking if shift is 0 breaks gcc's detection
      of this code as a ROL instruction, which is unacceptable.
      
      This bug was reported and fixed in GCC
      (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):
      
      	The standard rotate idiom,
      
      	  (x << n) | (x >> (32 - n))
      
      	is recognized by gcc (for concreteness, I discuss only the case that x
      	is an uint32_t here).
      
      	However, this is portable C only for n in the range 0 < n < 32. For n
      	== 0, we get x >> 32 which gives undefined behaviour according to the
      	C standard (6.5.7, Bitwise shift operators). To portably support n ==
      	0, one has to write the rotate as something like
      
      	  (x << n) | (x >> ((-n) & 31))
      
      	And this is apparently not recognized by gcc.
      
      Note that this is broken on older GCCs and will result in slower ROL.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7e35dfa
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 626d114f
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "A couple of fixes, both -stable fodder (9p one all way back to 2.6.32,
        dio - to all branches where "Fix negative return from dio read beyond
        eof" will end up it; it's a fixup to commit marked for -stable)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix the regression from "direct-io: Fix negative return from dio read beyond eof"
        9p: ->evict_inode() should kick out ->i_data, not ->i_mapping
      626d114f
    • L
      Merge tag 'pci-v4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 978d6a90
      Linus Torvalds 提交于
      Pull PCI fixes from Bjorn Helgaas:
       "These are more fixes I'd like to have in v4.4.  Several for the Altera
        driver added for v4.4, and one for an MSI domain problem that affects
        several arm64 platforms:
      
        MSI:
         - Only use the generic MSI layer when domain is hierarchical (Marc
           Zyngier)
      
        Altera host bridge driver:
         - Fix loop in tlp_read_packet() (Dan Carpenter)
         - Fix Requester ID for config accesses (Ley Foon Tan)
         - Check TLP completion status (Ley Foon Tan)
         - Fix error when INTx is 4 (Ley Foon Tan)"
      
      * tag 'pci-v4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: altera: Fix error when INTx is 4
        PCI: altera: Check TLP completion status
        PCI: altera: Fix Requester ID for config accesses
        PCI: altera: Fix loop in tlp_read_packet()
        PCI/MSI: Only use the generic MSI layer when domain is hierarchical
      978d6a90
  3. 09 12月, 2015 12 次提交
    • R
      of/irq: move of_msi_map_rid declaration to the correct ifdef section · eaddb572
      Rob Herring 提交于
      In checking fixes for of_irq_find_parent declaration location, I found
      that of_msi_map_rid is also wrong. of_msi_map_rid is not implemented for
      Sparc, so it should not be in the Sparc specific section of the header.
      Move it to just depend on OF_IRQ.
      
      Cc: Frank Rowand <frowand.list@gmail.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      eaddb572
    • C
      of/irq: Export of_irq_find_parent again · 4c3141e0
      Carlo Caione 提交于
      of_irq_find_parent was made static since it had no users outside of
      of_irq.c. Export it again since we are going to use it again.
      Signed-off-by: NCarlo Caione <carlo@endlessm.com>
      [robh: move of_irq_find_parent to correct ifdef section]
      Signed-off-by: NRob Herring <robh@kernel.org>
      4c3141e0
    • L
      Merge branch 'for-linus-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · aa536855
      Linus Torvalds 提交于
      Pull uml fixes from Richard Weinberger:
       "This contains various bug fixes, most of them are fall out from the
        merge window"
      
      * 'for-linus-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        um: fix returns without va_end
        um: Fix fpstate handling
        arch: um: fix error when linking vmlinux.
        um: Fix get_signal() usage
      aa536855
    • L
      Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 5406812e
      Linus Torvalds 提交于
      Pull cgroup fixes from Tejun Heo:
       "More change than I'd have liked at this stage.  The pids controller
        and the changes made to cgroup core to support it introduced and
        revealed several important issues.
      
         - Assigning membership to a newly created task and migrating it can
           race leading to incorrect accounting.  Oleg fixed it by widening
           threadgroup synchronization.  It looks like we'll be able to merge
           it with a different percpu rwsem which is used in fork path making
           things simpler and cheaper.
      
         - The recent change to extend cgroup membership to zombies (so that
           pid accounting can extend till the pid is actually released) missed
           pinning the underlying data structures leading to use-after-free.
           Fixed.
      
         - v2 hierarchy was calling subsystem callbacks with the wrong target
           cgroup_subsys_state based on the incorrect assumption that they
           share the same target.  pids is the first controller affected by
           this.  Subsys callbacks updated so that they can deal with
           multi-target migrations"
      
      * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup_pids: don't account for the root cgroup
        cgroup: fix handling of multi-destination migration from subtree_control enabling
        cgroup_freezer: simplify propagation of CGROUP_FROZEN clearing in freezer_attach()
        cgroup: pids: kill pids_fork(), simplify pids_can_fork() and pids_cancel_fork()
        cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()
        cgroup: make css_set pin its css's to avoid use-afer-free
        cgroup: fix cftype->file_offset handling
      5406812e
    • L
      Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 633bb738
      Linus Torvalds 提交于
      Pull libata fixes from Tejun Heo:
       "Nothing too interesting.  All are device specific additions and
        workarounds"
      
      * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads
        libata-eh.c: Introduce new ata port flag for controller which lockup on read log page
        sata_sil: disable trim
        AHCI: Fix softreset failed issue of Port Multiplier
        sata/mvebu: use #ifdef around suspend/resume code
        ahci: Order SATA device IDs for codename Lewisburg
        ahci: Add Device ID for Intel Sunrise Point PCH
      633bb738
    • G
      um: fix returns without va_end · 887a9853
      Geyslan G. Bem 提交于
      When using va_list ensure that va_start will be followed by va_end.
      Signed-off-by: NGeyslan G. Bem <geyslan@gmail.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      887a9853
    • R
      um: Fix fpstate handling · 8090bfd2
      Richard Weinberger 提交于
      The x86 FPU cleanup changed fpstate to a plain integer.
      UML on x86 has to deal with that too.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      8090bfd2
    • L
      arch: um: fix error when linking vmlinux. · fb1770aa
      Lorenzo Colitti 提交于
      On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with:
      
      arch/um/os-Linux/built-in.o: In function `os_timer_create':
      /android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create'
      arch/um/os-Linux/built-in.o: In function `os_timer_set_interval':
      /android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime'
      arch/um/os-Linux/built-in.o: In function `os_timer_remain':
      /android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime'
      arch/um/os-Linux/built-in.o: In function `os_timer_one_shot':
      /android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime'
      arch/um/os-Linux/built-in.o: In function `os_timer_disable':
      /android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime'
      
      This is because -lrt appears in the generated link commandline
      after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from
      arch/um/Makefile and adding it to the UM-specific section of
      scripts/link-vmlinux.sh.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      fb1770aa
    • R
      um: Fix get_signal() usage · db2f24dc
      Richard Weinberger 提交于
      If get_signal() returns us a signal to post
      we must not call it again, otherwise the already
      posted signal will be overridden.
      Before commit a610d6e6 this was the case as we stopped
      the while after a successful handle_signal().
      
      Cc: <stable@vger.kernel.org> # 3.10-
      Fixes: a610d6e6 ("pull clearing RESTORE_SIGMASK into block_sigmask()")
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      db2f24dc
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 51825c8a
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "This tree includes four core perf fixes for misc bugs, three fixes to
        x86 PMU drivers, and two updates to old email addresses"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Do not send exit event twice
        perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro
        perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell
        perf: Fix PERF_EVENT_IOC_PERIOD deadlock
        treewide: Remove old email address
        perf/x86: Fix LBR call stack save/restore
        perf: Update email address in MAINTAINERS
        perf/core: Robustify the perf_cgroup_from_task() RCU checks
        perf/core: Fix RCU problem with cgroup context switching code
      51825c8a
    • A
      fix the regression from "direct-io: Fix negative return from dio read beyond eof" · 2d4594ac
      Al Viro 提交于
      Sure, it's better to bail out of past-the-eof read and return 0 than return
      a bogus negative value on such.  Only we'd better make sure we are bailing out
      with 0 and not -ENOMEM...
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2d4594ac
    • A
      9p: ->evict_inode() should kick out ->i_data, not ->i_mapping · 4ad78628
      Al Viro 提交于
      For block devices the pagecache is associated with the inode
      on bdevfs, not with the aliasing ones on the mountable filesystems.
      The latter have its own ->i_data empty and ->i_mapping pointing
      to the (unique per major/minor) bdevfs inode.  That guarantees
      cache coherence between all block device inodes with the same
      device number.
      
      Eviction of an alias inode has no business trying to evict the
      pages belonging to bdevfs one; moreover, ->i_mapping is only
      safe to access when the thing is opened.  At the time of
      ->evict_inode() the victim is definitely *not* opened.  We are
      about to kill the address space embedded into struct inode
      (inode->i_data) and that's what we need to empty of any pages.
      
      9p instance tries to empty inode->i_mapping instead, which is
      both unsafe and bogus - if we have several device nodes with
      the same device number in different places, closing one of them
      should not try to empty the (shared) page cache.
      
      Fortunately, other instances in the tree are OK; they are
      evicting from &inode->i_data instead, as 9p one should.
      
      Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that
      Reported-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Tested-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4ad78628
  4. 08 12月, 2015 3 次提交
    • G
      of/fdt: Add mutex protection for calls to __unflatten_device_tree() · f8062386
      Guenter Roeck 提交于
      __unflatten_device_tree() calls unflatten_dt_node(), which declares
      a static variable. It is therefore not reentrant.
      
      One of the callers of __unflatten_device_tree(), unflatten_device_tree(),
      is only called once during early initialization and does not need to be
      protected. The other caller, of_fdt_unflatten_tree(), can be called at
      any time, possibly multiple times in parallel. This can happen, for
      example, if multiple devicetree overlays have to be loaded and installed.
      
      Without this protection, errors such as the following may be seen.
      
      kernel: End of tree marker overwritten: e6a3a458
      kernel: find_target_node:
      	Failed to find target-indirect node at /fragment@0
      kernel: __of_overlay_create: of_build_overlay_info() failed for tree@/
      
      Add a mutex to of_fdt_unflatten_tree() to make the call reentrant.
      
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: NRob Herring <robh@kernel.org>
      f8062386
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 62ea1ec5
      Linus Torvalds 提交于
      Pull virtio fixes from Michael Tsirkin:
       "This includes some fixes and cleanups in virtio and vhost code.
      
        Most notably, shadowing the index fixes the excessive cacheline
        bouncing observed on AMD platforms"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_ring: shadow available ring flags & index
        virtio: Do not drop __GFP_HIGH in alloc_indirect
        vhost: replace % with & on data path
        tools/virtio: fix byteswap logic
        tools/virtio: move list macro stubs
        virtio: fix memory leak of virtio ida cache layers
        vhost: relax log address alignment
        virtio-net: Stop doing DMA from the stack
      62ea1ec5
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · f41683a2
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings,
        some endian conversion problems with ext4 encryption, potential memory
        leaks after truncate in data=journal mode, and an ocfs2 regression
        caused by a jbd2 performance improvement"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        jbd2: fix null committed data return in undo_access
        ext4: add "static" to ext4_seq_##name##_fops struct
        ext4: fix an endianness bug in ext4_encrypted_follow_link()
        ext4: fix an endianness bug in ext4_encrypted_zeroout()
        jbd2: Fix unreclaimed pages after truncate in data=journal mode
        ext4: Fix handling of extended tv_sec
      f41683a2
  5. 07 12月, 2015 16 次提交
    • V
      virtio_ring: shadow available ring flags & index · f277ec42
      Venkatesh Srinivas 提交于
      Improves cacheline transfer flow of available ring header.
      
      Virtqueues are implemented as a pair of rings, one producer->consumer
      avail ring and one consumer->producer used ring; preceding the
      avail ring in memory are two contiguous u16 fields -- avail->flags
      and avail->idx. A producer posts work by writing to avail->idx and
      a consumer reads avail->idx.
      
      The flags and idx fields only need to be written by a producer CPU
      and only read by a consumer CPU; when the producer and consumer are
      running on different CPUs and the virtio_ring code is structured to
      only have source writes/sink reads, we can continuously transfer the
      avail header cacheline between 'M' states between cores. This flow
      optimizes core -> core bandwidth on certain CPUs.
      
      (see: "Software Optimization Guide for AMD Family 15h Processors",
      Section 11.6; similar language appears in the 10h guide and should
      apply to CPUs w/ exclusive caches, using LLC as a transfer cache)
      
      Unfortunately the existing virtio_ring code issued reads to the
      avail->idx and read-modify-writes to avail->flags on the producer.
      
      This change shadows the flags and index fields in producer memory;
      the vring code now reads from the shadows and only ever writes to
      avail->flags and avail->idx, allowing the cacheline to transfer
      core -> core optimally.
      
      In a concurrent version of vring_bench, the time required for
      10,000,000 buffer checkout/returns was reduced by ~2% (average
      across many runs) on an AMD Piledriver (15h) CPU:
      
      (w/o shadowing):
       Performance counter stats for './vring_bench':
           5,451,082,016      L1-dcache-loads
           ...
             2.221477739 seconds time elapsed
      
      (w/ shadowing):
       Performance counter stats for './vring_bench':
           5,405,701,361      L1-dcache-loads
           ...
             2.168405376 seconds time elapsed
      
      The further away (in a NUMA sense) virtio producers and consumers are
      from each other, the more we expect to benefit. Physical implementations
      of virtio devices and implementations of virtio where the consumer polls
      vring avail indexes (vhost) should also benefit.
      Signed-off-by: NVenkatesh Srinivas <venkateshs@google.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f277ec42
    • M
      virtio: Do not drop __GFP_HIGH in alloc_indirect · 82107539
      Michal Hocko 提交于
      b92b1b89 ("virtio: force vring descriptors to be allocated from
      lowmem") tried to exclude highmem pages for descriptors so it cleared
      __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH
      which doesn't make much sense for this fix because __GFP_HIGH only
      controls access to memory reserves and it doesn't have any influence
      on the zone selection. Some of the call paths use GFP_ATOMIC and
      dropping __GFP_HIGH will reduce their changes for success because the
      lack of access to memory reserves.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMel Gorman <mgorman@techsingularity.net>
      82107539
    • M
      vhost: replace % with & on data path · 5fba13b5
      Michael S. Tsirkin 提交于
      We know vring num is a power of 2, so use &
      to mask the high bits.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      5fba13b5
    • M
      tools/virtio: fix byteswap logic · 55564a02
      Michael S. Tsirkin 提交于
      commit cf561f0d ("virtio: introduce
      virtio_is_little_endian() helper") changed byteswap logic to
      skip feature bit checks for LE platforms, but didn't
      update tools/virtio, so vring_bench started failing.
      
      Update the copy under tools/virtio/ (TODO: find a way to avoid this code
      duplication).
      
      Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      55564a02
    • M
      tools/virtio: move list macro stubs · 40c172e5
      Michael S. Tsirkin 提交于
      Makes them more generally available.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      40c172e5
    • S
      virtio: fix memory leak of virtio ida cache layers · c13f99b7
      Suman Anna 提交于
      The virtio core uses a static ida named virtio_index_ida for
      assigning index numbers to virtio devices during registration.
      The ida core may allocate some internal idr cache layers and
      an ida bitmap upon any ida allocation, and all these layers are
      truely freed only upon the ida destruction. The virtio_index_ida
      is not destroyed at present, leading to a memory leak when using
      the virtio core as a module and atleast one virtio device is
      registered and unregistered.
      
      Fix this by invoking ida_destroy() in the virtio core module
      exit.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSuman Anna <s-anna@ti.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      c13f99b7
    • M
      vhost: relax log address alignment · d5424838
      Michael S. Tsirkin 提交于
      commit 5d9a07b0 ("vhost: relax used
      address alignment") fixed the alignment for the used virtual address,
      but not for the physical address used for logging.
      
      That's a mistake: alignment should clearly be the same for virtual and
      physical addresses,
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      d5424838
    • A
      ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads · 4f2568f5
      Andreas Werner 提交于
      Every attempt to issue a read log page command lockup the controller.
      The command is currently sent if the sata device includes the devlsp feature
      to read out the timing data.
      This attempt to read the data, locks up the controller and the device
      is not recognzied correctly (failed to set xfermode) and cannot be accessed.
      
      This was found on Freescale P1013/P1022 and T4240 CPUs
      using a ATP IG mSATA 4GB with the devslp feature.
      
      fsl-sata ff718000.sata: Sata FSL Platform/CSB Driver init
      [    1.254195] scsi0 : sata_fsl
      [    1.256004] ata1: SATA max UDMA/133 irq 74
      [    1.370666] fsl-gianfar ethernet.3: enabled errata workarounds, flags: 0x4
      [    1.470671] fsl-gianfar ethernet.4: enabled errata workarounds, flags: 0x4
      [    1.775584] ata1: Signature Update detected @ 504 msecs
      [    1.947594] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [    1.948366] ata1.00: ATA-8: ATP IG mSATA, 20150311, max UDMA/133
      [    1.948371] ata1.00: 7732368 sectors, multi 0: LBA
      [    1.948843] ata1.00: failed to get Identify Device Data, Emask 0x1
      [    1.948857] ata1.00: failed to set xfermode (err_mask=0x40)
      [    7.467557] ata1: Signature Update detected @ 504 msecs
      [    7.639560] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [    7.651320] ata1.00: failed to get Identify Device Data, Emask 0x1
      [    7.651360] ata1.00: failed to set xfermode (err_mask=0x40)
      [    7.655628] ata1: limiting SATA link speed to 1.5 Gbps
      [    7.659458] ata1.00: limiting speed to UDMA/133:PIO3
      [   13.163554] ata1: Signature Update detected @ 504 msecs
      [   13.335558] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [   13.347298] ata1.00: failed to get Identify Device Data, Emask 0x1
      [   13.347334] ata1.00: failed to set xfermode (err_mask=0x40)
      [   13.351601] ata1.00: disabled
      [   13.353278] ata1: exception Emask 0x50 SAct 0x0 SErr 0x800 action 0x6 frozen t4
      [   13.359281] ata1: SError: { HostInt }
      [   13.361644] ata1: hard resetting link
      Signed-off-by: NAndreas Werner <andreas.werner@men.de>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4f2568f5
    • A
      libata-eh.c: Introduce new ata port flag for controller which lockup on read log page · ea013a9b
      Andreas Werner 提交于
      Some controller lockup on a ata_read_log_page.
      Add new ata port flag ATA_FLAG_NO_LOG_PAGE which can used
      to blacklist a controller.
      
      If this flag is set, any attempt to read a log page returns an error
      without actually issuing the command.
      Signed-off-by: NAndreas Werner <andreas.werner@men.de>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ea013a9b
    • T
      Merge branch 'master' into for-4.4-fixes · 0b98f0c0
      Tejun Heo 提交于
      The following commit which went into mainline through networking tree
      
        3b13758f ("cgroups: Allow dynamically changing net_classid")
      
      conflicts in net/core/netclassid_cgroup.c with the following pending
      fix in cgroup/for-4.4-fixes.
      
        1f7dd3e5 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")
      
      The former separates out update_classid() from cgrp_attach() and
      updates it to walk all fds of all tasks in the target css so that it
      can be used from both migration and config change paths.  The latter
      drops @css from cgrp_attach().
      
      Resolve the conflict by making cgrp_attach() call update_classid()
      with the css from the first task.  We can revive @tset walking in
      cgrp_attach() but given that net_cls is v1 only where there always is
      only one target css during migration, this is fine.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Nina Schiff <ninasc@fb.com>
      0b98f0c0
    • M
      virtio-net: Stop doing DMA from the stack · 2ac46030
      Michael S. Tsirkin 提交于
      Once virtio starts using the DMA API, we won't be able to safely DMA
      from the stack.  virtio-net does a couple of config DMA requests
      from small stack buffers -- switch to using dynamically-allocated
      memory.
      
      This should have no effect on any performance-critical code paths.
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Tested-by: NAndy Lutomirski <luto@kernel.org>
      
      2ac46030
    • L
      Linux 4.4-rc4 · 527e9316
      Linus Torvalds 提交于
      527e9316
    • J
      staging/lustre: remove IOC_LIBCFS_PING_TEST ioctl · d035e336
      James Simmons 提交于
      The ioctl IOC_LIBCFS_PING_TEST has not been used in ages.  The recent
      nidstring changes which moved all the nidstring operations from libcfs
      to the LNet layer but this ioctl code was still using an nidstring
      operation that was causing a circular dependency loop between libcfs and
      LNet.
      Signed-off-by: NJames Simmons <jsimmons@infradead.org>
      Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d035e336
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d8cd93ea
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "A couple of fixes (-stable fodder) + dead code removal after the
        overlayfs fix.
      
        I agree that it's better to separate from the fix part to make
        backporting easier, but IMO it's not worth delaying said dead code
        removal until the next window"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Don't reset ->total_link_count on nested calls of vfs_path_lookup()
        ovl: get rid of the dead code left from broken (and disabled) optimizations
        ovl: fix permission checking for setattr
      d8cd93ea
    • A
      Don't reset ->total_link_count on nested calls of vfs_path_lookup() · 2788cc47
      Al Viro 提交于
      we already zero it on outermost set_nameidata(), so initialization in
      path_init() is pointless and wrong.  The same DoS exists on pre-4.2
      kernels, but there a slightly different fix will be needed.
      
      Cc: stable@vger.kernel.org # v4.2
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2788cc47
    • A