提交 · d90167a941f62860f35eb960e1012aa2d30e7e94 · openeuler / Kernel

19 12月, 2015 1 次提交

x86/mce: Ensure offline CPUs don't participate in rendezvous process · d90167a9

由 Ashok Raj 提交于 12月 10, 2015

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
[ Massage commit message. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d90167a9

10 12月, 2015 8 次提交

Merge tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio · 6764e5eb

由 Linus Torvalds 提交于 12月 09, 2015

Pull VFIO fixes from Alex Williamson:

 - Various fixes for removing redundancy, const'ifying structs, avoiding
   stack usage, fixing WARN usage (Krzysztof Kozlowski, Julia Lawall,
   Kees Cook, Dan Carpenter)

 - Revert No-IOMMU mode as the intended user has not emerged (Alex
   Williamson)

* tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio:
  Revert: "vfio: Include No-IOMMU mode"
  vfio: fix a warning message
  vfio: platform: remove needless stack usage
  vfio-pci: constify pci_error_handlers structures
  vfio: Drop owner assignment from platform_driver

6764e5eb

Merge tag 'devicetree-fixes-for-4.4-rc4' of... · eef121f4

由 Linus Torvalds 提交于 12月 09, 2015

Merge tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DT fixes from Rob Herring:
 "I think this should be all for 4.4:

   - Fix incorrect warning about overlapping memory regions

   - Export of_irq_find_parent again which was made static in 4.4, but
     has users pending for 4.5.

   - Fix of_msi_map_rid declaration location

   - Fix re-entrancy for of_fdt_unflatten_tree

   - Clean-up of phys_addr_t printks"

* tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of/irq: move of_msi_map_rid declaration to the correct ifdef section
  of/irq: Export of_irq_find_parent again
  of/fdt: Add mutex protection for calls to __unflatten_device_tree()
  of/address: fix typo in comment block of of_translate_one()
  of: do not use 0x in front of %pa
  of: Fix comparison of reserved memory regions

eef121f4

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · abb7e2b3

由 Linus Torvalds 提交于 12月 09, 2015

Pull clk fixes from Stephen Boyd:
 "One small build fix, a couple do_div() fixes, and a fix for the gpio
  basic clock type are the major changes here.  There's also a couple
  fixes for the TI, sunxi, and scpi clock drivers"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: sunxi: pll2: Fix clock running too fast
  clk: scpi: add missing of_node_put
  clk: qoriq: fix memory leak
  imx/clk-pllv2: fix wrong do_div() usage
  imx/clk-pllv1: fix wrong do_div() usage
  clk: mmp: add linux/clk.h includes
  clk: ti: drop locking code from mux/divider drivers
  clk: ti816x: Add missing dmtimer clkdev entries
  clk: ti: fapll: fix wrong do_div() usage
  clk: ti: clkt_dpll: fix wrong do_div() usage
  clk: gpio: Get parent clk names in of_gpio_clk_setup()

abb7e2b3

Merge tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmi · 9a0f76fd

由 Linus Torvalds 提交于 12月 09, 2015

Pull IPMI fix from Corey Minyard:
 "Fix an Oops if an interrupt occurs at startup.  This can happen on
  some hardware"

* tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: move timer init to before irq is setup

9a0f76fd

ipmi: move timer init to before irq is setup · 27f972d3

由 Jan Stancek 提交于 12月 08, 2015

We encountered a panic on boot in ipmi_si on a dell per320 due to an
uninitialized timer as follows.

static int smi_start_processing(void       *send_info,
                                ipmi_smi_t intf)
{
        /* Try to claim any interrupts. */
        if (new_smi->irq_setup)
                new_smi->irq_setup(new_smi);

 --> IRQ arrives here and irq handler tries to modify uninitialized timer

    which triggers BUG_ON(!timer->function) in __mod_timer().

 Call Trace:
   <IRQ>
   [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si]
   [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si]
   [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si]
   [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350
   [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si]
   [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170
   [<ffffffff810f245e>] handle_edge_irq+0xde/0x180
   [<ffffffff8100fc59>] handle_irq+0x49/0xa0
   [<ffffffff8154643c>] do_IRQ+0x6c/0xf0
   [<ffffffff8100ba53>] ret_from_intr+0x0/0x11

        /* Set up the timer that drives the interface. */
        setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi);

The following patch fixes the problem.

To: Openipmi-developer@lists.sourceforge.net
To: Corey Minyard <minyard@acm.org>
CC: linux-kernel@vger.kernel.org
Signed-off-by: NJan Stancek <jstancek@redhat.com>
Signed-off-by: NTony Camuso <tcamuso@redhat.com>
Signed-off-by: NCorey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # Applies cleanly to 3.10-, needs small rework before

27f972d3

bitops.h: correctly handle rol32 with 0 byte shift · d7e35dfa

由 Sasha Levin 提交于 12月 03, 2015

ROL on a 32 bit integer with a shift of 32 or more is undefined and the
result is arch-dependent. Avoid this by handling the trivial case of
roling by 0 correctly.

The trivial solution of checking if shift is 0 breaks gcc's detection
of this code as a ROL instruction, which is unacceptable.

This bug was reported and fixed in GCC
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):

	The standard rotate idiom,

	  (x << n) | (x >> (32 - n))

	is recognized by gcc (for concreteness, I discuss only the case that x
	is an uint32_t here).

	However, this is portable C only for n in the range 0 < n < 32. For n
	== 0, we get x >> 32 which gives undefined behaviour according to the
	C standard (6.5.7, Bitwise shift operators). To portably support n ==
	0, one has to write the rotate as something like

	  (x << n) | (x >> ((-n) & 31))

	And this is apparently not recognized by gcc.

Note that this is broken on older GCCs and will result in slower ROL.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7e35dfa

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 626d114f

由 Linus Torvalds 提交于 12月 09, 2015

Pull vfs fixes from Al Viro:
 "A couple of fixes, both -stable fodder (9p one all way back to 2.6.32,
  dio - to all branches where "Fix negative return from dio read beyond
  eof" will end up it; it's a fixup to commit marked for -stable)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix the regression from "direct-io: Fix negative return from dio read beyond eof"
  9p: ->evict_inode() should kick out ->i_data, not ->i_mapping

626d114f

Merge tag 'pci-v4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 978d6a90

由 Linus Torvalds 提交于 12月 09, 2015

Pull PCI fixes from Bjorn Helgaas:
 "These are more fixes I'd like to have in v4.4.  Several for the Altera
  driver added for v4.4, and one for an MSI domain problem that affects
  several arm64 platforms:

  MSI:
   - Only use the generic MSI layer when domain is hierarchical (Marc
     Zyngier)

  Altera host bridge driver:
   - Fix loop in tlp_read_packet() (Dan Carpenter)
   - Fix Requester ID for config accesses (Ley Foon Tan)
   - Check TLP completion status (Ley Foon Tan)
   - Fix error when INTx is 4 (Ley Foon Tan)"

* tag 'pci-v4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: altera: Fix error when INTx is 4
  PCI: altera: Check TLP completion status
  PCI: altera: Fix Requester ID for config accesses
  PCI: altera: Fix loop in tlp_read_packet()
  PCI/MSI: Only use the generic MSI layer when domain is hierarchical

978d6a90

09 12月, 2015 12 次提交

of/irq: move of_msi_map_rid declaration to the correct ifdef section · eaddb572

由 Rob Herring 提交于 12月 09, 2015

In checking fixes for of_irq_find_parent declaration location, I found
that of_msi_map_rid is also wrong. of_msi_map_rid is not implemented for
Sparc, so it should not be in the Sparc specific section of the header.
Move it to just depend on OF_IRQ.

Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: NRob Herring <robh@kernel.org>

eaddb572

of/irq: Export of_irq_find_parent again · 4c3141e0

由 Carlo Caione 提交于 12月 01, 2015

of_irq_find_parent was made static since it had no users outside of
of_irq.c. Export it again since we are going to use it again.
Signed-off-by: NCarlo Caione <carlo@endlessm.com>
[robh: move of_irq_find_parent to correct ifdef section]
Signed-off-by: NRob Herring <robh@kernel.org>

4c3141e0

Merge branch 'for-linus-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · aa536855

由 Linus Torvalds 提交于 12月 08, 2015

Pull uml fixes from Richard Weinberger:
 "This contains various bug fixes, most of them are fall out from the
  merge window"

* 'for-linus-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: fix returns without va_end
  um: Fix fpstate handling
  arch: um: fix error when linking vmlinux.
  um: Fix get_signal() usage

aa536855

Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 5406812e

由 Linus Torvalds 提交于 12月 08, 2015

Pull cgroup fixes from Tejun Heo:
 "More change than I'd have liked at this stage.  The pids controller
  and the changes made to cgroup core to support it introduced and
  revealed several important issues.

   - Assigning membership to a newly created task and migrating it can
     race leading to incorrect accounting.  Oleg fixed it by widening
     threadgroup synchronization.  It looks like we'll be able to merge
     it with a different percpu rwsem which is used in fork path making
     things simpler and cheaper.

   - The recent change to extend cgroup membership to zombies (so that
     pid accounting can extend till the pid is actually released) missed
     pinning the underlying data structures leading to use-after-free.
     Fixed.

   - v2 hierarchy was calling subsystem callbacks with the wrong target
     cgroup_subsys_state based on the incorrect assumption that they
     share the same target.  pids is the first controller affected by
     this.  Subsys callbacks updated so that they can deal with
     multi-target migrations"

* 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup_pids: don't account for the root cgroup
  cgroup: fix handling of multi-destination migration from subtree_control enabling
  cgroup_freezer: simplify propagation of CGROUP_FROZEN clearing in freezer_attach()
  cgroup: pids: kill pids_fork(), simplify pids_can_fork() and pids_cancel_fork()
  cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()
  cgroup: make css_set pin its css's to avoid use-afer-free
  cgroup: fix cftype->file_offset handling

5406812e

Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 633bb738

由 Linus Torvalds 提交于 12月 08, 2015

Pull libata fixes from Tejun Heo:
 "Nothing too interesting.  All are device specific additions and
  workarounds"

* 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads
  libata-eh.c: Introduce new ata port flag for controller which lockup on read log page
  sata_sil: disable trim
  AHCI: Fix softreset failed issue of Port Multiplier
  sata/mvebu: use #ifdef around suspend/resume code
  ahci: Order SATA device IDs for codename Lewisburg
  ahci: Add Device ID for Intel Sunrise Point PCH

633bb738

um: fix returns without va_end · 887a9853

由 Geyslan G. Bem 提交于 12月 01, 2015

When using va_list ensure that va_start will be followed by va_end.
Signed-off-by: NGeyslan G. Bem <geyslan@gmail.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

887a9853

um: Fix fpstate handling · 8090bfd2

由 Richard Weinberger 提交于 11月 29, 2015

The x86 FPU cleanup changed fpstate to a plain integer.
UML on x86 has to deal with that too.
Signed-off-by: NRichard Weinberger <richard@nod.at>

8090bfd2

arch: um: fix error when linking vmlinux. · fb1770aa

由 Lorenzo Colitti 提交于 11月 18, 2015

On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with:

arch/um/os-Linux/built-in.o: In function `os_timer_create':
/android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create'
arch/um/os-Linux/built-in.o: In function `os_timer_set_interval':
/android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_remain':
/android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime'
arch/um/os-Linux/built-in.o: In function `os_timer_one_shot':
/android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_disable':
/android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime'

This is because -lrt appears in the generated link commandline
after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from
arch/um/Makefile and adding it to the UM-specific section of
scripts/link-vmlinux.sh.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

fb1770aa

um: Fix get_signal() usage · db2f24dc

由 Richard Weinberger 提交于 11月 18, 2015

If get_signal() returns us a signal to post
we must not call it again, otherwise the already
posted signal will be overridden.
Before commit a610d6e6 this was the case as we stopped
the while after a successful handle_signal().

Cc: <stable@vger.kernel.org> # 3.10-
Fixes: a610d6e6 ("pull clearing RESTORE_SIGMASK into block_sigmask()")
Signed-off-by: NRichard Weinberger <richard@nod.at>

db2f24dc

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 51825c8a

由 Linus Torvalds 提交于 12月 08, 2015

Pull perf fixes from Ingo Molnar:
 "This tree includes four core perf fixes for misc bugs, three fixes to
  x86 PMU drivers, and two updates to old email addresses"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Do not send exit event twice
  perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro
  perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell
  perf: Fix PERF_EVENT_IOC_PERIOD deadlock
  treewide: Remove old email address
  perf/x86: Fix LBR call stack save/restore
  perf: Update email address in MAINTAINERS
  perf/core: Robustify the perf_cgroup_from_task() RCU checks
  perf/core: Fix RCU problem with cgroup context switching code

51825c8a

fix the regression from "direct-io: Fix negative return from dio read beyond eof" · 2d4594ac

由 Al Viro 提交于 12月 08, 2015

Sure, it's better to bail out of past-the-eof read and return 0 than return
a bogus negative value on such.  Only we'd better make sure we are bailing out
with 0 and not -ENOMEM...

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2d4594ac

9p: ->evict_inode() should kick out ->i_data, not ->i_mapping · 4ad78628

由 Al Viro 提交于 12月 08, 2015

For block devices the pagecache is associated with the inode
on bdevfs, not with the aliasing ones on the mountable filesystems.
The latter have its own ->i_data empty and ->i_mapping pointing
to the (unique per major/minor) bdevfs inode.  That guarantees
cache coherence between all block device inodes with the same
device number.

Eviction of an alias inode has no business trying to evict the
pages belonging to bdevfs one; moreover, ->i_mapping is only
safe to access when the thing is opened.  At the time of
->evict_inode() the victim is definitely *not* opened.  We are
about to kill the address space embedded into struct inode
(inode->i_data) and that's what we need to empty of any pages.

9p instance tries to empty inode->i_mapping instead, which is
both unsafe and bogus - if we have several device nodes with
the same device number in different places, closing one of them
should not try to empty the (shared) page cache.

Fortunately, other instances in the tree are OK; they are
evicting from &inode->i_data instead, as 9p one should.

Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that
Reported-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
Tested-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4ad78628

08 12月, 2015 3 次提交

of/fdt: Add mutex protection for calls to __unflatten_device_tree() · f8062386

由 Guenter Roeck 提交于 12月 05, 2015

__unflatten_device_tree() calls unflatten_dt_node(), which declares
a static variable. It is therefore not reentrant.

One of the callers of __unflatten_device_tree(), unflatten_device_tree(),
is only called once during early initialization and does not need to be
protected. The other caller, of_fdt_unflatten_tree(), can be called at
any time, possibly multiple times in parallel. This can happen, for
example, if multiple devicetree overlays have to be loaded and installed.

Without this protection, errors such as the following may be seen.

kernel: End of tree marker overwritten: e6a3a458
kernel: find_target_node:
	Failed to find target-indirect node at /fragment@0
kernel: __of_overlay_create: of_build_overlay_info() failed for tree@/

Add a mutex to of_fdt_unflatten_tree() to make the call reentrant.

Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: NRob Herring <robh@kernel.org>

f8062386

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 62ea1ec5

由 Linus Torvalds 提交于 12月 07, 2015

Pull virtio fixes from Michael Tsirkin:
 "This includes some fixes and cleanups in virtio and vhost code.

  Most notably, shadowing the index fixes the excessive cacheline
  bouncing observed on AMD platforms"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_ring: shadow available ring flags & index
  virtio: Do not drop __GFP_HIGH in alloc_indirect
  vhost: replace % with & on data path
  tools/virtio: fix byteswap logic
  tools/virtio: move list macro stubs
  virtio: fix memory leak of virtio ida cache layers
  vhost: relax log address alignment
  virtio-net: Stop doing DMA from the stack

62ea1ec5

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · f41683a2

由 Linus Torvalds 提交于 12月 07, 2015

Pull ext4 fixes from Ted Ts'o:
 "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings,
  some endian conversion problems with ext4 encryption, potential memory
  leaks after truncate in data=journal mode, and an ocfs2 regression
  caused by a jbd2 performance improvement"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  jbd2: fix null committed data return in undo_access
  ext4: add "static" to ext4_seq_##name##_fops struct
  ext4: fix an endianness bug in ext4_encrypted_follow_link()
  ext4: fix an endianness bug in ext4_encrypted_zeroout()
  jbd2: Fix unreclaimed pages after truncate in data=journal mode
  ext4: Fix handling of extended tv_sec

f41683a2

07 12月, 2015 16 次提交

virtio_ring: shadow available ring flags & index · f277ec42

由 Venkatesh Srinivas 提交于 11月 10, 2015

Improves cacheline transfer flow of available ring header.

Virtqueues are implemented as a pair of rings, one producer->consumer
avail ring and one consumer->producer used ring; preceding the
avail ring in memory are two contiguous u16 fields -- avail->flags
and avail->idx. A producer posts work by writing to avail->idx and
a consumer reads avail->idx.

The flags and idx fields only need to be written by a producer CPU
and only read by a consumer CPU; when the producer and consumer are
running on different CPUs and the virtio_ring code is structured to
only have source writes/sink reads, we can continuously transfer the
avail header cacheline between 'M' states between cores. This flow
optimizes core -> core bandwidth on certain CPUs.

(see: "Software Optimization Guide for AMD Family 15h Processors",
Section 11.6; similar language appears in the 10h guide and should
apply to CPUs w/ exclusive caches, using LLC as a transfer cache)

Unfortunately the existing virtio_ring code issued reads to the
avail->idx and read-modify-writes to avail->flags on the producer.

This change shadows the flags and index fields in producer memory;
the vring code now reads from the shadows and only ever writes to
avail->flags and avail->idx, allowing the cacheline to transfer
core -> core optimally.

In a concurrent version of vring_bench, the time required for
10,000,000 buffer checkout/returns was reduced by ~2% (average
across many runs) on an AMD Piledriver (15h) CPU:

(w/o shadowing):
 Performance counter stats for './vring_bench':
     5,451,082,016      L1-dcache-loads
     ...
       2.221477739 seconds time elapsed

(w/ shadowing):
 Performance counter stats for './vring_bench':
     5,405,701,361      L1-dcache-loads
     ...
       2.168405376 seconds time elapsed

The further away (in a NUMA sense) virtio producers and consumers are
from each other, the more we expect to benefit. Physical implementations
of virtio devices and implementations of virtio where the consumer polls
vring avail indexes (vhost) should also benefit.
Signed-off-by: NVenkatesh Srinivas <venkateshs@google.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

f277ec42

virtio: Do not drop __GFP_HIGH in alloc_indirect · 82107539

由 Michal Hocko 提交于 12月 01, 2015

b92b1b89 ("virtio: force vring descriptors to be allocated from
lowmem") tried to exclude highmem pages for descriptors so it cleared
__GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH
which doesn't make much sense for this fix because __GFP_HIGH only
controls access to memory reserves and it doesn't have any influence
on the zone selection. Some of the call paths use GFP_ATOMIC and
dropping __GFP_HIGH will reduce their changes for success because the
lack of access to memory reserves.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NMel Gorman <mgorman@techsingularity.net>

82107539

vhost: replace % with & on data path · 5fba13b5

由 Michael S. Tsirkin 提交于 11月 29, 2015

We know vring num is a power of 2, so use &
to mask the high bits.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

5fba13b5

tools/virtio: fix byteswap logic · 55564a02

由 Michael S. Tsirkin 提交于 11月 29, 2015

commit cf561f0d ("virtio: introduce
virtio_is_little_endian() helper") changed byteswap logic to
skip feature bit checks for LE platforms, but didn't
update tools/virtio, so vring_bench started failing.

Update the copy under tools/virtio/ (TODO: find a way to avoid this code
duplication).

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

55564a02

M
tools/virtio: move list macro stubs · 40c172e5
由 Michael S. Tsirkin 提交于 11月 29, 2015
```
Makes them more generally available.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
```
40c172e5

virtio: fix memory leak of virtio ida cache layers · c13f99b7

由 Suman Anna 提交于 9月 16, 2015

The virtio core uses a static ida named virtio_index_ida for
assigning index numbers to virtio devices during registration.
The ida core may allocate some internal idr cache layers and
an ida bitmap upon any ida allocation, and all these layers are
truely freed only upon the ida destruction. The virtio_index_ida
is not destroyed at present, leading to a memory leak when using
the virtio core as a module and atleast one virtio device is
registered and unregistered.

Fix this by invoking ida_destroy() in the virtio core module
exit.

Cc: stable@vger.kernel.org
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

c13f99b7

vhost: relax log address alignment · d5424838

由 Michael S. Tsirkin 提交于 11月 16, 2015

commit 5d9a07b0 ("vhost: relax used
address alignment") fixed the alignment for the used virtual address,
but not for the physical address used for logging.

That's a mistake: alignment should clearly be the same for virtual and
physical addresses,

Cc: stable@vger.kernel.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

d5424838

ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads · 4f2568f5

由 Andreas Werner 提交于 12月 04, 2015

Every attempt to issue a read log page command lockup the controller.
The command is currently sent if the sata device includes the devlsp feature
to read out the timing data.
This attempt to read the data, locks up the controller and the device
is not recognzied correctly (failed to set xfermode) and cannot be accessed.

This was found on Freescale P1013/P1022 and T4240 CPUs
using a ATP IG mSATA 4GB with the devslp feature.

fsl-sata ff718000.sata: Sata FSL Platform/CSB Driver init
[    1.254195] scsi0 : sata_fsl
[    1.256004] ata1: SATA max UDMA/133 irq 74
[    1.370666] fsl-gianfar ethernet.3: enabled errata workarounds, flags: 0x4
[    1.470671] fsl-gianfar ethernet.4: enabled errata workarounds, flags: 0x4
[    1.775584] ata1: Signature Update detected @ 504 msecs
[    1.947594] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.948366] ata1.00: ATA-8: ATP IG mSATA, 20150311, max UDMA/133
[    1.948371] ata1.00: 7732368 sectors, multi 0: LBA
[    1.948843] ata1.00: failed to get Identify Device Data, Emask 0x1
[    1.948857] ata1.00: failed to set xfermode (err_mask=0x40)
[    7.467557] ata1: Signature Update detected @ 504 msecs
[    7.639560] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.651320] ata1.00: failed to get Identify Device Data, Emask 0x1
[    7.651360] ata1.00: failed to set xfermode (err_mask=0x40)
[    7.655628] ata1: limiting SATA link speed to 1.5 Gbps
[    7.659458] ata1.00: limiting speed to UDMA/133:PIO3
[   13.163554] ata1: Signature Update detected @ 504 msecs
[   13.335558] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   13.347298] ata1.00: failed to get Identify Device Data, Emask 0x1
[   13.347334] ata1.00: failed to set xfermode (err_mask=0x40)
[   13.351601] ata1.00: disabled
[   13.353278] ata1: exception Emask 0x50 SAct 0x0 SErr 0x800 action 0x6 frozen t4
[   13.359281] ata1: SError: { HostInt }
[   13.361644] ata1: hard resetting link
Signed-off-by: NAndreas Werner <andreas.werner@men.de>
Signed-off-by: NTejun Heo <tj@kernel.org>

4f2568f5

libata-eh.c: Introduce new ata port flag for controller which lockup on read log page · ea013a9b

由 Andreas Werner 提交于 12月 04, 2015

Some controller lockup on a ata_read_log_page.
Add new ata port flag ATA_FLAG_NO_LOG_PAGE which can used
to blacklist a controller.

If this flag is set, any attempt to read a log page returns an error
without actually issuing the command.
Signed-off-by: NAndreas Werner <andreas.werner@men.de>
Signed-off-by: NTejun Heo <tj@kernel.org>

ea013a9b

Merge branch 'master' into for-4.4-fixes · 0b98f0c0

由 Tejun Heo 提交于 12月 07, 2015

The following commit which went into mainline through networking tree

  3b13758f ("cgroups: Allow dynamically changing net_classid")

conflicts in net/core/netclassid_cgroup.c with the following pending
fix in cgroup/for-4.4-fixes.

  1f7dd3e5 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")

The former separates out update_classid() from cgrp_attach() and
updates it to walk all fds of all tasks in the target css so that it
can be used from both migration and config change paths.  The latter
drops @css from cgrp_attach().

Resolve the conflict by making cgrp_attach() call update_classid()
with the css from the first task.  We can revive @tset walking in
cgrp_attach() but given that net_cls is v1 only where there always is
only one target css during migration, this is fine.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Nina Schiff <ninasc@fb.com>

0b98f0c0

virtio-net: Stop doing DMA from the stack · 2ac46030

由 Michael S. Tsirkin 提交于 11月 15, 2015

Once virtio starts using the DMA API, we won't be able to safely DMA
from the stack.  virtio-net does a couple of config DMA requests
from small stack buffers -- switch to using dynamically-allocated
memory.

This should have no effect on any performance-critical code paths.
Reported-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NAndy Lutomirski <luto@kernel.org>

2ac46030

L

Linux 4.4-rc4 · 527e9316
由 Linus Torvalds 提交于 12月 06, 2015

527e9316

staging/lustre: remove IOC_LIBCFS_PING_TEST ioctl · d035e336

由 James Simmons 提交于 12月 04, 2015

The ioctl IOC_LIBCFS_PING_TEST has not been used in ages.  The recent
nidstring changes which moved all the nidstring operations from libcfs
to the LNet layer but this ioctl code was still using an nidstring
operation that was causing a circular dependency loop between libcfs and
LNet.
Signed-off-by: NJames Simmons <jsimmons@infradead.org>
Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d035e336

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d8cd93ea

由 Linus Torvalds 提交于 12月 06, 2015

Pull vfs fixes from Al Viro:
 "A couple of fixes (-stable fodder) + dead code removal after the
  overlayfs fix.

  I agree that it's better to separate from the fix part to make
  backporting easier, but IMO it's not worth delaying said dead code
  removal until the next window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Don't reset ->total_link_count on nested calls of vfs_path_lookup()
  ovl: get rid of the dead code left from broken (and disabled) optimizations
  ovl: fix permission checking for setattr

d8cd93ea

Don't reset ->total_link_count on nested calls of vfs_path_lookup() · 2788cc47

由 Al Viro 提交于 12月 06, 2015

we already zero it on outermost set_nameidata(), so initialization in
path_init() is pointless and wrong.  The same DoS exists on pre-4.2
kernels, but there a slightly different fix will be needed.

Cc: stable@vger.kernel.org # v4.2
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2788cc47

A
ovl: get rid of the dead code left from broken (and disabled) optimizations · 0f7ff2da
由 Al Viro 提交于 12月 06, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0f7ff2da

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功