提交 · 5c1bec61fdfcd056df909a712e2a86bbaeb0f942 · openanolis / cloud-kernel

10 2月, 2017 23 次提交

vmbus: use kernel bitops for traversing interrupt mask · 5c1bec61

由 Stephen Hemminger 提交于 2月 05, 2017

Use standard kernel operations for find first set bit to traverse
the channel bit array. This has added benefit of speeding up
lookup on 64 bit and because it uses find first set instruction.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5c1bec61

Drivers: hv: util: Fix a typo · bb6a4db9

由 K. Y. Srinivasan 提交于 2月 04, 2017

Fix a typo.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bb6a4db9

hv_utils: implement Hyper-V PTP source · 3716a49a

由 Vitaly Kuznetsov 提交于 2月 04, 2017

With TimeSync version 4 protocol support we started updating system time
continuously through the whole lifetime of Hyper-V guests. Every 5 seconds
there is a time sample from the host which triggers do_settimeofday[64]().
While the time from the host is very accurate such adjustments may cause
issues:
- Time is jumping forward and backward, some applications may misbehave.
- In case an NTP server runs in parallel and uses something else for time
  sync (network, PTP,...) system time will never converge.
- Systemd starts annoying you by printing "Time has been changed" every 5
  seconds to the system log.

Instead of doing in-kernel time adjustments offload the work to an
NTP client by exposing TimeSync messages as a PTP device. Users may now
decide what they want to use as a source.

I tested the solution with chrony, the config was:

 refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

The result I'm seeing is accurate enough, the time delta between the guest
and the host is almost always within [-10us, +10us], the in-kernel solution
was giving us comparable results.

I also tried implementing PPS device instead of PTP by using not currently
used Hyper-V synthetic timers (we use only one of four for clockevent) but
with PPS source only chrony wasn't able to give me the required accuracy,
the delta often more that 100us.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3716a49a

hv: export current Hyper-V clocksource · dee863b5

由 Vitaly Kuznetsov 提交于 2月 04, 2017

As a preparation to implementing Hyper-V PTP device supporting
.getcrosststamp we need to export a reference to the current Hyper-V
clocksource in use (MSR or TSC page).
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

dee863b5

Drivers: hv: Fix the bug in generating the guest ID · 9b06e101

由 K. Y. Srinivasan 提交于 2月 04, 2017

Fix the bug in the generation of the guest ID. Without this fix
the host side telemetry code is broken.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Fixes: 352c9624 ("Drivers: hv: vmbus: Move the definition of generate_guest_id()")
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9b06e101

misc: panel: Abstract temporary backlight handling · fda4ae18

由 Geert Uytterhoeven 提交于 2月 06, 2017

Currently the periodic scan timer is used for three purposes,
entangling keypad and display handling, which are both optional:
  1. Scanning the keypad,
  2. Flashing the backlight when a key is pressed,
  3. Disabling temporary backlighting after a fixed period of time.

Abstract the second purpose using a new lcd_poke() function.
Make the non-periodic temporary backlight handling independent from
keypad handling by converting it to a delayed workqueue.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

fda4ae18

misc: panel: Add lcd_home() helper · 204a4f6d

由 Geert Uytterhoeven 提交于 2月 06, 2017

Add a helper function to move the cursor to the home position, so
callers no longer need access to internal state.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

204a4f6d

misc: panel: Remove always-true check from panel_detach() · 3f77b439

由 Geert Uytterhoeven 提交于 2月 06, 2017

panel_detach() already verified that pptr is a valid pointer.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3f77b439

misc: panel: Move all suboptions into a big if section · 9db3cf1c

由 Geert Uytterhoeven 提交于 2月 06, 2017

All 18 suboptions related to the panel driver have individual
dependencies on PANEL.

Replace them by a single "if PANEL / endif # PANEL" section for easier
dependency management.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9db3cf1c

misc: panel: Remove reference to misc device support · 731fcec4

由 Geert Uytterhoeven 提交于 2月 06, 2017

As of commit 7c5763b8 ("drivers: misc: Remove MISC_DEVICES
config option"), misc device support no longer needs to be enabled
manually.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

731fcec4

misc: panel: Remove unused LCD_FLAG_S and LCD_FLAG_ID · e28fa714

由 Geert Uytterhoeven 提交于 2月 06, 2017

These definitions were never used in any publicly available version
since (at least) 2004.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e28fa714

misc: panel: Remove PANEL_VERSION · 30f468b2

由 Geert Uytterhoeven 提交于 2月 06, 2017

Hardcoded driver versions are so pre-git.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

30f468b2

misc: panel: Fix LCD_FLAG_F/LCD_FLAG_N exchange · 84a1ed04

由 Geert Uytterhoeven 提交于 2月 06, 2017

LCD_FLAG_F is the font flag, LCD_FLAG_N is the two-lines flag.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

84a1ed04

w1: ds2405: use module_w1_family to simplify the code · 90beaf64

由 Wei Yongjun 提交于 2月 09, 2017

module_w1_family() makes the code simpler by eliminating
boilerplate code.
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Acked-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

90beaf64

w1: ds2490: use kmemdup rather than duplicating its implementation · 45003a1e

由 Wei Yongjun 提交于 2月 09, 2017

Use kmemdup rather than duplicating its implementation.

Generated by: scripts/coccinelle/api/memdup.cocci
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Acked-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

45003a1e

fpga zynq: Use the scatterlist interface · 425902f5

由 Jason Gunthorpe 提交于 2月 01, 2017

This allows the driver to avoid a high order coherent DMA allocation
and memory copy. With this patch it can DMA directly from the kernel
pages that the bitfile is stored in.

Since this is now a gather DMA operation the driver uses the ISR
to feed the chips DMA queue with each entry from the SGL.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NMoritz Fischer <moritz.fischer@ettus.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

425902f5

fpga: Add scatterlist based programming · baa6d396

由 Jason Gunthorpe 提交于 2月 01, 2017

Requiring contiguous kernel memory is not a good idea, this is a limited
resource and allocation can fail under normal work loads.

This introduces a .write_sg op that supporting drivers can provide
to DMA directly from dis-contiguous memory and a new entry point
fpga_mgr_buf_load_sg that users can call to directly provide page
lists.

The full matrix of compatibility is provided, either the linear or sg
interface can be used by the user with a driver supporting either
interface.

A notable change for drivers is that the .write op can now be called
multiple times.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NAlan Tull <atull@opensource.altera.com>
Acked-by: NMoritz Fischer <moritz.fischer@ettus.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

baa6d396

fpga zynq: Check the bitstream for validity · b496df86

由 Jason Gunthorpe 提交于 2月 01, 2017

There is no sense in sending a bitstream we know will not work, and
with the variety of options for bitstream generation in Xilinx tools
it is not terribly clear what the correct input should be.

This is particularly important for Zynq since auto-correction was
removed from the driver and the Zynq hardware only accepts a bitstream
format that is different from what the Xilinx tools typically produce.

Worse, the hardware provides no indication why the bitstream fails,
it simply times out if the input is wrong.

The best option here is to have the kernel print a message informing
the user they are using a malformed bistream and programming failure
isn't for any of the myriad of other reasons.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NMoritz Fischer <moritz.fischer@ettus.com>
Acked-by: NAlan Tull <atull@opensource.altera.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b496df86

fpga zynq: Check for errors after completing DMA · 6b45e0f2

由 Jason Gunthorpe 提交于 2月 01, 2017

The completion did not check the interrupt status to see if any error
bits were asserted, check error bits and dump some registers if things
went wrong.

A few fixes are needed to make this work, the IXR_ERROR_FLAGS_MASK was
wrong, it included the done bits, which shows a bug in mask/unmask_irqs
which were using the wrong bits, simplify all of this stuff.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NMoritz Fischer <moritz.fischer@ettus.com>
Acked-by: NAlan Tull <atull@opensource.altera.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6b45e0f2

mei: remove support for broken parallel read · cb97fbbc

由 Alexander Usyskin 提交于 2月 08, 2017

Parallel reads from multiple threads on a file descriptor
are not well defined and racy. It is safer to return to original
behavior and simply fail the additional read.
The solution is to remove request for next read credit.

Cc: <stable@vger.kernel.org> #4.9
Fixes: ff1586a7 ("mei: enqueue consecutive reads")
Signed-off-by: NAlexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: NTomas Winkler <tomas.winkler@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cb97fbbc

drivers/fsi: add driver to device matches · dd37eed7

由 Jeremy Kerr 提交于 2月 01, 2017

Driver bind to devices based on the engine types & (optional) versions.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NChris Bostic <cbostic@us.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

dd37eed7

drivers/fsi: Add device & driver definitions · fda07a6c

由 Jeremy Kerr 提交于 2月 01, 2017

Add structs for fsi devices & drivers, and struct device conversion
functions.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NChris Bostic <cbostic@us.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

fda07a6c

drivers/fsi: Add empty fsi bus definitions · 0508ad1f

由 Jeremy Kerr 提交于 2月 01, 2017

This change adds the initial (empty) fsi bus definition, and introduces
drivers/fsi/.
Signed-off-by: NJeremy Kerr <jk@ozlabs.org>
Signed-off-by: NChris Bostic <cbostic@us.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0508ad1f

06 2月, 2017 2 次提交
- G
  Merge 4.10-rc7 into char-misc-next · 17fa87fe
  由 Greg Kroah-Hartman 提交于 2月 06, 2017
```
We want the hv and other fixes in here as well to handle merge and
testing issues.
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
```
  17fa87fe
- L
  
  Linux 4.10-rc7 · d5adbfcd
  由 Linus Torvalds 提交于 2月 05, 2017
  
  d5adbfcd
05 2月, 2017 5 次提交

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a572a1b9

由 Linus Torvalds 提交于 2月 04, 2017

Pull irq fixes from Thomas Gleixner:

 - Prevent double activation of interrupt lines, which causes problems
   on certain interrupt controllers

 - Handle the fallout of the above because x86 (ab)uses the activation
   function to reconfigure interrupts under the hood.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Make irq activate operations symmetric
  irqdomain: Avoid activating interrupts more than once

a572a1b9

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 24bc5fe7

由 Linus Torvalds 提交于 2月 04, 2017

Pull KVM fix from Radim Krčmář:
 "Fix a regression that prevented migration between hosts with different
  XSAVE features even if the missing features were not used by the guest
  (for stable)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: do not save guest-unsupported XSAVE state

24bc5fe7

Merge tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 412e6d3f

由 Linus Torvalds 提交于 2月 04, 2017

Pull char/misc driver fixes from Greg KH:
 "Here are two bugfixes that resolve some reported issues. One in the
  firmware loader, that should fix the much-reported problem of crashes
  with it. The other is a hyperv fix for a reported regression.

  Both have been in linux-next for a week or so with no reported issues"

* tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()
  firmware: fix NULL pointer dereference in __fw_load_abort()

412e6d3f

Merge tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 252bf9f4

由 Linus Torvalds 提交于 2月 04, 2017

Pull staging/IIO fixes from Greg KH:
 "Here are a few small IIO and one staging driver fix for 4.10-rc7. They
  fix some reported issues with the drivers.

  All of them have been in linux-next for a week or so with no reported
  issues"

* tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: greybus: timesync: validate platform state callback
  iio: dht11: Use usleep_range instead of msleep for start signal
  iio: adc: palmas_gpadc: retrieve a valid iio_dev in suspend/resume
  iio: health: max30100: fixed parenthesis around FIFO count check
  iio: health: afe4404: retrieve a valid iio_dev in suspend/resume
  iio: health: afe4403: retrieve a valid iio_dev in suspend/resume

252bf9f4

Merge tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 8fcdcc42

由 Linus Torvalds 提交于 2月 04, 2017

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for some reported issues, and the usual
  number of new device ids for 4.10-rc7.

  All of these, except the last new device id, have been in linux-next
  for a while with no reported issues"

* tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: pl2303: add ATEN device ID
  usb: gadget: f_fs: Assorted buffer overflow checks.
  USB: Add quirk for WORLDE easykey.25 MIDI keyboard
  usb: musb: Fix external abort on non-linefetch for musb_irq_work()
  usb: musb: Fix host mode error -71 regression
  USB: serial: option: add device ID for HP lt2523 (Novatel E371)
  USB: serial: qcserial: add Dell DW5570 QDL

8fcdcc42

04 2月, 2017 10 次提交

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · a0a28644

由 Linus Torvalds 提交于 2月 03, 2017

Pull SCSI fix from James Bottomley:
 "A single fix this time: a fix for a virtqueue removal bug which only
  appears to affect S390, but which results in the queue hanging forever
  thus causing the machine to fail shutdown"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: virtio_scsi: Reject commands when virtqueue is broken

a0a28644

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · a49e6f58

由 Linus Torvalds 提交于 2月 03, 2017

Pull virtio/vhost fixes from Michael S. Tsirkin:
 "Last minute fixes:

   - ARM DMA fix revert

   - vhost endian-ness fix

   - MAINTAINERS: email address change for Amit"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  MAINTAINERS: update email address for Amit Shah
  vhost: fix initialization for vq->is_le
  Revert "vring: Force use of DMA API for ARM-based systems with legacy devices"

a49e6f58

Merge tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio · e9f7f17d

由 Linus Torvalds 提交于 2月 03, 2017

Pull VFIO fix from Alex Williamson:
 "Fix an error path in SPAPR IOMMU backend (Alexey Kardashevskiy)"

* tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio:
  vfio/spapr: Fix missing mutex unlock when creating a window

e9f7f17d

Merge branch 'akpm' (patches from Andrew) · 7a92cc6b

由 Linus Torvalds 提交于 2月 03, 2017

Merge fixes from Andrew Morton:
 "8 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, fs: check for fatal signals in do_generic_file_read()
  fs: break out of iomap_file_buffered_write on fatal signals
  base/memory, hotplug: fix a kernel oops in show_valid_zones()
  mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
  jump label: pass kbuild_cflags when checking for asm goto support
  shmem: fix sleeping from atomic context
  kasan: respect /proc/sys/kernel/traceoff_on_warning
  zswap: disable changing params if init fails

7a92cc6b

mm, fs: check for fatal signals in do_generic_file_read() · 5abf186a

由 Michal Hocko 提交于 2月 03, 2017

do_generic_file_read() can be told to perform a large request from
userspace.  If the system is under OOM and the reading task is the OOM
victim then it has an access to memory reserves and finishing the full
request can lead to the full memory depletion which is dangerous.  Make
sure we rather go with a short read and allow the killed task to
terminate.

Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5abf186a

fs: break out of iomap_file_buffered_write on fatal signals · d1908f52

由 Michal Hocko 提交于 2月 03, 2017

Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion.  He has tracked
this down to the following path

	__alloc_pages_nodemask+0x436/0x4d0
	alloc_pages_current+0x97/0x1b0
	__page_cache_alloc+0x15d/0x1a0          mm/filemap.c:728
	pagecache_get_page+0x5a/0x2b0           mm/filemap.c:1331
	grab_cache_page_write_begin+0x23/0x40   mm/filemap.c:2773
	iomap_write_begin+0x50/0xd0             fs/iomap.c:118
	iomap_write_actor+0xb5/0x1a0            fs/iomap.c:190
	? iomap_write_end+0x80/0x80             fs/iomap.c:150
	iomap_apply+0xb3/0x130                  fs/iomap.c:79
	iomap_file_buffered_write+0x68/0xa0     fs/iomap.c:243
	? iomap_write_end+0x80/0x80
	xfs_file_buffered_aio_write+0x132/0x390 [xfs]
	? remove_wait_queue+0x59/0x60
	xfs_file_write_iter+0x90/0x130 [xfs]
	__vfs_write+0xe5/0x140
	vfs_write+0xc7/0x1f0
	? syscall_trace_enter+0x1d0/0x380
	SyS_write+0x58/0xc0
	do_syscall_64+0x6c/0x200
	entry_SYSCALL64_slow_path+0x25/0x25

the oom victim has access to all memory reserves to make a forward
progress to exit easier.  But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request.  We need to
check for fatal signals and back off with a short write instead.

As the iomap_apply delegates all the work down to the actor we have to
hook into those.  All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there.  dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly.  Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.

Fixes: 68a9f5e7 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d1908f52

base/memory, hotplug: fix a kernel oops in show_valid_zones() · a96dfddb

由 Toshi Kani 提交于 2月 03, 2017

Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().

 BUG: unable to handle kernel paging request at ffffea017a000000
 IP: show_valid_zones+0x6f/0x160

This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB.  [1] An example of such
systems is desribed below.  0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.

 BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable

Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range.  show_valid_zones() then proceeds with the valid range.

[1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'

Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.comSigned-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a96dfddb

mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() · deb88a2a

由 Toshi Kani 提交于 2月 03, 2017

Patch series "fix a kernel oops when reading sysfs valid_zones", v2.

A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory.  [1] When the start address of a
memory block is not backed by struct page, i.e.  a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops.  This issue was observed on multiple x86-64 systems with
more than 64GiB of memory.  This patch-set fixes this issue.

Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.

Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).

Note for stable kernels: The memory block size change was made by commit
bdee237c ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9.  However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887 ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.

So, I recommend that we backport it up to 4.4.

[1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'

This patch (of 2):

test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'.  Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.

Fix it by properly setting 'sec_end_pfn' to the next section pfn.

Also make sure that this function returns 1 only when the range belongs
to a zone.

Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.comSigned-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

deb88a2a

jump label: pass kbuild_cflags when checking for asm goto support · 35f860f9

由 David Lin 提交于 2月 03, 2017

Some versions of ARM GCC compiler such as Android toolchain throws in a
'-fpic' flag by default.  This causes the gcc-goto check script to fail
although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.

This patch passes the KBUILD_CFLAGS to the check script so that the
script does not rely on the default config from different compilers.

Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.comSigned-off-by: NDavid Lin <dtwlin@google.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Michal Marek <mmarek@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

35f860f9

shmem: fix sleeping from atomic context · 253fd0f0

由 Kirill A. Shutemov 提交于 2月 03, 2017

Syzkaller fuzzer managed to trigger this:

    BUG: sleeping function called from invalid context at mm/shmem.c:852
    in_atomic(): 1, irqs_disabled(): 0, pid: 529, name: khugepaged
    3 locks held by khugepaged/529:
     #0:  (shrinker_rwsem){++++..}, at: [<ffffffff818d7ef1>] shrink_slab.part.59+0x121/0xd30 mm/vmscan.c:451
     #1:  (&type->s_umount_key#29){++++..}, at: [<ffffffff81a63630>] trylock_super+0x20/0x100 fs/super.c:392
     #2:  (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<ffffffff818fd83e>] spin_lock include/linux/spinlock.h:302 [inline]
     #2:  (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<ffffffff818fd83e>] shmem_unused_huge_shrink+0x28e/0x1490 mm/shmem.c:427
    CPU: 2 PID: 529 Comm: khugepaged Not tainted 4.10.0-rc5+ #201
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
       shmem_undo_range+0xb20/0x2710 mm/shmem.c:852
       shmem_truncate_range+0x27/0xa0 mm/shmem.c:939
       shmem_evict_inode+0x35f/0xca0 mm/shmem.c:1030
       evict+0x46e/0x980 fs/inode.c:553
       iput_final fs/inode.c:1515 [inline]
       iput+0x589/0xb20 fs/inode.c:1542
       shmem_unused_huge_shrink+0xbad/0x1490 mm/shmem.c:446
       shmem_unused_huge_scan+0x10c/0x170 mm/shmem.c:512
       super_cache_scan+0x376/0x450 fs/super.c:106
       do_shrink_slab mm/vmscan.c:378 [inline]
       shrink_slab.part.59+0x543/0xd30 mm/vmscan.c:481
       shrink_slab mm/vmscan.c:2592 [inline]
       shrink_node+0x2c7/0x870 mm/vmscan.c:2592
       shrink_zones mm/vmscan.c:2734 [inline]
       do_try_to_free_pages+0x369/0xc80 mm/vmscan.c:2776
       try_to_free_pages+0x3c6/0x900 mm/vmscan.c:2982
       __perform_reclaim mm/page_alloc.c:3301 [inline]
       __alloc_pages_direct_reclaim mm/page_alloc.c:3322 [inline]
       __alloc_pages_slowpath+0xa24/0x1c30 mm/page_alloc.c:3683
       __alloc_pages_nodemask+0x544/0xae0 mm/page_alloc.c:3848
       __alloc_pages include/linux/gfp.h:426 [inline]
       __alloc_pages_node include/linux/gfp.h:439 [inline]
       khugepaged_alloc_page+0xc2/0x1b0 mm/khugepaged.c:750
       collapse_huge_page+0x182/0x1fe0 mm/khugepaged.c:955
       khugepaged_scan_pmd+0xfdf/0x12a0 mm/khugepaged.c:1208
       khugepaged_scan_mm_slot mm/khugepaged.c:1727 [inline]
       khugepaged_do_scan mm/khugepaged.c:1808 [inline]
       khugepaged+0xe9b/0x1590 mm/khugepaged.c:1853
       kthread+0x326/0x3f0 kernel/kthread.c:227
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430

The iput() from atomic context was a bad idea: if after igrab() somebody
else calls iput() and we left with the last inode reference, our iput()
would lead to inode eviction and therefore sleeping.

This patch should fix the situation.

Link: http://lkml.kernel.org/r/20170131093141.GA15899@node.shutemov.nameSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

253fd0f0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功