提交 · 7650cb80e4e90b0fae7854b6008a46d24360515f · openanolis / cloud-kernel

29 2月, 2016 5 次提交

X.509: Handle midnight alternative notation in GeneralizedTime · 7650cb80

由 David Howells 提交于 2月 24, 2016

The ASN.1 GeneralizedTime object carries an ISO 8601 format date and time.
The time is permitted to show midnight as 00:00 or 24:00 (the latter being
equivalent of 00:00 of the following day).

The permitted value is checked in x509_decode_time() but the actual
handling is left to mktime64().

Without this patch, certain X.509 certificates will be rejected and could
lead to an unbootable kernel.

Note that with this patch we also permit any 24:mm:ss time and extend this
to UTCTime, which whilst not strictly correct don't permit much leeway in
fiddling date strings.
Reported-by: NRudolf Polzer <rpolzer@google.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
cc: David Woodhouse <David.Woodhouse@intel.com>
cc: John Stultz <john.stultz@linaro.org>

7650cb80

X.509: Support leap seconds · da02559c

由 David Howells 提交于 2月 24, 2016

The format of ASN.1 GeneralizedTime seems to be specified by ISO 8601
[X.680 46.3] and this apparently supports leap seconds (ie. the seconds
field is 60).  It's not entirely clear that ASN.1 expects it, but we can
relax the seconds check slightly for GeneralizedTime.

This results in us passing a time with sec as 60 to mktime64(), which
handles it as being a duplicate of the 0th second of the next minute.

We can't really do otherwise without giving the kernel much greater
knowledge of where all the leap seconds are.  Unfortunately, this would
require change the mapping of the kernel's current-time-in-seconds.

UTCTime, however, only supports a seconds value in the range 00-59, but for
the sake of simplicity allow this with UTCTime also.

Without this patch, certain X.509 certificates will be rejected,
potentially making a kernel unbootable.
Reported-by: NRudolf Polzer <rpolzer@google.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
cc: David Woodhouse <David.Woodhouse@intel.com>
cc: John Stultz <john.stultz@linaro.org>

da02559c

Handle ISO 8601 leap seconds and encodings of midnight in mktime64() · ede5147d

由 David Howells 提交于 2月 24, 2016

Handle the following ISO 8601 features in mktime64():

 (1) Leap seconds.

     Leap seconds are indicated by the seconds parameter being the value
     60.  Handle this by treating it the same as 00 of the following
     minute.

     It has been pointed out that a minute may contain two leap seconds.
     However, pending discussion of what that looks like and how to handle
     it, I'm not going to concern myself with it.

 (2) Alternate encodings of midnight.

     Two different encodings of midnight are permitted - 00:00:00 and
     24:00:00 - the first is midnight today and the second is midnight
     tomorrow and is exactly equivalent to the first with tomorrow's date.

As it happens, we don't actually need to change mktime64() to handle either
of these - just comment them as valid parameters.

These facility will be used by the X.509 parser.  Doing it in mktime64()
makes the policy common to the whole kernel and easier to find.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
cc: John Stultz <john.stultz@linaro.org>
cc: Rudolf Polzer <rpolzer@google.com>
cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>

ede5147d

X.509: Fix leap year handling again · ac4cbedf

由 David Howells 提交于 2月 24, 2016

There are still a couple of minor issues in the X.509 leap year handling:

 (1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when
     determining whether the year is a leap year or not, I divided the year
     by 100 after doing the modulus-by-100, thereby letting the compiler do
     one instruction for both, and then did a modulus-by-4.

     Unfortunately, I then passed the now-modified year value to mktime64()
     to construct a time value.

     Since this isn't a fast path and since mktime64() does a bunch of
     divisions, just condense down to "% 400".  It's also easier to read.

 (2) The default month length for any February where the year doesn't
     divide by four exactly is obtained from the month_length[] array where
     the value is 29, not 28.

     This is fixed by altering the table.
Reported-by: NRudolf Polzer <rpolzer@google.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
cc: stable@vger.kernel.org

ac4cbedf

PKCS#7: fix unitialized boolean 'want' · 06aae592

由 Colin Ian King 提交于 2月 27, 2016

The boolean want is not initialized and hence garbage. The default should
be false (later it is only set to true on tne sinfo->authattrs check).

Found with static analysis using CoverityScan
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

06aae592

26 2月, 2016 3 次提交

KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert · 8e167898

由 Mehmet Kayaalp 提交于 11月 24, 2015

When a certificate is inserted to the image using scripts/writekey, the
value of __cert_list_end does not change. The updated size can be found
out by reading the value pointed by the system_certificate_list_size
symbol.
Signed-off-by: NMehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

8e167898

KEYS: Reserve an extra certificate symbol for inserting without recompiling · c4c36105

由 Mehmet Kayaalp 提交于 11月 24, 2015

Place a system_extra_cert buffer of configurable size, right after the
system_certificate_list, so that inserted keys can be readily processed by
the existing mechanism. Added script takes a key file and a kernel image
and inserts its contents to the reserved area. The
system_certificate_list_size is also adjusted accordingly.

Call the script as:

    scripts/insert-sys-cert -b <vmlinux> -c <certfile>

If vmlinux has no symbol table, supply System.map file with -s flag.
Subsequent runs replace the previously inserted key, instead of appending
the new one.
Signed-off-by: NMehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>

c4c36105

modsign: hide openssl output in silent builds · 5d06ee20

由 Arnd Bergmann 提交于 2月 25, 2016

When a user calls 'make -s', we can assume they don't want to
see any output except for warnings and errors, but instead
they see this for a warning free build:

 ###
 ### Now generating an X.509 key pair to be used for signing modules.
 ###
 ### If this takes a long time, you might wish to run rngd in the
 ### background to keep the supply of entropy topped up.  It
 ### needs to be run as root, and uses a hardware random
 ### number generator if one is available.
 ###
 Generating a 4096 bit RSA private key
 .................................................................................................................................................................................................................................++
 ..............................................................................................................................++
 writing new private key to 'certs/signing_key.pem'
 -----
 ###
 ### Key pair generated.
 ###

The output can confuse simple build testing scripts that just check
for an empty build log.

This patch silences all the output:
 - "echo" is changed to "@$(kecho)", which is dropped when "-s" gets
   passed
 - the openssl command itself is only printed with V=1, using the
   $(Q) macro
 - The output of openssl gets redirected to /dev/null on "-s" builds.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

5d06ee20

19 2月, 2016 1 次提交

scripts/sign-file.c: Add support for signing with a raw signature · e5a2e3c8

由 Juerg Haefliger 提交于 2月 04, 2016

This patch adds support for signing a kernel module with a raw
detached PKCS#7 signature/message.

The signature is not converted and is simply appended to the module so
it needs to be in the right format. Using openssl, a valid signature can
be generated like this:
  $ openssl smime -sign -nocerts -noattr -binary -in <module> -inkey \
    <key> -signer <x509> -outform der -out <raw sig>

The resulting raw signature from the above command is (more or less)
identical to the raw signature that sign-file itself can produce like
this:
  $ scripts/sign-file -d <hash algo> <key> <x509> <module>
Signed-off-by: NJuerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

e5a2e3c8

18 2月, 2016 3 次提交

security/keys: make big_key.c explicitly non-modular · a1f2bdf3

由 Paul Gortmaker 提交于 12月 09, 2015

The Kconfig currently controlling compilation of this code is:

config BIG_KEYS
        bool "Large payload keys"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We also delete the MODULE_LICENSE tag since all that information
is already contained at the top of the file in the comments.

Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

a1f2bdf3

crypto: public_key: remove MPIs from public_key_signature struct · d846e78e

由 Tadeusz Struk 提交于 2月 02, 2016

After digsig_asymmetric.c is converted the MPIs can be now
safely removed from the public_key_signature structure.
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

d846e78e

integrity: convert digsig to akcipher api · eb5798f2

由 Tadeusz Struk 提交于 2月 02, 2016

Convert asymmetric_verify to akcipher api.
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

eb5798f2

10 2月, 2016 4 次提交

crypto: KEYS: convert public key and digsig asym to the akcipher api · db6c43bd

由 Tadeusz Struk 提交于 2月 02, 2016

This patch converts the module verification code to the new akcipher API.
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

db6c43bd

KEYS: CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an option · 50d35015

由 David Howells 提交于 2月 03, 2016

CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an option as /proc/keys is now
mandatory if the keyrings facility is enabled (it's used by libkeyutils in
userspace).

The defconfig references were removed with:

	perl -p -i -e 's/CONFIG_KEYS_DEBUG_PROC_KEYS=y\n//' \
	    `git grep -l CONFIG_KEYS_DEBUG_PROC_KEYS=y`

and the integrity Kconfig fixed by hand.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
cc: Andreas Ziegler <andreas.ziegler@fau.de>
cc: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>

50d35015

KEYS: Add an alloc flag to convey the builtinness of a key · 5d2787cf

由 David Howells 提交于 2月 09, 2016

Add KEY_ALLOC_BUILT_IN to convey that a key should have KEY_FLAG_BUILTIN
set rather than setting it after the fact.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>

5d2787cf

v2 linux-next scripts/sign-file.c Fix LibreSSL support · 411a6f58

由 Codarren Velvindron 提交于 2月 09, 2016

In file included from scripts/sign-file.c:47:0:
/usr/include/openssl/cms.h:62:2: error: #error CMS is disabled.
 #error CMS is disabled.
  ^
scripts/Makefile.host:91: recipe for target 'scripts/sign-file' failed
make[1]: *** [scripts/sign-file] Error 1
Makefile:567: recipe for target 'scripts' failed
make: *** [scripts] Error 2


Fix SSL headers so that the kernel can build with LibreSSL
Signed-off-by: NCodarren Velvindron <codarren@hackers.mu>
Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

411a6f58

08 2月, 2016 3 次提交

L

Linux 4.5-rc3 · 388f7b1d
由 Linus Torvalds 提交于 2月 07, 2016

388f7b1d

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · c17dfb01

由 Linus Torvalds 提交于 2月 07, 2016

Pull ARM SoC fixes from Olof Johansson:
 "The first real batch of fixes for this release cycle, so there are a
  few more than usual.

  Most of these are fixes and tweaks to board support (DT bugfixes,
  etc).  I've also picked up a couple of small cleanups that seemed
  innocent enough that there was little reason to wait (const/
  __initconst and Kconfig deps).

  Quite a bit of the changes on OMAP were due to fixes to no longer
  write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
  there were also other fixes.

  Kirkwood had a bunch of gpio fixes for some boards.  OMAP had RTC
  fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.

  All in all, mostly the usual mix of various fixes"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
  ARM: multi_v7_defconfig: enable DW_WATCHDOG
  ARM: nomadik: fix up SD/MMC DT settings
  ARM64: tegra: Add chosen node for tegra132 norrin
  ARM: realview: use "depends on" instead of "if" after prompt
  ARM: tango: use "depends on" instead of "if" after prompt
  ARM: tango: use const and __initconst for smp_operations
  ARM: realview: use const and __initconst for smp_operations
  bus: uniphier-system-bus: revive tristate prompt
  arm64: dts: Add missing DMA Abort interrupt to Juno
  bus: vexpress-config: Add missing of_node_put
  ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
  ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
  ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
  ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
  ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
  ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
  ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
  ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
  ARM: dts: am4372: fix irq type for arm twd and global timer
  ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
  ...

c17dfb01

Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration · 63fee123

由 Linus Torvalds 提交于 2月 07, 2016

Pull mailbox fixes from Jassi Brar:

 - fix getting element from the pcc-channels array by simply indexing
   into it

 - prevent building mailbox-test driver for archs that don't have IOMEM

* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: Fix dependencies for !HAS_IOMEM archs
  mailbox: pcc: fix channel calculation in get_pcc_channel()

63fee123

07 2月, 2016 2 次提交

Merge tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 46df55ce

由 Linus Torvalds 提交于 2月 06, 2016

Pull USB fixes from Greg KH:
 "Here are some USB fixes for 4.5-rc3.

  The usual, xhci fixes for reported issues, combined with some small
  gadget driver fixes, and a MAINTAINERS file update.  All have been in
  linux-next with no reported issues"

* tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: harden xhci_find_next_ext_cap against device removal
  xhci: Fix list corruption in urb dequeue at host removal
  usb: host: xhci-plat: fix NULL pointer in probe for device tree case
  usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling
  usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
  usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms
  usb: xhci: set SSIC port unused only if xhci_suspend succeeds
  usb: xhci: add a quirk bit for ssic port unused
  usb: xhci: handle both SSIC ports in PME stuck quirk
  usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.
  Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
  MAINTAINERS: fix my email address
  usb: dwc2: Fix probe problem on bcm2835
  Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
  usb: musb: ux500: Fix NULL pointer dereference at system PM
  usb: phy: mxs: declare variable with initialized value
  usb: phy: msm: fix error handling in probe.

46df55ce

Merge tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · dacd53c8

由 Linus Torvalds 提交于 2月 06, 2016

Pull staging and IIO driver fixes from Greg KH:
 "Here are some IIO and staging driver fixes for 4.5-rc3.

  All of them, except one, are for IIO drivers, and one is for a speakup
  driver fix caused by some earlier patches, to resolve a reported build
  failure"

* tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  Staging: speakup: Fix allyesconfig build on mn10300
  iio: dht11: Use boottime
  iio: ade7753: avoid uninitialized data
  iio: pressure: mpl115: fix temperature offset sign
  iio: imu: Fix dependencies for !HAS_IOMEM archs
  staging: iio: Fix dependencies for !HAS_IOMEM archs
  iio: adc: Fix dependencies for !HAS_IOMEM archs
  iio: inkern: fix a NULL dereference on error
  iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
  iio: light: acpi-als: Report data as processed
  iio: dac: mcp4725: set iio name property in sysfs
  iio: add HAS_IOMEM dependency to VF610_ADC
  iio: add IIO_TRIGGER dependency to STK8BA50
  iio: proximity: lidar: correct return value
  iio-light: Use a signed return type for ltr501_match_samp_freq()

dacd53c8

06 2月, 2016 19 次提交

Merge branch 'akpm' (patches from Andrew) · 5af9c2e1

由 Linus Torvalds 提交于 2月 05, 2016

Merge fixes from Andrew Morton:
 "22 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
  radix-tree: fix oops after radix_tree_iter_retry
  MAINTAINERS: trim the file triggers for ABI/API
  dax: dirty inode only if required
  thp: make deferred_split_scan() work again
  mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
  ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
  um: asm/page.h: remove the pte_high member from struct pte_t
  mm, hugetlb: don't require CMA for runtime gigantic pages
  mm/hugetlb: fix gigantic page initialization/allocation
  mm: downgrade VM_BUG in isolate_lru_page() to warning
  mempolicy: do not try to queue pages from !vma_migratable()
  mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
  vmstat: make vmstat_update deferrable
  mm, vmstat: make quiet_vmstat lighter
  mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
  memblock: don't mark memblock_phys_mem_size() as __init
  dump_stack: avoid potential deadlocks
  mm: validate_mm browse_rb SMP race condition
  m32r: fix build failure due to SMP and MMU
  ...

5af9c2e1

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 5d6a6a75

由 Linus Torvalds 提交于 2月 05, 2016

Pull Ceph fixes from Sage Weil:
 "We have a few wire protocol compatibility fixes, ports of a few recent
  CRUSH mapping changes, and a couple error path fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: MOSDOpReply v7 encoding
  libceph: advertise support for TUNABLES5
  crush: decode and initialize chooseleaf_stable
  crush: add chooseleaf_stable tunable
  crush: ensure take bucket value is valid
  crush: ensure bucket id is valid before indexing buckets array
  ceph: fix snap context leak in error path
  ceph: checking for IS_ERR instead of NULL

5d6a6a75

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 9b108828

由 Linus Torvalds 提交于 2月 05, 2016

Pull drm fixes from Dave Airlie:
 "Fixes all over the place:

   - amdkfd: two static checker fixes
   - mst: a bunch of static checker and spec/hw interaction fixes
   - amdgpu: fix Iceland hw properly, and some fiji bugs, along with
     some write-combining fixes.
   - exynos: some regression fixes
   - adv7511: fix some EDID reading issues"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits)
  drm/dp/mst: deallocate payload on port destruction
  drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
  drm/dp/mst: move GUID storage from mgr, port to only mst branch
  drm/dp/mst: change MST detection scheme
  drm/dp/mst: Calculate MST PBN with 31.32 fixed point
  drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
  drm/mst: Add range check for max_payloads during init
  drm/mst: Don't ignore the MST PBN self-test result
  drm: fix missing reference counting decrease
  drm/amdgpu: disable uvd and vce clockgating on Fiji
  drm/amdgpu: remove exp hardware support from iceland
  drm/amdgpu: load MEC ucode manually on iceland
  drm/amdgpu: don't load MEC2 on topaz
  drm/amdgpu: drop topaz support from gmc8 module
  drm/amdgpu: pull topaz gmc bits into gmc_v7
  drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
  drm/amdgpu: iceland use CI based MC IP
  drm/amdgpu: move gmc7 support out of CIK dependency
  drm/amdgpu/gfx7: enable cp inst/reg error interrupts
  drm/amdgpu/gfx8: enable cp inst/reg error interrupts
  ...

9b108828

Merge tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 22f60701

由 Linus Torvalds 提交于 2月 05, 2016

Pull power management and ACPI fixes from Rafael Wysocki:
 "These are: a fix for a recently introduced false-positive warnings
  about PM domain pointers being changed inappropriately (harmless but
  annoying), an MCH size workaround quirk for one more platform, a
  compiler warning fix (generic power domains framework), an ACPI LPSS
  (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code.

  Specifics:

   - PM core fix to avoid false-positive warnings generated when the
     pm_domain field is cleared for a device that appears to be bound to
     a driver (Rafael Wysocki).

   - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).

   - Fix for an "unused function" compiler warning in the generic power
     domains framework (Ulf Hansson).

   - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM
     domain pointer of a device properly in one place that was
     overlooked by a recent PM core update (Andy Shevchenko).

   - Removal of a redundant function declaration in the ACPI CPPC core
     code (Timur Tabi)"

* tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: Avoid false-positive warnings in dev_pm_domain_set()
  PM / Domains: Silence compiler warning for an unused function
  ACPI / CPPC: remove redundant mbox_send_message() declaration
  ACPI / LPSS: set PM domain via helper setter
  PNP: Add Haswell-ULT to Intel MCH size workaround

22f60701

epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT · b6a515c8

由 Jason Baron 提交于 2月 05, 2016

In the current implementation of the EPOLLEXCLUSIVE flag (added for
4.5-rc1), if epoll waiters create different POLL* sets and register them
as exclusive against the same target fd, the current implementation will
stop waking any further waiters once it finds the first idle waiter.
This means that waiters could miss wakeups in certain cases.

For example, when we wake up a pipe for reading we do:
wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
one epoll set or epfd is added to pipe p with POLLIN and a second set
epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
wakeup since the current implementation will stop after it finds any
intersection of events with a waiter that is blocked in epoll_wait().

We could potentially address this by requiring all epoll waiters that
are added to p be required to pass the same set of POLL* events.  IE the
first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
However, I think it might be somewhat confusing interface as we would
have to reference count the number of users for that set, and so
userspace would have to keep track of that count, or we would need a
more involved interface.  It also adds some shared state that we'd have
store somewhere.  I don't think anybody will want to bloat
__wait_queue_head for this.

I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
such that it can only be specified with EPOLLIN and/or EPOLLOUT.  So
that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
once we hit the first idle waiter that specifies the EPOLLIN bit, since
any remaining waiters that only have 'POLLOUT' set wouldn't need to be
woken.  Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
bit set and not 'POLLIN'.  If both 'POLLOUT' and 'POLLIN' are set in the
wake bit set (there is at least one example of this I saw in fs/pipe.c),
then we just wake the entire exclusive list.  Having both 'POLLOUT' and
'POLLIN' both set should not be on any performance critical path, so I
think that's ok (in fs/pipe.c its in pipe_release()).  We also continue
to include EPOLLERR and EPOLLHUP by default in any exclusive set.  Thus,
the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
so.

Since epoll waiters may be interested in other events as well besides
EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
doing a 'dup' call on the target fd and adding that as one normally
would with EPOLL_CTL_ADD.  Since I think that the POLLIN and POLLOUT
events are what we are interest in balancing, I think that the 'dup'
thing could perhaps be added to only one of the waiter threads.
However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
sufficient for the majority of use-cases.

Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
among multiple epfds, where between 1 and n of the epfds may receive an
event, it does not satisfy the semantics of EPOLLONESHOT where only 1
epfd would get an event.  Thus, it is not allowed to be specified in
conjunction with EPOLLEXCLUSIVE.

EPOLL_CTL_MOD is also not allowed if the fd was previously added as
EPOLLEXCLUSIVE.  It seems with the limited number of flags to not be as
interesting, but this could be relaxed at some further point.
Signed-off-by: NJason Baron <jbaron@akamai.com>
Tested-by: NMadars Vitolins <m@silodev.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6a515c8

radix-tree: fix oops after radix_tree_iter_retry · 73204282

由 Konstantin Khlebnikov 提交于 2月 05, 2016

Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero.  This
isn't checked and it tries to dereference null pointer in slot.

Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.

Fixes: 46437f9a ("radix-tree: fix race in gang lookup")
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73204282

MAINTAINERS: trim the file triggers for ABI/API · b14fd334

由 Michael Kerrisk (man-pages) 提交于 2月 05, 2016

Commit ea8f8fc8 ("MAINTAINERS: add linux-api for review of API/ABI
changes") added file triggers for various paths that likely indicated
API/ABI changes.  However, catching all changes in Documentation/ABI/
and include/uapi/ produces a large volume of mail to linux-api, rather
than only API/ABI changes.  Drop those two entries, but leave
include/linux/syscalls.h and kernel/sys_ni.c to catch syscall-related
changes.

[josh@joshtriplett.org: redid changelog]
Signed-off-by: NMichael Kerrisk <mtk.man-pages@gmail.com>
Acked-by: NShuah khan <shuahkh@osg.samsung.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b14fd334

dax: dirty inode only if required · d2b2a28e

由 Dmitry Monakhov 提交于 2月 05, 2016

Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d2b2a28e

thp: make deferred_split_scan() work again · ae026204

由 Kirill A. Shutemov 提交于 2月 05, 2016

We need to iterate over split_queue, not local empty list to get
anything split from the shrinker.

Fixes: e3ae1953 ("thp: limit number of object to scan on deferred_split_scan()")
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ae026204

mm: replace vma_lock_anon_vma with anon_vma_lock_read/write · 12352d3c

由 Konstantin Khlebnikov 提交于 2月 05, 2016

Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock. We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here. There are
only few users of these legacy helpers. Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm(). Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.comSigned-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

12352d3c

ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup · c95a5180

由 xuejiufei 提交于 2月 05, 2016

When recovery master down, dlm_do_local_recovery_cleanup() only remove
the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
Which will make umount thread falling in dead loop migrating $RECOVERY
to the dead node.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c95a5180

um: asm/page.h: remove the pte_high member from struct pte_t · 012a4163

由 Nicolai Stange 提交于 2月 05, 2016

Commit 16da3068 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):

  arch/um/kernel/skas/mmu.c:38:206:
      warning: right shift count >= width of type [-Wshift-count-overflow]

Aforementioned patch changes the definition of the phys_to_pfn() macro
from

  ((pfn_t) ((p) >> PAGE_SHIFT))

to

  ((p) >> PAGE_SHIFT)

This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.

Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to

  (pte).pte_high = (phys) >> 32;

This results in the warning from above.

Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway.  Also, all page protection flags
defined by UML don't use any bits beyond bit 9.  Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.

Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.

Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped.  This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.

Fixes: 16da3068 ("um: kill pfn_t")
Signed-off-by: NNicolai Stange <nicstange@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

012a4163

mm, hugetlb: don't require CMA for runtime gigantic pages · 080fe206

由 Vlastimil Babka 提交于 2月 05, 2016

Commit 944d9fec ("hugetlb: add support for gigantic page allocation
at runtime") has added the runtime gigantic page allocation via
alloc_contig_range(), making this support available only when CONFIG_CMA
is enabled.  Because it doesn't depend on MIGRATE_CMA pageblocks and the
associated infrastructure, it is possible with few simple adjustments to
require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.

After this patch, alloc_contig_range() and related functions are
available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
enabled.  Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION.  This allows
supporting runtime gigantic pages without the CMA-specific checks in
page allocator fastpaths.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

080fe206

mm/hugetlb: fix gigantic page initialization/allocation · b4330afb

由 Mike Kravetz 提交于 2月 05, 2016

Attempting to preallocate 1G gigantic huge pages at boot time with
"hugepagesz=1G hugepages=1" on the kernel command line will prevent
booting with the following:

  kernel BUG at mm/hugetlb.c:1218!

When mapcount accounting was reworked, the setting of
compound_mapcount_ptr in prep_compound_gigantic_page was overlooked.  As
a result, the validation of mapcount in free_huge_page fails.

The "BUG_ON" checks in free_huge_page were also changed to
"VM_BUG_ON_PAGE" to assist with debugging.

Fixes: 53f9263b ("mm: rework mapcount accounting to enable 4k mapping of THPs")
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Tested-by: NVlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b4330afb

mm: downgrade VM_BUG in isolate_lru_page() to warning · cf2a82ee

由 Kirill A. Shutemov 提交于 2月 05, 2016

Calling isolate_lru_page() is wrong and shouldn't happen, but it not
nessesary fatal: the page just will not be isolated if it's not on LRU.

Let's downgrade the VM_BUG_ON_PAGE() to WARN_RATELIMIT().
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf2a82ee

mempolicy: do not try to queue pages from !vma_migratable() · 77bf45e7

由 Kirill A. Shutemov 提交于 2月 05, 2016

Maybe I miss some point, but I don't see a reason why we try to queue
pages from non migratable VMAs.

This testcase steps on VM_BUG_ON_PAGE() in isolate_lru_page():

    #include <fcntl.h>
    #include <unistd.h>
    #include <stdio.h>
    #include <sys/mman.h>
    #include <numaif.h>

    #define SIZE 0x2000

    int foo;

    int main()
    {
        int fd;
        char *p;
        unsigned long mask = 2;

        fd = open("/dev/sg0", O_RDWR);
        p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
        /* Faultin pages */
        foo = p[0] + p[0x1000];
        mbind(p, SIZE, MPOL_BIND, &mask, 4, MPOL_MF_MOVE | MPOL_MF_STRICT);
        return 0;
    }

The only case when we can queue pages from such VMA is MPOL_MF_STRICT
plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
in vma_migratable()).  That's looks like a bug to me.

Let's filter out non-migratable vma at start of queue_pages_test_walk()
and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL flag is set.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77bf45e7

mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress · 564e81a5

由 Tetsuo Handa 提交于 2月 05, 2016

Jan Stancek has reported that system occasionally hanging after "oom01"
testcase from LTP triggers OOM.  Guessing from a result that there is a
kworker thread doing memory allocation and the values between "Node 0
Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
up-to-date for some reason.

According to commit 373ccbe5 ("mm, vmstat: allow WQ concurrency to
discover memory reclaim doesn't make any progress"), it meant to force
the kworker thread to take a short sleep, but it by error used
schedule_timeout(1).  We missed that schedule_timeout() in state
TASK_RUNNING doesn't do anything.

Fix it by using schedule_timeout_uninterruptible(1) which forces the
kworker thread to take a short sleep in order to make sure that vmstat
is up-to-date.

Fixes: 373ccbe5 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: NJan Stancek <jstancek@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Cristopher Lameter <clameter@sgi.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Arkadiusz Miskiewicz <arekm@maven.pl>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

564e81a5

vmstat: make vmstat_update deferrable · ccde8bd4

由 Michal Hocko 提交于 2月 05, 2016

Commit 0eb77e98 ("vmstat: make vmstat_updater deferrable again and
shut down on idle") made vmstat_shepherd deferrable.  vmstat_update
itself is still useing standard timer which might interrupt idle task.
This is possible because "mm, vmstat: make quiet_vmstat lighter" removed
cancel_delayed_work from the quiet_vmstat.

Change vmstat_work to use DEFERRABLE_WORK to prevent from pointless
wakeups from the idle context.
Acked-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ccde8bd4

mm, vmstat: make quiet_vmstat lighter · f01f17d3

由 Michal Hocko 提交于 2月 05, 2016

Mike has reported a considerable overhead of refresh_cpu_vm_stats from
the idle entry during pipe test:

    12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
     4.75%  [kernel]       [k] __schedule
     4.70%  [kernel]       [k] mutex_unlock
     3.14%  [kernel]       [k] __switch_to

This is caused by commit 0eb77e98 ("vmstat: make vmstat_updater
deferrable again and shut down on idle") which has placed quiet_vmstat
into cpu_idle_loop.  The main reason here seems to be that the idle
entry has to get over all zones and perform atomic operations for each
vmstat entry even though there might be no per cpu diffs.  This is a
pointless overhead for _each_ idle entry.

Make sure that quiet_vmstat is as light as possible.

First of all it doesn't make any sense to do any local sync if the
current cpu is already set in oncpu_stat_off because vmstat_update puts
itself there only if there is nothing to do.

Then we can check need_update which should be a cheap way to check for
potential per-cpu diffs and only then do refresh_cpu_vm_stats.

The original patch also did cancel_delayed_work which we are not doing
here.  There are two reasons for that.  Firstly cancel_delayed_work from
idle context will blow up on RT kernels (reported by Mike):

  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
  Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
  Call Trace:
    dump_stack+0x49/0x67
    ___might_sleep+0xf5/0x180
    rt_spin_lock+0x20/0x50
    try_to_grab_pending+0x69/0x240
    cancel_delayed_work+0x26/0xe0
    quiet_vmstat+0x75/0xa0
    cpu_idle_loop+0x38/0x3e0
    cpu_startup_entry+0x13/0x20
    start_secondary+0x114/0x140

And secondly, even on !RT kernels it might add some non trivial overhead
which is not necessary.  Even if the vmstat worker wakes up and preempts
idle then it will be most likely a single shot noop because the stats
were already synced and so it would end up on the oncpu_stat_off anyway.
We just need to teach both vmstat_shepherd and vmstat_update to stop
scheduling the worker if there is nothing to do.

[mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Reported-by: NMike Galbraith <umgwanakikbuti@gmail.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NMike Galbraith <mgalbraith@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f01f17d3

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功