提交 · c1aa905a304e4b5e6a3fe112ec62d9c1c7b0c155 · openeuler / Kernel

09 3月, 2017 2 次提交

Revert "scsi, block: fix duplicate bdi name registration crashes" · c01228db

由 Jan Kara 提交于 3月 08, 2017

This reverts commit 0dba1314. It causes
leaking of device numbers for SCSI when SCSI registers multiple gendisks
for one request_queue in succession. It can be easily reproduced using
Omar's script [1] on kernel with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
Furthermore the protection provided by this commit is not needed anymore
as the problem it was fixing got also fixed by commit 165a5e22
"block: Move bdi_unregister() to del_gendisk()".

[1]: http://marc.info/?l=linux-block&m=148554717109098&w=2Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Tested-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c01228db

zram: set physical queue limits to avoid array out of bounds accesses · 0bc31538

由 Johannes Thumshirn 提交于 3月 06, 2017

zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
the NVMe over Fabrics loopback target which potentially sends a huge bulk of
pages attached to the bio's bvec this results in a kernel panic because of
array out of bounds accesses in zram_decompress_page().
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0bc31538

08 3月, 2017 4 次提交

PCI/ASPM: Always set link->downstream to avoid NULL dereference on remove · 3bd7db63

由 Yinghai Lu 提交于 3月 01, 2017

We call pcie_aspm_exit_link_state() when we remove a device.  If the device
is the last PCIe function to be removed below a bridge and the bridge has
an ASPM link_state struct, we disable ASPM on the link.  Disabling ASPM
requires link->downstream (used in pcie_config_aspm_link()).

We previously set link->downstream in pcie_aspm_cap_init(), but only if the
device was not blacklisted.  Removing the blacklisted device caused a NULL
pointer dereference in the pcie_aspm_exit_link_state() ->
pcie_config_aspm_link() path:

  # echo 1 > /sys/bus/pci/devices/0000\:0b\:00.0/remove
  ...
   BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
   IP: pcie_config_aspm_link+0x5d/0x2b0
   Call Trace:
    pcie_aspm_exit_link_state+0x75/0x130
    pci_stop_bus_device+0xa4/0xb0
    pci_stop_and_remove_bus_device_locked+0x1a/0x30
    remove_store+0x50/0x70
    dev_attr_store+0x18/0x30
    sysfs_kf_write+0x44/0x60
    kernfs_fop_write+0x10e/0x190
    __vfs_write+0x28/0x110
    ? rcu_read_lock_sched_held+0x5d/0x80
    ? rcu_sync_lockdep_assert+0x2c/0x60
    ? __sb_start_write+0x173/0x1a0
    ? vfs_write+0xb3/0x180
    vfs_write+0xc4/0x180
    SyS_write+0x49/0xa0
    do_syscall_64+0xa6/0x1c0
    entry_SYSCALL64_slow_path+0x25/0x25
   ---[ end trace bd187ee0267df5d9 ]---

To avoid this, set link->downstream in alloc_pcie_link_state(), so every
pcie_link_state structure has a valid link->downstream pointer.

[bhelgaas: changelog]
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRajat Jain <rajatja@google.com>
CC: stable@vger.kernel.org

3bd7db63

PCI: Prevent VPD access for QLogic ISP2722 · 0d5370d1

由 Ethan Zhao 提交于 2月 27, 2017

QLogic ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter has the VPD
access issue too, while read the common pci-sysfs access interface shown as

 /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/vpd

with simple 'cat' could cause system hang and panic:

  Kernel panic - not syncing: An NMI occurred. Depending on your system the reason for the NMI is logged in any one of the following resources:
  1. Integrated Management Log (IML)
  2. OA Syslog
  3. OA Forward Progress Log
  4. iLO Event Log
  CPU: 0 PID: 15070 Comm: udevadm Not tainted 4.1.12
  Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
   0000000000000086 000000007f0cdf51 ffff880c4fa05d58 ffffffff817193de
   ffffffffa00b42d8 0000000000000075 ffff880c4fa05dd8 ffffffff81714072
   0000000000000008 ffff880c4fa05de8 ffff880c4fa05d88 000000007f0cdf51
  Call Trace:
   <NMI>  [<ffffffff817193de>] dump_stack+0x63/0x81
   [<ffffffff81714072>] panic+0xd0/0x20e
   [<ffffffffa00b390d>] hpwdt_pretimeout+0xdd/0xe0 [hpwdt]
   [<ffffffff81021fc9>] ? sched_clock+0x9/0x10
   [<ffffffff8101c101>] nmi_handle+0x91/0x170
   [<ffffffff8101c10c>] ? nmi_handle+0x9c/0x170
   [<ffffffff8101c5fe>] io_check_error+0x1e/0xa0
   [<ffffffff8101c719>] default_do_nmi+0x99/0x140
   [<ffffffff8101c8b4>] do_nmi+0xf4/0x170
   [<ffffffff817232c5>] end_repeat_nmi+0x1a/0x1e
   [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120
   [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120
   [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120
   <<EOE>>  [<ffffffff815db4b3>] raw_pci_read+0x23/0x40
   [<ffffffff815db4fc>] pci_read+0x2c/0x30
   [<ffffffff8136f612>] pci_user_read_config_word+0x72/0x110
   [<ffffffff8136f746>] pci_vpd_pci22_wait+0x96/0x130
   [<ffffffff8136ff9b>] pci_vpd_pci22_read+0xdb/0x1a0
   [<ffffffff8136ea30>] pci_read_vpd+0x20/0x30
   [<ffffffff8137d590>] read_vpd_attr+0x30/0x40
   [<ffffffff8128e037>] sysfs_kf_bin_read+0x47/0x70
   [<ffffffff8128d24e>] kernfs_fop_read+0xae/0x180
   [<ffffffff8120dd97>] __vfs_read+0x37/0x100
   [<ffffffff812ba7e4>] ? security_file_permission+0x84/0xa0
   [<ffffffff8120e366>] ? rw_verify_area+0x56/0xe0
   [<ffffffff8120e476>] vfs_read+0x86/0x140
   [<ffffffff8120f3f5>] SyS_read+0x55/0xd0
   [<ffffffff81720f2e>] system_call_fastpath+0x12/0x71
  Shutting down cpus with NMI
  Kernel Offset: disabled
  drm_kms_helper: panic occurred, switching back to text console

So blacklist the access to its VPD.
Signed-off-by: NEthan Zhao <ethan.zhao@oracle.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v4.6+

0d5370d1

PCI: exynos: Initialize elbi_base even when using PHY framework · 544714d8

由 Jaehoon Chung 提交于 3月 07, 2017

Even when using the PHY framework, we need the elbi_base.  Before this
patch, we didn't initialize elbi_base, which caused NULL pointer
dereferences later.

Fixes: e7cd7ef5 ("PCI: exynos: Support the PHY generic framework")
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

544714d8

[media] v4l: vsp1: Adapt vsp1_du_setup_lif() interface to use a structure · 8c71fff4

由 Kieran Bingham 提交于 3月 03, 2017

The interface to configure the LIF in the VSP1 requires adapting the
function prototype for any changes. This makes extending the interface
difficult.

Change the function prototype to pass a structure which can be easily
extended.

This changes the means of disabling the pipeline, by now passing a NULL
configuration rather than passing either a 0 width or height.

[Fixed kerneldoc, made vsp1_du_setup_lif() cfg argument const]
Signed-off-by: NKieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

8c71fff4

07 3月, 2017 1 次提交

drivers/char/nwbutton: Fix build breakage caused by include file reshuffling · bb35e451

由 Guenter Roeck 提交于 3月 05, 2017

Fix:

  drivers/char/nwbutton.c: In function 'button_sequence_finished':
  drivers/char/nwbutton.c:134:3: error: implicit declaration of function 'kill_cad_pid'

The declaration has been moved from one include file to another.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c3edc401 ("sched/headers: Move task_struct::signal and ...")
Link: http://lkml.kernel.org/r/1488762811-9022-1-git-send-email-linux@roeck-us.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

bb35e451

06 3月, 2017 5 次提交

[media] dw2102: don't do DMA on stack · 606142af

由 Jonathan McDowell 提交于 2月 15, 2017

On Kernel 4.9, WARNINGs about doing DMA on stack are hit at
the dw2102 driver: one in su3000_power_ctrl() and the other in tt_s2_4600_frontend_attach().

Both were due to the use of buffers on the stack as parameters to
dvb_usb_generic_rw() and the resulting attempt to do DMA with them.

The device was non-functional as a result.

So, switch this driver over to use a buffer within the device state
structure, as has been done with other DVB-USB drivers.

Tested with TechnoTrend TT-connect S2-4600.

[mchehab@osg.samsung.com: fixed a warning at su3000_i2c_transfer() that
 state var were dereferenced before check 'd']
Signed-off-by: NJonathan McDowell <noodles@earth.li>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

606142af

cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy · a240c4aa

由 Rafael J. Wysocki 提交于 3月 02, 2017

If the current P-state selection algorithm is set to "performance"
in intel_pstate_set_policy(), the limits may be initialized from
scratch, but only if no_turbo is not set and the maximum frequency
allowed for the given CPU (i.e. the policy object representing it)
is at least equal to the max frequency supported by the CPU.  In all
of the other cases, the limits will not be updated.

For example, the following can happen:

 # cat intel_pstate/status
 active
 # echo performance > cpufreq/policy0/scaling_governor
 # cat intel_pstate/min_perf_pct
 100
 # echo 94 > intel_pstate/min_perf_pct
 # cat intel_pstate/min_perf_pct
 100
 # cat cpufreq/policy0/scaling_max_freq
 3100000
 echo 3000000 > cpufreq/policy0/scaling_max_freq
 # cat intel_pstate/min_perf_pct
 94
 # echo 95 > intel_pstate/min_perf_pct
 # cat intel_pstate/min_perf_pct
 95

That is confusing for two reasons.  First, the initial attempt to
change min_perf_pct to 94 seems to have no effect, even though
setting the global limits should always work.  Second, after
changing scaling_max_freq for policy0 the global min_perf_pct
attribute shows 94, even though it should have not been affected
by that operation in principle.

Moreover, the final attempt to change min_perf_pct to 95 worked
as expected, because scaling_max_freq for the only policy with
scaling_governor equal to "performance" was different from the
maximum at that time.

To make all that confusion go away, modify intel_pstate_set_policy()
so that it doesn't reinitialize the limits at all.

At the same time, change intel_pstate_set_performance_limits() to
set min_sysfs_pct to 100 in the "performance" limits set so that
switching the P-state selection algorithm to "performance" causes
intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value
min_sysfs_pct in the "performance" limits is set to later).

That requires per-CPU limits to be initialized explicitly rather
than by copying the global limits to avoid setting min_sysfs_pct
in the per-CPU limits to 100.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a240c4aa

cpufreq: intel_pstate: Fix intel_pstate_verify_policy() · d74b1992

由 Rafael J. Wysocki 提交于 3月 01, 2017

The code added to intel_pstate_verify_policy() by commit 1443ebba
(cpufreq: intel_pstate: Fix sysfs limits enforcement for performance
policy) should use perf_limits instead of limits, because otherwise
setting global limits via sysfs may affect policies inconsistently.

For example, in the sequence of shell commands below, the
scaling_min_freq attribute for policy1 and policy2 should be
affected in the same way, because scaling_governor is set in
the same way for both of them:

 # cat cpufreq/policy1/scaling_governor
 powersave
 # cat cpufreq/policy2/scaling_governor
 powersave
 # echo performance > cpufreq/policy0/scaling_governor
 # echo 94 > intel_pstate/min_perf_pct
 # cat cpufreq/policy0/scaling_min_freq
 2914000
 # cat cpufreq/policy1/scaling_min_freq
 2914000
 # cat cpufreq/policy2/scaling_min_freq
 800000

The are affected differently, because intel_pstate_verify_policy()
is invoked with limits set to &performance_limits (left behind by
policy0) for policy1 and with limits set to &powersave_limits (left
behind by policy1) for policy2.  Since perf_limits is set to the
set of limits matching the policy being updated, using it instead
of limits fixes the inconsistency.

Fixes: 1443ebba (cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d74b1992

cpufreq: intel_pstate: Fix global settings in active mode · cd59b4be

由 Rafael J. Wysocki 提交于 3月 01, 2017

Commit 111b8b3f (cpufreq: intel_pstate: Always keep all
limits settings in sync) changed intel_pstate to invoke
cpufreq_update_policy() for every registered CPU on global sysfs
attributes updates, but that led to undesirable effects in the
active mode if the "performance" P-state selection algorithm is
configufred for one CPU and the "powersave" one is chosen for
all of the other CPUs.

Namely, in that case, the following is possible:

 # cd /sys/devices/system/cpu/
 # cat intel_pstate/max_perf_pct
 100
 # cat intel_pstate/min_perf_pct
 26
 # echo performance > cpufreq/policy0/scaling_governor
 # cat intel_pstate/max_perf_pct
 100
 # cat intel_pstate/min_perf_pct
 100
 # echo 94 > intel_pstate/min_perf_pct
 # cat intel_pstate/min_perf_pct
 26

The reason why this happens is because intel_pstate attempts to
maintain two sets of global limits in the active mode, one for
the "performance" P-state selection algorithm and one for the
"powersave"  P-state selection algorithm, but the P-state selection
algorithms are set per policy, so the global limits cannot reflect
all of them at the same time if they are different for different
policies.

In the particular situation above, the attempt to change
min_perf_pct to 94 caused cpufreq_update_policy() to be run
for a CPU with the "powersave"  P-state selection algorithm
and intel_pstate_set_policy() called by it silently switched the
global limits to the "powersave" set which finally was reflected
by the sysfs interface.

To prevent that from happening, modify intel_pstate_update_policies()
to always switch back to the set of limits that was used right before
it has been invoked.

Fixes: 111b8b3f (cpufreq: intel_pstate: Always keep all limits settings in sync)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cd59b4be

cpufreq: Add the "cpufreq.off=1" cmdline option · d82f2692

由 Len Brown 提交于 2月 28, 2017

Add the "cpufreq.off=1" cmdline option.

At boot-time, this allows a user to request CONFIG_CPU_FREQ=n
behavior from a kernel built with CONFIG_CPU_FREQ=y.

This is analogous to the existing "cpuidle.off=1" option
and CONFIG_CPU_IDLE=y

This capability is valuable when we need to debug end-user
issues in the BIOS or in Linux.  It is also convenient
for enabling comparisons, which may otherwise require a new kernel,
or help from BIOS SETUP, which may be buggy or unavailable.
Signed-off-by: NLen Brown <len.brown@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

d82f2692

04 3月, 2017 11 次提交

cpufreq: intel_pstate: Avoid triggering cpu_frequency tracepoint unnecessarily · 64078299

由 Rafael J. Wysocki 提交于 3月 03, 2017

In the passive mode the cpu_frequency trace event is already
triggered by the cpufreq core or by scaling governors, so
intel_pstate should not trigger it once again for the same
P-state updates.

In addition to that, the frequency returned by
intel_cpufreq_fast_switch() and passed via freqs.new from
intel_cpufreq_target() to cpufreq_freq_transition_end() should
reflect the P-state actually set, so make that happen.

Fixes: 001c76f0 (cpufreq: intel_pstate: Generic governors support)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

64078299

cpufreq: intel_pstate: Fix intel_cpufreq_verify_policy() · 7f17326f

由 Rafael J. Wysocki 提交于 3月 02, 2017

The intel_pstate_update_perf_limits() called from
intel_cpufreq_verify_policy() may cause global P-state limits
to change which is generally confusing and unnecessary.

In the passive mode the global limits are only applied to the
frequency selected by the scaling governor (they are not taken
into account by governors when making decisions anyway), so making
them follow the per-policy limits serves no purpose and may go
against user expectations (as it generally causes the global
attributes in sysfs to change even though they have not been
written to in some cases).

Fix that by dropping the intel_pstate_update_perf_limits()
invocation from intel_cpufreq_verify_policy() (which also
reduces the code size by a few lines).

This change does not affect the per-CPU limits case, because those
limits allow any P-state to be set by default in the passive mode
and it removes the only piece of code updating them in that mode,
so the per-policy settings will be the only ones taken into account
in that case as expected.

Fixes: 001c76f0 (cpufreq: intel_pstate: Generic governors support)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7f17326f

cpufreq: intel_pstate: Do not use performance_limits in passive mode · 2bc756e7

由 Rafael J. Wysocki 提交于 3月 02, 2017

Using performance_limits in the passive mode doesn't make
sense, because in that mode the global limits are applied to the
frequency selected by the scaling governor.

The maximum and minimum P-state limits in performance_limits are both
set to 100 percent which will put all CPUs into the turbo range
regardless of what governor is used and what frequencies are
selected by it (that is particularly undesirable on CPUs with the
generic powersave governor attached).

For this reason, make intel_pstate_register_driver() always point
limits to powersave_limits in the passive mode.

Fixes: 001c76f0 (cpufreq: intel_pstate: Generic governors support)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2bc756e7

sfc: fix IPID endianness in TSOv2 · 6d43131c

由 Edward Cree 提交于 3月 03, 2017

The value we read from the header is in network byte order, whereas
 EFX_POPULATE_QWORD_* takes values in host byte order (which it then
 converts to little-endian, as MCDI is little-endian).

Fixes: e9117e50 ("sfc: Firmware-Assisted TSO version 2")
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d43131c

sfc: avoid max() in array size · d0346b03

由 Edward Cree 提交于 3月 03, 2017

It confuses sparse, which thinks the size isn't constant.  Let's achieve
 the same thing with a BUILD_BUG_ON, since we know which one should be
 bigger and don't expect them ever to change.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0346b03

nfp: correct DMA direction in XDP DMA sync · d58cebb7

由 Jakub Kicinski 提交于 3月 02, 2017

dma_sync_single_for_*() takes the direction in which the buffer
was mapped, not the direction of the sync.  We should sync XDP
buffers bidirectionally.

Fixes: ecd63a02 ("nfp: add XDP support in the driver")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d58cebb7

nfp: don't tell FW about the reserved buffer space · 9383b337

由 Jakub Kicinski 提交于 3月 02, 2017

Since commit c0f031bc ("nfp_net: use alloc_frag() and build_skb()")
we are allocating buffers which have to hold both the data and skb to
be created in place by build_skb().

FW should only be told about the buffer space it can DMA to, that
is without the build_skb() headroom and tailroom.  Note: firmware
applications should validate the buffers against both MTU and
free list buffer size so oversized packets would not pass through
the NIC anyway.

Fixes: c0f031bc ("nfp: use alloc_frag() and build_skb()")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9383b337

net: ethernet: bgmac: mac address change bug · fa42245d

由 Hari Vyas 提交于 3月 02, 2017

ndo_set_mac_address() passes struct sockaddr * as 2nd parameter to
bgmac_set_mac_address() but code assumed u8 *.  This caused two bytes
chopping and the wrong mac address was configured.
Signed-off-by: NHari Vyas <hariv@broadcom.com>
Signed-off-by: NJon Mason <jon.mason@broadcom.com>
Fixes: 4e209001 ("bgmac: write mac address to hardware in ndo_set_mac_address")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa42245d

net: ethernet: bgmac: init sequence bug · 16206524

由 Jon Mason 提交于 3月 02, 2017

Fix a bug in the 'bgmac' driver init sequence that blind writes for init
sequence where it should preserve most bits other than the ones it is
deliberately manipulating.

The code now checks to see if the adapter needs to be brought out of
reset (where as before it was doing an IDM write to bring it out of
reset regardless of whether it was in reset or not).  Also, removed
unnecessary usleeps (as there is already a read present to flush the
IDM writes).
Signed-off-by: NZac Schroff <zschroff@broadcom.com>
Signed-off-by: NJon Mason <jon.mason@broadcom.com>
Fixes: f6a95a24 ("net: ethernet: bgmac: Add platform device support")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16206524

xen-netback: don't vfree() queues under spinlock · a254d8f9

由 Paul Durrant 提交于 3月 02, 2017

This leads to a BUG of the following form:

[  174.512861] switch: port 2(vif3.0) entered disabled state
[  174.522735] BUG: sleeping function called from invalid context at
/home/build/linux-linus/mm/vmalloc.c:1441
[  174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
[  174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G        W
4.10.0upstream-11073-g4977ab6e-dirty #1
[  174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0
03/14/2011
[  174.525517] Call Trace:
[  174.526217]  show_stack+0x23/0x60
[  174.526899]  dump_stack+0x5b/0x88
[  174.527562]  ___might_sleep+0xde/0x130
[  174.528208]  __might_sleep+0x35/0xa0
[  174.528840]  ? _raw_spin_unlock_irqrestore+0x13/0x20
[  174.529463]  ? __wake_up+0x40/0x50
[  174.530089]  remove_vm_area+0x20/0x90
[  174.530724]  __vunmap+0x1d/0xc0
[  174.531346]  ? delete_object_full+0x13/0x20
[  174.531973]  vfree+0x40/0x80
[  174.532594]  set_backend_state+0x18a/0xa90
[  174.533221]  ? dwc_scan_descriptors+0x24d/0x430
[  174.533850]  ? kfree+0x5b/0xc0
[  174.534476]  ? xenbus_read+0x3d/0x50
[  174.535101]  ? xenbus_read+0x3d/0x50
[  174.535718]  ? xenbus_gather+0x31/0x90
[  174.536332]  ? ___might_sleep+0xf6/0x130
[  174.536945]  frontend_changed+0x6b/0xd0
[  174.537565]  xenbus_otherend_changed+0x7d/0x80
[  174.538185]  frontend_changed+0x12/0x20
[  174.538803]  xenwatch_thread+0x74/0x110
[  174.539417]  ? woken_wake_function+0x20/0x20
[  174.540049]  kthread+0xe5/0x120
[  174.540663]  ? xenbus_printf+0x50/0x50
[  174.541278]  ? __kthread_init_worker+0x40/0x40
[  174.541898]  ret_from_fork+0x21/0x2c
[  174.548635] switch: port 2(vif3.0) entered disabled state

This patch defers the vfree() until after the spinlock is released.
Reported-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a254d8f9

xen-netback: keep a local pointer for vif in backend_disconnect() · d67ce7da

由 Paul Durrant 提交于 3月 02, 2017

This patch replaces use of 'be->vif' with 'vif' and hence generally
makes the function look tidier. No semantic change.
Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d67ce7da

03 3月, 2017 17 次提交

can: flexcan: fix typo in comment · 66ddb821

由 Marc Kleine-Budde 提交于 3月 02, 2017

This patch fixes the typo "Disble" -> "Disable".
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

66ddb821

can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer · 7c426313

由 Marc Kleine-Budde 提交于 3月 02, 2017

The priv->cmd_msg_buffer is allocated in the probe function, but never
kfree()ed. This patch converts the kzalloc() to resource-managed
kzalloc.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

7c426313

can: gs_usb: fix coding style · 540a27ae

由 Ethan Zonca 提交于 2月 24, 2017

This patch fixes five minor style issues, spaces are between bitwise OR
operators.
Signed-off-by: NEthan Zonca <e@ethanzonca.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

540a27ae

can: gs_usb: Don't use stack memory for USB transfers · c919a306

由 Ethan Zonca 提交于 2月 24, 2017

Fixes: 05ca5270 can: gs_usb: add ethtool set_phys_id callback to locate physical device

The gs_usb driver is performing USB transfers using buffers allocated on
the stack. This causes the driver to not function with vmapped stacks.
Instead, allocate memory for the transfer buffers.
Signed-off-by: NEthan Zonca <e@ethanzonca.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.8
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

c919a306

[media] rc: protocol is not set on register for raw IR devices · 5df62771

由 Sean Young 提交于 2月 23, 2017

ir_raw_event_register() sets up change_protocol(), and without that set,
rc_setup_rx_device() does not set the protocol for the device on register.

The standard udev rules run ir-keytable, which writes to the protocols
file again, which hides this problem.

Fixes: 7ff2c2bc ("[media] rc-main: split setup and unregister functions")
Signed-off-by: NSean Young <sean@mess.org>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

5df62771

[media] rc: raw decoder for keymap protocol is not loaded on register · 41380868

由 Sean Young 提交于 2月 22, 2017

When the protocol is set via the sysfs protocols attribute, the
decoder is loaded. However, when it is not when a device is first
plugged in or registered.

Fixes: acc1c3c6 ("[media] media: rc: load decoder modules on-demand")
Signed-off-by: NSean Young <sean@mess.org>
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

41380868

[media] rc: nuvoton: fix deadlock in nvt_write_wakeup_codes · c1305a40

由 Heiner Kallweit 提交于 2月 12, 2017

nvt_write_wakeup_codes acquires the same lock as the ISR but doesn't
disable interrupts on the local CPU. This caused the following
deadlock. Fix this by using spin_lock_irqsave.

[  432.362008] ================================
[  432.362074] WARNING: inconsistent lock state
[  432.362144] 4.10.0-rc7-next-20170210 #1 Not tainted
[  432.362219] --------------------------------
[  432.362286] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  432.362379] swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  432.362457]  (&(&nvt->lock)->rlock){?.+...}, at: [<ffffffffa016b17d>] nvt_cir_isr+0x2d/0x520 [nuvoton_cir]
[  432.362611] {HARDIRQ-ON-W} state was registered at:
[  432.362686]
[  432.362698] [<ffffffff810adb7c>] __lock_acquire+0x5dc/0x1260
[  432.362812]
[  432.362817] [<ffffffff810aec29>] lock_acquire+0xe9/0x1d0
[  432.362927]
[  432.362934] [<ffffffff81609f63>] _raw_spin_lock+0x33/0x50
[  432.363045]
[  432.363051] [<ffffffffa016b822>] nvt_write_wakeup_codes.isra.12+0x22/0xe0 [nuvoton_cir]
[  432.363193]
[  432.363199] [<ffffffffa016b9bf>] wakeup_data_store+0xdf/0xf0 [nuvoton_cir]
[  432.363327]
[  432.363333] [<ffffffff81484223>] dev_attr_store+0x13/0x20
[  432.363441]
[  432.363449] [<ffffffff81232450>] sysfs_kf_write+0x40/0x50
[  432.363558]
[  432.363564] [<ffffffff81231640>] kernfs_fop_write+0x150/0x1e0
[  432.363676]
[  432.363685] [<ffffffff811b36a3>] __vfs_write+0x23/0x120
[  432.363791]
[  432.363798] [<ffffffff811b4d53>] vfs_write+0xc3/0x1e0
[  432.363902]
[  432.363909] [<ffffffff811b6124>] SyS_write+0x44/0xa0
[  432.364012]
[  432.364021] [<ffffffff81002c47>] do_syscall_64+0x57/0x140
[  432.364129]
[  432.364135] [<ffffffff8160a9e4>] return_from_SYSCALL_64+0x0/0x7a
[  432.364252] irq event stamp: 415118
[  432.364313] hardirqs last  enabled at (415115): [<ffffffff814fd2eb>] cpuidle_enter_state+0x11b/0x370
[  432.364445] hardirqs last disabled at (415116): [<ffffffff8160b2cb>] common_interrupt+0x8b/0x90
[  432.364573] softirqs last  enabled at (415118): [<ffffffff8106157c>] _local_bh_enable+0x1c/0x50
[  432.364699] softirqs last disabled at (415117): [<ffffffff810629a3>] irq_enter+0x43/0x60
[  432.364814]
               other info that might help us debug this:
[  432.364909]  Possible unsafe locking scenario:

[  432.367821]        CPU0
[  432.370645]        ----
[  432.373432]   lock(&(&nvt->lock)->rlock);
[  432.376228]   <Interrupt>
[  432.378982]     lock(&(&nvt->lock)->rlock);
[  432.381757]
                *** DEADLOCK ***

[  432.389888] no locks held by swapper/0/0.
[  432.392574]
               stack backtrace:
[  432.397774] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc7-next-20170210 #1
[  432.400375] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[  432.403023] Call Trace:
[  432.405636]  <IRQ>
[  432.408208]  dump_stack+0x68/0x93
[  432.410775]  print_usage_bug+0x1dd/0x1f0
[  432.413334]  mark_lock+0x559/0x5c0
[  432.415871]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
[  432.418431]  __lock_acquire+0x6b1/0x1260
[  432.420941]  lock_acquire+0xe9/0x1d0
[  432.423396]  ? nvt_cir_isr+0x2d/0x520 [nuvoton_cir]
[  432.425844]  _raw_spin_lock+0x33/0x50
[  432.428252]  ? nvt_cir_isr+0x2d/0x520 [nuvoton_cir]
[  432.430670]  nvt_cir_isr+0x2d/0x520 [nuvoton_cir]
[  432.433085]  __handle_irq_event_percpu+0x43/0x330
[  432.435493]  handle_irq_event_percpu+0x1e/0x50
[  432.437884]  handle_irq_event+0x34/0x60
[  432.440236]  handle_edge_irq+0x6a/0x150
[  432.442561]  handle_irq+0x15/0x20
[  432.444854]  do_IRQ+0x57/0x110
[  432.447115]  common_interrupt+0x90/0x90
[  432.449380] RIP: 0010:cpuidle_enter_state+0x120/0x370
[  432.451653] RSP: 0018:ffffffff81c03dd8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffcc
[  432.453994] RAX: ffffffff81c14500 RBX: 0000000000000008 RCX: 00000064aac6f2d2
[  432.456349] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81c14500
[  432.458704] RBP: ffffffff81c03e18 R08: cccccccccccccccd R09: 0000000000000018
[  432.461072] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880100a21260
[  432.463450] R13: ffffffff81c7e6f8 R14: 0000000000000008 R15: ffffffff81c7e6e0
[  432.465819]  </IRQ>
[  432.468104]  ? cpuidle_enter_state+0x11b/0x370
[  432.470413]  cpuidle_enter+0x12/0x20
[  432.472698]  call_cpuidle+0x1e/0x40
[  432.474967]  do_idle+0xe3/0x1c0
[  432.477172]  cpu_startup_entry+0x18/0x20
[  432.479376]  rest_init+0x130/0x140
[  432.481565]  start_kernel+0x3cc/0x3d9
[  432.483750]  x86_64_start_reservations+0x2a/0x2c
[  432.485980]  x86_64_start_kernel+0x178/0x18b
[  432.488222]  start_cpu+0x14/0x14
[  432.490453]  ? start_cpu+0x14/0x14

Fixes: 97c12974 "[media] rc: nuvoton-cir: Add support wakeup via sysfs filter callback"
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NSean Young <sean@mess.org>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

c1305a40

[media] lirc: fix dead lock between open and wakeup_filter · db5b15b7

由 Sean Young 提交于 2月 13, 2017

The locking in lirc needs improvement, but for now just fix this potential
deadlock.

======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-rc1+ #1 Not tainted
-------------------------------------------------------
bash/2502 is trying to acquire lock:
 (ir_raw_handler_lock){+.+.+.}, at: [<ffffffffc06f6a5e>] ir_raw_encode_scancode+0x3e/0xb0 [rc_core]

               but task is already holding lock:
 (&dev->lock){+.+.+.}, at: [<ffffffffc06f511f>] store_filter+0x9f/0x240 [rc_core]

               which lock already depends on the new lock.

               the existing dependency chain (in reverse order) is:

               -> #2 (&dev->lock){+.+.+.}:

[<ffffffffa110adad>] lock_acquire+0xfd/0x200
[<ffffffffa1921327>] mutex_lock_nested+0x77/0x6d0
[<ffffffffc06f436a>] rc_open+0x2a/0x80 [rc_core]
[<ffffffffc07114ca>] lirc_dev_fop_open+0xda/0x1e0 [lirc_dev]
[<ffffffffa12975e0>] chrdev_open+0xb0/0x210
[<ffffffffa128eb5a>] do_dentry_open+0x20a/0x2f0
[<ffffffffa128ffcc>] vfs_open+0x4c/0x80
[<ffffffffa12a35ec>] path_openat+0x5bc/0xc00
[<ffffffffa12a5271>] do_filp_open+0x91/0x100
[<ffffffffa12903f0>] do_sys_open+0x130/0x220
[<ffffffffa12904fe>] SyS_open+0x1e/0x20
[<ffffffffa19278c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
               -> #1 (lirc_dev_lock){+.+.+.}:
[<ffffffffa110adad>] lock_acquire+0xfd/0x200
[<ffffffffa1921327>] mutex_lock_nested+0x77/0x6d0
[<ffffffffc0711f47>] lirc_register_driver+0x67/0x59b [lirc_dev]
[<ffffffffc06db7f4>] ir_lirc_register+0x1f4/0x260 [ir_lirc_codec]
[<ffffffffc06f6cac>] ir_raw_handler_register+0x7c/0xb0 [rc_core]
[<ffffffffc0398010>] 0xffffffffc0398010
[<ffffffffa1002192>] do_one_initcall+0x52/0x1b0
[<ffffffffa11ef5c8>] do_init_module+0x5f/0x1fa
[<ffffffffa11566b5>] load_module+0x2675/0x2b00
[<ffffffffa1156dcf>] SYSC_finit_module+0xdf/0x110
[<ffffffffa1156e1e>] SyS_finit_module+0xe/0x10
[<ffffffffa1003f5c>] do_syscall_64+0x6c/0x1f0
[<ffffffffa1927989>] return_from_SYSCALL_64+0x0/0x7a
               -> #0 (ir_raw_handler_lock){+.+.+.}:
[<ffffffffa110a7b7>] __lock_acquire+0x10f7/0x1290
[<ffffffffa110adad>] lock_acquire+0xfd/0x200
[<ffffffffa1921327>] mutex_lock_nested+0x77/0x6d0
[<ffffffffc06f6a5e>] ir_raw_encode_scancode+0x3e/0xb0 [rc_core]
[<ffffffffc0b0f492>] loop_set_wakeup_filter+0x62/0xbd [rc_loopback]
[<ffffffffc06f522a>] store_filter+0x1aa/0x240 [rc_core]
[<ffffffffa15e46f8>] dev_attr_store+0x18/0x30
[<ffffffffa13318e5>] sysfs_kf_write+0x45/0x60
[<ffffffffa1330b55>] kernfs_fop_write+0x155/0x1e0
[<ffffffffa1290797>] __vfs_write+0x37/0x160
[<ffffffffa12921f8>] vfs_write+0xc8/0x1e0
[<ffffffffa12936e8>] SyS_write+0x58/0xc0
[<ffffffffa19278c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

               other info that might help us debug this:

Chain exists of:
                 ir_raw_handler_lock --> lirc_dev_lock --> &dev->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->lock);
                               lock(lirc_dev_lock);
                               lock(&dev->lock);
  lock(ir_raw_handler_lock);

                *** DEADLOCK ***

4 locks held by bash/2502:
 #0:  (sb_writers#4){.+.+.+}, at: [<ffffffffa12922c5>] vfs_write+0x195/0x1e0
 #1:  (&of->mutex){+.+.+.}, at: [<ffffffffa1330b1f>] kernfs_fop_write+0x11f/0x1e0
 #2:  (s_active#215){.+.+.+}, at: [<ffffffffa1330b28>] kernfs_fop_write+0x128/0x1e0
 #3:  (&dev->lock){+.+.+.}, at: [<ffffffffc06f511f>] store_filter+0x9f/0x240 [rc_core]

               stack backtrace:
CPU: 3 PID: 2502 Comm: bash Not tainted 4.10.0-rc1+ #1
Hardware name:                  /DG45ID, BIOS IDG4510H.86A.0135.2011.0225.1100 02/25/2011
Call Trace:
 dump_stack+0x86/0xc3
 print_circular_bug+0x1be/0x210
 __lock_acquire+0x10f7/0x1290
 lock_acquire+0xfd/0x200
 ? ir_raw_encode_scancode+0x3e/0xb0 [rc_core]
 ? ir_raw_encode_scancode+0x3e/0xb0 [rc_core]
 mutex_lock_nested+0x77/0x6d0
 ? ir_raw_encode_scancode+0x3e/0xb0 [rc_core]
 ? loop_set_wakeup_filter+0x44/0xbd [rc_loopback]
 ir_raw_encode_scancode+0x3e/0xb0 [rc_core]
 loop_set_wakeup_filter+0x62/0xbd [rc_loopback]
 ? loop_set_tx_duty_cycle+0x70/0x70 [rc_loopback]
 store_filter+0x1aa/0x240 [rc_core]
 dev_attr_store+0x18/0x30
 sysfs_kf_write+0x45/0x60
 kernfs_fop_write+0x155/0x1e0
 __vfs_write+0x37/0x160
 ? rcu_read_lock_sched_held+0x4a/0x80
 ? rcu_sync_lockdep_assert+0x2f/0x60
 ? __sb_start_write+0x10c/0x220
 ? vfs_write+0x195/0x1e0
 ? security_file_permission+0x3b/0xc0
 vfs_write+0xc8/0x1e0
 SyS_write+0x58/0xc0
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Signed-off-by: NSean Young <sean@mess.org>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

db5b15b7

[media] serial_ir: ensure we're ready to receive interrupts · 0265634e

由 Sean Young 提交于 2月 25, 2017

When the interrupt requested with devm_request_irq(), serial_ir.rcdev
is still null so will cause null deference if the irq handler is called
early on.

Also ensure that timeout_timer is setup.

Link: http://lkml.kernel.org/r/CA+55aFxsh2uF8gi5sN_guY3Z+tiLv7LpJYKBw+y8vqLzp+TsnQ@mail.gmail.com

[mchehab@s-opensource.com: moved serial_ir_probe() back to its original place]

Cc: <stable@vger.kernel.org> # 4.10
Signed-off-by: NSean Young <sean@mess.org>
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

0265634e

ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines · c74042f3

由 Alexander Duyck 提交于 2月 03, 2017

On architectures that have a cache line size larger than 64 Bytes we start
running into issues where the amount of headroom for the frame starts
shrinking.

The size of skb_shared_info on a system with a 64B L1 cache line size is
320. This increases to 384 with a 128B cache line, and 512 with a 256B
cache line.

In addition the NET_SKB_PAD value increases as well consistent with the
cache line size. As a result when we get to a 256B cache line as seen on
the s390 we end up 768 bytes used by padding and shared info leaving us
with only 1280 bytes to use for data storage. On architectures such as
this we should default to using 3K Rx buffers out of a 8K page instead of
trying to do 1.5K buffers out of a 4K page.

To take all of this into account I have added one small check so that we
compare the max_frame to the amount of actual data we can store. This was
already occurring for igb, but I had overlooked it for ixgbe as it doesn't
have strict limits for 82599 once we enable jumbo frames. By adding this
check we will automatically enable 3K Rx buffers as soon as the maximum
frame size we can handle drops below the standard Ethernet MTU.

I also went through and fixed one small typo that I found where I had left
an IGB in a variable name due to a copy/paste error.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c74042f3

statx: Add a system call to make enhanced file info available · a528d35e

由 David Howells 提交于 1月 31, 2017

Add a system call to make extended file information available, including
file creation and some attribute flags where available through the
underlying filesystem.

The getattr inode operation is altered to take two additional arguments: a
u32 request_mask and an unsigned int flags that indicate the
synchronisation mode.  This change is propagated to the vfs_getattr*()
function.

Functions like vfs_stat() are now inline wrappers around new functions
vfs_statx() and vfs_statx_fd() to reduce stack usage.

========
OVERVIEW
========

The idea was initially proposed as a set of xattrs that could be retrieved
with getxattr(), but the general preference proved to be for a new syscall
with an extended stat structure.

A number of requests were gathered for features to be included.  The
following have been included:

 (1) Make the fields a consistent size on all arches and make them large.

 (2) Spare space, request flags and information flags are provided for
     future expansion.

 (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
     __s64).

 (4) Creation time: The SMB protocol carries the creation time, which could
     be exported by Samba, which will in turn help CIFS make use of
     FS-Cache as that can be used for coherency data (stx_btime).

     This is also specified in NFSv4 as a recommended attribute and could
     be exported by NFSD [Steve French].

 (5) Lightweight stat: Ask for just those details of interest, and allow a
     netfs (such as NFS) to approximate anything not of interest, possibly
     without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
     Dilger] (AT_STATX_DONT_SYNC).

 (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
     its cached attributes are up to date [Trond Myklebust]
     (AT_STATX_FORCE_SYNC).

And the following have been left out for future extension:

 (7) Data version number: Could be used by userspace NFS servers [Aneesh
     Kumar].

     Can also be used to modify fill_post_wcc() in NFSD which retrieves
     i_version directly, but has just called vfs_getattr().  It could get
     it from the kstat struct if it used vfs_xgetattr() instead.

     (There's disagreement on the exact semantics of a single field, since
     not all filesystems do this the same way).

 (8) BSD stat compatibility: Including more fields from the BSD stat such
     as creation time (st_btime) and inode generation number (st_gen)
     [Jeremy Allison, Bernd Schubert].

 (9) Inode generation number: Useful for FUSE and userspace NFS servers
     [Bernd Schubert].

     (This was asked for but later deemed unnecessary with the
     open-by-handle capability available and caused disagreement as to
     whether it's a security hole or not).

(10) Extra coherency data may be useful in making backups [Andreas Dilger].

     (No particular data were offered, but things like last backup
     timestamp, the data version number and the DOS archive bit would come
     into this category).

(11) Allow the filesystem to indicate what it can/cannot provide: A
     filesystem can now say it doesn't support a standard stat feature if
     that isn't available, so if, for instance, inode numbers or UIDs don't
     exist or are fabricated locally...

     (This requires a separate system call - I have an fsinfo() call idea
     for this).

(12) Store a 16-byte volume ID in the superblock that can be returned in
     struct xstat [Steve French].

     (Deferred to fsinfo).

(13) Include granularity fields in the time data to indicate the
     granularity of each of the times (NFSv4 time_delta) [Steve French].

     (Deferred to fsinfo).

(14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
     Note that the Linux IOC flags are a mess and filesystems such as Ext4
     define flags that aren't in linux/fs.h, so translation in the kernel
     may be a necessity (or, possibly, we provide the filesystem type too).

     (Some attributes are made available in stx_attributes, but the general
     feeling was that the IOC flags were to ext[234]-specific and shouldn't
     be exposed through statx this way).

(15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
     Michael Kerrisk].

     (Deferred, probably to fsinfo.  Finding out if there's an ACL or
     seclabal might require extra filesystem operations).

(16) Femtosecond-resolution timestamps [Dave Chinner].

     (A __reserved field has been left in the statx_timestamp struct for
     this - if there proves to be a need).

(17) A set multiple attributes syscall to go with this.

===============
NEW SYSTEM CALL
===============

The new system call is:

	int ret = statx(int dfd,
			const char *filename,
			unsigned int flags,
			unsigned int mask,
			struct statx *buffer);

The dfd, filename and flags parameters indicate the file to query, in a
similar way to fstatat().  There is no equivalent of lstat() as that can be
emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
also no equivalent of fstat() as that can be emulated by passing a NULL
filename to statx() with the fd of interest in dfd.

Whether or not statx() synchronises the attributes with the backing store
can be controlled by OR'ing a value into the flags argument (this typically
only affects network filesystems):

 (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
     respect.

 (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
     its attributes with the server - which might require data writeback to
     occur to get the timestamps correct.

 (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
     network filesystem.  The resulting values should be considered
     approximate.

mask is a bitmask indicating the fields in struct statx that are of
interest to the caller.  The user should set this to STATX_BASIC_STATS to
get the basic set returned by stat().  It should be noted that asking for
more information may entail extra I/O operations.

buffer points to the destination for the data.  This must be 256 bytes in
size.

======================
MAIN ATTRIBUTES RECORD
======================

The following structures are defined in which to return the main attribute
set:

	struct statx_timestamp {
		__s64	tv_sec;
		__s32	tv_nsec;
		__s32	__reserved;
	};

	struct statx {
		__u32	stx_mask;
		__u32	stx_blksize;
		__u64	stx_attributes;
		__u32	stx_nlink;
		__u32	stx_uid;
		__u32	stx_gid;
		__u16	stx_mode;
		__u16	__spare0[1];
		__u64	stx_ino;
		__u64	stx_size;
		__u64	stx_blocks;
		__u64	__spare1[1];
		struct statx_timestamp	stx_atime;
		struct statx_timestamp	stx_btime;
		struct statx_timestamp	stx_ctime;
		struct statx_timestamp	stx_mtime;
		__u32	stx_rdev_major;
		__u32	stx_rdev_minor;
		__u32	stx_dev_major;
		__u32	stx_dev_minor;
		__u64	__spare2[14];
	};

The defined bits in request_mask and stx_mask are:

	STATX_TYPE		Want/got stx_mode & S_IFMT
	STATX_MODE		Want/got stx_mode & ~S_IFMT
	STATX_NLINK		Want/got stx_nlink
	STATX_UID		Want/got stx_uid
	STATX_GID		Want/got stx_gid
	STATX_ATIME		Want/got stx_atime{,_ns}
	STATX_MTIME		Want/got stx_mtime{,_ns}
	STATX_CTIME		Want/got stx_ctime{,_ns}
	STATX_INO		Want/got stx_ino
	STATX_SIZE		Want/got stx_size
	STATX_BLOCKS		Want/got stx_blocks
	STATX_BASIC_STATS	[The stuff in the normal stat struct]
	STATX_BTIME		Want/got stx_btime{,_ns}
	STATX_ALL		[All currently available stuff]

stx_btime is the file creation time, stx_mask is a bitmask indicating the
data provided and __spares*[] are where as-yet undefined fields can be
placed.

Time fields are structures with separate seconds and nanoseconds fields
plus a reserved field in case we want to add even finer resolution.  Note
that times will be negative if before 1970; in such a case, the nanosecond
fields will also be negative if not zero.

The bits defined in the stx_attributes field convey information about a
file, how it is accessed, where it is and what it does.  The following
attributes map to FS_*_FL flags and are the same numerical value:

	STATX_ATTR_COMPRESSED		File is compressed by the fs
	STATX_ATTR_IMMUTABLE		File is marked immutable
	STATX_ATTR_APPEND		File is append-only
	STATX_ATTR_NODUMP		File is not to be dumped
	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs

Within the kernel, the supported flags are listed by:

	KSTAT_ATTR_FS_IOC_FLAGS

[Are any other IOC flags of sufficient general interest to be exposed
through this interface?]

New flags include:

	STATX_ATTR_AUTOMOUNT		Object is an automount trigger

These are for the use of GUI tools that might want to mark files specially,
depending on what they are.

Fields in struct statx come in a number of classes:

 (0) stx_dev_*, stx_blksize.

     These are local system information and are always available.

 (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
     stx_size, stx_blocks.

     These will be returned whether the caller asks for them or not.  The
     corresponding bits in stx_mask will be set to indicate whether they
     actually have valid values.

     If the caller didn't ask for them, then they may be approximated.  For
     example, NFS won't waste any time updating them from the server,
     unless as a byproduct of updating something requested.

     If the values don't actually exist for the underlying object (such as
     UID or GID on a DOS file), then the bit won't be set in the stx_mask,
     even if the caller asked for the value.  In such a case, the returned
     value will be a fabrication.

     Note that there are instances where the type might not be valid, for
     instance Windows reparse points.

 (2) stx_rdev_*.

     This will be set only if stx_mode indicates we're looking at a
     blockdev or a chardev, otherwise will be 0.

 (3) stx_btime.

     Similar to (1), except this will be set to 0 if it doesn't exist.

=======
TESTING
=======

The following test program can be used to test the statx system call:

	samples/statx/test-statx.c

Just compile and run, passing it paths to the files you want to examine.
The file is built automatically if CONFIG_SAMPLES is enabled.

Here's some example output.  Firstly, an NFS directory that crosses to
another FSID.  Note that the AUTOMOUNT attribute is set because transiting
this directory will cause d_automount to be invoked by the VFS.

	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:26           Inode: 1703937     Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000
	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)

Secondly, the result of automounting on that directory.

	[root@andromeda ~]# /tmp/test-statx /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:27           Inode: 2           Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a528d35e

ixgbe: update the rss key on h/w, when ethtool ask for it · d3aa9c9f

由 Paolo Abeni 提交于 12月 15, 2016

Currently ixgbe_set_rxfh() updates the rss_key copy in the driver
memory, but does not push the new value into the h/w. This commit
add a new helper for the latter operation and call it in
ixgbe_set_rxfh(), so that the h/w rss key value can be really
updated via ethtool.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d3aa9c9f

sched/headers: Remove the <linux/mm_types.h> dependency from <linux/sched.h> · 77ba809e

由 Ingo Molnar 提交于 2月 04, 2017

Use the freshly introduced, reduced size <linux/mm_types_task.h> header instead.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

77ba809e

sched/headers: Move task_struct::signal and task_struct::sighand types and... · c3edc401

由 Ingo Molnar 提交于 2月 02, 2017

sched/headers: Move task_struct::signal and task_struct::sighand types and accessors into <linux/sched/signal.h>

task_struct::signal and task_struct::sighand are pointers, which would normally make it
straightforward to not define those types in sched.h.

That is not so, because the types are accompanied by a myriad of APIs (macros and inline
functions) that dereference them.

Split the types and the APIs out of sched.h and move them into a new header, <linux/sched/signal.h>.

With this change sched.h does not know about 'struct signal' and 'struct sighand' anymore,
trying to put accessors into sched.h as a test fails the following way:

  ./include/linux/sched.h: In function ‘test_signal_types’:
  ./include/linux/sched.h:2461:18: error: dereferencing pointer to incomplete type ‘struct signal_struct’
                    ^

This reduces the size and complexity of sched.h significantly.

Update all headers and .c code that relied on getting the signal handling
functionality from <linux/sched.h> to include <linux/sched/signal.h>.

The list of affected files in the preparatory patch was partly generated by
grepping for the APIs, and partly by doing coverage build testing, both
all[yes|mod|def|no]config builds on 64-bit and 32-bit x86, and an array of
cross-architecture builds.

Nevertheless some (trivial) build breakage is still expected related to rare
Kconfig combinations and in-flight patches to various kernel code, but most
of it should be handled by this patch.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

c3edc401

xen-netback: Use GFP_ATOMIC to allocate hash · 9f674e48

由 Anoob Soman 提交于 3月 02, 2017

Allocation of new_hash, inside xenvif_new_hash(), always happen
in softirq context, so use GFP_ATOMIC instead of GFP_KERNEL for new
hash allocation.
Signed-off-by: NAnoob Soman <anoob.soman@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f674e48

bonding: use ETH_MAX_MTU as max mtu · 31c05415

由 WANG Cong 提交于 3月 02, 2017

This restores the ability of setting bond device's mtu to 9000.

Fixes: 91572088 ("net: use core MTU range checking in core net infra")
Reported-by: daznis@gmail.com
Reported-by: NBrad Campbell <lists2009@fnarfbargle.com>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31c05415

netvsc: fix use-after-free in netvsc_change_mtu() · 152669bd

由 Dexuan Cui 提交于 3月 02, 2017

'nvdev' is freed in rndis_filter_device_remove -> netvsc_device_remove ->
free_netvsc_device, so we mustn't access it, before it's re-created in
rndis_filter_device_add -> netvsc_device_add.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

152669bd

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功