提交 · 9578f41aeaee5010384f4f8484da1566e2ce4901 · openeuler / Kernel

29 10月, 2014 1 次提交

HID: fix merge from wacom into the HID tree · c241c5ee

由 Benjamin Tissoires 提交于 9月 30, 2014

While merging wacom from the input to the hid tree, some
comments have been duplicated. We can also integrate the
test for Synaptics devices in the switch case below, so
it is clear that there will be only one place for such
quirks.

No functional changes are expected in this commit.
Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c241c5ee

01 10月, 2014 2 次提交

HID: wacom: implement generic HID handling for pen generic devices · 7704ac93

由 Benjamin Tissoires 提交于 9月 23, 2014

ISDv4 and v5 are plain HID devices. We can directly implement a generic
HID parsing/handling and remove the need to manually add those PID in
the list of supported devices.

This patch implements the pen support only. The finger part will come in
a later patch.

To be properly notified of an .event() and a .report(), we need to force
hid-core to go through the HID parsing. By default, wacom.ko binds only
hidraw, so the hid parsing is not done by hid-core. When a true HID device
is there, we add the flag HID_CLAIMED_DRIVER to hid->claimed which will
force hid-core to parse the incoming reports.
(Note that this can be easily backported by directly setting the .claimed
flag to HID_CLAIMED_DRIVER even if hid-core does not support
HID_CONNECT_DRIVER)
Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: NJason Gerecke <killertofu@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

7704ac93

ipv6: remove rt6i_genid · 705f1c86

由 Hannes Frederic Sowa 提交于 9月 28, 2014

Eric Dumazet noticed that all no-nonexthop or no-gateway routes which
are already marked DST_HOST (e.g. input routes routes) will always be
invalidated during sk_dst_check. Thus per-socket dst caching absolutely
had no effect and early demuxing had no effect.

Thus this patch removes rt6i_genid: fn_sernum already gets modified during
add operations, so we only must ensure we mutate fn_sernum during ipv6
address remove operations. This is a fairly cost extensive operations,
but address removal should not happen that often. Also our mtu update
functions do the same and we heard no complains so far. xfrm policy
changes also cause a call into fib6_flush_trees. Also plug a hole in
rt6_info (no cacheline changes).

I verified via tracing that this change has effect.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

705f1c86

28 9月, 2014 2 次提交

net: make tcp_cleanup_rbuf private · 3f334078

由 Dan Williams 提交于 12月 30, 2013

net_dma was the only external user so this can become local to tcp.c
again.

Cc: James Morris <jmorris@namei.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3f334078

net_dma: simple removal · 7bced397

由 Dan Williams 提交于 12月 30, 2013

Per commit "77873803 net_dma: mark broken" net_dma is no longer used
and there is no plan to fix it.

This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
Reverting the remainder of the net_dma induced changes is deferred to
subsequent patches.

Marked for stable due to Roman's report of a memory leak in
dma_pin_iovec_pages():

    https://lkml.org/lkml/2014/9/3/177

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: David Whipple <whipple@securedatainnovations.ch>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: <stable@vger.kernel.org>
Reported-by: NRoman Gushchin <klamm@yandex-team.ru>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

7bced397

27 9月, 2014 1 次提交

fuse: honour max_read and max_write in direct_io mode · 2c80929c

由 Miklos Szeredi 提交于 9月 24, 2014

The third argument of fuse_get_user_pages() "nbytesp" refers to the number of
bytes a caller asked to pack into fuse request. This value may be lesser
than capacity of fuse request or iov_iter.  So fuse_get_user_pages() must
ensure that *nbytesp won't grow.

Now, when helper iov_iter_get_pages() performs all hard work of extracting
pages from iov_iter, it can be done by passing properly calculated
"maxsize" to the helper.

The other caller of iov_iter_get_pages() (dio_refill_pages()) doesn't need
this capability, so pass LONG_MAX as the maxsize argument here.

Fixes: c9c37e2e ("fuse: switch to iov_iter_get_pages()")
Reported-by: NWerner Baumann <werner.baumann@onlinehome.de>
Tested-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2c80929c

25 9月, 2014 4 次提交

i2c: move acpi code back into the core · 17f4a5c4

由 Wolfram Sang 提交于 9月 22, 2014

Commit 5d98e61d ("I2C/ACPI: Add i2c ACPI operation region support")
renamed the i2c-core module. This may cause regressions for
distributions, so put the ACPI code back into the core.
Reported-by: NJean Delvare <jdelvare@suse.de>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
Tested-by: NLan Tianyu <tianyu.lan@intel.com>
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>

17f4a5c4

cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags · 2ad654bc

由 Zefan Li 提交于 9月 25, 2014

When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.

Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.

Here's the full report:
https://lkml.org/lkml/2014/9/19/230

To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.

v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Fixes: 950592f7 ("cpusets: update tasks' page/slab spread flags in time")
Cc: <stable@vger.kernel.org> # 2.6.31+
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2ad654bc

sched: add macros to define bitops for task atomic flags · e0e5070b

由 Zefan Li 提交于 9月 25, 2014

This will simplify code when we add new flags.

v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.

v2:
- updated scripts/tags.sh, suggested by Peter

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

e0e5070b

sched: fix confusing PFA_NO_NEW_PRIVS constant · a2b86f77

由 Zefan Li 提交于 9月 25, 2014

Commit 1d4457f9 ("sched: move no_new_privs into new atomic flags")
defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing
because it is used as bit number. Redefine it as decimal bit number.

Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NKees Cook <keescook@chromium.org>
[ lizf: slightly modified subject and changelog ]
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

a2b86f77

24 9月, 2014 3 次提交

blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe · 0a30288d

由 Tejun Heo 提交于 9月 23, 2014

blk-mq uses percpu_ref for its usage counter which tracks the number
of in-flight commands and used to synchronously drain the queue on
freeze.  percpu_ref shutdown takes measureable wallclock time as it
involves a sched RCU grace period.  This means that draining a blk-mq
takes measureable wallclock time.  One would think that this shouldn't
matter as queue shutdown should be a rare event which takes place
asynchronously w.r.t. userland.

Unfortunately, SCSI probing involves synchronously setting up and then
tearing down a lot of request_queues back-to-back for non-existent
LUNs.  This means that SCSI probing may take more than ten seconds
when scsi-mq is used.

This will be properly fixed by implementing a mechanism to keep
q->mq_usage_counter in atomic mode till genhd registration; however,
that involves rather big updates to percpu_ref which is difficult to
apply late in the devel cycle (v3.17-rc6 at the moment).  As a
stop-gap measure till the proper fix can be implemented in the next
cycle, this patch introduces __percpu_ref_kill_expedited() and makes
blk_mq_freeze_queue() use it.  This is heavy-handed but should work
for testing the experimental SCSI blk-mq implementation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NChristoph Hellwig <hch@infradead.org>
Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Tested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

0a30288d

spi: pl022: Add support for chip select extension · db4fa45e

由 Anders Berg 提交于 9月 17, 2014

Add support for a extended PL022 which has an extra register for controlling up
to five chip select signals. This controller is found on the AXM5516 SoC.
Unfortunately the PrimeCell identification registers are identical to a
standard ARM PL022. To work around this, the peripheral ID must be overridden
in the device tree using the "arm,primecell-periphid" property with the value
0x000b6022.
Signed-off-by: NAnders Berg <anders.berg@avagotech.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NMark Brown <broonie@kernel.org>

db4fa45e

crypto: ccp - Check for CCP before registering crypto algs · c9f21cb6

由 Tom Lendacky 提交于 9月 05, 2014

If the ccp is built as a built-in module, then ccp-crypto (whether
built as a module or a built-in module) will be able to load and
it will register its crypto algorithms.  If the system does not have
a CCP this will result in -ENODEV being returned whenever a command
is attempted to be queued by the registered crypto algorithms.

Add an API, ccp_present(), that checks for the presence of a CCP
on the system.  The ccp-crypto module can use this to determine if it
should register it's crypto alogorithms.

Cc: stable@vger.kernel.org
Reported-by: NScot Doyle <lkml14@scotdoyle.com>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Tested-by: NScot Doyle <lkml14@scotdoyle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c9f21cb6

23 9月, 2014 1 次提交

net: sched: shrink struct qdisc_skb_cb to 28 bytes · 25711786

由 Eric Dumazet 提交于 9月 18, 2014

We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
or increasing skb->cb[] size.

Commit e0f31d84 ("flow_keys: Record IP layer protocol in
skb_flow_dissect()") broke IPoIB.

Only current offender is sch_choke, and this one do not need an
absolutely precise flow key.

If we store 17 bytes of flow key, its more than enough. (Its the actual
size of flow_keys if it was a packed structure, but we might add new
fields at the end of it later)
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: e0f31d84 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25711786

22 9月, 2014 2 次提交

[media] vb2: fix VBI/poll regression · 58d75f4b

由 Hans Verkuil 提交于 9月 20, 2014

The recent conversion of saa7134 to vb2 unconvered a poll() bug that
broke the teletext applications alevt and mtt. These applications
expect that calling poll() without having called VIDIOC_STREAMON will
cause poll() to return POLLERR. That did not happen in vb2.

This patch fixes that behavior. It also fixes what should happen when
poll() is called when STREAMON is called but no buffers have been
queued. In that case poll() will also return POLLERR, but only for
capture queues since output queues will always return POLLOUT
anyway in that situation.

This brings the vb2 behavior in line with the old videobuf behavior.
Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>

58d75f4b

[media] videobuf2-core.h: fix comment · 44e8e69d

由 Hans Verkuil 提交于 8月 04, 2014

The comment for start_streaming that tells the developer with which vb2 state
buffers should be returned to vb2 gave the wrong state. Very confusing.
Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>

44e8e69d

21 9月, 2014 2 次提交

ACPI / hotplug: Generate online uevents for ACPI containers · 8ab17fc9

由 Rafael J. Wysocki 提交于 9月 21, 2014

Commit 46394fd0 (ACPI / hotplug: Move container-specific code out of
the core) removed the generation of "online" uevents for containers,
because "add" uevents are now generated for them automatically when
container system devices are registered. However, there are user
space tools that need to be notified when the container and all of
its children have been enumerated, which doesn't happen any more.

For this reason, add a mechanism allowing "online" uevents to be
generated for ACPI containers after enumerating the container along
with all of its children.

Fixes: 46394fd0 (ACPI / hotplug: Move container-specific code out of the core)
Reported-and-tested-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8ab17fc9

sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP · 6a40281a

由 Chuck Ebbert 提交于 9月 20, 2014

Aaron Tomlin recently posted patches [1] to enable checking the
stack canary on every task switch. Looking at the canary code, I
realized that every arch (except ia64, which adds some space for
register spill above the stack) shares a definition of
end_of_stack() that makes it the first long after the
threadinfo.

For stacks that grow down, this low address is correct because
the stack starts at the end of the thread area and grows toward
lower addresses. However, for stacks that grow up, toward higher
addresses, this is wrong. (The stack actually grows away from
the canary.) On these archs end_of_stack() should return the
address of the last long, at the highest possible address for the stack.

[1] http://lkml.org/lkml/2014/9/12/293Signed-off-by: NChuck Ebbert <cebbert.lkml@gmail.com>
Link: http://lkml.kernel.org/r/20140920101751.6c5166b6@asSigned-off-by: NIngo Molnar <mingo@kernel.org>
Tested-by: James Hogan <james.hogan@imgtec.com> [metag]
Acked-by: NJames Hogan <james.hogan@imgtec.com>
Acked-by: NAaron Tomlin <atomlin@redhat.com>

6a40281a

20 9月, 2014 2 次提交

genetlink: add function genl_has_listeners() · 0d566379

由 Nicolas Dichtel 提交于 9月 18, 2014

This function is the counterpart of the function netlink_has_listeners().
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d566379

IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get · 87773dd5

由 Shawn Bohrer 提交于 9月 03, 2014

In debugging an application that receives -ENOMEM from ib_reg_mr(), I
found that ib_umem_get() can fail because the pinned_vm count has
wrapped causing it to always be larger than the lock limit even with
RLIMIT_MEMLOCK set to RLIM_INFINITY.

The wrapping of pinned_vm occurs because the process that calls
ib_reg_mr() will have its mm->pinned_vm count incremented.  Later a
different process with a different mm_struct than the one that
allocated the ib_umem struct ends up releasing it which results in
decrementing the new processes mm->pinned_vm count past zero and
wrapping.

I'm not entirely sure what circumstances cause a different process to
release the ib_umem than the one that allocated it but the kernel
stack trace of the freeing process from my situation looks like the
following:

    Call Trace:
     [<ffffffff814d64b1>] dump_stack+0x19/0x1b
     [<ffffffffa0b522a5>] ib_umem_release+0x1f5/0x200 [ib_core]
     [<ffffffffa0b90681>] mlx4_ib_destroy_qp+0x241/0x440 [mlx4_ib]
     [<ffffffffa0b4d93c>] ib_destroy_qp+0x12c/0x170 [ib_core]
     [<ffffffffa0cc7129>] ib_uverbs_close+0x259/0x4e0 [ib_uverbs]
     [<ffffffff81141cba>] __fput+0xba/0x240
     [<ffffffff81141e4e>] ____fput+0xe/0x10
     [<ffffffff81060894>] task_work_run+0xc4/0xe0
     [<ffffffff810029e5>] do_notify_resume+0x95/0xa0
     [<ffffffff814e3dd0>] int_signal+0x12/0x17

The following patch fixes the issue by storing the pid struct of the
process that calls ib_umem_get() so that ib_umem_release and/or
ib_umem_account() can properly decrement the pinned_vm count of the
correct mm_struct.
Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
Reviewed-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

87773dd5

19 9月, 2014 2 次提交

[SCSI] fix regression that accidentally disabled block-based tcq · e8be1cf5

由 Christoph Hellwig 提交于 9月 12, 2014

The scsi blk-mq support accidentally flipped a conditional, which lead to
never enabling block based tcq when using the legacy request path.

Fixes: d285203c scsi: add support for a blk-mq based I/O path.
Reported-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

e8be1cf5

vgaswitcheroo: add vga_switcheroo_fini_domain_pm_ops · 766a53d0

由 Alex Deucher 提交于 9月 12, 2014

Drivers should call this on unload to unregister pmops.

Bug:
https://bugzilla.kernel.org/show_bug.cgi?id=84431Reviewed-by: NBen Skeggs <bskeggs@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPali Rohár <pali.rohar@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>

766a53d0

17 9月, 2014 1 次提交

vgaarb: Drop obsolete #ifndef · ce6eacb0

由 Bruno Prémont 提交于 8月 24, 2014

Commit 20cde694 ("x86, ia64: Move EFI_FB vga_default_device()
initialization to pci_vga_fixup()") moved boot video device detection from
efifb to x86 and ia64 pci/fixup.c.

Remove the left-over #ifndef check that will always match since the
corresponding arch-specific define is gone with above patch.
Signed-off-by: NBruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Matthew Garrett <matthew.garrett@nebula.com>

ce6eacb0

16 9月, 2014 3 次提交

xfrm: Generate queueing routes only from route lookup functions · b8c203b2

由 Steffen Klassert 提交于 9月 16, 2014

Currently we genarate a queueing route if we have matching policies
but can not resolve the states and the sysctl xfrm_larval_drop is
disabled. Here we assume that dst_output() is called to kill the
queued packets. Unfortunately this assumption is not true in all
cases, so it is possible that these packets leave the system unwanted.

We fix this by generating queueing routes only from the
route lookup functions, here we can guarantee a call to
dst_output() afterwards.

Fixes: a0073fe1 ("xfrm: Add a state resolution packet queue")
Reported-by: NKonstantinos Kolelis <k.kolelis@sirrix.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b8c203b2

xfrm: Generate blackhole routes only from route lookup functions · f92ee619

由 Steffen Klassert 提交于 9月 16, 2014

Currently we genarate a blackhole route route whenever we have
matching policies but can not resolve the states. Here we assume
that dst_output() is called to kill the balckholed packets.
Unfortunately this assumption is not true in all cases, so
it is possible that these packets leave the system unwanted.

We fix this by generating blackhole routes only from the
route lookup functions, here we can guarantee a call to
dst_output() afterwards.

Fixes: 2774c131 ("xfrm: Handle blackhole route creation via afinfo.")
Reported-by: NKonstantinos Kolelis <k.kolelis@sirrix.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

f92ee619

ACPIPHP / radeon / nouveau: Remove acpi_bus_no_hotplug() · f91ce35e

由 Bjorn Helgaas 提交于 9月 10, 2014

Revert parts of f244d8b6 ("ACPIPHP / radeon / nouveau: Fix VGA
switcheroo problem related to hotplug").

A previous commit 5493b31f0b55 ("PCI: Add pci_ignore_hotplug() to ignore
hotplug events for a device") added equivalent functionality implemented in
a different way for both acpiphp and pciehp.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NDave Airlie <airlied@redhat.com>
Acked-by: NRajat Jain <rajatxjain@gmail.com>

f91ce35e

15 9月, 2014 1 次提交

vfs: avoid non-forwarding large load after small store in path lookup · 9226b5b4

由 Linus Torvalds 提交于 9月 14, 2014

The performance regression that Josef Bacik reported in the pathname
lookup (see commit 99d263d4 "vfs: fix bad hashing of dentries") made
me look at performance stability of the dcache code, just to verify that
the problem was actually fixed.  That turned up a few other problems in
this area.

There are a few cases where we exit RCU lookup mode and go to the slow
serializing case when we shouldn't, Al has fixed those and they'll come
in with the next VFS pull.

But my performance verification also shows that link_path_walk() turns
out to have a very unfortunate 32-bit store of the length and hash of
the name we look up, followed by a 64-bit read of the combined hash_len
field.  That screws up the processor store to load forwarding, causing
an unnecessary hickup in this critical routine.

It's caused by the ugly calling convention for the "hash_name()"
function, and easily fixed by just making hash_name() fill in the whole
'struct qstr' rather than passing it a pointer to just the hash value.

With that, the profile for this function looks much smoother.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9226b5b4

14 9月, 2014 1 次提交

Make hash_64() use a 64-bit multiply when appropriate · 23d0db76

由 Linus Torvalds 提交于 9月 13, 2014

The hash_64() function historically does the multiply by the
GOLDEN_RATIO_PRIME_64 number with explicit shifts and adds, because
unlike the 32-bit case, gcc seems unable to turn the constant multiply
into the more appropriate shift and adds when required.

However, that means that we generate those shifts and adds even when the
architecture has a fast multiplier, and could just do it better in
hardware.

Use the now-cleaned-up CONFIG_ARCH_HAS_FAST_MULTIPLIER (together with
"is it a 64-bit architecture") to decide whether to use an integer
multiply or the explicit sequence of shift/add instructions.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

23d0db76

13 9月, 2014 3 次提交

ipv6: clean up anycast when an interface is destroyed · 381f4dca

由 Sabrina Dubroca 提交于 9月 10, 2014

If we try to rmmod the driver for an interface while sockets with
setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up
and we get stuck on:

  unregister_netdevice: waiting for ens3 to become free. Usage count = 1

If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no
problem.

We need to perform a cleanup similar to the one for multicast in
addrconf_ifdown(how == 1).
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

381f4dca

jiffies: Fix timeval conversion to jiffies · d78c9300

由 Andrew Hunter 提交于 9月 04, 2014

timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, &val, NULL);
setitimer(ITIMER_PROF, NULL, &val);

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 10000;
  setitimer(ITIMER_PROF, &initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
    struct itimerval prev;
    setitimer(ITIMER_PROF, &zero, &prev);
    /* on old kernels, this goes up by TICK_USEC every iteration */
    printf("previous value: %ld %ld %ld %ld\n",
           prev.it_interval.tv_sec, prev.it_interval.tv_usec,
           prev.it_value.tv_sec, prev.it_value.tv_usec);
    setitimer(ITIMER_PROF, &prev, NULL);
  }
    return 0;
}

Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: NPaul Turner <pjt@google.com>
Reported-by: NAaron Jacobs <jacobsa@google.com>
Signed-off-by: NAndrew Hunter <ahh@google.com>
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d78c9300

workqueue: apply __WQ_ORDERED to create_singlethread_workqueue() · e09c2c29

由 Tejun Heo 提交于 9月 13, 2014

create_singlethread_workqueue() is a compat interface for single
threaded workqueue which maps to ordered workqueue w/ rescuer in the
current implementation.  create_singlethread_workqueue() currently
implemented by invoking alloc_workqueue() w/ appropriate parameters.

8719dcea ("workqueue: reject adjusting max_active or applying
attrs to ordered workqueues") introduced __WQ_ORDERED to protect
ordered workqueues against dynamic attribute changes which can break
ordering guarantees but forgot to apply it to
create_singlethread_workqueue().  This in itself is okay as nobody
currently uses dynamic attribute change on workqueues created with
create_singlethread_workqueue().

However, 4c16bd32 ("workqueue: implement NUMA affinity for unbound
workqueues") broke singlethreaded guarantee for ordered workqueues
through allocating a separate pool_workqueue on each NUMA node by
default.  A later change 8a2b7538 ("workqueue: fix ordered
workqueues in NUMA setups") fixed it by allocating only one global
pool_workqueue if __WQ_ORDERED is set.

Combined, the __WQ_ORDERED omission in create_singlethread_workqueue()
became critical breaking its single threadedness and ordering
guarantee.

Let's make create_singlethread_workqueue() wrap
alloc_ordered_workqueue() instead so that it inherits __WQ_ORDERED and
can implicitly track future ordered_workqueue changes.

v2: I missed that __WQ_ORDERED now protects against pwq splitting
    across NUMA nodes and incorrectly described the patch as a
    nice-to-have fix to protect against future dynamic attribute
    usages.  Oleg pointed out that this is actually a critical
    breakage due to 8a2b7538 ("workqueue: fix ordered workqueues
    in NUMA setups").
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NMike Anderson <mike.anderson@us.ibm.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 4c16bd32 ("workqueue: implement NUMA affinity for unbound workqueues")

e09c2c29

12 9月, 2014 1 次提交

xen/arm: introduce XENFEAT_grant_map_identity · 5ebc77de

由 Stefano Stabellini 提交于 9月 10, 2014

The flag tells us that the hypervisor maps a grant page to guest
physical address == machine address of the page in addition to the
normal grant mapping address. It is needed to properly issue cache
maintenance operation at the completion of a DMA operation involving a
foreign grant.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: NDenis Schneider <v1ne2go@gmail.com>

5ebc77de

11 9月, 2014 4 次提交

moduleparam: Resolve missing-field-initializer warning · 184c3fc3

由 Mark Rustad 提交于 9月 11, 2014

Resolve a missing-field-initializer warning, that is produced
by every reference to module_param_call, by using designated
initialization for the first field. That is enough to silence
the complaint.

The message is only seen when doing a W=2 build. I happened to be using gcc
4.8.3, but I think most versions would produce the warning when it is
enabled. It can either be silenced by using even a single designated
initializer as I did here, or providing values for all of the fields. Because
of the number of references to the macro, this change silences many warnings
in W=2 builds.

One instance of the full warning message looks like this:

/home/share/git/nn-mdr/include/linux/moduleparam.h:198:16: warning: missing
initializer for field ‘free’ of ‘struct kernel_param_ops’
[-Wmissing-field-initializers]
  static struct kernel_param_ops __param_ops_##name =  \
		  ^
/home/share/git/nn-mdr/fs/fuse/inode.c:35:1: note: in expansion of macro
‘module_param_call’
 module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
 ^
/home/share/git/nn-mdr/include/linux/moduleparam.h:56:9: note: ‘free’
declared here
  void (*free)(void *arg);
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

184c3fc3

shm: add memfd.h to UAPI export list · b01d0720

由 David Drysdale 提交于 9月 09, 2014

The new header file memfd.h from commit 9183df25 ("shm: add
memfd_create() syscall") should be exported.
Signed-off-by: NDavid Drysdale <drysdale@google.com>
Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b01d0720

net/mlx4: Set vlan stripping policy by the right command · 09e05c3f

由 Matan Barak 提交于 9月 10, 2014

Changing the vlan stripping policy of the QP isn't supported by older
firmware versions for the INIT2RTR command. Nevertheless, we've used it.

Fix that by doing this policy change using INIT2RTR only if the firmware
supports it, otherwise, we call UPDATE_QP command to do the task.

Fixes: 7677fc96 ('net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode')
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09e05c3f

PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device · b440bde7

由 Bjorn Helgaas 提交于 9月 10, 2014

Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.

Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.

Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.

The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.

This is a reimplementation of f244d8b6 ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.

This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,

This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:

[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !

or while resuming it, as in bug 77261:

radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701Reported-by: NShawn Starr <shawn.starr@rogers.com>
Reported-by: NJose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NRajat Jain <rajatxjain@gmail.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NDave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+

b440bde7

10 9月, 2014 1 次提交

regulator: of: Provide simplified DT parsing method · a0c7b164

由 Mark Brown 提交于 9月 09, 2014

Currently regulator drivers which support DT all repeat very similar code
to supply a list of known regulator identifiers to be matched with DT,
convert that to platform data which is then matched up with the regulators
as they are registered. This is both fiddly to get right and for devices
which can use the standard helpers to provide their operations is the main
source of code in the driver.

Since this code is essentially identical for most drivers we can factor it
out into the core, moving the identifiers in the match table into the
regulator descriptors and also allowing drivers to pass in the name of the
subnode to search. When a driver provides an of_match string for the
regulator the core will attempt to use that to obtain init_data, allowing
the driver to remove all explicit code for DT parsing and simply provide
data instead.

The current code leaks the phandles for the child nodes, this will be
addressed incrementally and makes no practical difference for FDT anyway
as the DT data structures are never freed.
Signed-off-by: NMark Brown <broonie@linaro.org>

a0c7b164

09 9月, 2014 2 次提交

Documentation: Docbook: Fix generated DocBook/kernel-api.xml · da3dae54

由 Masanari Iida 提交于 9月 09, 2014

This patch fix spelling typo found in DocBook/kernel-api.xml.
It is because the file is generated from the source comments,
I have to fix the comments in source codes.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

da3dae54

Input: add INPUT_PROP_POINTING_STICK property · 7611392f

由 Hans de Goede 提交于 9月 08, 2014

It is useful for userspace to know that there not dealing with a regular
mouse but rather with a pointing stick (e.g. a trackpoint) so that
userspace can e.g. automatically enable middle button scrollwheel
emulation.

It is impossible to tell the difference from the evdev info without
resorting to putting a list of device / driver names in userspace, this is
undesirable.

Add a property which allows userspace to see if a device is a pointing
stick, and set it on all the pointing stick drivers.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: NPeter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

7611392f

08 9月, 2014 1 次提交

HID: usbhid: add always-poll quirk · 0b750b3b

由 Johan Hovold 提交于 9月 05, 2014

Add quirk to make sure that a device is always polled for input events
even if it hasn't been opened.

This is needed for devices that disconnects from the bus unless the
interrupt endpoint has been polled at least once or when not responding
to an input event (e.g. after having shut down X).
Signed-off-by: NJohan Hovold <johan@kernel.org>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

0b750b3b

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功