提交 · 3e4d890a26d5411d0b64e5e8ecfdcdb435c1d3f8 · openeuler / Kernel

23 9月, 2019 1 次提交

modules: make MODULE_IMPORT_NS() work even when modular builds are disabled · 3e4d890a

由 Linus Torvalds 提交于 9月 22, 2019

It's an unusual configuration, and was apparently never tested, and not
caught in linux-next because of a combination of travels and it making
it into the tree too late.

The fix is to simply move the #define to outside the CONFIG_MODULE
section, since MODULE_INFO() will do the right thing.

Cc: Martijn Coenen <maco@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthias Maennich <maennich@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e4d890a

18 9月, 2019 3 次提交

clk: add include guard to clk-conf.h · d7aef6ef

由 Masahiro Yamada 提交于 8月 20, 2019

Add a header include guard just in case.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Link: https://lkml.kernel.org/r/20190820030536.1181-1-yamada.masahiro@socionext.comSigned-off-by: NStephen Boyd <sboyd@kernel.org>

d7aef6ef

clk: mediatek: add pericfg clocks for MT8183 · f9e55ac2

由 Chunfeng Yun 提交于 8月 28, 2019

Add pericfg clocks for MT8183, it's used when support USB
remote wakeup

Cc: Weiyi Lu <weiyi.lu@mediatek.com>
Signed-off-by: NChunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lkml.kernel.org/r/1566980533-28282-2-git-send-email-chunfeng.yun@mediatek.comAcked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NStephen Boyd <sboyd@kernel.org>

f9e55ac2

dt-bindings: bcm2835-cprman: Add bcm2711 support · 80766f87

由 Stefan Wahren 提交于 8月 18, 2019

The new BCM2711 supports an additional clock for the emmc2 block.
So we need an additional compatible.
Signed-off-by: NStefan Wahren <wahrenst@gmx.net>
Acked-by: NEric Anholt <eric@anholt.net>
Reviewed-by: NEric Anholt <eric@anholt.net>

80766f87

17 9月, 2019 6 次提交

mailbox: mediatek: cmdq: clear the event in cmdq initial flow · 6058f118

由 Bibby Hsieh 提交于 8月 29, 2019

GCE hardware stored event information in own internal sysram,
if the initial value in those sysram is not zero value
it will cause a situation that gce can wait the event immediately
after client ask gce to wait event but not really trigger the
corresponding hardware.

In order to make sure that the wait event function is
exactly correct, we need to clear the sysram value in
cmdq initial flow.

Fixes: 623a6143 ("mailbox: mediatek: Add Mediatek CMDQ driver")
Signed-off-by: NBibby Hsieh <bibby.hsieh@mediatek.com>
Reviewed-by: NCK Hu <ck.hu@mediatek.com>
Reviewed-by: NMatthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>

6058f118

dt-binding: gce: add gce header file for mt8183 · 8fedf805

由 Bibby Hsieh 提交于 8月 29, 2019

Add documentation for the mt8183 gce.

Add gce header file defined the gce hardware event,
subsys number and constant for mt8183.
Signed-off-by: NBibby Hsieh <bibby.hsieh@mediatek.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>

8fedf805

ethtool: implement Energy Detect Powerdown support via phy-tunable · 9f2f13f4

由 Alexandru Ardelean 提交于 9月 16, 2019

The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like
this feature is common across other PHYs (like EEE), and defining
`ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long.

The way EDPD works, is that the RX block is put to a lower power mode,
except for link-pulse detection circuits. The TX block is also put to low
power mode, but the PHY wakes-up periodically to send link pulses, to avoid
lock-ups in case the other side is also in EDPD mode.

Currently, there are 2 PHY drivers that look like they could use this new
PHY tunable feature: the `adin` && `micrel` PHYs.

The ADIN's datasheet mentions that TX pulses are at intervals of 1 second
default each, and they can be disabled. For the Micrel KSZ9031 PHY, the
datasheet does not mention whether they can be disabled, but mentions that
they can modified.

The way this change is structured, is similar to the PHY tunable downshift
control:
* a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default
  TX interval; some PHYs could specify a certain value that makes sense
* `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled
* `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD

As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1
millisecond, which should cover a reasonable range of intervals:
 - from 1 millisecond, which does not sound like much of a power-saver
 - to ~65 seconds which is quite a lot to wait for a link to come up when
   plugging a cable
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NAlexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f2f13f4

net: dsa: Pass ndo_setup_tc slave callback to drivers · 47d23af2

由 Vladimir Oltean 提交于 9月 15, 2019

DSA currently handles shared block filters (for the classifier-action
qdisc) in the core due to what I believe are simply pragmatic reasons -
hiding the complexity from drivers and offerring a simple API for port
mirroring.

Extend the dsa_slave_setup_tc function by passing all other qdisc
offloads to the driver layer, where the driver may choose what it
implements and how. DSA is simply a pass-through in this case.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Acked-by: NKurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47d23af2

taprio: Add support for hardware offloading · 9c66d156

由 Vinicius Costa Gomes 提交于 9月 15, 2019

This allows taprio to offload the schedule enforcement to capable
network cards, resulting in more precise windows and less CPU usage.

The gate mask acts on traffic classes (groups of queues of same
priority), as specified in IEEE 802.1Q-2018, and following the existing
taprio and mqprio semantics.
It is up to the driver to perform conversion between tc and individual
netdev queues if for some reason it needs to make that distinction.

Full offload is requested from the network interface by specifying
"flags 2" in the tc qdisc creation command, which in turn corresponds to
the TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD bit.

The important detail here is the clockid which is implicitly /dev/ptpN
for full offload, and hence not configurable.

A reference counting API is added to support the use case where Ethernet
drivers need to keep the taprio offload structure locally (i.e. they are
a multi-port switch driver, and configuring a port depends on the
settings of other ports as well). The refcount_t variable is kept in a
private structure (__tc_taprio_qopt_offload) and not exposed to drivers.

In the future, the private structure might also be expanded with a
backpointer to taprio_sched *q, to implement the notification system
described in the patch (of when admin became oper, or an error occurred,
etc, so the offload can be monitored with 'tc qdisc show').
Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c66d156

cifs: Add support for root file systems · 8eecd1c2

由 Paulo Alcantara (SUSE) 提交于 7月 16, 2019

Introduce a new CONFIG_CIFS_ROOT option to handle root file systems
over a SMB share.

In order to mount the root file system during the init process, make
cifs.ko perform non-blocking socket operations while mounting and
accessing it.

Cc: Steve French <smfrench@gmail.com>
Reviewed-by: NAurelien Aptel <aaptel@suse.com>
Signed-off-by: NPaulo Alcantara (SUSE) <paulo@paulo.ac>
Signed-off-by: NSteve French <stfrench@microsoft.com>

8eecd1c2

16 9月, 2019 12 次提交

tcp: Add snd_wnd to TCP_INFO · 8f7baad7

由 Thomas Higdon 提交于 9月 13, 2019

Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP
performance problems --
> (1) Usually when we're diagnosing TCP performance problems, we do so
> from the sender, since the sender makes most of the
> performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc).
> From the sender-side the thing that would be most useful is to see
> tp->snd_wnd, the receive window that the receiver has advertised to
> the sender.

This serves the purpose of adding an additional __u32 to avoid the
would-be hole caused by the addition of the tcpi_rcvi_ooopack field.
Signed-off-by: NThomas Higdon <tph@fb.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f7baad7

tcp: Add TCP_INFO counter for packets received out-of-order · f9af2dbb

由 Thomas Higdon 提交于 9月 13, 2019

For receive-heavy cases on the server-side, we want to track the
connection quality for individual client IPs. This counter, similar to
the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
tracks out-of-order packet reception. By providing this counter in
TCP_INFO, it will allow understanding to what degree receive-heavy
sockets are experiencing out-of-order delivery and packet drops
indicating congestion.

Please note that this is similar to the counter in NetBSD TCP_INFO, and
has the same name.

Also note that we avoid increasing the size of the tcp_sock struct by
taking advantage of a hole.
Signed-off-by: NThomas Higdon <tph@fb.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9af2dbb

dm: introduce DM_GET_TARGET_VERSION · afa179eb

由 Mikulas Patocka 提交于 9月 16, 2019

This commit introduces a new ioctl DM_GET_TARGET_VERSION. It will load a
target that is specified in the "name" entry in the parameter structure
and return its version.

This functionality is intended to be used by cryptsetup, so that it can
query kernel capabilities before activating the device.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

afa179eb

bpf: fix accessing bpf_sysctl.file_pos on s390 · d895a0f1

由 Ilya Leoshkevich 提交于 8月 16, 2019

"ctx:file_pos sysctl:read write ok" fails on s390 with "Read value  !=
nux". This is because verifier rewrites a complete 32-bit
bpf_sysctl.file_pos update to a partial update of the first 32 bits of
64-bit *bpf_sysctl_kern.ppos, which is not correct on big-endian
systems.

Fix by using an offset on big-endian systems.

Ditto for bpf_sysctl.file_pos reads. Currently the test does not detect
a problem there, since it expects to see 0, which it gets with high
probability in error cases, so change it to seek to offset 3 and expect
3 in bpf_sysctl.file_pos.

Fixes: e1550bfe ("bpf: Add file_pos field to bpf_sysctl ctx")
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Acked-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20190816105300.49035-1-iii@linux.ibm.com/

d895a0f1

net: sched: use get_dev() action API in flow_action infra · 470d5060

由 Vlad Buslov 提交于 9月 13, 2019

When filling in hardware intermediate representation tc_setup_flow_action()
directly obtains, checks and takes reference to dev used by mirred action,
instead of using act->ops->get_dev() API created specifically for this
purpose. In order to remove code duplication, refactor flow_action infra to
use action API when obtaining mirred action target dev. Extend get_dev()
with additional argument that is used to provide dev destructor to the
user.

Fixes: 5a6ff4b1 ("net: sched: take reference to action dev before calling offloads")
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

470d5060

net: sched: take reference to psample group in flow_action infra · 4a5da47d

由 Vlad Buslov 提交于 9月 13, 2019

With recent patch set that removed rtnl lock dependency from cls hardware
offload API rtnl lock is only taken when reading action data and can be
released after action-specific data is parsed into intermediate
representation. However, sample action psample group is passed by pointer
without obtaining reference to it first, which makes it possible to
concurrently overwrite the action and deallocate object pointed by
psample_group pointer after rtnl lock is released but before driver
finished using the pointer.

To prevent such race condition, obtain reference to psample group while it
is used by flow_action infra. Extend psample API with function
psample_group_take() that increments psample group reference counter.
Extend struct tc_action_ops with new get_psample_group() API. Implement the
API for action sample using psample_group_take() and already existing
psample_group_put() as a destructor. Use it in tc_setup_flow_action() to
take reference to psample group pointed to by entry->sample.psample_group
and release it in tc_cleanup_flow_action().

Disable bh when taking psample_groups_lock. The lock is now taken while
holding action tcf_lock that is used by data path and requires bh to be
disabled, so doing the same for psample_groups_lock is necessary to
preserve SOFTIRQ-irq-safety.

Fixes: 918190f5 ("net: sched: flower: don't take rtnl lock for cls hw offloads API")
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a5da47d

net: sched: extend flow_action_entry with destructor · 1158958a

由 Vlad Buslov 提交于 9月 13, 2019

Generalize flow_action_entry cleanup by extending the structure with
pointer to destructor function. Set the destructor in
tc_setup_flow_action(). Refactor tc_cleanup_flow_action() to call
entry->destructor() instead of using switch that dispatches by entry->id
and manually executes cleanup.

This refactoring is necessary for following patches in this series that
require destructor to use tc_action->ops callbacks that can't be easily
obtained in tc_cleanup_flow_action().
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1158958a

udp: correct reuseport selection with connected sockets · acdcecc6

由 Willem de Bruijn 提交于 9月 12, 2019

UDP reuseport groups can hold a mix unconnected and connected sockets.
Ensure that connections only receive all traffic to their 4-tuple.

Fast reuseport returns on the first reuseport match on the assumption
that all matches are equal. Only if connections are present, return to
the previous behavior of scoring all sockets.

Record if connections are present and if so (1) treat such connected
sockets as an independent match from the group, (2) only return
2-tuple matches from reuseport and (3) do not return on the first
2-tuple reuseport match to allow for a higher scoring match later.

New field has_conns is set without locks. No other fields in the
bitmap are modified at runtime and the field is only ever set
unconditionally, so an RMW cannot miss a change.

Fixes: e32ea7e7 ("soreuseport: fast reuseport UDP socket selection")
Link: http://lkml.kernel.org/r/CA+FuTSfRP09aJNYRt04SS6qj22ViiOEWaWmLAwX0psk8-PGNxw@mail.gmail.comSigned-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acdcecc6

block: make rq sector size accessible for block stats · 3d244306

由 Hou Tao 提交于 5月 21, 2019

Currently rq->data_len will be decreased by partial completion or
zeroed by completion, so when blk_stat_add() is invoked, data_len
will be zero and there will never be samples in poll_cb because
blk_mq_poll_stats_bkt() will return -1 if data_len is zero.

We could move blk_stat_add() back to __blk_mq_complete_request(),
but that would make the effort of trying to call ktime_get_ns()
once in vain. Instead we can reuse throtl_size field, and use
it for both block stats and block throttle, and adjust the
logic in blk_mq_poll_stats_bkt() accordingly.

Fixes: 4bc6339a ("block: move blk_stat_add() to __blk_mq_end_request()")
Tested-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3d244306

net/sched: fix race between deactivation and dequeue for NOLOCK qdisc · d518d2ed

由 Paolo Abeni 提交于 9月 12, 2019

The test implemented by some_qdisc_is_busy() is somewhat loosy for
NOLOCK qdisc, as we may hit the following scenario:

CPU1						CPU2
// in net_tx_action()
clear_bit(__QDISC_STATE_SCHED...);
						// in some_qdisc_is_busy()
						val = (qdisc_is_running(q) ||
						       test_bit(__QDISC_STATE_SCHED,
								&q->state));
						// here val is 0 but...
qdisc_run(q)
// ... CPU1 is going to run the qdisc next

As a conseguence qdisc_run() in net_tx_action() can race with qdisc_reset()
in dev_qdisc_reset(). Such race is not possible for !NOLOCK qdisc as
both the above bit operations are under the root qdisc lock().

After commit 021a17ed ("pfifo_fast: drop unneeded additional lock on dequeue")
the race can cause use after free and/or null ptr dereference, but the root
cause is likely older.

This patch addresses the issue explicitly checking for deactivation under
the seqlock for NOLOCK qdisc, so that the qdisc_run() in the critical
scenario becomes a no-op.

Note that the enqueue() op can still execute concurrently with dev_qdisc_reset(),
but that is safe due to the skb_array() locking, and we can't avoid that
for NOLOCK qdiscs.

Fixes: 021a17ed ("pfifo_fast: drop unneeded additional lock on dequeue")
Reported-by: NLi Shuang <shuali@redhat.com>
Reported-and-tested-by: NDavide Caratti <dcaratti@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d518d2ed

compiler-types.h: add asm_inline definition · eb111869

由 Rasmus Villemoes 提交于 9月 13, 2019

This adds an asm_inline macro which expands to "asm inline" [1] when
the compiler supports it. This is currently gcc 9.1+, gcc 8.3
and (once released) gcc 7.5 [2]. It expands to just "asm" for other
compilers.

Using asm inline("foo") instead of asm("foo") overrules gcc's
heuristic estimate of the size of the code represented by the asm()
statement, and makes gcc use the minimum possible size instead. That
can in turn affect gcc's inlining decisions.

I wasn't sure whether to make this a function-like macro or not - this
way, it can be combined with volatile as

  asm_inline volatile()

but perhaps we'd prefer to spell that

  asm_inline_volatile()

anyway.

The Kconfig logic is taken from an RFC patch by Masahiro Yamada [3].

[1] Technically, asm __inline, since both inline and __inline__
are macros that attach various attributes, making gcc barf if one
literally does "asm inline()". However, the third spelling __inline is
available for referring to the bare keyword.

[2] https://lore.kernel.org/lkml/20190907001411.GG9749@gate.crashing.org/

[3] https://lore.kernel.org/lkml/1544695154-15250-1-git-send-email-yamada.masahiro@socionext.com/Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

eb111869

compiler_types.h: don't #define __inline · c30724e9

由 Rasmus Villemoes 提交于 9月 13, 2019

The spellings __inline and __inline__ should be reserved for uses
where one really wants to refer to the inline keyword, regardless of
whether or not the spelling "inline" has been #defined to something
else. Due to use of __inline__ in uapi headers, we can't easily get
rid of the definition of __inline__. However, almost all users of
__inline have been converted to inline, so we can get rid of that
#define.

The exception is include/acpi/platform/acintel.h. However, that header
is only included when using the intel compiler (does anybody actually
build the kernel with that?), and the ACPI_INLINE macro is only used
in the definition of utterly trivial stub functions, where I doubt a
small change of semantics (lack of __gnu_inline) changes anything.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
[Fix trivial typo in message]
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

c30724e9

14 9月, 2019 6 次提交

export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols · 69a94abb

由 Masahiro Yamada 提交于 9月 09, 2019

Arnd Bergmann reported false-positive modpost warnings detected by his
randconfig testing of linux-next.

Actually, this happens under the combination of CONFIG_MODVERSIONS
and CONFIG_TRIM_UNUSED_KSYMS since commit 15bfc234 ("modpost:
check for static EXPORT_SYMBOL* functions").

For example, arch/arm/config/multi_v7_defconfig + CONFIG_MODVERSIONS
+ CONFIG_TRIM_UNUSED_KSYMS produces the following false-positives:

WARNING: "__lshrdi3" [vmlinux] is a static (unknown)
WARNING: "__ashrdi3" [vmlinux] is a static (unknown)
WARNING: "__aeabi_lasr" [vmlinux] is a static (unknown)
WARNING: "__aeabi_llsr" [vmlinux] is a static (unknown)
WARNING: "ftrace_set_clr_event" [vmlinux] is a static (unknown)
WARNING: "__muldi3" [vmlinux] is a static (unknown)
WARNING: "__aeabi_ulcmp" [vmlinux] is a static (unknown)
WARNING: "__ucmpdi2" [vmlinux] is a static (unknown)
WARNING: "__aeabi_lmul" [vmlinux] is a static (unknown)
WARNING: "__bswapsi2" [vmlinux] is a static (unknown)
WARNING: "__bswapdi2" [vmlinux] is a static (unknown)
WARNING: "__ashldi3" [vmlinux] is a static (unknown)
WARNING: "__aeabi_llsl" [vmlinux] is a static (unknown)

The root cause of the problem is not in the modpost, but in the
implementation of CONFIG_TRIM_UNUSED_KSYMS.

If there is at least one untrimmed symbol in the file, genksyms is
invoked to calculate CRC of *all* the exported symbols in that file
even if some of them have been trimmed due to no caller existing.

As a result, .tmp_*.ver files contain CRC of trimmed symbols, thus
unneeded, orphan __crc* symbols are added to objects. It had been
harmless until recently.

With commit 15bfc234 ("modpost: check for static EXPORT_SYMBOL*
functions"), it is now harmful because the bogus __crc* symbols make
modpost call sym_update_crc() to add the symbols to the hash table,
but there is no one that clears the ->is_static member.

I gave Fixes to the first commit that uncovered the issue, but the
potential problem has long existed since commit f2355416
("export.h: allow for per-symbol configurable EXPORT_SYMBOL()").

Fixes: 15bfc234 ("modpost: check for static EXPORT_SYMBOL* functions")
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: NArnd Bergmann <arnd@arndb.de>

69a94abb

net: devlink: move reload fail indication to devlink core and expose to user · 2670ac26

由 Jiri Pirko 提交于 9月 12, 2019

Currently the fact that devlink reload failed is stored in drivers.
Move this flag into devlink core. Also, expose it to the user.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2670ac26

net: devlink: split reload op into two · 97691069

由 Jiri Pirko 提交于 9月 12, 2019

In order to properly implement failure indication during reload,
split the reload op into two ops, one for down phase and one for
up phase.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97691069

md: add feature flag MD_FEATURE_RAID0_LAYOUT · 33f2c35a

由 NeilBrown 提交于 9月 09, 2019

Due to a bug introduced in Linux 3.14 we cannot determine the
correctly layout for a multi-zone RAID0 array - there are two
possibilities.

It is possible to tell the kernel which to chose using a module
parameter, but this can be clumsy to use.  It would be best if
the choice were recorded in the metadata.
So add a feature flag for this purpose.
If it is set, then the 'layout' field of the superblock is used
to determine which layout to use.

If this flag is not set, then mddev->layout gets set to -1,
which causes the module parameter to be required.
Acked-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NSong Liu <songliubraving@fb.com>

33f2c35a

IB/{rdmavt, hfi1, qib}: Add a counter for credit waits · 71994354

由 Kaike Wan 提交于 9月 11, 2019

This patch adds a counter for credit waits to assist field debugging.

Link: https://lore.kernel.org/r/20190911113047.126040.10857.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

71994354

ip: support SO_MARK cmsg · c6af0c22

由 Willem de Bruijn 提交于 9月 11, 2019

Enable setting skb->mark for UDP and RAW sockets using cmsg.

This is analogous to existing support for TOS, TTL, txtime, etc.

Packet sockets already support this as of commit c7d39e32
("packet: support per-packet fwmark for af_packet sendmsg").

Similar to other fields, implement by
1. initialize the sockcm_cookie.mark from socket option sk_mark
2. optionally overwrite this in ip_cmsg_send/ip6_datagram_send_ctl
3. initialize inet_cork.mark from sockcm_cookie.mark
4. initialize each (usually just one) skb->mark from inet_cork.mark

Step 1 is handled in one location for most protocols by ipcm_init_sk
as of commit 35178206 ("ipv4: ipcm_cookie initializers").
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6af0c22

13 9月, 2019 12 次提交

PTP: add support for one-shot output · 823eb2a3

由 Felipe Balbi 提交于 9月 11, 2019

Some controllers allow for a one-shot output pulse, in contrast to
periodic output. Now that we have extensible versions of our IOCTLs, we
can finally make use of the 'flags' field to pass a bit telling driver
that if we want one-shot pulse output.
Signed-off-by: NFelipe Balbi <felipe.balbi@linux.intel.com>
Reviewed-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

823eb2a3

PTP: introduce new versions of IOCTLs · 41560658

由 Felipe Balbi 提交于 9月 11, 2019

The current version of the IOCTL have a small problem which prevents us
from extending the API by making use of reserved fields. In these new
IOCTLs, we are now making sure that flags and rsv fields are zero which
will allow us to extend the API in the future.
Reviewed-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NFelipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41560658

padata: remove cpu_index from the parallel_queue · c51636a3

由 Daniel Jordan 提交于 9月 05, 2019

With the removal of the ENODATA case from padata_get_next, the cpu_index
field is no longer useful, so it can go away.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c51636a3

padata: unbind parallel jobs from specific CPUs · bfde23ce

由 Daniel Jordan 提交于 9月 05, 2019

Padata binds the parallel part of a job to a single CPU and round-robins
over all CPUs in the system for each successive job.  Though the serial
parts rely on per-CPU queues for correct ordering, they're not necessary
for parallel work, and it improves performance to run the job locally on
NUMA machines and let the scheduler pick the CPU within a node on a busy
system.

So, make the parallel workqueue unbound.

Update the parallel workqueue's cpumask when the instance's parallel
cpumask changes.

Now that parallel jobs no longer run on max_active=1 workqueues, two or
more parallel works that hash to the same CPU may run simultaneously,
finish out of order, and so be serialized out of order.  Prevent this by
keeping the works sorted on the reorder list by sequence number and
checking that in the reordering logic.

padata_get_next becomes padata_find_next so it can be reused for the end
of padata_reorder, where it's used to avoid uselessly queueing work when
the next job by sequence number isn't finished yet but a later job that
hashed to the same CPU has.

The ENODATA case in padata_find_next no longer makes sense because
parallel jobs aren't bound to specific CPUs.  The EINPROGRESS case takes
care of the scenario where a parallel job is potentially running on the
same CPU as padata_find_next, and with only one error code left, just
use NULL instead.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

bfde23ce

padata: use separate workqueues for parallel and serial work · 45d153c0

由 Daniel Jordan 提交于 9月 05, 2019

padata currently uses one per-CPU workqueue per instance for all work.

Prepare for running parallel jobs on an unbound workqueue by introducing
dedicated workqueues for parallel and serial work.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

45d153c0

padata: make padata_do_parallel find alternate callback CPU · e6ce0e08

由 Daniel Jordan 提交于 9月 05, 2019

padata_do_parallel currently returns -EINVAL if the callback CPU isn't
in the callback cpumask.

pcrypt tries to prevent this situation by keeping its own callback
cpumask in sync with padata's and checks that the callback CPU it passes
to padata is valid.  Make padata handle this instead.

padata_do_parallel now takes a pointer to the callback CPU and updates
it for the caller if an alternate CPU is used.  Overall behavior in
terms of which callback CPUs are chosen stays the same.

Prepares for removal of the padata cpumask notifier in pcrypt, which
will fix a lockdep complaint about nested acquisition of the CPU hotplug
lock later in the series.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e6ce0e08

workqueue: unconfine alloc/apply/free_workqueue_attrs() · 513c98d0

由 Daniel Jordan 提交于 9月 05, 2019

padata will use these these interfaces in a later patch, so unconfine them.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

513c98d0

padata: allocate workqueue internally · b128a304

由 Daniel Jordan 提交于 9月 05, 2019

Move workqueue allocation inside of padata to prepare for further
changes to how padata uses workqueues.

Guarantees the workqueue is created with max_active=1, which padata
relies on to work correctly.  No functional change.
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b128a304

netfilter: conntrack: remove two unused functions from nf_conntrack_timestamp.h. · 0d32e704

由 Jeremy Sowden 提交于 9月 13, 2019

Two inline functions defined in nf_conntrack_timestamp.h,
`nf_ct_tstamp_enabled` and `nf_ct_set_tstamp`, are not called anywhere.
Remove them.
Signed-off-by: NJeremy Sowden <jeremy@azazel.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

0d32e704

netfilter: conntrack: remove CONFIG_NF_CONNTRACK checks from nf_conntrack_zones.h. · 1f1475f3

由 Jeremy Sowden 提交于 9月 13, 2019

nf_conntrack_zones.h was wrapped in a CONFIG_NF_CONNTRACK check in order
to fix compilation failures:

  37ee3d5b ("netfilter: nf_defrag_ipv4: fix compilation error with NF_CONNTRACK=n")

Subsequent changes mean that these failures will no longer occur and the
check is unnecessary.  Remove it.
Signed-off-by: NJeremy Sowden <jeremy@azazel.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1f1475f3

netfilter: remove CONFIG_NETFILTER checks from headers. · f19438bd

由 Jeremy Sowden 提交于 9月 13, 2019

`struct nf_hook_ops`, `struct nf_hook_state` and the `nf_hookfn`
function typedef appear in function and struct declarations and
definitions in a number of netfilter headers.  The structs and typedef
themselves are defined by linux/netfilter.h but only when
CONFIG_NETFILTER is enabled.  Define them unconditionally and add
forward declarations in order to remove CONFIG_NETFILTER conditionals
from the other headers.
Signed-off-by: NJeremy Sowden <jeremy@azazel.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f19438bd

netfilter: conntrack: remove CONFIG_NF_CONNTRACK check from nf_conntrack_acct.h. · 51a21be4

由 Jeremy Sowden 提交于 9月 13, 2019

There is a superfluous `#if IS_ENABLED(CONFIG_NF_CONNTRACK)` check
wrapping some function declarations.  Remove it.
Signed-off-by: NJeremy Sowden <jeremy@azazel.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

51a21be4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功