提交 · c9ebf126f127780a5a125449e4efe6df37b8daa4 · openeuler / Kernel

08 9月, 2020 33 次提交

net: dsa: change PHY error message again · c9ebf126

由 Vladimir Oltean 提交于 9月 08, 2020

slave_dev->name is only populated at this stage if it was specified
through a label in the device tree. However that is not mandatory.
When it isn't, the error message looks like this:

[    5.037057] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.044672] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.052275] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.059877] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d

which is especially confusing since the error gets printed on behalf of
the DSA master (fsl_enetc in this case).

Printing an error message that contains a valid reference to the DSA
port's name is difficult at this point in the initialization stage, so
at least we should print some info that is more reliable, even if less
user-friendly. That may be the driver name and the hardware port index.

After this change, the error is printed as:

[    6.051587] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 0
[    6.061192] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 1
[    6.070765] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 2
[    6.080324] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 3
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

c9ebf126

net: tighten the definition of interface statistics · 0db0c34c

由 Jakub Kicinski 提交于 9月 03, 2020

This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.

Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?

I tried to clarify the expectations, further clarifications from
others are very welcome.

The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.

Now - which two of those counters we select to use is anyone's pick:

sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).

Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).

Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.

v2:
 - add driver developer notes about ethtool stat count and reset
 - replace Ethernet with IEEE 802.3 to better indicate source of attrs
 - mention byte counters don't count FCS
 - clarify RX counter is from device to host
 - drop "sightly" from sysfs paragraph
 - add examples of ethtool stats
 - s/incoming/received/ s/incoming/transmitted/
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0db0c34c

rxrpc: Remove unused macro rxrpc_min_rtt_wlen · 81365af1

由 Wang Hai 提交于 9月 04, 2020

rxrpc_min_rtt_wlen is never used after it was introduced.
So better to remove it.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

81365af1

Merge branch 'sfc-ethtool-for-EF100-and-related-improvements' · 14e9e262

由 Jakub Kicinski 提交于 9月 07, 2020

Edward Cree says:

====================
sfc: ethtool for EF100 and related improvements

This series adds the ethtool support to the EF100 driver that was held
 back from the original submission as the lack of phy_ops caused issues.
Patch #2, removing the phy_op indirection, deals with this.  There are a
 lot of checkpatch warnings / xmastree violations but they're all in
 pure code movement so I've left the code as it is.
While patch #1 is technically a fix and possibly could go to 'net', I've
 put it in this series since it only becomes triggerable with the added
 'ethtool --reset' support.
====================
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

14e9e262

sfc: simplify DMA mask setting · 08bdbcae

由 Edward Cree 提交于 9月 07, 2020

Christoph says[1] that dma_set_mask_and_coherent() is smart enough to
 truncate the mask itself if it's too long.  So we can get rid of our
 "lop off one bit and retry" loop in efx_init_io().

[1]: https://www.spinics.net/lists/netdev/msg677266.htmlSigned-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

08bdbcae

sfc: remove EFX_DRIVER_VERSION · 60bd2a2d

由 Edward Cree 提交于 9月 07, 2020

Per-module versions for in-tree drivers are deprecated.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

60bd2a2d

sfc: handle limited FEC support · 400d64cf

由 Edward Cree 提交于 9月 07, 2020

If the reported PHY capabilities do not include a given FEC mode, don't
 attempt to select that FEC mode anyway.  If the user tries to set a mode
 through ethtool that is not supported, return an error.
The _REQUESTED bits don't appear in the supported caps, but are implied
 by the corresponding FEC bits.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

400d64cf

sfc: add ethtool ops and miscellaneous ndos to EF100 · 4404c089

由 Edward Cree 提交于 9月 07, 2020

Mostly just calls to existing common functions.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

4404c089

sfc: remove phy_op indirection · c77289b4

由 Edward Cree 提交于 9月 07, 2020

Originally there were several implementations of PHY operations for the
 several different PHYs used on Falcon boards.  But Falcon is now in a
 separate driver, and all sfc NICs since then have had MCDI-managed PHYs.
Thus, there is no need to indirect through function pointers in
 efx->phy_op; we can simply call the efx_mcdi_phy_* functions directly.

This also hooks up these functions for EF100, which was previously using
 the dummy_phy_ops.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

c77289b4

sfc: don't double-down() filters in ef100_reset() · 7dcc9d8a

由 Edward Cree 提交于 9月 07, 2020

dev_close(), by way of ef100_net_stop(), already brings down the filter
 table, so there's no need to do it again (which just causes lots of
 WARN_ONs).
Similarly, don't bring it up ourselves, as dev_open() -> ef100_net_open()
 will do it, and will fail if it's already been brought up.

Fixes: a9dc3d56 ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

7dcc9d8a

net: ethernet: dnet: Remove set but unused variable 'len' · 30ebaf8e

由 Wang Hai 提交于 9月 07, 2020

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/dnet.c: In function dnet_start_xmit
drivers/net/ethernet/dnet.c:511:15: warning: variable ‘len’ set but not used [-Wunused-but-set-variable]

commit 47964174 ("dnet: Dave DNET ethernet controller driver (updated)")
involved this unused variable, remove it.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

30ebaf8e

net: ethernet: dwmac: remove redundant null check before clk_disable_unprepare() · f3b11449

由 Zhang Changzhong 提交于 9月 07, 2020

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

f3b11449

net: ethernet: fec: remove redundant null check before clk_disable_unprepare() · 05891200

由 Zhang Changzhong 提交于 9月 07, 2020

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: NFugang Duan <fugang.duan@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

05891200

net: stmmac: remove redundant null check before clk_disable_unprepare() · 1c35cc9c

由 Zhang Changzhong 提交于 9月 07, 2020

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1c35cc9c

net: xilinx: remove redundant null check before clk_disable_unprepare() · e50fd9b5

由 Zhang Changzhong 提交于 9月 07, 2020

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: NRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e50fd9b5

Merge branch 'net-bridge-mcast-initial-IGMPv3-MLDv2-support-part-1' · 6af52ae2

由 Jakub Kicinski 提交于 9月 07, 2020

Nikolay Aleksandrov says:

====================
net: bridge: mcast: initial IGMPv3/MLDv2 support (part 1)

This patch-set implements the control plane for initial IGMPv3/MLDv2
support which takes care of include/exclude sets and state transitions
based on the different report types.
Patch 01 arranges the structure better by moving the frequently used
fields together, patch 02 factors out the port group deletion code which is
used in a few places. Patches 03 and 04 add support for source lists and
group modes per port group which are dumped. Patch 05 adds support for
group-and-source specific queries required for IGMPv3/MLDv2. Then patch 06
adds support for group and group-and-source query retransmissions via a new
rexmit timer. Patches 07 and 08 make use of the already present mdb fill
functions when sending notifications so we can have the full mdb entries'
state filled in (with sources, mode etc). Patch 09 takes care of port group
expiration, it switches the group mode to include and deletes it if there
are no sources with active timers. Patches 10-13 are the core changes which
add support for IGMPv3/MLDv2 reports and handle the source list set
operations as per RFCs 3376 and 3810, all IGMPv3/MLDv2 report types with
their transitions should be supported after these patches. I've used RFCs
3376, 3810 and FRR as a reference implementation. The source lists are
capped at 32 entries, we can remove that limitation at a later point which
would require a better data structure to hold them. IGMPv3 processing is
hidden behind the bridge's multicast_igmp_version option which must be set
to 3 in order to enable it. MLDv2 processing is hidden behind the bridge's
multicast_mld_version which must be set to 2 in order to enable it.
Patch 14 improves other querier processing a bit (more about this below).
And finally patch 15 transforms the src gc so it can be used with all mcast
objects since now we have multiple timers that can be running and we
need to make sure they have all finished before freeing the objects.
This is part 1, it only adds control plane support and doesn't change
the fast path. A following patch-set will take care of that.

Here're the sets that will come next (in order):
 - Fast path patch-set which adds support for (S, G) mdb entries needed
   for IGMPv3/MLDv2 forwarding, entry add source (kernel, user-space etc)
   needed for IGMPv3/MLDv2 entry management, entry block mode needed for
   IGMPv3/MLDv2 exclude mode. This set will also add iproute2 support for
   manipulating and showing all the new state.
 - Selftests patches which will verify all state transitions and forwarding
 - Explicit host tracking patch-set, needed for proper fast leave and
   with it fast leave will be enabled for IGMPv3/MLDv2

Not implemented yet:
 - Host IGMPv3/MLDv2 filter support (currently we handle only join/leave
   as before)
 - Proper other querier source timer and value updates
 - IGMPv3/v2 MLDv2/v1 compat (I have a few rough patches for this one)

v4: move old patch 05 to 02 (group del patch), before src lists
    patch 02: set pg's fast leave flag when deleting due to fast leave
    patch 03: now can use the new port del function
              add igmpv2/mldv1 bool which are set when the entry is
              added in those modes (later will be passed as update_timer)
    patch 10: rename update_timer to igmpv2_mldv1 and use the passed
              value from br_multicast_add_group's callers
v3: add IPv6/MLDv2 support, most patches are changed
v2:
 patches 03-04: make src lists RCU friendly so they can be traversed
                when dumping, reduce limit to a more conservative 32
                src group entries for a start
 patches 11-13: remove helper and directly do bitops
 patch      15: force mcast gc on bridge port del to make sure port
                group timers have finished before freeing the port
====================
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

6af52ae2

net: bridge: mcast: destroy all entries via gc · e12cec65

由 Nikolay Aleksandrov 提交于 9月 07, 2020

Since each entry type has timers that can be running simultaneously we need
to make sure that entries are not freed before their timers have finished.
In order to do that generalize the src gc work to mcast gc work and use a
callback to free the entries (mdb, port group or src).

v3: add IPv6 support
v2: force mcast gc on port del to make sure all port group timers have
    finished before freeing the bridge port
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e12cec65

net: bridge: mcast: improve IGMPv3/MLDv2 query processing · 23550b83

由 Nikolay Aleksandrov 提交于 9月 07, 2020

When an IGMPv3/MLDv2 query is received and we're operating in such mode
then we need to avoid updating group timers if the suppress flag is set.
Also we should update only timers for groups in exclude mode.

v3: add IPv6/MLDv2 support
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

23550b83

net: bridge: mcast: support for IGMPV3/MLDv2 BLOCK_OLD_SOURCES report · 109865fe

由 Nikolay Aleksandrov 提交于 9月 07, 2020

We already have all necessary helpers, so process IGMPV3/MLDv2
BLOCK_OLD_SOURCES as per the RFCs.

v3: add IPv6/MLDv2 support
v2: directly do flag bit operations
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

109865fe

net: bridge: mcast: support for IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report · 5bf1e00b

由 Nikolay Aleksandrov 提交于 9月 07, 2020

In order to process IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report types we
need new helpers which allow us to mark entries based on their timer
state and to query only marked entries.

v3: add IPv6/MLDv2 support, fix other_query checks
v2: directly do flag bit operations
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5bf1e00b

net: bridge: mcast: support for IGMPV3/MLDv2 MODE_IS_INCLUDE/EXCLUDE report · e6231bca

由 Nikolay Aleksandrov 提交于 9月 07, 2020

In order to process IGMPV3/MLDv2_MODE_IS_INCLUDE/EXCLUDE report types we
need some new helpers which allow us to set/clear flags for all current
entries and later delete marked entries after the report sources have been
processed.

v3: add IPv6/MLDv2 support
v2: drop flag helpers and directly do flag bit operations
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e6231bca

net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report · 0436862e

由 Nikolay Aleksandrov 提交于 9月 07, 2020

This patch adds handling for the ALLOW_NEW_SOURCES IGMPv3/MLDv2 report
types and limits them only when multicast_igmp_version == 3 or
multicast_mld_version == 2 respectively. Now that IGMPv3/MLDv2 handling
functions will be managing timers we need to delay their activation, thus
a new argument is added which controls if the timer should be updated.
We also disable host IGMPv3/MLDv2 handling as it's not yet implemented and
could cause inconsistent group state, the host can only join a group as
EXCLUDE {} or leave it.

v4: rename update_timer to igmpv2_mldv1 and use the passed value from
    br_multicast_add_group's callers
v3: Add IPv6/MLDv2 support
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0436862e

net: bridge: mcast: delete expired port groups without srcs · d6c33d67

由 Nikolay Aleksandrov 提交于 9月 07, 2020

If an expired port group is in EXCLUDE mode, then we have to turn it
into INCLUDE mode, remove all srcs with zero timer and finally remove
the group itself if there are no more srcs with an active timer.
For IGMPv2 use there would be no sources, so this will reduce to just
removing the group as before.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d6c33d67

net: bridge: mdb: use mdb and port entries in notifications · 81f19838

由 Nikolay Aleksandrov 提交于 9月 07, 2020

We have to use mdb and port entries when sending mdb notifications in
order to fill in all group attributes properly. Before this change we
would've used a fake br_mdb_entry struct to fill in only partial
information about the mdb. Now we can also reuse the mdb dump fill
function and thus have only a single central place which fills the mdb
attributes.

v3: add IPv6 support
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

81f19838

net: bridge: mdb: push notifications in __br_mdb_add/del · 79abc875

由 Nikolay Aleksandrov 提交于 9月 07, 2020

This change is in preparation for using the mdb port group entries when
sending a notification, so their full state and additional attributes can
be filled in.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

79abc875

net: bridge: mcast: add support for group query retransmit · 42c11ccf

由 Nikolay Aleksandrov 提交于 9月 07, 2020

We need to be able to retransmit group-specific and group-and-source
specific queries. The new timer takes care of those.

v3: add IPv6 support
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

42c11ccf

net: bridge: mcast: add support for group-and-source specific queries · 438ef2d0

由 Nikolay Aleksandrov 提交于 9月 07, 2020

Allows br_multicast_alloc_query to build queries with the port group's
source lists and sends a query for sources over and under lmqt when
necessary as per RFCs 3376 and 3810 with the suppress flag set
appropriately.

v3: add IPv6 support
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

438ef2d0

net: bridge: mcast: add support for src list and filter mode dumping · 5205e919

由 Nikolay Aleksandrov 提交于 9月 07, 2020

Support per port group src list (address and timer) and filter mode
dumping. Protected by either multicast_lock or rcu.

v3: add IPv6 support
v2: require RCU or multicast_lock to traverse src groups
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5205e919

net: bridge: mcast: add support for group source list · 8b671779

由 Nikolay Aleksandrov 提交于 9月 07, 2020

Initial functions for group source lists which are needed for IGMPv3
and MLDv2 include/exclude lists. Both IPv4 and IPv6 sources are supported.
User-added mdb entries are created with exclude filter mode, we can
extend that later to allow user-supplied mode. When group src entries
are deleted, they're freed from a workqueue to make sure their timers
are not still running. Source entries are protected by the multicast_lock
and rcu. The number of src groups per port group is limited to 32.

v4: use the new port group del function directly
    add igmpv2/mldv1 bool to denote if the entry was added in those
    modes, it will later replace the old update_timer bool
v3: add IPv6 support
v2: allow src groups to be traversed under rcu
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8b671779

net: bridge: mcast: factor out port group del · 681590bd

由 Nikolay Aleksandrov 提交于 9月 07, 2020

In order to avoid future errors and reduce code duplication we should
factor out the port group del sequence. This allows us to have one
function which takes care of all details when removing a port group.

v4: set pg's fast leave flag when deleting due to fast leave
    move the patch before adding source lists
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

681590bd

net: bridge: mdb: arrange internal structs so fast-path fields are close · 6ec0d0ee

由 Nikolay Aleksandrov 提交于 9月 07, 2020

Before this patch we'd need 2 cache lines for fast-path, now all used
fields are in the first cache line.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

6ec0d0ee

net: dsa: rtl8366rb: Switch to phylink · bb1416ad

由 Linus Walleij 提交于 9月 06, 2020

This switches the RTL8366RB over to using phylink callbacks
instead of .adjust_link(). This is a pretty template
switchover. All we adjust is the CPU port so that is why
the code only inspects this port.

We enhance by adding proper error messages, also disabling
the CPU port on the way down and moving dev_info() to
dev_dbg().
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

bb1416ad

tipc: fix a deadlock when flushing scheduled work · d966ddcc

由 Hoang Huu Le 提交于 9月 07, 2020

In the commit fdeba99b
("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying
to make sure the tipc_net_finalize_work work item finished if it
enqueued. But calling flush_scheduled_work() is not just affecting
above work item but either any scheduled work. This has turned out
to be overkill and caused to deadlock as syzbot reported:

======================================================
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:6/349 is trying to acquire lock:
ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

but task is already holding lock:
ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565

[...]
 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(pernet_ops_rwsem);
                               lock(&sb->s_type->i_mutex_key#13);
                               lock(pernet_ops_rwsem);
  lock((wq_completion)events);

 *** DEADLOCK ***
[...]

v1:
To fix the original issue, we replace above calling by introducing
a bit flag. When a namespace cleaned-up, bit flag is set to zero and:
- tipc_net_finalize functionial just does return immediately.
- tipc_net_finalize_work does not enqueue into the scheduled work queue.

v2:
Use cancel_work_sync() helper to make sure ONLY the
tipc_net_finalize_work() stopped before releasing bcbase object.

Reported-by: syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com
Fixes: fdeba99b ("tipc: fix use-after-free in tipc_bcast_get_mode")
Acked-by: NJon Maloy <jmaloy@redhat.com>
Signed-off-by: NHoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d966ddcc

07 9月, 2020 5 次提交

enic: switch from 'pci_' to 'dma_' API · 02a20d4f

由 Christophe JAILLET 提交于 9月 06, 2020

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'vnic_dev_classifier()', 'vnic_dev_fw_info()',
'vnic_dev_notify_set()' and 'vnic_dev_stats_dump()' (vnic_dev.c) GFP_ATOMIC
must be used because its callers take a spinlock before calling these
functions.

When memory is allocated in '__enic_set_rsskey()' and 'enic_set_rsscpu()'
GFP_ATOMIC must be used because they can be called with a spinlock.
The call chain is:
  enic_reset                         <-- takes 'enic->enic_api_lock'
    --> enic_set_rss_nic_cfg
      --> enic_set_rsskey
        --> __enic_set_rsskey        <-- uses dma_alloc_coherent
      --> enic_set_rsscpu            <-- uses dma_alloc_coherent

When memory is allocated in 'vnic_dev_init_prov2()' GFP_ATOMIC must be used
because a spinlock is hidden in the ENIC_DEVCMD_PROXY_BY_INDEX macro, when
this function is called in 'enic_set_port_profile()'.

When memory is allocated in 'vnic_dev_alloc_desc_ring()' GFP_KERNEL can be
used because it is only called from 5 functions ('vnic_dev_init_devcmd2()',
'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()' and
'enic_wq_devcmd2_alloc()'.

  'vnic_dev_init_devcmd2()': already uses GFP_KERNEL and no lock is taken
     in the between.
  'enic_wq_devcmd2_alloc()': is called from ' vnic_dev_init_devcmd2()'
     which already uses GFP_KERNEL and no lock is taken in the between.
  'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()': are called
     from 'enic_alloc_vnic_resources()'
'enic_alloc_vnic_resources()' has only 2 call chains:

  1) enic_probe
      --> enic_dev_init
        --> enic_alloc_vnic_resources
'enic_probe()' is a probe function and no lock is taken in the between

  2) enic_set_ringparam
      --> enic_alloc_vnic_resources
'enic_set_ringparam()' is a .set_ringparam function (see struct
ethtool_ops). It seems to only take a mutex and no spinlock.

So all paths are safe to use GFP_KERNEL.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

02a20d4f

net: gemini: Clean up phy registration · 3e813d61

由 Linus Walleij 提交于 9月 06, 2020

It's nice if the phy is online before we register the netdev
so try to do that first.

Stop trying to do "second tried" to register the phy, it
works perfectly fine the first time.

Stop remvoving the phy in uninit. Remove it when the
driver is remove():d, symmetric to where it is added, in
probe().
Suggested-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reported-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3e813d61

net: Add a missing word · ee1a4c84

由 Jonathan Neuschäfer 提交于 9月 05, 2020

Signed-off-by: NJonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ee1a4c84

net: dsa: rtl8366rb: Support setting MTU · 5f4a8ef3

由 Linus Walleij 提交于 9月 05, 2020

This implements the missing MTU setting for the RTL8366RB
switch.

Apart from supporting jumboframes, this rids us of annoying
boot messages like this:
realtek-smi switch: nonfatal error -95 setting MTU on port 0
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5f4a8ef3

net/packet: Remove unused macro BLOCK_PRIV · 383e3f3e

由 Wang Hai 提交于 9月 05, 2020

BLOCK_PRIV is never used after it was introduced.
So better to remove it.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

383e3f3e

06 9月, 2020 2 次提交

NFC: digital: Remove two unused macroes · be239c4d

由 Wang Hai 提交于 9月 04, 2020

DIGITAL_NFC_DEP_REQ_RES_TAILROOM is never used after it was introduced.
DIGITAL_NFC_DEP_REQ_RES_HEADROOM is no more used after below
commit e8e7f421 ("NFC: digital: Remove useless call to skb_reserve()")
Remove them.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

be239c4d

caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE · 877c3474

由 Wang Hai 提交于 9月 04, 2020

Remove SRVL_CTRL_PKT_SIZE which is defined more than once.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

877c3474

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功