提交 · 8e0c3b73cec1b943affde91b3c412ad8266b4694 · openeuler / raspberrypi-kernel

16 12月, 2017 6 次提交

sctp: implement generate_ftsn for sctp_stream_interleave · 8e0c3b73

由 Xin Long 提交于 12月 15, 2017

generate_ftsn is added as a member of sctp_stream_interleave, used to
create fwdtsn or ifwdtsn chunk according to abandoned chunks, called
in sctp_retransmit and sctp_outq_sack.

sctp_generate_iftsn works for ifwdtsn, and sctp_generate_fwdtsn is
still used for making fwdtsn.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo R. Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e0c3b73

sctp: add basic structures and make chunk function for ifwdtsn · 2d07a49a

由 Xin Long 提交于 12月 15, 2017

sctp_ifwdtsn_skip, sctp_ifwdtsn_hdr and sctp_ifwdtsn_chunk are used to
define and parse I-FWD TSN chunk format, and sctp_make_ifwdtsn is a
function to build the chunk.

The I-FORWARD-TSN Chunk Format is defined in section 2.3.1 of RFC8260.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo R. Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d07a49a

sfp: add sff module support · 259c8618

由 Russell King 提交于 12月 14, 2017

Add support for SFF modules, which are soldered down SFP modules.
These have a different phys_id value, and also have the present and
rate select signals omitted compared with their socketed counter-parts.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

259c8618

ip6_gre: add erspan v2 support · 94d7d8f2

由 William Tu 提交于 12月 13, 2017

Similar to support for ipv4 erspan, this patch adds
erspan v2 to ip6erspan tunnel.
Signed-off-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94d7d8f2

net: erspan: introduce erspan v2 for ip_gre · f551c91d

由 William Tu 提交于 12月 13, 2017

The patch adds support for erspan version 2.  Not all features are
supported in this patch.  The SGT (security group tag), GRA (timestamp
granularity), FT (frame type) are set to fixed value.  Only hardware
ID and direction are configurable.  Optional subheader is also not
supported.
Signed-off-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f551c91d

net: erspan: refactor existing erspan code · 1d7e2ed2

由 William Tu 提交于 12月 13, 2017

The patch refactors the existing erspan implementation in order
to support erspan version 2, which has additional metadata.  So, in
stead of having one 'struct erspanhdr' holding erspan version 1,
breaks it into 'struct erspan_base_hdr' and 'struct erspan_metadata'.
Signed-off-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d7e2ed2

14 12月, 2017 10 次提交

net: phy: phylink: Allow setting a custom link state callback · 1ac63e39

由 Florian Fainelli 提交于 12月 12, 2017

phylink_get_fixed_state() currently consults an optional "link_gpio"
GPIO descriptor, expand this mechanism to allow specifying a custom
callback. This is necessary to support out of band link notifcation
(e.g: from an interrupt within a MMIO register).
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ac63e39

net: phy: phylink: Allow specifying PHY device flags · 0a62964c

由 Florian Fainelli 提交于 12月 12, 2017

In order to let subsystems like DSA fully utilize PHYLINK, we need to be able
to communicate phy_device::flags from of_phy_{connect,attach} even when using
PHYLINK APIs.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a62964c

tcp: pause Fast Open globally after third consecutive timeout · 7268586b

由 Yuchung Cheng 提交于 12月 12, 2017

Prior to this patch, active Fast Open is paused on a specific
destination IP address if the previous connections to the
IP address have experienced recurring timeouts . But recent
experiments by Microsoft (https://goo.gl/cykmn7) and Mozilla
browsers indicate the isssue is often caused by broken middle-boxes
sitting close to the client. Therefore it is much better user
experience if Fast Open is disabled out-right globally to avoid
experiencing further timeouts on connections toward other
destinations.

This patch changes the destination-IP disablement to global
disablement if a connection experiencing recurring timeouts
or aborts due to timeout.  Repeated incidents would still
exponentially increase the pause time, starting from an hour.
This is extremely conservative but an unfortunate compromise to
minimize bad experience due to broken middle-boxes.
Reported-by: NDragana Damjanovic <ddamjanovic@mozilla.com>
Reported-by: NPatrick McManus <mcmanus@ducksong.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Reviewed-by: NWei Wang <weiwan@google.com>
Reviewed-by: NNeal Cardwell <ncardwell@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7268586b

net: sk_pacing_shift_update() helper · c9f1f58d

由 Eric Dumazet 提交于 12月 12, 2017

In commit 3a9b76fd ("tcp: allow drivers to tweak TSQ logic")
I gave a code sample to set sk->sk_pacing_shift that was not complete.

Better add a helper that can be used by drivers without worries,
and maybe amended in the future.

A wifi driver might use it from its ndo_start_xmit()

Following call would setup TCP to allow up to ~8ms of queued data per
flow.

sk_pacing_shift_update(skb->sk, 7);
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9f1f58d

net: bridge: use rhashtable for fdbs · eb793583

由 Nikolay Aleksandrov 提交于 12月 12, 2017

Before this patch the bridge used a fixed 256 element hash table which
was fine for small use cases (in my tests it starts to degrade
above 1000 entries), but it wasn't enough for medium or large
scale deployments. Modern setups have thousands of participants in a
single bridge, even only enabling vlans and adding a few thousand vlan
entries will cause a few thousand fdbs to be automatically inserted per
participating port. So we need to scale the fdb table considerably to
cope with modern workloads, and this patch converts it to use a
rhashtable for its operations thus improving the bridge scalability.
Tests show the following results (10 runs each), at up to 1000 entries
rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it
is 2 times faster and at 30000 it is 50 times faster.
Obviously this happens because of the properties of the two constructs
and is expected, rhashtable keeps pretty much a constant time even with
10000000 entries (tested), while the fixed hash table struggles
considerably even above 10000.
As a side effect this also reduces the net_bridge struct size from 3248
bytes to 1344 bytes. Also note that the key struct is 8 bytes.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb793583

PCI: Add pcim_set_mwi(), a device-managed pci_set_mwi() · fc0f9f4d

由 Heiner Kallweit 提交于 12月 12, 2017

Add pcim_set_mwi(), a device-managed version of pci_set_mwi().
First user is the Realtek r8169 driver.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc0f9f4d

tcp/dccp: avoid one atomic operation for timewait hashdance · ec94c269

由 Eric Dumazet 提交于 12月 11, 2017

First, rename __inet_twsk_hashdance() to inet_twsk_hashdance()

Then, remove one inet_twsk_put() by setting tw_refcnt to 3 instead
of 4, but adding a fat warning that we do not have the right to access
tw anymore after inet_twsk_hashdance()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec94c269

net_sched: switch to exit_batch for action pernet ops · 039af9c6

由 Cong Wang 提交于 12月 11, 2017

Since we now hold RTNL lock in tc_action_net_exit(), it is good to
batch them to speedup tc action dismantle.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

039af9c6

phylib: add reset after clk enable support · a9668491

由 Richard Leitner 提交于 12月 11, 2017

Some PHYs need the refclk to be a continuous clock. Therefore they don't
allow turning it off and on again during operation. Nonetheless such a
clock switching is performed by some ETH drivers (namely FEC [1]) for
power saving reasons. An example for an affected PHY is the
SMSC/Microchip LAN8720 in "REF_CLK In Mode".

In order to provide a uniform method to overcome this problem this patch
adds a new phy_driver flag (PHY_RST_AFTER_CLK_EN) and corresponding
function phy_reset_after_clk_enable() to the phylib. These should be
used to trigger reset of the PHY after the refclk is switched on again.

[1] commit e8fcfcd5 ("net: fec: optimize the clock management to save power")
Signed-off-by: NRichard Leitner <richard.leitner@skidata.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9668491

phylib: Add device reset delay support · 3a30ae6e

由 Richard Leitner 提交于 12月 11, 2017

Some PHYs need a minimum time after the reset gpio was asserted and/or
deasserted. To ensure we meet these timing requirements add two new
optional devicetree parameters for the phy: reset-delay-us and
reset-post-delay-us.
Signed-off-by: NRichard Leitner <richard.leitner@skidata.com>
Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a30ae6e

12 12月, 2017 13 次提交

tcp: avoid integer overflows in tcp_rcv_space_adjust() · 607065ba

由 Eric Dumazet 提交于 12月 10, 2017

When using large tcp_rmem[2] values (I did tests with 500 MB),
I noticed overflows while computing rcvwin.

Lets fix this before the following patch.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NWei Wang <weiwan@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

607065ba

sctp: add support for the process of unordered idata · 13228238

由 Xin Long 提交于 12月 08, 2017

Unordered idata process is more complicated than unordered data:

 - It has to add mid into sctp_stream_out to save the next mid value,
   which is separated from ordered idata's.

 - To support pd for unordered idata, another mid and pd_mode need to
   be added to save the message id and pd state in sctp_stream_in.

 - To make  unordered idata reasm easier, it adds a new event queue
   to save frags for idata.

The patch mostly adds the samilar reasm functions for unordered idata
as ordered idata's, and also adjusts some other codes on assign_mid,
abort_pd and ulpevent_data for idata.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13228238

sctp: implement abort_pd for sctp_stream_interleave · 65f5e357

由 Xin Long 提交于 12月 08, 2017

abort_pd is added as a member of sctp_stream_interleave, used to abort
partial delivery for data or idata, called in sctp_cmd_assoc_failed.

Since stream interleave allows to do partial delivery for each stream
at the same time, sctp_intl_abort_pd for idata would be very different
from the old function sctp_ulpq_abort_pd for data.

Note that sctp_ulpevent_make_pdapi will support per stream in this
patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as
described in section 6.1.7 of RFC6458.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65f5e357

sctp: implement start_pd for sctp_stream_interleave · be4e0ce1

由 Xin Long 提交于 12月 08, 2017

start_pd is added as a member of sctp_stream_interleave, used to
do partial_delivery for data or idata when datalen >= asoc->rwnd
in sctp_eat_data. The codes have been done in last patches, but
they need to be extracted into start_pd, so that it could be used
for SCTP_CMD_PART_DELIVER cmd as well.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be4e0ce1

sctp: implement renege_events for sctp_stream_interleave · 94014e8d

由 Xin Long 提交于 12月 08, 2017

renege_events is added as a member of sctp_stream_interleave, used to
renege some old data or idata in reasm or lobby queue properly to free
some memory for the new data when there's memory stress.

It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege
as it is for data.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94014e8d

sctp: implement enqueue_event for sctp_stream_interleave · 9162e0ed

由 Xin Long 提交于 12月 08, 2017

enqueue_event is added as a member of sctp_stream_interleave, used to
enqueue either data, idata or notification events into user socket rx
queue.

It replaces sctp_ulpq_tail_event used in the other places with
enqueue_event.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9162e0ed

sctp: implement ulpevent_data for sctp_stream_interleave · bd4d627d

由 Xin Long 提交于 12月 08, 2017

ulpevent_data is added as a member of sctp_stream_interleave, used to
do the most process in ulpq, including to convert data or idata chunk
to event, reasm them in reasm queue and put them in lobby queue in
right order, and deliver them up to user sk rx queue.

This procedure is described in section 2.2.3 of RFC8260.

It adds most functions for idata here to do the similar process as
the old functions for data. But since the details are very different
between them, the old functions can not be reused for idata.

event->ssn and event->ppid settings are moved to ulpevent_data from
sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could
work for both data and idata.

Note that mid is added in sctp_ulpevent for idata, __packed has to
be used for defining sctp_ulpevent, or it would exceeds the skb cb
that saves a sctp_ulpevent variable for ulp layer process.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd4d627d

sctp: implement validate_data for sctp_stream_interleave · 9d4ceaf1

由 Xin Long 提交于 12月 08, 2017

validate_data is added as a member of sctp_stream_interleave, used
to validate ssn/chunk type for data or mid (message id)/chunk type
for idata, called in sctp_eat_data.

If this check fails, an abort packet will be sent, as said in
section 2.2.3 of RFC8260.

It also adds the process for idata in rx path. As Marcelo pointed
out, there's no need to add event table for idata, but just share
chunk_event_table with data's. It would drop data chunk for idata
and drop idata chunk for data by calling validate_data in
sctp_eat_data.

As last patch did, it also replaces sizeof(struct sctp_data_chunk)
with sctp_datachk_len for rx path.

After this patch, the idata can be accepted and delivered to ulp
layer.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d4ceaf1

sctp: implement assign_number for sctp_stream_interleave · 668c9beb

由 Xin Long 提交于 12月 08, 2017

assign_number is added as a member of sctp_stream_interleave, used
to assign ssn for data or mid (message id) for idata, called in
sctp_packet_append_data. sctp_chunk_assign_ssn is left as it is,
and sctp_chunk_assign_mid is added for sctp_stream_interleave_1.

This procedure is described in section 2.2.2 of RFC8260.

All sizeof(struct sctp_data_chunk) in tx path is replaced with
sctp_datachk_len, to make it right for idata as well. And also
adjust sctp_chunk_is_data for SCTP_CID_I_DATA.

After this patch, idata can be built and sent in tx path.

Note that if sp strm_interleave is set, it has to wait_connect in
sctp_sendmsg, as asoc intl_enable need to be known after 4 shake-
hands, to decide if it should use data or idata later. data and
idata can't be mixed to send in one asoc.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

668c9beb

sctp: implement make_datafrag for sctp_stream_interleave · 0c3f6f65

由 Xin Long 提交于 12月 08, 2017

To avoid hundreds of checks for the different process on I-DATA chunk,
struct sctp_stream_interleave is defined as a group of functions used
to replace the codes in some place where it needs to do different job
according to if the asoc intl_enabled is set.

With these ops, it only needs to initialize asoc->stream.si with
sctp_stream_interleave_0 for normal data if asoc intl_enable is 0,
or sctp_stream_interleave_1 for idata if asoc intl_enable is set in
sctp_stream_init.

After that, the members in asoc->stream.si can be used directly in
some special places without checking asoc intl_enable.

make_datafrag is the first member for sctp_stream_interleave, it's
used to make data or idata frags, called in sctp_datamsg_from_user.
The old function sctp_make_datafrag_empty needs to be adjust some
to fit in this ops.

Note that as idata and data chunks have different length, it also
defines data_chunk_len for sctp_stream_interleave to describe the
chunk size.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c3f6f65

sctp: add basic structures and make chunk function for idata · ad05a7a0

由 Xin Long 提交于 12月 08, 2017

sctp_idatahdr and sctp_idata_chunk are used to define and parse
I-DATA chunk format, and sctp_make_idata is a function to build
the chunk.

The I-DATA Chunk Format is defined in section 2.1 of RFC8260.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad05a7a0

sctp: add asoc intl_enable negotiation during 4 shakehands · 96b120b3

由 Xin Long 提交于 12月 08, 2017

asoc intl_enable will be set when local sp strm_interleave is set
and there's I-DATA chunk in init and init_ack extensions, as said
in section 2.2.1 of RFC8260.

asoc intl_enable indicates all data will be sent as I-DATA chunks.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96b120b3

sctp: add stream interleave enable members and sockopt · 772a5869

由 Xin Long 提交于 12月 08, 2017

This patch adds intl_enable in asoc and netns, and strm_interleave in
sctp_sock to indicate if stream interleave is enabled and supported.

netns intl_enable would be set via procfs, but that is not added yet
until all stream interleave codes are completely implemented; asoc
intl_enable will be set when doing 4-shakehands.

sp strm_interleave can be set by sockopt SCTP_INTERLEAVING_SUPPORTED
which is also added in this patch. This socket option is defined in
section 4.3.1 of RFC8260.

Note that strm_interleave can only be set by sockopt when both netns
intl_enable and sp frag_interleave are set.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

772a5869

11 12月, 2017 4 次提交

spinlock: Add library function to allocate spinlock buckets array · 92f36cca

由 Tom Herbert 提交于 12月 04, 2017

Add two new library functions: alloc_bucket_spinlocks and
free_bucket_spinlocks. These are used to allocate and free an array
of spinlocks that are useful as locks for hash buckets. The interface
specifies the maximum number of spinlocks in the array as well
as a CPU multiplier to derive the number of spinlocks to allocate.
The number allocated is rounded up to a power of two to make the
array amenable to hash lookup.
Signed-off-by: NTom Herbert <tom@quantonium.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92f36cca

rhashtable: abstract out function to get hash · 2b860931

由 Tom Herbert 提交于 12月 04, 2017

Split out most of rht_key_hashfn which is calculating the hash into
its own function. This way the hash function can be called separately to
get the hash value.
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NTom Herbert <tom@quantonium.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b860931

rhashtable: Add rhastable_walk_peek · 2db54b47

由 Tom Herbert 提交于 12月 04, 2017

This function is like rhashtable_walk_next except that it only returns
the current element in the inter and does not advance the iter.

This patch also creates __rhashtable_walk_find_next. It finds the next
element in the table when the entry cached in iter is NULL or at the end
of a slot. __rhashtable_walk_find_next is called from
rhashtable_walk_next and rhastable_walk_peek.

end_of_table is an added field to the iter structure. This indicates
that the end of table was reached (walker.tbl being NULL is not a
sufficient condition for end of table).
Signed-off-by: NTom Herbert <tom@quantonium.net>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2db54b47

rhashtable: Change rhashtable_walk_start to return void · 97a6ec4a

由 Tom Herbert 提交于 12月 04, 2017

Most callers of rhashtable_walk_start don't care about a resize event
which is indicated by a return value of -EAGAIN. So calls to
rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
like this is common:

       ret = rhashtable_walk_start(rhiter);
       if (ret && ret != -EAGAIN)
               goto out;

Since zero and -EAGAIN are the only possible return values from the
function this check is pointless. The condition never evaluates to true.

This patch changes rhashtable_walk_start to return void. This simplifies
code for the callers that ignore -EAGAIN. For the few cases where the
caller cares about the resize event, particularly where the table can be
walked in mulitple parts for netlink or seq file dump, the function
rhashtable_walk_start_check has been added that returns -EAGAIN on a
resize event.
Signed-off-by: NTom Herbert <tom@quantonium.net>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97a6ec4a

09 12月, 2017 7 次提交

kmemcheck: rip it out for real · f335195a

由 Michal Hocko 提交于 12月 06, 2017

Commit 4675ff05 ("kmemcheck: rip it out") has removed the code but
for some reason SPDX header stayed in place.  This looks like a rebase
mistake in the mmotm tree or the merge mistake.  Let's drop those
leftovers as well.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f335195a

macvlan: fix memory hole in macvlan_dev · 5e54b3c1

由 Girish Moodalbail 提交于 12月 08, 2017

Move 'macaddr_count' from after 'netpoll' to after 'nest_level' to pack
and reduce a memory hole.

Fixes: 88ca59d1 (macvlan: remove unused fields in struct macvlan_dev)
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e54b3c1

net: dsa: lan9303: Protect ALR operations with mutex · 2e8d243e

由 Egil Hjelmeland 提交于 12月 07, 2017

ALR table operations are a sequence of related register operations which
should be protected from concurrent access. The alr_cache should also be
protected. Add alr_mutex doing that.
Signed-off-by: NEgil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e8d243e

net: skb_array: expose peek API · 4a86a4cf

由 John Fastabend 提交于 12月 07, 2017

This adds a peek routine to skb_array.h for use with qdisc.
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a86a4cf

net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq · b01ac095

由 John Fastabend 提交于 12月 07, 2017

The sch_mq qdisc creates a sub-qdisc per tx queue which are then
called independently for enqueue and dequeue operations. However
statistics are aggregated and pushed up to the "master" qdisc.

This patch adds support for any of the sub-qdiscs to be per cpu
statistic qdiscs. To handle this case add a check when calculating
stats and aggregate the per cpu stats if needed.

Also exports __gnet_stats_copy_queue() to use as a helper function.
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b01ac095

net: sched: helpers to sum qlen and qlen for per cpu logic · 7e66016f

由 John Fastabend 提交于 12月 07, 2017

Add qdisc qlen helper routines for lockless qdiscs to use.

The qdisc qlen is no longer used in the hotpath but it is reported
via stats query on the qdisc so it still needs to be tracked. This
adds the per cpu operations needed along with a helper to return
the summation of per cpu stats.
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e66016f

net: sched: use skb list for skb_bad_tx · 70e57d5e

由 John Fastabend 提交于 12月 07, 2017

Similar to how gso is handled use skb list for skb_bad_tx this is
required with lockless qdiscs because we may have multiple cores
attempting to push skbs into skb_bad_tx concurrently
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70e57d5e