提交 · a4174f0560f849317239478b1b22afbf03a6eda2 · openeuler / Kernel

11 11月, 2017 22 次提交

bpf: Fix tcp_bufs_kern.c sample program · a4174f05

由 Lawrence Brakmo 提交于 11月 10, 2017

The program was returning -1 in some cases which is not allowed
by the verifier any longer.

Fixes: 390ee7e2 ("bpf: enforce return code for cgroup-bpf programs")
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4174f05

bpf: Fix tcp_rwnd_kern.c sample program · 016e661b

由 Lawrence Brakmo 提交于 11月 10, 2017

The program was returning -1 in some cases which is not allowed
by the verifier any longer.

Fixes: 390ee7e2 ("bpf: enforce return code for cgroup-bpf programs")
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

016e661b

bpf: Fix tcp_synrto_kern.c sample program · 7863f46b

由 Lawrence Brakmo 提交于 11月 10, 2017

The program was returning -1 in some cases which is not allowed
by the verifier any longer.

Fixes: 390ee7e2 ("bpf: enforce return code for cgroup-bpf programs")
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7863f46b

tipc: improve link resiliency when rps is activated · 8d6e79d3

由 Jon Maloy 提交于 11月 08, 2017

Currently, the TIPC RPS dissector is based only on the incoming packets'
source node address, hence steering all traffic from a node to the same
core. We have seen that this makes the links vulnerable to starvation
and unnecessary resets when we turn down the link tolerance to very low
values.

To reduce the risk of this happening, we exempt probe and probe replies
packets from the convergence to one core per source node. Instead, we do
the opposite, - we try to diverge those packets across as many cores as
possible, by randomizing the flow selector key.

To make such packets identifiable to the dissector, we add a new
'is_keepalive' bit to word 0 of the LINK_PROTOCOL header. This bit is
set both for PROBE and PROBE_REPLY messages, and only for those.

It should be noted that these packets are not part of any flow anyway,
and only constitute a minuscule fraction of all packets sent across a
link. Hence, there is no risk that this will affect overall performance.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d6e79d3

Merge branch 'macb-next' · 141f575f

由 David S. Miller 提交于 11月 11, 2017

Michael Grzeschik says:

====================
net: macb: add error handling on probe and

This series adds more error handling to the macb driver.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

141f575f

net: macb: add of_node_put to error paths · 66ee6a06

由 Michael Grzeschik 提交于 11月 08, 2017

We add the call of_node_put(bp->phy_node) to all associated error
paths for memory clean up.
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>
Acked-by: NNicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66ee6a06

net: macb: add of_phy_deregister_fixed_link to error paths · 9ce98140

由 Michael Grzeschik 提交于 11月 08, 2017

We add the call of_phy_deregister_fixed_link to all associated
error paths for memory clean up.
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>
Acked-by: NNicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ce98140

net: mvpp2: fix GOP statistics loop start and stop conditions · e5c500eb

由 Miquel Raynal 提交于 11月 08, 2017

GOP statistics from all ports of one instance of the driver are gathered
with one work recalled in loop in a workqueue. The loop is started when
a port is up, and stopped when a port is down. This last condition is
obviously wrong.

Fix this by having a work per port. This way, starting and stoping it
when the port is up or down will be fine, while minimizing unnecessary
CPU usage.

Fixes: 118d6298 ("net: mvpp2: add ethtool GOP statistics")
Reported-by: NStefan Chulski <stefanc@marvell.com>
Signed-off-by: NMiquel Raynal <miquel.raynal@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5c500eb

Merge branch 'hns3-bug-fixes' · fc981359

由 David S. Miller 提交于 11月 11, 2017

Lipeng says:

====================
net: hns3: Bug fixes & Code improvements in HNS3 driver

This patch-set introduces some bug fixes and code improvements.
As [patch 1/2] depends on the patch {5392902d net: hns3: Consistently using
GENMASK in hns3 driver}, which exists in net-next, not exists in net, so
push this serise to nex-next.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc981359

net: hns3: cleanup mac auto-negotiation state query in hclge_update_speed_duplex · c040366b

由 Fuyun Liang 提交于 11月 08, 2017

When checking whether auto-negotiation is on, driver only needs to
check the value of mac.autoneg(SW) directly, and does not need to
query it from hardware. Because this value is always synchronized
with the auto-negotiation state of hardware.

This patch removes mac auto-negotiation state query in
hclge_update_speed_duplex().

Fixes: 46a3df9f (net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support)
Signed-off-by: NFuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: NLipeng <lipeng321@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c040366b

net: hns3: fix a bug when getting phy address from NCL_config file · 39e2151f

由 Fuyun Liang 提交于 11月 08, 2017

Driver gets phy address from NCL_config file and uses the phy address
to initialize phydev. There are 5 bits for phy address. And C22 phy
address has 5 bits. So 0-31 are all valid address for phy. If there
is no phy, it will crash. Because driver always get a valid phy address.

This patch fixes the phy address to 8 bits, and use 0xff to indicate
invalid phy address.

Fixes: 46a3df9f (net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support)
Signed-off-by: NFuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: NLipeng <lipeng321@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39e2151f

net: netlink: Update attr validation to require exact length for some types · 28033ae4

由 David Ahern 提交于 11月 07, 2017

Attributes using NLA_U* and NLA_S* (where * is 8, 16,32 and 64) are
expected to be an exact length. Split these data types from
nla_attr_minlen into nla_attr_len and update validate_nla to require
the attribute to have exact length for them.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28033ae4

net: ipv6: sysctl to specify IPv6 ND traffic class · 2210d6b2

由 Maciej Żenczykowski 提交于 11月 07, 2017

Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.

Currently this includes:

  - Router Solicitation (ICMPv6 type 133)
    ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Solicitation (ICMPv6 type 135)
    ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Advertisement (ICMPv6 type 136)
    ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Redirect (ICMPv6 type 137)
    ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()

and if the kernel ever gets around to generating RA's,
it would presumably also include:

  - Router Advertisement (ICMPv6 type 134)
    (radvd daemon could pick up on the kernel setting and use it)

Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme.  An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:

    https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11

The expected primary use case is to properly prioritize ND over wifi.

Testing:
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  0
  jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  255
  jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  34

  jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
  tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
  IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
  jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
  length 24, tgt is jzem22.pgc, Flags [solicited]

(based on original change written by Erik Kline, with minor changes)

v2: fix 'suspicious rcu_dereference_check() usage'
    by explicitly grabbing the rcu_read_lock.

Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: NErik Kline <ek@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2210d6b2

net/ncsi: Don't return error on normal response · 04bad8bd

由 Samuel Mendoza-Jonas 提交于 11月 08, 2017

Several response handlers return EBUSY if the data corresponding to the
command/response pair is already set. There is no reason to return an
error here; the channel is advertising something as enabled because we
told it to enable it, and it's possible that the feature has been
enabled previously.
Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04bad8bd

net/ncsi: Improve general state logging · 9ef8690b

由 Samuel Mendoza-Jonas 提交于 11月 08, 2017

The NCSI driver is mostly silent which becomes a headache when trying to
determine what has occurred on the NCSI connection. This adds additional
logging in a few key areas such as state transitions and calling out
certain errors more visibly.
Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ef8690b

Merge branch 'bpftool-show-filenames-of-pinned-objects' · a8a6f1e4

由 David S. Miller 提交于 11月 11, 2017

Prashant Bhole says:

====================
tools: bpftool: show filenames of pinned objects

This patchset adds support to show pinned objects in object details.

Patch1 adds a funtionality to open a path in bpf-fs regardless of its object
type.

Patch2 adds actual functionality by scanning the bpf-fs once and adding
object information in hash table, with object id as a key. One object may be
associated with multiple paths because an object can be pinned multiple times

Patch3 adds command line option to enable this functionality. Making it optional
because scanning bpf-fs can be costly.
====================
Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>

a8a6f1e4

tools: bpftool: optionally show filenames of pinned objects · c541b734

由 Prashant Bhole 提交于 11月 08, 2017

Making it optional to show file names of pinned objects because
it scans complete bpf-fs filesystem which is costly.
Added option -f|--bpffs. Documentation updated.
Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c541b734

tools: bpftool: show filenames of pinned objects · 4990f1f4

由 Prashant Bhole 提交于 11月 08, 2017

Added support to show filenames of pinned objects.

For example:

root@test# ./bpftool prog
3: tracepoint  name tracepoint__irq  tag f677a7dd722299a3
    loaded_at Oct 26/11:39  uid 0
    xlated 160B  not jited  memlock 4096B  map_ids 4
    pinned /sys/fs/bpf/softirq_prog

4: tracepoint  name tracepoint__irq  tag ea5dc530d00b92b6
    loaded_at Oct 26/11:39  uid 0
    xlated 392B  not jited  memlock 4096B  map_ids 4,6

root@test# ./bpftool --json --pretty prog
[{
        "id": 3,
        "type": "tracepoint",
        "name": "tracepoint__irq",
        "tag": "f677a7dd722299a3",
        "loaded_at": "Oct 26/11:39",
        "uid": 0,
        "bytes_xlated": 160,
        "jited": false,
        "bytes_memlock": 4096,
        "map_ids": [4
        ],
        "pinned": ["/sys/fs/bpf/softirq_prog"
        ]
    },{
        "id": 4,
        "type": "tracepoint",
        "name": "tracepoint__irq",
        "tag": "ea5dc530d00b92b6",
        "loaded_at": "Oct 26/11:39",
        "uid": 0,
        "bytes_xlated": 392,
        "jited": false,
        "bytes_memlock": 4096,
        "map_ids": [4,6
        ],
        "pinned": []
    }
]

root@test# ./bpftool map
4: hash  name start  flags 0x0
    key 4B  value 16B  max_entries 10240  memlock 1003520B
    pinned /sys/fs/bpf/softirq_map1
5: hash  name iptr  flags 0x0
    key 4B  value 8B  max_entries 10240  memlock 921600B

root@test# ./bpftool --json --pretty map
[{
        "id": 4,
        "type": "hash",
        "name": "start",
        "flags": 0,
        "bytes_key": 4,
        "bytes_value": 16,
        "max_entries": 10240,
        "bytes_memlock": 1003520,
        "pinned": ["/sys/fs/bpf/softirq_map1"
        ]
    },{
        "id": 5,
        "type": "hash",
        "name": "iptr",
        "flags": 0,
        "bytes_key": 4,
        "bytes_value": 8,
        "max_entries": 10240,
        "bytes_memlock": 921600,
        "pinned": []
    }
]
Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4990f1f4

tools: bpftool: open pinned object without type check · 18527196

由 Prashant Bhole 提交于 11月 08, 2017

This was needed for opening any file in bpf-fs without knowing
its object type
Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18527196

Merge branch 'BPF-directed-error-injection' · 329fca60

由 David S. Miller 提交于 11月 11, 2017

Josef Bacik says:

====================
Add the ability to do BPF directed error injection

I'm sending this through Dave since it'll conflict with other BPF changes in his
tree, but since it touches tracing as well Dave would like a review from
somebody on the tracing side.

v4->v5:
- disallow kprobe_override programs from being put in the prog map array so we
  don't tail call into something we didn't check.  This allows us to make the
  normal path still fast without a bunch of percpu operations.

v3->v4:
- fix a build error found by kbuild test bot (I didn't wait long enough
  apparently.)
- Added a warning message as per Daniels suggestion.

v2->v3:
- added a ->kprobe_override flag to bpf_prog.
- added some sanity checks to disallow attaching bpf progs that have
  ->kprobe_override set that aren't for ftrace kprobes.
- added the trace_kprobe_ftrace helper to check if the trace_event_call is a
  ftrace kprobe.
- renamed bpf_kprobe_state to bpf_kprobe_override, fixed it so we only read this
  value in the kprobe path, and thus only write to it if we're overriding or
  clearing the override.

v1->v2:
- moved things around to make sure that bpf_override_return could really only be
  used for an ftrace kprobe.
- killed the special return values from trace_call_bpf.
- renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if
  it was being called from an ftrace kprobe context.
- reworked the logic in kprobe_perf_func to take advantage of bpf_kprobe_state.
- updated the test as per Alexei's review.

- Original message -

A lot of our error paths are not well tested because we have no good way of
injecting errors generically.  Some subystems (block, memory) have ways to
inject errors, but they are random so it's hard to get reproduceable results.

With BPF we can add determinism to our error injection.  We can use kprobes and
other things to verify we are injecting errors at the exact case we are trying
to test.  This patch gives us the tool to actual do the error injection part.
It is very simple, we just set the return value of the pt_regs we're given to
whatever we provide, and then override the PC with a dummy function that simply
returns.

Right now this only works on x86, but it would be simple enough to expand to
other architectures.  Thanks,
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

329fca60

samples/bpf: add a test for bpf_override_return · eafb3401

由 Josef Bacik 提交于 11月 07, 2017

This adds a basic test for bpf_override_return to verify it works.  We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eafb3401

bpf: add a bpf_override_function helper · dd0bb688

由 Josef Bacik 提交于 11月 07, 2017

Error injection is sloppy and very ad-hoc.  BPF could fill this niche
perfectly with it's kprobe functionality.  We could make sure errors are
only triggered in specific call chains that we care about with very
specific situations.  Accomplish this with the bpf_override_funciton
helper.  This will modify the probe'd callers return value to the
specified value and set the PC to an override function that simply
returns, bypassing the originally probed function.  This gives us a nice
clean way to implement systematic error injection for all of our code
paths.
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd0bb688

10 11月, 2017 18 次提交

net: fix incorrect comment with regard to VLAN packet handling · 54985120

由 Girish Moodalbail 提交于 11月 07, 2017

The commit bcc6d479 ("net: vlan: make non-hw-accel rx path similar
to hw-accel") unified accel and non-accel path for VLAN RX. With that
fix we do not register any packet_type handler for VLANs anymore, so fix
the incorrect comment.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54985120

Merge branch 'act_vlan-rcu' · b79c069a

由 David S. Miller 提交于 11月 10, 2017

Manish Kurup says:

====================
net_sched actions: act_vlan now uses RCU

This commit consists of 3 patches:

patch1 (1/3):
The VLAN action maintains one set of stats across all cores, and uses a
spinlock to synchronize updates to it from the same. Changed this to use a
per-CPU stats context instead.
This change will result in better performance.

patch2 (2/3):
Modified netronome nfp flower action to use VLAN helper functions instead
of accessing/referencing TC act_vlan private structures directly.

patch3 (3/3):
Using a spinlock in the VLAN action causes performance issues when the VLAN
action is used on multiple cores. Rewrote the VLAN action to use RCU read
locking for reads and updates instead.
All functions now use an RCU dereferenced pointer to access the VLAN action
context. Modified helper functions used by other modules, to use the RCU as
opposed to directly accessing the structure.

As part of this review, there were some changes suggested by reviewers.
I have incorporated all the changes that were requested.

Here're the changes:
v2: Fixed all helper functions to use RCU (rtnl_dereference) - Eric, Jamal
v2: Fixed indentation, extra line nits - Jamal, Jiri
v2: Moved rcu_head to the end of the struct - Jiri
v2: Re-formatted locals to reverse-christmas-tree - Jiri
v2: Removed mismatched spin_lock() - Cong
v2: Removed spin_lock_bh() in tcf_vlan_init, rtnl_dereference() should
    suffice - Cong, Jiri
v4: Modified the nfp flower action code to use the VLAN helper functions
    instead of referencing the structure directly. Isolated this into a
    separate patch - Pieter Jansen
v5: Got rid of the unlikely() for the allocation case - Simon Horman
v6: Had forgotten cleanup functions for RCU alloc, added them - Dave Miller
v7: Re-formatted more locals to reverse-christmas-tree - Pieter V
v8: Reverted reverse-christmas-tree(v7), not required when dependencies
    make it difficult to implement - Alexander D
v9: Cover letter subject change - Jamal
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b79c069a

act_vlan: VLAN action rewrite to use RCU lock/unlock and update · 4c5b9d96

由 Manish Kurup 提交于 11月 07, 2017

Using a spinlock in the VLAN action causes performance issues when the VLAN
action is used on multiple cores. Rewrote the VLAN action to use RCU read
locking for reads and updates instead.
All functions now use an RCU dereferenced pointer to access the VLAN action
context. Modified helper functions used by other modules, to use the RCU as
opposed to directly accessing the structure.
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c5b9d96

nfp flower action: Modified to use VLAN helper functions · bf068bdd

由 Manish Kurup 提交于 11月 07, 2017

Modified netronome nfp flower action to use VLAN helper functions instead
of accessing/referencing TC act_vlan private structures directly.
Reviewed-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf068bdd

act_vlan: Change stats update to use per-core stats · e0496cbb

由 Manish Kurup 提交于 11月 07, 2017

The VLAN action maintains one set of stats across all cores, and uses a
spinlock to synchronize updates to it from the same. Changed this to use a
per-CPU stats context instead.
This change will result in better performance.
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0496cbb

sfc: don't warn on successful change of MAC · cbad52e9

由 Robert Stonehouse 提交于 11月 07, 2017

Fixes: 535a6177 ("sfc: suppress handled MCDI failures when changing the MAC address")
Signed-off-by: NBert Kenward <bkenward@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbad52e9

net: vxge: remove redundant assignments and pointers · e4effc09

由 Colin Ian King 提交于 11月 07, 2017

There are several pointers that are being assigned but never read
so remove these as they are redundant.  Also remove an assignment
to function_mode that is never read. Cleans up several clang
warnings:

vxge-main.c:1139:2: warning: Value stored to 'hldev' is never read
vxge-main.c:1294:2: warning: Value stored to 'hldev' is never read
vxge-main.c:2188:2: warning: Value stored to 'dev' is never read
vxge-main.c:2188:2: warning: Value stored to 'dev' is never read
vxge-main.c:2723:2: warning: Value stored to 'function_mode' is
never read
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4effc09

Merge branch 'ip_gre-flags-update' · be61a484

由 David S. Miller 提交于 11月 10, 2017

Xin Long says:

====================
ip_gre: add support for i/o_flags update

ip_gre is using as many ip_tunnel apis as possible, newlink works
fine as gre would do it's own part in .ndo_init. But when changing
link, ip_tunnel_changelink doesn't even update i/o_flags, and also
the update of these flags would cause some other gre's properties
need to be updated or recalculated.

These two patch are to add i/o_flags update and then do adjustment
on some gre's properties according to the new i/o_flags.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be61a484

ip_gre: add the support for i/o_flags update via ioctl · a0efab67

由 Xin Long 提交于 11月 07, 2017

As patch 'ip_gre: add the support for i/o_flags update via netlink'
did for netlink, we also need to do the same job for these update
via ioctl.

This patch is to update i/o_flags and call ipgre_link_update to
recalculate these gre properties after ip_tunnel_ioctl does the
common update.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0efab67

ip_gre: add the support for i/o_flags update via netlink · dd9d598c

由 Xin Long 提交于 11月 07, 2017

Now ip_gre is using ip_tunnel_changelink to update it's properties, but
ip_tunnel_changelink in ip_tunnel doesn't update i/o_flags as a common
function.

o_flags updates would cause that tunnel->tun_hlen / hlen and dev->mtu /
needed_headroom need to be recalculated, and dev->(hw_)features need to
be updated as well.

Therefore, we can't just add the update into ip_tunnel_update called
in ip_tunnel_changelink, and it's also better not to touch ip_tunnel
codes.

This patch updates i/o_flags and calls ipgre_link_update to recalculate
these gre properties after ip_tunnel_changelink does the common update.

Note that since ipgre_link_update doesn't know the lower dev, it will
update gre->hlen, dev->mtu and dev->needed_headroom with the value of
'new tun_hlen - old tun_hlen'. In this way, we can avoid many redundant
codes, unlike ip6_gre.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd9d598c

Merge branch 'tcp-ns-rmem-wmem' · c7947e43

由 David S. Miller 提交于 11月 10, 2017

Eric Dumazet says:

====================
net: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem

We need to get per netns sysctl for sysctl_[proto]_rmem and sysctl_[proto]_wmem

This patch series adds the basic infrastructure allowing per proto
conversion, and takes care of TCP.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7947e43

tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem · 356d1833

由 Eric Dumazet 提交于 11月 07, 2017

Note that when a new netns is created, it inherits its
sysctl_tcp_rmem and sysctl_tcp_wmem from initial netns.

This change is needed so that we can refine TCP rcvbuf autotuning,
to take RTT into consideration.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

356d1833

net: allow per netns sysctl_rmem and sysctl_wmem for protos · a3dcaf17

由 Eric Dumazet 提交于 11月 07, 2017

As we want to gradually implement per netns sysctl_rmem and sysctl_wmem
on per protocol basis, add two new fields in struct proto,
and two new helpers : sk_get_wmem0() and sk_get_rmem0()

First user will be TCP. Then UDP and SCTP can be easily converted,
while DECNET probably wont get this support.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3dcaf17

net: dsa: Don't add vlans when vlan filtering is disabled · 2ea7a679

由 Andrew Lunn 提交于 11月 07, 2017

The software bridge can be build with vlan filtering support
included. However, by default it is turned off. In its turned off
state, it still passes VLANs via switchev, even though they are not to
be used. Don't pass these VLANs to the hardware. Only do so when vlan
filtering is enabled.

This fixes at least one corner case. There are still issues in other
corners, such as when vlan_filtering is later enabled.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ea7a679

Merge tag 'mlx5-updates-2017-11-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 4fdc3023

由 David S. Miller 提交于 11月 10, 2017

Saeed Mahameed says:

====================
mlx5-updates-2017-11-09

This series introduces vlan offloads related improvements for mlx5
ethernet netdev driver, from Gal Pressman.

 - Add support for 802.1ad vlan filter
 - Add support for 802.1ad vlan insertion
 - Add vlan offloads statistics to ethtool (inserted/stripped vlans)
 - CHECKSUM_COMPLETE support for vlan traffic when vlan stripping is off! (Finally)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fdc3023

Merge branch 'IGMP-snooping-for-local-traffic' · 5d37636a

由 David S. Miller 提交于 11月 10, 2017

Andrew Lunn says:

====================
IGMP snooping for local traffic

The linux bridge supports IGMP snooping. It will listen to IGMP
reports on bridge ports and keep track of which groups have been
joined on an interface. It will then forward multicast based on this
group membership.

When the bridge adds or removed groups from an interface, it uses
switchdev to request the hardware add an mdb to a port, so the
hardware can perform the selective forwarding between ports.

What is not covered by the current bridge code, is IGMP joins/leaves
from the host on the brX interface. These are not reported via
switchdev so that hardware knows the local host is interested in the
multicast frames.

Luckily, the bridge does track joins/leaves on the brX interface. The
code is obfusticated, which is why i missed it with my first attempt.
So the first patch tries to remove this obfustication. Currently,
there is no notifications sent when the bridge interface joins a
group. The second patch adds them. bridge monitor then shows
joins/leaves in the same way as for other ports of the bridge.

Then starts the work passing down to the hardware that the host has
joined/left a group. The existing switchdev mdb object cannot be used,
since the semantics are different. The existing
SWITCHDEV_OBJ_ID_PORT_MDB is used to indicate a specific multicast
group should be forwarded out that port of the switch. However here we
require the exact opposite. We want multicast frames for the group
received on the port to the forwarded to the host. Hence add a new
object SWITCHDEV_OBJ_ID_HOST_MDB, a multicast database entry to
forward to the host. This new object is then propagated through the
DSA layers. No DSA driver changes should be needed, this should just
work...

This version fixes up the nitpick from Nikolay, removes an unrelated
white space change, and adds in a patch adding a few const attributes
to a couple of functions taking a port parameter, in order to stop the
following patch produces warnings.
====================
Acked-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d37636a

net: dsa: switch: Don't add CPU port to an mdb by default · ae45102c

由 Andrew Lunn 提交于 11月 09, 2017

Now that the host indicates when a multicast group should be forwarded
from the switch to the host, don't do it by default.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae45102c

net: dsa: add more const attributes · bb9f6031

由 Andrew Lunn 提交于 11月 09, 2017

The notify mechanism does not need to modify the port it is notifying.
So make the parameter const.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb9f6031

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功