提交 · 8f3504372963fb65d2386f8a2210a557d7cc01d9 · openeuler / Kernel

27 9月, 2015 1 次提交

vxlan: support both IPv4 and IPv6 sockets in a single vxlan device · b1be00a6

由 Jiri Benc 提交于 9年前

For metadata based vxlan interface, open both IPv4 and IPv6 socket. This is
much more user friendly: it's not necessary to create two vxlan interfaces
and pay attention to using the right one in routing rules.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1be00a6

26 9月, 2015 14 次提交

inet: constify inet_rtx_syn_ack() sock argument · 1b70e977

由 Eric Dumazet 提交于 9年前

SYNACK packets are sent on behalf on unlocked listeners
or fastopen sockets. Mark socket as const to catch future changes
that might break the assumption.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b70e977

tcp/dccp: constify rtx_synack() and friends · ea3bea3a

由 Eric Dumazet 提交于 9年前

This is done to make sure we do not change listener socket
while sending SYNACK packets while socket lock is not held.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea3bea3a

tcp: constify tcp_v{4|6}_send_synack() socket argument · 0f935dbe

由 Eric Dumazet 提交于 9年前

This documents fact that listener lock might not be held
at the time SYNACK are sent.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f935dbe

ipv6: constify ip6_xmit() sock argument · 1c1e9d2b

由 Eric Dumazet 提交于 9年前

This is to document that socket lock might not be held at this point.

skb_set_owner_w() and ipv6_local_error() are using proper atomic ops
or spinlocks, so we promote the socket to non const when calling them.

netfilter hooks should never assume socket lock is held,
we also promote the socket to non const.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c1e9d2b

tcp: constify tcp_make_synack() socket argument · 5d062de7

由 Eric Dumazet 提交于 9年前

listener socket is not locked when tcp_make_synack() is called.

We better make sure no field is written.

There is one exception : Since SYNACK packets are attached to the listener
at this moment (or SYN_RECV child in case of Fast Open),
sock_wmalloc() needs to update sk->sk_wmem_alloc, but this is done using
atomic operations so this is safe.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d062de7

ip: constify ip_build_and_send_pkt() socket argument · cfe673b0

由 Eric Dumazet 提交于 9年前

This function is used to build and send SYNACK packets,
possibly on behalf of unlocked listener socket.

Make sure we did not miss a write by making this socket const.

We no longer can use ip_select_ident() and have to either
set iph->id to 0 or directly call __ip_select_ident()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfe673b0

tcp: md5: constify tcp_md5_do_lookup() socket argument · b83e3deb

由 Eric Dumazet 提交于 9年前

When TCP new listener is done, these functions will be called
without socket lock being held. Make sure they don't change
anything.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b83e3deb

inet: constify ip_dont_fragment() arguments · 4e3f5d72

由 Eric Dumazet 提交于 9年前

ip_dont_fragment() can accept const socket and dst
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e3f5d72

ipv6: constify inet6_csk_route_req() socket argument · 30d50c61

由 Eric Dumazet 提交于 9年前

socket is not modified, make it const so that callers can
do the same if they need.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30d50c61

ipv6: constify ip6_dst_lookup_{flow|tail}() sock arguments · 3aef934f

由 Eric Dumazet 提交于 9年前

ip6_dst_lookup_flow() and ip6_dst_lookup_tail() do not touch
socket, lets add a const qualifier.

This will permit the same change in inet6_csk_route_req()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3aef934f

inet: constify inet_csk_route_req() socket argument · e5895bc6

由 Eric Dumazet 提交于 9年前

This is used by TCP listener core, and listener socket shall
not be modified by inet_csk_route_req().
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5895bc6

inet: constify ip_route_output_flow() socket argument · 6f9c9615

由 Eric Dumazet 提交于 9年前

Very soon, TCP stack might call inet_csk_route_req(), which
calls inet_csk_route_req() with an unlocked listener socket,
so we need to make sure ip_route_output_flow() is not trying to
change any field from its socket argument.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f9c9615

tcp: constify tcp_openreq_init_rwin() · b1964b5f

由 Eric Dumazet 提交于 9年前

Soon, listener socket wont be locked when tcp_openreq_init_rwin()
is called. We need to read socket fields once, as their value
could change under us.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1964b5f

tcp: constify listener socket in tcp_v[46]_init_req() · b40cf18e

由 Eric Dumazet 提交于 9年前

Soon, listener socket spinlock will no longer be held,
add const arguments to tcp_v[46]_init_req() to make clear these
functions can not mess socket fields.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b40cf18e

25 9月, 2015 15 次提交

phy: add phy_device_remove() · 38737e49

由 Russell King 提交于 9年前

Add a phy_device_remove() function to complement phy_device_register(),
which undoes the effects of phy_device_register() by removing the phy
device from visibility, but not freeing it.

This allows these details to be moved out of the mdio bus code into
the phy code where this action belongs.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38737e49

phy: fix mdiobus module safety · 3e3aaf64

由 Russell King 提交于 9年前

Re-implement the mdiobus module refcounting to ensure that we actually
ensure that the mdiobus module code does not go away while we might call
into it.

The old scheme using bus->dev.driver was buggy, because bus->dev is a
class device which never has a struct device_driver associated with it,
and hence the associated code trying to obtain a refcount did nothing
useful.

Instead, take the approach that other subsystems do: pass the module
when calling mdiobus_register(), and record that in the mii_bus struct.
When we need to increment the module use count in the phy code, use
this stored pointer.  When the phy is deteched, drop the module
refcount, remembering that the phy device might go away at that point.

This doesn't stop the mii_bus going away while there are in-use phys -
it merely stops the underlying code vanishing.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e3aaf64

switchdev: reduce transaction phase enum down to a boolean · f623ab7f

由 Jiri Pirko 提交于 9年前

Now, since we have only 2 values for transaction phase, just use bool.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f623ab7f

switchdev: remove "ABORT" transaction phase · 9f6467cf

由 Jiri Pirko 提交于 9年前

No longer used by drivers, as transaction queue with item destructors
takes care of abort phase internally in switchdev code. So kill it.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f6467cf

switchdev: remove "NONE" transaction phase · 2b8a61a6

由 Jiri Pirko 提交于 9年前

Shouldn't have been there in the first place. Now it is unused, kill it.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8a61a6

switchdev: add switchdev_trans_ph_prepare/commit helpers · 8bdb4272

由 Jiri Pirko 提交于 9年前

Add helpers which should be used int attr_set/obj_add switchdev ops to
check the phase of transaction.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bdb4272

switchdev: move transaction phase enum under transaction structure · f8db8348

由 Jiri Pirko 提交于 9年前

Before it disappears completely, move transaction phase enum under
transaction structure and make attr/obj structures a bit cleaner.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8db8348

switchdev: introduce transaction item queue for attr_set and obj_add · 7ea6eb3f

由 Jiri Pirko 提交于 9年前

Now, the memory allocation in prepare/commit state is done separatelly
in each driver (rocker). Introduce the similar mechanism in generic
switchdev code, in form of queue. That can be used not only for memory
allocations, but also for different items. Abort item destruction
is handled as well.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ea6eb3f

switchdev: rename "trans" to "trans_ph". · 69f5df49

由 Jiri Pirko 提交于 9年前

This is temporary, name "trans" will be used for something else and
"trans_ph" will eventually disappear.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69f5df49

lwtunnel: remove source and destination UDP port config option · b194f30c

由 Jiri Benc 提交于 9年前

The UDP tunnel config is asymmetric wrt. to the ports used. The source and
destination ports from one direction of the tunnel are not related to the
ports of the other direction. We need to be able to respond to ARP requests
using the correct ports without involving routing.

As the consequence, UDP ports need to be fixed property of the tunnel
interface and cannot be set per route. Remove the ability to set ports per
route. This is still okay to do, as no kernel has been released with these
attributes yet.

Note that the ability to specify source and destination ports is preserved
for other users of the lwtunnel API which don't use routes for tunnel key
specification (like openvswitch).

If in the future we rework ARP handling to allow port specification, the
attributes can be added back.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b194f30c

ipv4: send arp replies to the correct tunnel · 63d008a4

由 Jiri Benc 提交于 9年前

When using ip lwtunnels, the additional data for xmit (basically, the actual
tunnel to use) are carried in ip_tunnel_info either in dst->lwtstate or in
metadata dst. When replying to ARP requests, we need to send the reply to
the same tunnel the request came from. This means we need to construct
proper metadata dst for ARP replies.

We could perform another route lookup to get a dst entry with the correct
lwtstate. However, this won't always ensure that the outgoing tunnel is the
same as the incoming one, and it won't work anyway for IPv4 duplicate
address detection.

The only thing to do is to "reverse" the ip_tunnel_info.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63d008a4

skbuff: Fix skb checksum flag on skb pull · 6ae459bd

由 Pravin B Shelar 提交于 9年前

VXLAN device can receive skb with checksum partial. But the checksum
offset could be in outer header which is pulled on receive. This results
in negative checksum offset for the skb. Such skb can cause the assert
failure in skb_checksum_help(). Following patch fixes the bug by setting
checksum-none while pulling outer header.

Following is the kernel panic msg from old kernel hitting the bug.

------------[ cut here ]------------
kernel BUG at net/core/dev.c:1906!
RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150
Call Trace:
<IRQ>
[<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch]
[<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch]
[<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
[<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
[<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch]
[<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch]
[<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
[<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0
[<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0
[<ffffffff8157ba7a>] udp_rcv+0x1a/0x20
[<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280
[<ffffffff81550128>] ip_local_deliver+0x88/0x90
[<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370
[<ffffffff81550365>] ip_rcv+0x235/0x300
[<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620
[<ffffffff8151c360>] netif_receive_skb+0x80/0x90
[<ffffffff81459935>] virtnet_poll+0x555/0x6f0
[<ffffffff8151cd04>] net_rx_action+0x134/0x290
[<ffffffff810683d8>] __do_softirq+0xa8/0x210
[<ffffffff8162fe6c>] call_softirq+0x1c/0x30
[<ffffffff810161a5>] do_softirq+0x65/0xa0
[<ffffffff810687be>] irq_exit+0x8e/0xb0
[<ffffffff81630733>] do_IRQ+0x63/0xe0
[<ffffffff81625f2e>] common_interrupt+0x6e/0x6e
Reported-by: NAnupam Chanda <achanda@vmware.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ae459bd

cgroup, writeback: don't enable cgroup writeback on traditional hierarchies · 9badce00

由 Tejun Heo 提交于 9年前

inode_cgwb_enabled() gates cgroup writeback support. If it returns
true, each inode is attached to the corresponding memory domain which
gets mapped to io domain. It currently only tests whether the
filesystem and bdi support cgroup writeback; however, cgroup writeback
support doesn't work on traditional hierarchies and thus it should
also test whether memcg and iocg are on the default hierarchy.

This caused traditional hierarchy setups to hit the cgroup writeback
path inadvertently and ended up creating separate writeback domains
for each memcg and mapping them all to the root iocg uncovering a
couple issues in the cgroup writeback path.

cgroup writeback was never meant to be enabled on traditional
hierarchies. Make inode_cgwb_enabled() test whether both memcg and
iocg are on the default hierarchy.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NArtem Bityutskiy <dedekind1@gmail.com>
Reported-by: NDexuan Cui <decui@microsoft.com>
Link: http://lkml.kernel.org/g/1443012552.19983.209.camel@gmail.com
Link: http://lkml.kernel.org/g/f30d4a6aa8a546ff88f73021d026a453@SIXPR30MB031.064d.mgd.msft.net

9badce00

ipv6: remove unused neigh parameter from ndisc functions · 38cf595b

由 Jiri Benc 提交于 9年前

Since commit 12fd84f4 ("ipv6: Remove unused neigh argument for
icmp6_dst_alloc() and its callers."), the neigh parameter of ndisc_send_na
and ndisc_send_ns is unused.

CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38cf595b

genetlink: simplify genl_notify · 92c14d9b

由 Jiri Benc 提交于 9年前

The genl_notify function has too many arguments for no real reason - all
callers use genl_info to get them anyway. Just pass the genl_info down to
genl_notify.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92c14d9b

24 9月, 2015 3 次提交

net/ethoc: support big-endian register layout · 06e60e59

由 Max Filippov 提交于 9年前

Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06e60e59

bridge: define some min/max/default ageing time constants · a79e88d9

由 Scott Feldman 提交于 9年前

Signed-off-by: NScott Feldman <sfeldma@gmail.com>
Acked-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a79e88d9

netpoll: Close race condition between poll_one_napi and napi_disable · 2d8bff12

由 Neil Horman 提交于 9年前

Drivers might call napi_disable while not holding the napi instance poll_lock.
In those instances, its possible for a race condition to exist between
poll_one_napi and napi_disable.  That is to say, poll_one_napi only tests the
NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
the following may happen:

CPU0				CPU1
ndo_tx_timeout			napi_poll_dev
 napi_disable			 poll_one_napi
  test_and_set_bit (ret 0)
				  test_bit (ret 1)
   reset adapter		   napi_poll_routine

If the adapter gets a tx timeout without a napi instance scheduled, its possible
for the adapter to think it has exclusive access to the hardware  (as the napi
instance is now scheduled via the napi_disable call), while the netpoll code
thinks there is simply work to do.  The result is parallel hardware access
leading to corrupt data structures in the driver, and a crash.

Additionaly, there is another, more critical race between netpoll and
napi_disable.  The disabled napi state is actually identical to the scheduled
state for a given napi instance.  The implication being that, if a napi instance
is disabled, a netconsole instance would see the napi state of the device as
having been scheduled, and poll it, likely while the driver was dong something
requiring exclusive access.  In the case above, its fairly clear that not having
the rings in a state ready to be polled will cause any number of crashes.

The fix should be pretty easy.  netpoll uses its own bit to indicate that that
the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
We can just gate disabling on that bit as well as the sched bit.  That should
prevent netpoll from conducting a napi poll if we convert its set bit to a
test_and_set_bit operation to provide mutual exclusion

Change notes:
V2)
	Remove a trailing whtiespace
	Resubmit with proper subject prefix

V3)
	Clean up spacing nits
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: jmaxwell@redhat.com
Tested-by: jmaxwell@redhat.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d8bff12

23 9月, 2015 7 次提交

arcnet: Move files out of include/linux · 26c6d281

由 Joe Perches 提交于 9年前

These #include files don't need to be in the include/linux directory
as they can be local to drivers/net/arcnet/

Move them and update the #include statements.

Update the MAINTAINERS file pattern by deleting arcdevice from the
NETWORKING block as arcnet is currently unmaintained.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

26c6d281

arcnet: Wrap some long lines · d6d7d3ed

由 Joe Perches 提交于 9年前

Just neatening.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

d6d7d3ed

arcnet: Convert arcnet_dump_skb macro to static inline · 83df99b5

由 Joe Perches 提交于 9年前

Make sure the arguments are tested appropriately when not using
this function.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

83df99b5

arcnet: Convert BUGMSG and BUGMSG2 to arc_prink and arc_cont · a34c0932

由 Joe Perches 提交于 9年前

These macros don't actually represent BUG uses but are more commonly
used as logging macros, so use a more kernel style macro.

Convert the BUGMSG from a netdev_ like use to actually use netdev_<level>.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

a34c0932

arcnet: Expand odd BUGLVL macro with if and uses · 72aeea48

由 Joe Perches 提交于 9年前

Don't hide what should be obvious.

Make the macro a simple test instead of using if and test.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

72aeea48

arcnet: Neaten BUGMSG macro defines · d77510f3

由 Joe Perches 提交于 9年前

These macros are actually printk and pr_cont uses with a flag.

Add a new BUGLVL_TEST macro which is just the "should use" test
and not an odd "if (<foo>)" macro to simplify uses in a new patch.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

d77510f3

arcnet: Add and remove blank lines · 01a1d5ac

由 Joe Perches 提交于 9年前

Use a more current kernel line style.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>

01a1d5ac

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功