提交 · 0fc1e0495fd6e261e75acdbe66b53e769e5ffb81 · openeuler / raspberrypi-kernel

24 10月, 2014 3 次提交

mac80211: expose API allowing station iteration · 0fc1e049

由 Arik Nemtsov 提交于 10月 22, 2014

Allow drivers to iterate all stations currently uploaded to them.
Signed-off-by: NArik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

0fc1e049

mac80211: expose TDLS-initiator value to low level driver · 8b94148c

由 Arik Nemtsov 提交于 10月 22, 2014

Some drivers need to know which station is the TDLS link initiator.
Expose this value via the mac80211 ieee80211_sta structure.
Signed-off-by: NArik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

8b94148c

mac80211: export IE splitting function · a7f3a768

由 Andrei Otcheretianski 提交于 10月 22, 2014

Export ieee80211_ie_split function, so it can be reused by
drivers which need to insert additional elements.
Signed-off-by: NAndrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

a7f3a768

22 10月, 2014 2 次提交

mac80211: add WMM admission control support · 02219b3a

由 Johannes Berg 提交于 10月 07, 2014

Use the currently existing APIs between mac80211 and the low
level driver to implement WMM admission control.

The low level driver needs to report the media time used by
each transmitted packet in ieee80211_tx_status. Based on that
information, mac80211 will modify the QoS parameters of the
admission controlled Access Category when the limit is
reached. Once the original QoS parameters can be restored,
mac80211 will do so.

One issue with this approach is that management frames will
also erroneously be downgraded, but the upside is that the
implementation is simple. In the future, it can be extended
to driver- or device-based implementations that are better.
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

02219b3a

cfg80211: make WMM TSPEC support flag an nl80211 feature flag · 723e73ac

由 Johannes Berg 提交于 10月 22, 2014

During the review of the corresponding wpa_supplicant patches we
noticed that the only way for it to detect that this functionality
is supported currently is to check for the command support. This
can be misleading though, as the command was also designed to, in
the future, support pure 802.11 TSPECs.

Expose the WMM-TSPEC feature flag to nl80211 so later we can also
expose an 802.11-TSPEC feature flag (if needed) to differentiate
the two cases.

Note: this change isn't needed in 3.18 as there's no driver there
yet that supports the functionality at all.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

723e73ac

20 10月, 2014 2 次提交

cfg80211: Specify frame and reason code for NL80211_CMD_DEL_STATION · 98856866

由 Jouni Malinen 提交于 10月 20, 2014

The optional NL80211_ATTR_MGMT_SUBTYPE and NL80211_ATTR_REASON_CODE
attributes can now be included in NL80211_CMD_DEL_STATION to indicate to
the driver which frame (Deauthentication/Disassociation) and reason code
in that frame should be used to indicate removal to the specific
station. This is used by drivers that implement AP SME and generate
those frames internally.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

98856866

cfg80211: Convert del_station() callback to use a param struct · 89c771e5

由 Jouni Malinen 提交于 10月 10, 2014

This makes it easier to add new parameters for the del_station calls
without having to modify all drivers that use this.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

89c771e5

09 10月, 2014 5 次提交

mac80211: allow channel switch with multiple channel contexts · 0f791eb4

由 Luciano Coelho 提交于 10月 08, 2014

Channel switch with multiple channel contexts should now work fine.
Remove check that disallows switches when multiple contexts are in
use.
Signed-off-by: NLuciano Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

0f791eb4

mac80211: add post_channel_switch driver operation · f1d65583

由 Luciano Coelho 提交于 10月 08, 2014

As a counterpart to the pre_channel_switch operation, add a
post_channel_switch operation.  This allows the drivers to go back to
a normal configuration after the channel switch is completed.
Signed-off-by: NLuciano Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

f1d65583

mac80211: add pre_channel_switch driver operation · 6d027bcc

由 Luciano Coelho 提交于 10月 08, 2014

Some drivers may need to prepare for a channel switch also when it is
initiated from the remote side (eg. station, P2P client).  To make
this possible, add a generic callback that can be called for all
interface types.
Signed-off-by: NLuciano Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

6d027bcc

mac80211: add device_timestamp to the ieee80211_channel_switch struct · 2ba45384

由 Luciano Coelho 提交于 10月 08, 2014

Some devices may need the device timestamp in order to synchronize the
channel switch.  To pass this value back to the driver, add it to the
channel switch structure and copy the device_timestamp value received
in the rx info structure into it.
Signed-off-by: NLuciano Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

2ba45384

cfg80211: add ops to query mesh proxy path table · 66be7d2b

由 Henning Rogge 提交于 9月 12, 2014

Add two new cfg80211 operations for querying a table with proxied mesh
paths.
Signed-off-by: NHenning Rogge <henning.rogge@fkie.fraunhofer.de>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

66be7d2b

07 10月, 2014 5 次提交

openvswitch: fix a compilation error when CONFIG_INET is not setW! · 7c5df8fa

由 Andy Zhou 提交于 10月 06, 2014

Fix a openvswitch compilation error when CONFIG_INET is not set:

=====================================================
   In file included from include/net/geneve.h:4:0,
                       from net/openvswitch/flow_netlink.c:45:
		          include/net/udp_tunnel.h: In function 'udp_tunnel_handle_offloads':
			  >> include/net/udp_tunnel.h:100:2: error: implicit declaration of function 'iptunnel_handle_offloads' [-Werror=implicit-function-declaration]
			  >>      return iptunnel_handle_offloads(skb, udp_csum, type);
			  >>           ^
			  >>           >> include/net/udp_tunnel.h:100:2: warning: return makes pointer from integer without a cast
			  >>           >>    cc1: some warnings being treated as errors

=====================================================
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c5df8fa

ipv6: make fib6 serial number per namespace · 812918c4

由 Hannes Frederic Sowa 提交于 10月 06, 2014

Try to reduce number of possible fn_sernum mutation by constraining them
to their namespace.

Also remove rt_genid which I forgot to remove in 705f1c86 ("ipv6:
remove rt6i_genid").

Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

812918c4

ipv6: make rt_sernum atomic and serial number fields ordinary ints · 42b18706

由 Hannes Frederic Sowa 提交于 10月 06, 2014

Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42b18706

ipv6: minor fib6 cleanups like type safety, bool conversion, inline removal · 94b2cfe0

由 Hannes Frederic Sowa 提交于 10月 06, 2014

Also renamed struct fib6_walker_t to fib6_walker and enum fib_walk_state_t
to fib6_walk_state as recommended by Cong Wang.

Cc: Cong Wang <cwang@twopensource.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94b2cfe0

net: sched: remove tcf_proto from ematch calls · 82a470f1

由 John Fastabend 提交于 10月 05, 2014

This removes the tcf_proto argument from the ematch code paths that
only need it to reference the net namespace. This allows simplifying
qdisc code paths especially when we need to tear down the ematch
from an RCU callback. In this case we can not guarentee that the
tcf_proto structure is still valid.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NCong Wang <cwang@twopensource.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82a470f1

06 10月, 2014 4 次提交

net: sched: avoid costly atomic operation in fq_dequeue() · f2600cf0

由 Eric Dumazet 提交于 10月 04, 2014

Standard qdisc API to setup a timer implies an atomic operation on every
packet dequeue : qdisc_unthrottled()

It turns out this is not really needed for FQ, as FQ has no concept of
global qdisc throttling, being a qdisc handling many different flows,
some of them can be throttled, while others are not.

Fix is straightforward : add a 'bool throttle' to
qdisc_watchdog_schedule_ns(), and remove calls to qdisc_unthrottled()
in sch_fq.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2600cf0

openvswitch: Add support for Geneve tunneling. · f5796684

由 Jesse Gross 提交于 10月 03, 2014

The Openvswitch implementation is completely agnostic to the options
that are in use and can handle newly defined options without
further work. It does this by simply matching on a byte array
of options and allowing userspace to setup flows on this array.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Singed-off-by: NAnsis Atteka <aatteka@nicira.com>
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Acked-by: NThomas Graf <tgraf@noironetworks.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5796684

net: Add Geneve tunneling protocol driver · 0b5e8b8e

由 Andy Zhou 提交于 10月 03, 2014

This adds a device level support for Geneve -- Generic Network
Virtualization Encapsulation. The protocol is documented at
http://tools.ietf.org/html/draft-gross-geneve-01

Only protocol layer Geneve support is provided by this driver.
Openvswitch can be used for configuring, set up and tear down
functional Geneve tunnels.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b5e8b8e

sctp: handle association restarts when the socket is closed. · bdf6fa52

由 Vlad Yasevich 提交于 10月 03, 2014

Currently association restarts do not take into consideration the
state of the socket.  When a restart happens, the current assocation
simply transitions into established state.  This creates a condition
where a remote system, through a the restart procedure, may create a
local association that is no way reachable by user.  The conditions
to trigger this are as follows:
  1) Remote does not acknoledge some data causing data to remain
     outstanding.
  2) Local application calls close() on the socket.  Since data
     is still outstanding, the association is placed in SHUTDOWN_PENDING
     state.  However, the socket is closed.
  3) The remote tries to create a new association, triggering a restart
     on the local system.  The association moves from SHUTDOWN_PENDING
     to ESTABLISHED.  At this point, it is no longer reachable by
     any socket on the local system.

This patch addresses the above situation by moving the newly ESTABLISHED
association into SHUTDOWN-SENT state and bundling a SHUTDOWN after
the COOKIE-ACK chunk.  This way, the restarted associate immidiately
enters the shutdown procedure and forces the termination of the
unreachable association.
Reported-by: NDavid Laight <David.Laight@aculab.com>
Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdf6fa52

05 10月, 2014 1 次提交

Removed unused inet6 address state · dd3619f2

由 Sébastien Barré 提交于 10月 02, 2014

the inet6 state INET6_IFADDR_STATE_UP only appeared in its definition.

Cc: Christoph Paasch <christoph.paasch@uclouvain.be>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NSébastien Barré <sebastien.barre@uclouvain.be>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd3619f2

04 10月, 2014 3 次提交

gue: Receive side for Generic UDP Encapsulation · 37dd0247

由 Tom Herbert 提交于 10月 03, 2014

This patch adds support receiving for GUE packets in the fou module. The
fou module now supports direct foo-over-udp (no encapsulation header)
and GUE. To support this a type parameter is added to the fou netlink
parameters.

For a GUE socket we define gue_udp_recv, gue_gro_receive, and
gue_gro_complete to handle the specifics of the GUE protocol. Most
of the code to manage and configure sockets is common with the fou.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37dd0247

qdisc: validate skb without holding lock · 55a93b3e

由 Eric Dumazet 提交于 10月 03, 2014

Validation of skb can be pretty expensive :

GSO segmentation and/or checksum computations.

We can do this without holding qdisc lock, so that other cpus
can queue additional packets.

Trick is that requeued packets were already validated, so we carry
a boolean so that sch_direct_xmit() can validate a fresh skb list,
or directly use an old one.

Tested on 40Gb NIC (8 TX queues) and 200 concurrent flows, 48 threads
host.

Turning TSO on or off had no effect on throughput, only few more cpu
cycles. Lock contention on qdisc lock disappeared.

Same if disabling TX checksum offload.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55a93b3e

qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE · 5772e9a3

由 Jesper Dangaard Brouer 提交于 10月 01, 2014

Based on DaveM's recent API work on dev_hard_start_xmit(), that allows
sending/processing an entire skb list.

This patch implements qdisc bulk dequeue, by allowing multiple packets
to be dequeued in dequeue_skb().

The optimization principle for this is two fold, (1) to amortize
locking cost and (2) avoid expensive tailptr update for notifying HW.
 (1) Several packets are dequeued while holding the qdisc root_lock,
amortizing locking cost over several packet.  The dequeued SKB list is
processed under the TXQ lock in dev_hard_start_xmit(), thus also
amortizing the cost of the TXQ lock.
 (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more
API to delay HW tailptr update, which also reduces the cost per
packet.

One restriction of the new API is that every SKB must belong to the
same TXQ.  This patch takes the easy way out, by restricting bulk
dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the
qdisc only have attached a single TXQ.

Some detail about the flow; dev_hard_start_xmit() will process the skb
list, and transmit packets individually towards the driver (see
xmit_one()).  In case the driver stops midway in the list, the
remaining skb list is returned by dev_hard_start_xmit().  In
sch_direct_xmit() this returned list is requeued by dev_requeue_skb().

To avoid overshooting the HW limits, which results in requeuing, the
patch limits the amount of bytes dequeued, based on the drivers BQL
limits.  In-effect bulking will only happen for BQL enabled drivers.

Small amounts for extra HoL blocking (2x MTU/0.24ms) were
measured at 100Mbit/s, with bulking 8 packets, but the
oscillating nature of the measurement indicate something, like
sched latency might be causing this effect. More comparisons
show, that this oscillation goes away occationally. Thus, we
disregard this artifact completely and remove any "magic" bulking
limit.

For now, as a conservative approach, stop bulking when seeing TSO and
segmented GSO packets.  They already benefit from bulking on their own.
A followup patch add this, to allow easier bisect-ability for finding
regressions.

Jointed work with Hannes, Daniel and Florian.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5772e9a3

03 10月, 2014 4 次提交

ipvs: Clean up comment style in ip_vs.h · 07dcc686

由 Simon Horman 提交于 9月 30, 2014

* Consistently use the multi-line comment style for networking code:

  /* This
   * That
   * The other thing
   */

* Use single-line comment style for comments with only one line of text.

* In general follow the leading '*' of each line of a comment with a
  single space and then text.

* Add missing line break between functions, remove double line break,
  align comments to previous lines whenever possible.
Reported-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

07dcc686

netfilter: explicit module dependency between br_netfilter and physdev · 4b7fd5d9

由 Pablo Neira Ayuso 提交于 10月 02, 2014

You can use physdev to match the physical interface enslaved to the
bridge device. This information is stored in skb->nf_bridge and it is
set up by br_netfilter. So, this is only available when iptables is
used from the bridge netfilter path.

Since 34666d46 ("netfilter: bridge: move br_netfilter out of the core"),
the br_netfilter code is modular. To reduce the impact of this change,
we can autoload the br_netfilter if the physdev match is used since
we assume that the users need br_netfilter in place.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4b7fd5d9

netfilter: move nf_send_resetX() code to nf_reject_ipvX modules · c8d7b98b

由 Pablo Neira Ayuso 提交于 9月 26, 2014

Move nf_send_reset() and nf_send_reset6() to nf_reject_ipv4 and
nf_reject_ipv6 respectively. This code is shared by x_tables and
nf_tables.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c8d7b98b

netfilter: nft_reject: introduce icmp code abstraction for inet and bridge · 51b0a5d8

由 Pablo Neira Ayuso 提交于 9月 26, 2014

This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides
an abstraction to the ICMP and ICMPv6 codes that you can use from the
inet and bridge tables, they are:

* NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable
* NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable
* NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable
* NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited

You can still use the specific codes when restricting the rule to match
the corresponding layer 3 protocol.

I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have
different semantics depending on the table family and to allow the user
to specify ICMP family specific codes if they restrict it to the
corresponding family.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

51b0a5d8

02 10月, 2014 2 次提交

net_sched: avoid calling tcf_unbind_filter() in call_rcu callback · a0efb80c

由 WANG Cong 提交于 9月 30, 2014

This fixes the following crash:

[   63.976822] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   63.980094] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted 3.17.0-rc6+ #648
[   63.980094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   63.980094] task: ffff880117dea690 ti: ffff880117dfc000 task.ti: ffff880117dfc000
[   63.980094] RIP: 0010:[<ffffffff817e6d07>]  [<ffffffff817e6d07>] u32_destroy_key+0x27/0x6d
[   63.980094] RSP: 0018:ffff880117dffcc0  EFLAGS: 00010202
[   63.980094] RAX: ffff880117dea690 RBX: ffff8800d02e0820 RCX: 0000000000000000
[   63.980094] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
[   63.980094] RBP: ffff880117dffcd0 R08: 0000000000000000 R09: 0000000000000000
[   63.980094] R10: 00006c0900006ba8 R11: 00006ba100006b9d R12: 0000000000000001
[   63.980094] R13: ffff8800d02e0898 R14: ffffffff817e6d4d R15: ffff880117387a30
[   63.980094] FS:  0000000000000000(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
[   63.980094] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   63.980094] CR2: 00007f07e6732fed CR3: 000000011665b000 CR4: 00000000000006e0
[   63.980094] Stack:
[   63.980094]  ffff88011a9cd300 ffffffff82051ac0 ffff880117dffce0 ffffffff817e6d68
[   63.980094]  ffff880117dffd70 ffffffff810cb4c7 ffffffff810cb3cd ffff880117dfffd8
[   63.980094]  ffff880117dea690 ffff880117dea690 ffff880117dfffd8 000000000000000a
[   63.980094] Call Trace:
[   63.980094]  [<ffffffff817e6d68>] u32_delete_key_freepf_rcu+0x1b/0x1d
[   63.980094]  [<ffffffff810cb4c7>] rcu_process_callbacks+0x3bb/0x691
[   63.980094]  [<ffffffff810cb3cd>] ? rcu_process_callbacks+0x2c1/0x691
[   63.980094]  [<ffffffff817e6d4d>] ? u32_destroy_key+0x6d/0x6d
[   63.980094]  [<ffffffff810780a4>] __do_softirq+0x142/0x323
[   63.980094]  [<ffffffff810782a8>] run_ksoftirqd+0x23/0x53
[   63.980094]  [<ffffffff81092126>] smpboot_thread_fn+0x203/0x221
[   63.980094]  [<ffffffff81091f23>] ? smpboot_unpark_thread+0x33/0x33
[   63.980094]  [<ffffffff8108e44d>] kthread+0xc9/0xd1
[   63.980094]  [<ffffffff819e00ea>] ? do_wait_for_common+0xf8/0x125
[   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61
[   63.980094]  [<ffffffff819e43ec>] ret_from_fork+0x7c/0xb0
[   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61

tp could be freed in call_rcu callback too, the order is not guaranteed.

John Fastabend says:

====================
Its worth noting why this is safe. Any running schedulers will either
read the valid class field or it will be zeroed.

All schedulers today when the class is 0 do a lookup using the
same call used by the tcf_exts_bind(). So even if we have a running
classifier hit the null class pointer it will do a lookup and get
to the same result. This is particularly fragile at the moment because
the only way to verify this is to audit the schedulers call sites.
====================

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0efb80c

udp: Generalize skb_udp_segment · 8bce6d7d

由 Tom Herbert 提交于 9月 29, 2014

skb_udp_segment is the function called from udp4_ufo_fragment to
segment a UDP tunnel packet. This function currently assumes
segmentation is transparent Ethernet bridging (i.e. VXLAN
encapsulation). This patch generalizes the function to
operate on either Ethertype or IP protocol.

The inner_protocol field must be set to the protocol of the inner
header. This can now be either an Ethertype or an IP protocol
(in a union). A new flag in the skbuff indicates which type is
effective. skb_set_inner_protocol and skb_set_inner_ipproto
helper functions were added to set the inner_protocol. These
functions are called from the point where the tunnel encapsulation
is occuring.

When skb_udp_tunnel_segment is called, the function to segment the
inner packet is selected based on the inner IP or Ethertype. In the
case of an IP protocol encapsulation, the function is derived from
inet[6]_offloads. In the case of Ethertype, skb->protocol is
set to the inner_protocol and skb_mac_gso_segment is called. (GRE
currently does this, but it might be possible to lookup the protocol
in offload_base and call the appropriate segmenation function
directly).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bce6d7d

01 10月, 2014 2 次提交

tcp: Change tcp_slow_start function to return void · a12a601e

由 Li RongQing 提交于 9月 30, 2014

No caller uses the return value, so make this function return void.
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a12a601e

ipv6: remove rt6i_genid · 705f1c86

由 Hannes Frederic Sowa 提交于 9月 28, 2014

Eric Dumazet noticed that all no-nonexthop or no-gateway routes which
are already marked DST_HOST (e.g. input routes routes) will always be
invalidated during sk_dst_check. Thus per-socket dst caching absolutely
had no effect and early demuxing had no effect.

Thus this patch removes rt6i_genid: fn_sernum already gets modified during
add operations, so we only must ensure we mutate fn_sernum during ipv6
address remove operations. This is a fairly cost extensive operations,
but address removal should not happen that often. Also our mtu update
functions do the same and we heard no complains so far. xfrm policy
changes also cause a call into fib6_flush_trees. Also plug a hole in
rt6_info (no cacheline changes).

I verified via tracing that this change has effect.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

705f1c86

30 9月, 2014 5 次提交

net: sched: enable per cpu qstats · b0ab6f92

由 John Fastabend 提交于 9月 28, 2014

After previous patches to simplify qstats the qstats can be
made per cpu with a packed union in Qdisc struct.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0ab6f92

net: sched: restrict use of qstats qlen · 64015853

由 John Fastabend 提交于 9月 28, 2014

This removes the use of qstats->qlen variable from the classifiers
and makes it an explicit argument to gnet_stats_copy_queue().

The qlen represents the qdisc queue length and is packed into
the qstats at the last moment before passnig to user space. By
handling it explicitely we avoid, in the percpu stats case, having
to figure out which per_cpu variable to put it in.

It would probably be best to remove it from qstats completely
but qstats is a user space ABI and can't be broken. A future
patch could make an internal only qstats structure that would
avoid having to allocate an additional u32 variable on the
Qdisc struct. This would make the qstats struct 128bits instead
of 128+32.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64015853

net: sched: implement qstat helper routines · 25331d6c

由 John Fastabend 提交于 9月 28, 2014

This adds helpers to manipulate qstats logic and replaces locations
that touch the counters directly. This simplifies future patches
to push qstats onto per cpu counters.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25331d6c

net: sched: make bstats per cpu and estimator RCU safe · 22e0f8b9

由 John Fastabend 提交于 9月 28, 2014

In order to run qdisc's without locking statistics and estimators
need to be handled correctly.

To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.

Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22e0f8b9

tcp: move TCP_ECN_create_request out of header · d82bd122

由 Florian Westphal 提交于 9月 29, 2014

After Octavian Purdilas tcp ipv4/ipv6 unification work this helper only
has a single callsite.

While at it, convert name to lowercase, suggested by Stephen.
Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d82bd122

29 9月, 2014 2 次提交

netfilter: nf_tables: store and dump set policy · 9363dc4b

由 Arturo Borrero 提交于 9月 23, 2014

We want to know in which cases the user explicitly sets the policy
options. In that case, we also want to dump back the info.
Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9363dc4b

net: tcp: more detailed ACK events and events for CE marked packets · 9890092e

由 Florian Westphal 提交于 9月 26, 2014

DataCenter TCP (DCTCP) determines cwnd growth based on ECN information
and ACK properties, e.g. ACK that updates window is treated differently
than DUPACK.

Also DCTCP needs information whether ACK was delayed ACK. Furthermore,
DCTCP also implements a CE state machine that keeps track of CE markings
of incoming packets.

Therefore, extend the congestion control framework to provide these
event types, so that DCTCP can be properly implemented as a normal
congestion algorithm module outside of the core stack.

Joint work with Daniel Borkmann and Glenn Judd.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NGlenn Judd <glenn.judd@morganstanley.com>
Acked-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9890092e