提交 · 45f50bed1d808794e514e9eed0e579a8756ce2ba · openanolis / cloud-kernel

11 6月, 2016 15 次提交

net_sched: remove generic throttled management · 45f50bed

由 Eric Dumazet 提交于 6月 10, 2016

__QDISC_STATE_THROTTLED bit manipulation is rather expensive
for HTB and few others.

I already removed it for sch_fq in commit f2600cf0
("net: sched: avoid costly atomic operation in fq_dequeue()")
and so far nobody complained.

When one ore more packets are stuck in one or more throttled
HTB class, a htb dequeue() performs two atomic operations
to clear/set __QDISC_STATE_THROTTLED bit, while root qdisc
lock is held.

Removing this pair of atomic operations bring me a 8 % performance
increase on 200 TCP_RR tests, in presence of throttled classes.

This patch has no side effect, since nothing actually uses
disc_is_throttled() anymore.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45f50bed

net_sched: netem: remove qdisc_is_throttled() use · 42117927

由 Eric Dumazet 提交于 6月 10, 2016

Looks like it is only there as some optimization attempt.

Since __QDISC_STATE_THROTTLED set/unset is way too expensive,
and netem is the last user, just remove this check.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42117927

net_sched: cbq: remove a flaky use of qdisc_is_throttled() · cca605dd

由 Eric Dumazet 提交于 6月 10, 2016

So far no qdisc ever unset the throttled bit at enqueue() time,
so CBQ usage of qdisc_is_throttled() was flaky.

Since __QDISC_STATE_THROTTLED set/unset is way too expensive
considering that only CBQ was eventually caring for this status,
it would make sense to implement a Qdisc ops ->is_throttled()
if we find that this is needed.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cca605dd

net_sched: sch_plug: use a private throttled status · 8fe6a79f

由 Eric Dumazet 提交于 6月 10, 2016

We want to get rid of generic qdisc throttled management,
so this qdisc has to use a private flag.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fe6a79f

sctp: sctp should change socket state when shutdown is received · d46e416c

由 Xin Long 提交于 6月 09, 2016

Now sctp doesn't change socket state upon shutdown reception. It changes
just the assoc state, even though it's a TCP-style socket.

For some cases, if we really need to check sk->sk_state, it's necessary to
fix this issue, at least when we use ss or netstat to dump, we can get a
more exact information.

As an improvement, we will change sk->sk_state when we change asoc->state
to SHUTDOWN_RECEIVED, and also do it in sctp_shutdown to keep consistent
with sctp_close.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo R. Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d46e416c

tcp: add NV congestion control · 699fafaf

由 Lawrence Brakmo 提交于 6月 08, 2016

TCP-NV (New Vegas) is a major update to TCP-Vegas.
An earlier version of NV was presented at 2010's LPC.
It is a delayed based congestion avoidance for the
data center. This version has been tested within a
10G rack where the HW RTTs are 20-50us and with
1 to 400 flows.

A description of TCP-NV, including implementation
details as well as experimental results, can be found at:
http://www.brakmo.org/networking/tcp-nv/TCPNV.htmlSigned-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

699fafaf

tcp: add in_flight to tcp_skb_cb · 6f094b9e

由 Lawrence Brakmo 提交于 6月 08, 2016

Add in_flight (bytes in flight when packet was sent) field
to tx component of tcp_skb_cb and make it available to
congestion modules' pkts_acked() function through the
ack_sample function argument.
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f094b9e

packet: use common code for virtio_net_hdr and skb GSO conversion · 1276f24e

由 Mike Rapoport 提交于 6月 08, 2016

Replace open coded conversion between virtio_net_hdr to skb GSO info with
virtio_net_hdr_from_skb
Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1276f24e

RDS: IB: Remove deprecated create_workqueue · 231edca9

由 Bhaktipriya Shridhar 提交于 6月 08, 2016

alloc_workqueue replaces deprecated create_workqueue().

Since the driver is infiniband which can be used as block device and the
workqueue seems involved in regular operation of the device, so a
dedicated workqueue has been used  with WQ_MEM_RECLAIM set to guarantee
forward progress under memory pressure.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary here.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

231edca9

rxrpc: Limit the listening backlog · 0e119b41

由 David Howells 提交于 6月 10, 2016

Limit the socket incoming call backlog queue size so that a remote client
can't pump in sufficient new calls that the server runs out of memory.  Note
that this is partially theoretical at the moment since whilst the number of
calls is limited, the number of packets trying to set up new calls is not.
This will be addressed in a later patch.

If the caller of listen() specifies a backlog INT_MAX, then they get the
current maximum; anything else greater than max_backlog or anything
negative incurs EINVAL.

The limit on the maximum queue size can be set by:

	echo N >/proc/sys/net/rxrpc/max_backlog

where 4<=N<=32.

Further, set the default backlog to 0, requiring listen() to be called
before we start actually queueing new calls.  Whilst this kind of is a
change in the UAPI, the caller can't actually *accept* new calls anyway
unless they've first called listen() to put the socket into the LISTENING
state - thus the aforementioned new calls would otherwise just sit there,
eating up kernel memory.  (Note that sockets that don't have a non-zero
service ID bound don't get incoming calls anyway.)

Given that the default backlog is now 0, make the AFS filesystem call
kernel_listen() to set the maximum backlog for itself.

Possible improvements include:

 (1) Trimming a too-large backlog to max_backlog when listen is called.

 (2) Trimming the backlog value whenever the value is used so that changes
     to max_backlog are applied to an open socket automatically.  Note that
     the AFS filesystem opens one socket and keeps it open for extended
     periods, so would miss out on changes to max_backlog.

 (3) Having a separate setting for the AFS filesystem.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e119b41

rxrpc: Trim line-terminal whitespace · bc6e1ea3

由 David Howells 提交于 6月 10, 2016

Trim line-terminal whitespace in net/rxrpc/
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc6e1ea3

net, cls: allow for deleting all filters for given parent · ea7f8277

由 Daniel Borkmann 提交于 6月 10, 2016

Add a possibility where the user can just specify the parent and
all filters under that parent are then being purged. Currently,
for example for scripting, one needs to specify pref/prio to have
a well-defined number for 'tc filter del' command for addressing
the previously created instance or additionally filter handle in
case of priorities being the same. Improve usage by allowing the
option for tc to specify the parent and removing the whole chain
for that given parent.

Example usage after patch, no tc changes required:

  # tc qdisc replace dev foo clsact
  # tc filter add dev foo egress bpf da obj ./bpf.o
  # tc filter add dev foo egress bpf da obj ./bpf.o
  # tc filter show dev foo egress
  filter protocol all pref 49151 bpf
  filter protocol all pref 49151 bpf handle 0x1 bpf.o:[classifier] direct-action
  filter protocol all pref 49152 bpf
  filter protocol all pref 49152 bpf handle 0x1 bpf.o:[classifier] direct-action
  # tc filter del dev foo egress
  # tc filter show dev foo egress
  #

Previously, RTM_DELTFILTER requests with invalid prio of 0 were
rejected, so only netlink requests with RTM_NEWTFILTER and NLM_F_CREATE
flag were allowed where the kernel would auto-generate a pref/prio.
We can piggyback on that and use prio of 0 as a wildcard for
requests of RTM_DELTFILTER.

For notifying tc netlink monitoring users (e.g. libnl uses this
for caching), there are two options, that is, sending individual
tfilter_notify() notifications for each tcf_proto, or sending a
single one indicating wildcard removal. I tried both and there
are pros and cons for each, eventually I decided for sending
individual tfilter_notify(), so that user space can support this
seamlessly and there won't be a mess of changing each and every
application to make sure expectations from the kernel won't break
when they don't understand single notification. Since linear chains
don't really scale, I expect only a handful of classifiers to be
attached at max for a given parent anyway.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea7f8277

bpf: reject wrong sized filters earlier · f7bd9e36

由 Daniel Borkmann 提交于 6月 10, 2016

Add a bpf_check_basics_ok() and reject filters that are of invalid
size much earlier, so we don't do any useless work such as invoking
bpf_prog_alloc(). Currently, rejection happens in bpf_check_classic()
only, but it's really unnecessarily late and they should be rejected
at earliest point. While at it, also clean up one bpf_prog_size() to
make it consistent with the remaining invocations.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7bd9e36

bpf: enforce recursion limit on redirects · a70b506e

由 Daniel Borkmann 提交于 6月 10, 2016

Respect the stack's xmit_recursion limit for calls into dev_queue_xmit().
Currently, they are not handeled by the limiter when attached to clsact's
egress parent, for example, and a buggy program redirecting it to the
same device again could run into stack overflow eventually. It would be
good if we could notify an admin to give him a chance to react. We reuse
xmit_recursion instead of having one private to eBPF, so that the stack's
current recursion depth will be taken into account as well. Follow-up to
commit 3896d655 ("bpf: introduce bpf_clone_redirect() helper") and
27b29f63 ("bpf: add bpf_redirect() helper").
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a70b506e

openvswitch: Add packet truncation support. · f2a4d086

由 William Tu 提交于 6月 10, 2016

The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to
truncate packets. A 'max_len' is added for setting up the maximum
packet size, and a 'cutlen' field is to record the number of bytes
to trim the packet when the packet is outputting to a port, or when
the packet is sent to userspace.
Signed-off-by: NWilliam Tu <u9012063@gmail.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2a4d086

10 6月, 2016 7 次提交

packet: compat support for sock_fprog · 719c44d3

由 Willem de Bruijn 提交于 6月 07, 2016

Socket option PACKET_FANOUT_DATA takes a struct sock_fprog as argument
if PACKET_FANOUT has mode PACKET_FANOUT_CBPF. This structure contains
a pointer into user memory. If userland is 32-bit and kernel is 64-bit
the two disagree about the layout of struct sock_fprog.

Add compat setsockopt support to convert a 32-bit compat_sock_fprog to
a 64-bit sock_fprog. This is analogous to compat_sock_fprog support for
SO_REUSEPORT added in commit 19575988 ("soreuseport: add compat
case for setsockopt SO_ATTACH_REUSEPORT_CBPF").
Reported-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

719c44d3

net: vrf: Fix crash when IPv6 is disabled at boot time · e4348637

由 David Ahern 提交于 6月 09, 2016

Frank Kellermann reported a kernel crash with 4.5.0 when IPv6 is
disabled at boot using the kernel option ipv6.disable=1. Using
current net-next with the boot option:

$ ip link add red type vrf table 1001

Generates:
[12210.919584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000748
[12210.921341] IP: [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
[12210.922537] PGD b79e3067 PUD bb32b067 PMD 0
[12210.923479] Oops: 0000 [#1] SMP
[12210.924001] Modules linked in: ipvlan 8021q garp mrp stp llc
[12210.925130] CPU: 3 PID: 1177 Comm: ip Not tainted 4.7.0-rc1+ #235
[12210.926168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[12210.928065] task: ffff8800b9ac4640 ti: ffff8800bacac000 task.ti: ffff8800bacac000
[12210.929328] RIP: 0010:[<ffffffff814b30e3>]  [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
[12210.930697] RSP: 0018:ffff8800bacaf888  EFLAGS: 00010202
[12210.931563] RAX: 0000000000000748 RBX: ffffffff81a9e280 RCX: ffff8800b9ac4e28
[12210.932688] RDX: 00000000000000e9 RSI: 0000000000000002 RDI: 0000000000000286
[12210.933820] RBP: ffff8800bacaf898 R08: ffff8800b9ac4df0 R09: 000000000052001b
[12210.934941] R10: 00000000657c0000 R11: 000000000000c649 R12: 00000000000003e9
[12210.936032] R13: 00000000000003e9 R14: ffff8800bace7800 R15: ffff8800bb3ec000
[12210.937103] FS:  00007faa1766c700(0000) GS:ffff88013ac00000(0000) knlGS:0000000000000000
[12210.938321] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12210.939166] CR2: 0000000000000748 CR3: 00000000b79d6000 CR4: 00000000000406e0
[12210.940278] Stack:
[12210.940603]  ffff8800bb3ec000 ffffffff81a9e280 ffff8800bacaf8c8 ffffffff814b3135
[12210.941818]  ffff8800bb3ec000 ffffffff81a9e280 ffffffff81a9e280 ffff8800bace7800
[12210.943040]  ffff8800bacaf8f0 ffffffff81397c88 ffff8800bb3ec000 ffffffff81a9e280
[12210.944288] Call Trace:
[12210.944688]  [<ffffffff814b3135>] fib6_new_table+0x24/0x8a
[12210.945516]  [<ffffffff81397c88>] vrf_dev_init+0xd4/0x162
[12210.946328]  [<ffffffff814091e1>] register_netdevice+0x100/0x396
[12210.947209]  [<ffffffff8139823d>] vrf_newlink+0x40/0xb3
[12210.948001]  [<ffffffff814187f0>] rtnl_newlink+0x5d3/0x6d5
...

The problem above is due to the fact that the fib hash table is not
allocated when IPv6 is disabled at boot.

As for the VRF driver it should not do any IPv6 initializations if IPv6
is disabled, so it needs to know if IPv6 is disabled at boot. The disable
parameter is private to the IPv6 module, so provide an accessor for
modules to determine if IPv6 was disabled at boot time.

Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4348637

rxrpc: Simplify connect() implementation and simplify sendmsg() op · 2341e077

由 David Howells 提交于 6月 09, 2016

Simplify the RxRPC connect() implementation. It will just note the
destination address it is given, and if a sendmsg() comes along with no
address, this will be assigned as the address. No transport struct will be
held internally, which will allow us to remove this later.

Simplify sendmsg() also. Whilst a call is active, userspace refers to it
by a private unique user ID specified in a control message. When sendmsg()
sees a user ID that doesn't map to an extant call, it creates a new call
for that user ID and attempts to add it. If, when we try to add it, the
user ID is now registered, we now reject the message with -EEXIST. We
should never see this situation unless two threads are racing, trying to
create a call with the same ID - which would be an error.

It also isn't required to provide sendmsg() with an address - provided the
control message data holds a user ID that maps to a currently active call.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2341e077

net/netlink/af_netlink.h: Remove unused structure. · 21aff3b9

由 Fabien Siron 提交于 6月 07, 2016

Signed-off-by: NFabien Siron <fabien.siron@epita.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21aff3b9

net: add netdev_lockdep_set_classes() helper · d3fff6c4

由 Eric Dumazet 提交于 6月 09, 2016

It is time to add netdev_lockdep_set_classes() helper
so that lockdep annotations per device type are easier to manage.

This removes a lot of copies and missing annotations.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3fff6c4

net: sched: fix qdisc->running lockdep annotations · 52fbb290

由 Eric Dumazet 提交于 6月 09, 2016

1) qdisc_run_begin() is really using the equivalent of a trylock.
  Instead of using write_seqcount_begin(), use a combination of
  raw_write_seqcount_begin() and correct lockdep annotation.

2) sch_direct_xmit() should use regular spin_lock(root_lock)

Fixes: f9eb8aea ("net_sched: transform qdisc running bit into a seqcount")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52fbb290

sit: remove unnecessary protocol check in ipip6_tunnel_xmit() · adba931f

由 Simon Horman 提交于 6月 09, 2016

ipip6_tunnel_xmit() is called immediately after checking that
skb->protocol is  htons(ETH_P_IPV6) so there is no need
to check it a second time.

Found by inspection.
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NDinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adba931f

09 6月, 2016 18 次提交

mac80211: implement codel on fair queuing flows · 5caa328e

由 Michal Kazior 提交于 5月 19, 2016

There is no other limit other than a global
packet count limit when using software queuing.
This means a single flow queue can grow insanely
long. This is particularly bad for TCP congestion
algorithms which requires a little more
sophisticated frame dropping scheme than a mere
headdrop on limit overflow.

Hence apply (a slighly modified, to fit the knobs)
CoDel5 on flow queues. This improves TCP
convergence and stability when combined with
wireless driver which keeps its own tx queue/fifo
at a minimum fill level for given link conditions.
Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

5caa328e

mac80211: add debug knobs for fair queuing · 9399b86c

由 Michal Kazior 提交于 5月 19, 2016

This adds a debugfs entry to read and modify some fq parameters.

This makes it easy to debug, test and experiment.
Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
[remove module parameter for now]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

9399b86c

mac80211: implement fair queueing per txq · fa962b92

由 Michal Kazior 提交于 5月 19, 2016

mac80211's software queues were designed to work
very closely with device tx queues. They are
required to make use of 802.11 packet aggregation
easily and efficiently.

Due to the way 802.11 aggregation is designed it
only makes sense to keep fair queuing as close to
hardware as possible to reduce induced latency and
inertia and provide the best flow responsiveness.

This change doesn't translate directly to
immediate and significant gains. End result
depends on driver's induced latency. Best results
can be achieved if driver keeps its own tx
queue/fifo fill level to a minimum.
Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

fa962b92

mac80211: skip netdev queue control with software queuing · 80a83cfc

由 Michal Kazior 提交于 5月 19, 2016

Qdiscs are designed with no regard to 802.11
aggregation requirements and hand out
packet-by-packet with no guarantee they are
destined to the same tid. This does more bad than
good no matter how fairly a given qdisc may behave
on an ethernet interface.

Software queuing used per-AC netdev subqueue
congestion control whenever a global AC limit was
hit. This meant in practice a single station or
tid queue could starve others rather easily. This
could resonate with qdiscs in a bad way or could
just end up with poor aggregation performance.
Increasing the AC limit would increase induced
latency which is also bad.

Disabling qdiscs by default and performing
taildrop instead of netdev subqueue congestion
control on the other hand makes it possible for
tid queues to fill up "in the meantime" while
preventing stations starving each other.

This increases aggregation opportunities and
should allow software queuing based drivers
achieve better performance by utilizing airtime
more efficiently with big aggregates.
Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

80a83cfc

nl80211: clarify nl80211_set_reg() success path · 06627990

由 Johannes Berg 提交于 6月 09, 2016

Setting rd to NULL to avoid freeing it, just to be able to return
from the function in a single place, doesn't make much sense.

Return the set_regdom() return value directly.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

06627990

nl80211: Fix checkpatch warnings about blank lines · 7a087e74

由 Kirtika Ruchandani 提交于 5月 29, 2016

This patch fixes the following checkpatch.pl issues -
- Please don't use multiple blank lines
- Blank lines aren't necessary before a close brace
- Missing a blank line after declarations
Reviewed-by: NJulian Calaby <julian.calaby@gmail.com>
Signed-off-by: NKirtika Ruchandani <kirtika.ruchandani@gmail.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

7a087e74

nl80211: Fix spelling · 56ab364f

由 Kirtika Ruchandani 提交于 5月 29, 2016

Fix 'implementation' spelling, reported by checkpatch.pl
Signed-off-by: NKirtika Ruchandani <kirtika.ruchandani@gmail.com>
Reviewed-by: NJulian Calaby <julian.calaby@gmail.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

56ab364f

wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel · 3d5fdff4

由 Prasun Maiti 提交于 6月 06, 2016

iwpriv app uses iw_point structure to send data to Kernel. The iw_point
structure holds a pointer. For compatibility Kernel converts the pointer
as required for WEXT IOCTLs (SIOCIWFIRST to SIOCIWLAST). Some drivers
may use iw_handler_def.private_args to populate iwpriv commands instead
of iw_handler_def.private. For those case, the IOCTLs from
SIOCIWFIRSTPRIV to SIOCIWLASTPRIV will follow the path ndo_do_ioctl().
Accordingly when the filled up iw_point structure comes from 32 bit
iwpriv to 64 bit Kernel, Kernel will not convert the pointer and sends
it to driver. So, the driver may get the invalid data.

The pointer conversion for the IOCTLs (SIOCIWFIRSTPRIV to
SIOCIWLASTPRIV), which follow the path ndo_do_ioctl(), is mandatory.
This patch adds pointer conversion from 32 bit to 64 bit and vice versa,
if the ioctl comes from 32 bit iwpriv to 64 bit Kernel.

Cc: stable@vger.kernel.org
Signed-off-by: NPrasun Maiti <prasunmaiti87@gmail.com>
Signed-off-by: NUjjal Roy <royujjal@gmail.com>
Tested-by: NDibyajyoti Ghosh <dibyajyotig@gmail.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

3d5fdff4

cfg80211: remove get/set antenna and tx power warnings · 6cbf6236

由 Johannes Berg 提交于 6月 09, 2016

Since set_tx_power and set_antenna are frequently implemented
without the matching get_tx_power/get_antenna, we shouldn't
have added warnings for those. Remove them.

The remaining ones are correct and need to be implemented
symmetrically for correct operation.

Cc: stable@vger.kernel.org
Fixes: de3bb771 ("cfg80211: add more warnings for inconsistent ops")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

6cbf6236

sched: remove qdisc->drop · a09ceb0e

由 Florian Westphal 提交于 6月 09, 2016

after removal of TCA_CBQ_OVL_STRATEGY from cbq scheduler, there are no
more callers of ->drop() outside of other ->drop functions, i.e.
nothing calls them.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a09ceb0e

sched: remove qdisc_rehape_fail · c3a173d7

由 Florian Westphal 提交于 6月 09, 2016

After the removal of TCA_CBQ_POLICE in cbq scheduler qdisc->reshape_fail
is always NULL, i.e. qdisc_rehape_fail is now the same as qdisc_drop.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3a173d7

cbq: remove TCA_CBQ_POLICE support · dd47c1fa

由 Florian Westphal 提交于 6月 09, 2016

iproute2 doesn't implement any cbq option that results in this attribute
being sent to kernel.

To make use of it, user would have to

- patch iproute2
- add a class
- attach a qdisc to the class (default pfifo doesn't work as
  q->handle is 0 and cbq_set_police() is a no-op in this case)
- re-'add' the same class (tc class change ...) again
- user must also specifiy a defmap (e.g. 'split 1:0 defmap 3f'), since
  this 'police' feature relies on its presence
- the added qdisc must be one of bfifo, pfifo or netem

If all of these conditions are met and _some_ leaf qdiscs, namely
p/bfifo, netem, plug or tbf would drop a packet, kernel calls back into
cbq, which will attempt to re-queue the skb into a different class
as indicated by the parents' defmap entry for TC_PRIO_BESTEFFORT.

[ i.e. we behave as if tc_classify returned TC_ACT_RECLASSIFY ].

This feature, which isn't documented or implemented in iproute2,
and isn't implemented consistently (most qdiscs like sfq, codel, etc
drop right away instead of attempting this reclassification) is the
sole reason for the reshape_fail and __parent member in Qdisc struct.

So remove TCA_CBQ_POLICE support from the kernel, reject it via EOPNOTSUPP
so userspace knows we don't support it, and then remove no-longer needed
infrastructure in followup commit.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd47c1fa

cbq: remove TCA_CBQ_OVL_STRATEGY support · c3498d34

由 Florian Westphal 提交于 6月 09, 2016

since initial revision of cbq in 2004 iproute 2 has never implemented
support for TCA_CBQ_OVL_STRATEGY, which is what needs to be set to
activate the class->drop() call (TC_CBQ_OVL_DROP strategy must be
set by userspace value must be set by userspace).

David Miller says:
   It seems really safe to kill this thing off, flag an error if someone
   tries to set the attribute, and therefore kill off all of the
   non-default cbq_ovl_*() functions.

A followup commit can then remove all .drop qdisc methods since this
removed the only caller.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3498d34

qfq: don't leak skb if kzalloc fails · 9b15350f

由 Florian Westphal 提交于 6月 08, 2016

When we need to create a new aggregate to enqueue the skb we call kzalloc.
If that fails we returned ENOBUFS without freeing the skb.

Spotted during code review.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b15350f

ip6gre: Allow live link address change · 76e48f9f

由 Shweta Choudaha 提交于 6月 08, 2016

The ip6 GRE tap device should not be forced to down state to change
the mac address and should allow live address change for tap device
similar to ipv4 gre.
Signed-off-by: NShweta Choudaha <schoudah@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76e48f9f

ip6gre: Allow live link address change · 0a46baaf

由 Shweta Choudaha 提交于 6月 08, 2016

The ip6 GRE tap device should not be forced to down state to change
the mac address and should allow live address change for tap device
similar to ipv4 gre.
Signed-off-by: NShweta Choudaha <schoudah@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a46baaf

net: cls_u32: be more strict about skip-sw flag for knodes · 201c44bd

由 Jakub Kicinski 提交于 6月 08, 2016

Return an error if user requested skip-sw and the underlaying
hardware cannot handle tc offloads (or offloads are disabled).
This patch fixes the knode handling.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

201c44bd

net: cls_u32: catch all hardware offload errors · 6eef3801

由 Jakub Kicinski 提交于 6月 08, 2016

Errors reported by u32_replace_hw_hnode() were not propagated.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eef3801

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功