提交 · 02c81ab95d8718d75886d16227a10cc7774493ea · openanolis / cloud-kernel

27 12月, 2014 1 次提交

netlink: rename netlink_unbind() to netlink_undo_bind() · 02c81ab9

由 Johannes Berg 提交于 12月 22, 2014

The new name is more expressive - this isn't a generic unbind
function but rather only a little undo helper for use only in
netlink_bind().
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02c81ab9

19 12月, 2014 2 次提交

netlink: Don't reorder loads/stores before marking mmap netlink frame as available · a18e6a18

由 Thomas Graf 提交于 12月 18, 2014

Each mmap Netlink frame contains a status field which indicates
whether the frame is unused, reserved, contains data or needs to
be skipped. Both loads and stores may not be reordeded and must
complete before the status field is changed and another CPU might
pick up the frame for use. Use an smp_mb() to cover needs of both
types of callers to netlink_set_status(), callers which have been
reading data frame from the frame, and callers which have been
filling or releasing and thus writing to the frame.

- Example code path requiring a smp_rmb():
  memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len);
  netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);

- Example code path requiring a smp_wmb():
  hdr->nm_uid	= from_kuid(sk_user_ns(sk), NETLINK_CB(skb).creds.uid);
  hdr->nm_gid	= from_kgid(sk_user_ns(sk), NETLINK_CB(skb).creds.gid);
  netlink_frame_flush_dcache(hdr);
  netlink_set_status(hdr, NL_MMAP_STATUS_VALID);

Fixes: f9c228 ("netlink: implement memory mapped recvmsg()")
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a18e6a18

netlink: Always copy on mmap TX. · 4682a035

由 David Miller 提交于 12月 16, 2014

Checking the file f_count and the nlk->mapped count is not completely
sufficient to prevent the mmap'd area contents from changing from
under us during netlink mmap sendmsg() operations.

Be careful to sample the header's length field only once, because this
could change from under us as well.

Fixes: 5fd96123 ("netlink: implement memory mapped sendmsg()")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NThomas Graf <tgraf@suug.ch>

4682a035

11 12月, 2014 1 次提交

netlink: use jhash as hashfn for rhashtable · 7f19fc5e

由 Daniel Borkmann 提交于 12月 10, 2014

For netlink, we shouldn't be using arch_fast_hash() as a hashing
discipline, but rather jhash() instead.

Since netlink sockets can be opened by any user, a local attacker
would be able to easily create collisions with the DPDK-derived
arch_fast_hash(), which trades off performance for security by
using crc32 CPU instructions on x86_64.

While it might have a legimite use case in other places, it should
be avoided in netlink context, though. As rhashtable's API is very
flexible, we could later on still decide on other hashing disciplines,
if legitimate.

Reference: http://thread.gmane.org/gmane.linux.kernel/1844123
Fixes: e341694e ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f19fc5e

10 12月, 2014 1 次提交

put iov_iter into msghdr · c0371da6

由 Al Viro 提交于 11月 24, 2014

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c0371da6

24 11月, 2014 1 次提交
- A
  new helper: memcpy_from_msg() · 6ce8e9ce
  由 Al Viro 提交于 4月 06, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6ce8e9ce
20 11月, 2014 1 次提交

netlink: Deletion of an unnecessary check before the function call "__module_get" · fcd4d35e

由 Markus Elfring 提交于 11月 18, 2014

The __module_get() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcd4d35e

14 11月, 2014 3 次提交

rhashtable: Drop gfp_flags arg in insert/remove functions · 6eba8224

由 Thomas Graf 提交于 11月 13, 2014

Reallocation is only required for shrinking and expanding and both rely
on a mutex for synchronization and callers of rhashtable_init() are in
non atomic context. Therefore, no reason to continue passing allocation
hints through the API.

Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eba8224

rhashtable: Add parent argument to mutex_is_held · 7b4ce235

由 Herbert Xu 提交于 11月 13, 2014

Currently mutex_is_held can only test locks in the that are global
since it takes no arguments.  This prevents rhashtable from being
used in places where locks are lock, e.g., per-namespace locks.

This patch adds a parent field to mutex_is_held and rhashtable_params
so that local locks can be used (and tested).
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b4ce235

netlink: Move mutex_is_held under PROVE_LOCKING · 97127566

由 Herbert Xu 提交于 11月 13, 2014

The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled. This patch modifies netlink so that we can rhashtable.h
itself can later make mutex_is_held optional depending on PROVE_LOCKING.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97127566

13 11月, 2014 1 次提交

netlink: Properly unbind in error conditions. · 6251edd9

由 Hiroaki SHIMODA 提交于 11月 13, 2014

Even if netlink_kernel_cfg::unbind is implemented the unbind() method is
not called, because cfg->unbind is omitted in __netlink_kernel_create().
And fix wrong argument of test_bit() and off by one problem.

At this point, no unbind() method is implemented, so there is no real
issue.

Fixes: 4f520900 ("netlink: have netlink per-protocol bind function return an error code.")
Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Cc: Richard Guy Briggs <rgb@redhat.com>
Acked-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6251edd9

06 11月, 2014 1 次提交

net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b

由 David S. Miller 提交于 11月 05, 2014

This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".

When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.

Having a helper like this means there will be less places to touch
during that transformation.

Based upon descriptions and patch from Al Viro.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51f3d02b

22 10月, 2014 1 次提交

netlink: Re-add locking to netlink_lookup() and seq walker · 78fd1d0a

由 Thomas Graf 提交于 10月 21, 2014

The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com>
Reported-and-tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78fd1d0a

09 10月, 2014 1 次提交

fix misuses of f_count() in ppp and netlink · 24dff96a

由 Al Viro 提交于 10月 08, 2014

we used to check for "nobody else could start doing anything with
that opened file" by checking that refcount was 2 or less - one
for descriptor table and one we'd acquired in fget() on the way to
wherever we are.  That was race-prone (somebody else might have
had a reference to descriptor table and do fget() just as we'd
been checking) and it had become flat-out incorrect back when
we switched to fget_light() on those codepaths - unlike fget(),
it doesn't grab an extra reference unless the descriptor table
is shared.  The same change allowed a race-free check, though -
we are safe exactly when refcount is less than 2.

It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading
to ppp one) and 2.6.17 for sendmsg() (netlink one).  OTOH,
netlink hadn't grown that check until 3.9 and ppp used to live
in drivers/net, not drivers/net/ppp until 3.1.  The bug existed
well before that, though, and the same fix used to apply in old
location of file.

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

24dff96a

15 8月, 2014 1 次提交

netlink: Annotate RCU locking for seq_file walker · 9ce12eb1

由 Thomas Graf 提交于 8月 13, 2014

Silences the following sparse warnings:
net/netlink/af_netlink.c:2926:21: warning: context imbalance in 'netlink_seq_start' - wrong count at exit
net/netlink/af_netlink.c:2972:13: warning: context imbalance in 'netlink_seq_stop' - unexpected unlock
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ce12eb1

08 8月, 2014 1 次提交

netlink: reset network header before passing to taps · 4e48ed88

由 Daniel Borkmann 提交于 8月 07, 2014

netlink doesn't set any network header offset thus when the skb is
being passed to tap devices via dev_queue_xmit_nit(), it emits klog
false positives due to it being unset like:

  ...
  [  124.990397] protocol 0000 is buggy, dev nlmon0
  [  124.990411] protocol 0000 is buggy, dev nlmon0
  ...

So just reset the network header before passing to the device; for
packet sockets that just means nothing will change - mac and net
offset hold the same value just as before.
Reported-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e48ed88

07 8月, 2014 1 次提交

netlink: hold nl_sock_hash_lock during diag dump · 6c8f7e70

由 Thomas Graf 提交于 8月 07, 2014

Although RCU protection would be possible during diag dump, doing
so allows for concurrent table mutations which can render the
in-table offset between individual Netlink messages invalid and
thus cause legitimate sockets to be skipped in the dump.

Since the diag dump is relatively low volume and consistency is
more important than performance, the table mutex is held during
dump.
Reported-by: NAndrey Wagin <avagin@gmail.com>
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Fixes: e341694e ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c8f7e70

05 8月, 2014 1 次提交

netlink: fix lockdep splats · 67a24ac1

由 Eric Dumazet 提交于 8月 05, 2014

With netlink_lookup() conversion to RCU, we need to use appropriate
rcu dereference in netlink_seq_socket_idx() & netlink_seq_next()
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: e341694e ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67a24ac1

03 8月, 2014 1 次提交

netlink: Convert netlink_lookup() to use RCU protected hash table · e341694e

由 Thomas Graf 提交于 8月 02, 2014

Heavy Netlink users such as Open vSwitch spend a considerable amount of
time in netlink_lookup() due to the read-lock on nl_table_lock. Use of
RCU relieves the lock contention.

Makes use of the new resizable hash table to avoid locking on the
lookup.

The hash table will grow if entries exceeds 75% of table size up to a
total table size of 64K. It will automatically shrink if usage falls
below 30%.

Also splits nl_table_lock into a separate mutex to protect hash table
mutations and allow synchronize_rcu() to sleep while waiting for readers
during expansion and shrinking.

Before:
   9.16%  kpktgend_0  [openvswitch]      [k] masked_flow_lookup
   6.42%  kpktgend_0  [pktgen]           [k] mod_cur_headers
   6.26%  kpktgend_0  [pktgen]           [k] pktgen_thread_worker
   6.23%  kpktgend_0  [kernel.kallsyms]  [k] memset
   4.79%  kpktgend_0  [kernel.kallsyms]  [k] netlink_lookup
   4.37%  kpktgend_0  [kernel.kallsyms]  [k] memcpy
   3.60%  kpktgend_0  [openvswitch]      [k] ovs_flow_extract
   2.69%  kpktgend_0  [kernel.kallsyms]  [k] jhash2

After:
  15.26%  kpktgend_0  [openvswitch]      [k] masked_flow_lookup
   8.12%  kpktgend_0  [pktgen]           [k] pktgen_thread_worker
   7.92%  kpktgend_0  [pktgen]           [k] mod_cur_headers
   5.11%  kpktgend_0  [kernel.kallsyms]  [k] memset
   4.11%  kpktgend_0  [openvswitch]      [k] ovs_flow_extract
   4.06%  kpktgend_0  [kernel.kallsyms]  [k] _raw_spin_lock
   3.90%  kpktgend_0  [kernel.kallsyms]  [k] jhash2
   [...]
   0.67%  kpktgend_0  [kernel.kallsyms]  [k] netlink_lookup
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e341694e

01 8月, 2014 1 次提交

netlink: Use PAGE_ALIGNED macro · 74e83b23

由 Tobias Klauser 提交于 7月 31, 2014

Use PAGE_ALIGNED(...) instead of IS_ALIGNED(..., PAGE_SIZE).
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74e83b23

17 7月, 2014 1 次提交

netlink: remove bool varible · 498044bb

由 Varka Bhadram 提交于 7月 16, 2014

This patch removes the bool variable 'pass'.
If the swith case exist return true or return false.
Signed-off-by: NVarka Bhadram <varkab@cdac.in>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

498044bb

10 7月, 2014 1 次提交

netlink: Fix handling of error from netlink_dump(). · ac30ef83

由 Ben Pfaff 提交于 7月 09, 2014

netlink_dump() returns a negative errno value on error.  Until now,
netlink_recvmsg() directly recorded that negative value in sk->sk_err, but
that's wrong since sk_err takes positive errno values.  (This manifests as
userspace receiving a positive return value from the recv() system call,
falsely indicating success.) This bug was introduced in the commit that
started checking the netlink_dump() return value, commit b44d211e (netlink:
handle errors from netlink_dump()).

Multithreaded Netlink dumps are one way to trigger this behavior in
practice, as described in the commit message for the userspace workaround
posted here:
    http://openvswitch.org/pipermail/dev/2014-June/042339.html

This commit also fixes the same bug in netlink_poll(), introduced in commit
cd1df525 (netlink: add flow control for memory mapped I/O).
Signed-off-by: NBen Pfaff <blp@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac30ef83

08 7月, 2014 1 次提交

netlink: Fix do_one_broadcast() prototype. · 46c9521f

由 Rami Rosen 提交于 7月 01, 2014

This patch changes the prototype of the do_one_broadcast() method so that it will return void.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46c9521f

03 6月, 2014 1 次提交

netlink: Only check file credentials for implicit destinations · 2d7a85f4

由 Eric W. Biederman 提交于 5月 30, 2014

It was possible to get a setuid root or setcap executable to write to
it's stdout or stderr (which has been set made a netlink socket) and
inadvertently reconfigure the networking stack.

To prevent this we check that both the creator of the socket and
the currentl applications has permission to reconfigure the network
stack.

Unfortunately this breaks Zebra which always uses sendto/sendmsg
and creates it's socket without any privileges.

To keep Zebra working don't bother checking if the creator of the
socket has privilege when a destination address is specified.  Instead
rely exclusively on the privileges of the sender of the socket.

Note from Andy: This is exactly Eric's code except for some comment
clarifications and formatting fixes.  Neither I nor, I think, anyone
else is thrilled with this approach, but I'm hesitant to wait on a
better fix since 3.15 is almost here.

Note to stable maintainers: This is a mess.  An earlier series of
patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
but they did so in a way that breaks Zebra.  The offending series
includes:

    commit aa4cf945
    Author: Eric W. Biederman <ebiederm@xmission.com>
    Date:   Wed Apr 23 14:28:03 2014 -0700

        net: Add variants of capable for use on netlink messages

If a given kernel version is missing that series of fixes, it's
probably worth backporting it and this patch.  if that series is
present, then this fix is critical if you care about Zebra.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d7a85f4

25 4月, 2014 2 次提交

net: Add variants of capable for use on netlink messages · aa4cf945

由 Eric W. Biederman 提交于 4月 23, 2014

netlink_net_capable - The common case use, for operations that are safe on a network namespace
netlink_capable - For operations that are only known to be safe for the global root
netlink_ns_capable - The general case of capable used to handle special cases

__netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of
		       the skbuff of a netlink message.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa4cf945

netlink: Rename netlink_capable netlink_allowed · 5187cd05

由 Eric W. Biederman 提交于 4月 23, 2014

netlink_capable is a static internal function in af_netlink.c and we
have better uses for the name netlink_capable.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5187cd05

23 4月, 2014 2 次提交

netlink: implement unbind to netlink_setsockopt NETLINK_DROP_MEMBERSHIP · 7774d5e0

由 Richard Guy Briggs 提交于 4月 22, 2014

Call the per-protocol unbind function rather than bind function on
NETLINK_DROP_MEMBERSHIP in netlink_setsockopt().
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7774d5e0

netlink: have netlink per-protocol bind function return an error code. · 4f520900

由 Richard Guy Briggs 提交于 4月 22, 2014

Have the netlink per-protocol optional bind function return an int error code
rather than void to signal a failure.

This will enable netlink protocols to perform extra checks including
capabilities and permissions verifications when updating memberships in
multicast groups.

In netlink_bind() and netlink_setsockopt() the call to the per-protocol bind
function was moved above the multicast group update to prevent any access to
the multicast socket groups before checking with the per-protocol bind
function.  This will enable the per-protocol bind function to be used to check
permissions which could be denied before making them available, and to avoid
the messy job of undoing the addition should the per-protocol bind function
fail.

The netfilter subsystem seems to be the only one currently using the
per-protocol bind function.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f520900

12 4月, 2014 1 次提交

net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369

由 David S. Miller 提交于 4月 11, 2014

Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

676d2369

11 3月, 2014 1 次提交

netlink: autosize skb lengthes · 9063e21f

由 Eric Dumazet 提交于 3月 07, 2014

One known problem with netlink is the fact that NLMSG_GOODSIZE is
really small on PAGE_SIZE==4096 architectures, and it is difficult
to know in advance what buffer size is used by the application.

This patch adds an automatic learning of the size.

First netlink message will still be limited to ~4K, but if user used
bigger buffers, then following messages will be able to use up to 16KB.

This speedups dump() operations by a large factor and should be safe
for legacy applications.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9063e21f

26 2月, 2014 1 次提交

net: Fix permission check in netlink_connect() · 46833a86

由 Mike Pecovnik 提交于 2月 24, 2014

netlink_sendmsg() was changed to prevent non-root processes from sending
messages with dst_pid != 0.
netlink_connect() however still only checks if nladdr->nl_groups is set.
This patch modifies netlink_connect() to check for the same condition.
Signed-off-by: NMike Pecovnik <mike.pecovnik@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46833a86

18 2月, 2014 1 次提交

netlink: fix checkpatch errors space and "foo *bar" · 23b45672

由 Wang Yufen 提交于 2月 17, 2014

ERROR: spaces required and "(foo*)" should be "(foo *)"
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23b45672

19 1月, 2014 1 次提交

net: add build-time checks for msg->msg_name size · 342dfc30

由 Steffen Hurrle 提交于 1月 17, 2014

This is a follow-up patch to f3d33426 ("net: rework recvmsg
handler msg_name and msg_namelen logic").

DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.
Signed-off-by: NSteffen Hurrle <steffen@hurrle.net>
Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

342dfc30

07 1月, 2014 1 次提交

netlink: Avoid netlink mmap alloc if msg size exceeds frame size · aae9f0e2

由 Thomas Graf 提交于 11月 30, 2013

An insufficent ring frame size configuration can lead to an
unnecessary skb allocation for every Netlink message. Check frame
size before taking the queue lock and allocating the skb and
re-check with lock to be safe.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

aae9f0e2

02 1月, 2014 1 次提交

netlink: cleanup tap related functions · 2173f8d9

由 stephen hemminger 提交于 12月 30, 2013

Cleanups in netlink_tap code
 * remove unused function netlink_clear_multicast_users
 * make local function static
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2173f8d9

01 1月, 2014 2 次提交

netlink: specify netlink packet direction for nlmon · 604d13c9

由 Daniel Borkmann 提交于 12月 23, 2013

In order to facilitate development for netlink protocol dissector,
fill the unused field skb->pkt_type of the cloned skb with a hint
of the address space of the new owner (receiver) socket in the
notion of "to kernel" resp. "to user".

At the time we invoke __netlink_deliver_tap_skb(), we already have
set the new skb owner via netlink_skb_set_owner_r(), so we can use
that for netlink_is_kernel() probing.

In normal PF_PACKET network traffic, this field denotes if the
packet is destined for us (PACKET_HOST), if it's broadcast
(PACKET_BROADCAST), etc.

As we only have 3 bit reserved, we can use the value (= 6) of
PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
and not supported anywhere, and packets of such type were never
exposed to user space, so there are no overlapping users of such
kind. Thus, as wished, that seems the only way to make both
PACKET_* values non-overlapping and therefore device agnostic.

By using those two flags for netlink skbs on nlmon devices, they
can be made available and picked up via sll_pkttype (previously
unused in netlink context) in struct sockaddr_ll. We now have
these two directions:

 - PACKET_USER (= 6)    ->  to user space
 - PACKET_KERNEL (= 7)  ->  to kernel space

Partial `ip a` example strace for sa_family=AF_NETLINK with
detected nl msg direction:

syscall:                     direction:
sendto(3,  ...) = 40         /* to kernel */
recvmsg(3, ...) = 3404       /* to user */
recvmsg(3, ...) = 1120       /* to user */
recvmsg(3, ...) = 20         /* to user */
sendto(3,  ...) = 40         /* to kernel */
recvmsg(3, ...) = 168        /* to user */
recvmsg(3, ...) = 144        /* to user */
recvmsg(3, ...) = 20         /* to user */
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

604d13c9

netlink: only do not deliver to tap when both sides are kernel sks · 73bfd370

由 Daniel Borkmann 提交于 12月 23, 2013

We should also deliver packets to nlmon devices when we are in
netlink_unicast_kernel(), and only one of the {src,dst} sockets
is user sk and the other one kernel sk. That's e.g. the case in
netlink diag, netlink route, etc. Still, forbid to deliver messages
from kernel to kernel sks.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73bfd370

21 11月, 2013 1 次提交

net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426

由 Hannes Frederic Sowa 提交于 11月 21, 2013

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
	msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3d33426

20 11月, 2013 1 次提交

netlink: fix documentation typo in netlink_set_err() · 840e93f2

由 Johannes Berg 提交于 11月 19, 2013

The parameter is just 'group', not 'groups', fix the documentation typo.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

840e93f2

07 9月, 2013 1 次提交

net: netlink: filter particular protocols from analyzers · 5ffd5cdd

由 Daniel Borkmann 提交于 9月 05, 2013

Fix finer-grained control and let only a whitelist of allowed netlink
protocols pass, in our case related to networking. If later on, other
subsystems decide they want to add their protocol as well to the list
of allowed protocols they shall simply add it. While at it, we also
need to tell what protocol is in use otherwise BPF_S_ANC_PROTOCOL can
not pick it up (as it's not filled out).
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ffd5cdd

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功