- 28 6月, 2016 10 次提交
-
-
由 Huw Davies 提交于
CALIPSO is a hop-by-hop IPv6 option. A lot of this patch is based on the equivalent CISPO code. The main difference is due to manipulating the options in the hop-by-hop header. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
This is to allow the CALIPSO labelling engine to use these. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
The functionality is equivalent to ipv6_renew_options() except that the newopt pointer is in kernel, not user, memory The kernel memory implementation will be used by the CALIPSO network labelling engine, which needs to be able to set IPv6 hop-by-hop options. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
Remove a specified DOI through the NLBL_CALIPSO_C_REMOVE command. It requires the attribute: NLBL_CALIPSO_A_DOI. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
This extends the NLBL_MGMT_C_ADD and NLBL_MGMT_C_ADDDEF commands to accept CALIPSO protocol DOIs. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
Enumerate the DOI list through the NLBL_CALIPSO_C_LISTALL command. It takes no attributes. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
Query a specified DOI through the NLBL_CALIPSO_C_LIST command. It requires the attribute: NLBL_CALIPSO_A_DOI. The reply will contain: NLBL_CALIPSO_A_MTYPE Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
CALIPSO is a packet labelling protocol for IPv6 which is very similar to CIPSO. It is specified in RFC 5570. Much of the code is based on the current CIPSO code. This adds support for adding passthrough-type CALIPSO DOIs through the NLBL_CALIPSO_C_ADD command. It requires attributes: NLBL_CALIPSO_A_TYPE which must be CALIPSO_MAP_PASS. NLBL_CALIPSO_A_DOI. In passthrough mode the CALIPSO engine will map MLS secattr levels and categories directly to the packet label. At this stage, the major difference between this and the CIPSO code is that IPv6 may be compiled as a module. To allow for this the CALIPSO functions are registered at module init time. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
The reason is to allow different labelling protocols for different address families with the same domain. This requires the addition of an address family attribute in the netlink communication protocol. It is used in several messages: NLBL_MGMT_C_ADD and NLBL_MGMT_C_ADDDEF take it as an optional attribute for the unlabelled protocol. It may be one of AF_INET, AF_INET6 or AF_UNSPEC (to specify both address families). If it is missing, it defaults to AF_UNSPEC. NLBL_MGMT_C_LISTALL and NLBL_MGMT_C_LISTDEF return it as part of the enumeration of each item. Addtionally, it may be sent to LISTDEF to specify which address family to return. Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Huw Davies 提交于
This fixes sparse errors of the form: incompatible types in comparison expression (different address spaces) Signed-off-by: NHuw Davies <huw@codeweavers.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 09 6月, 2016 1 次提交
-
-
由 Paul Moore 提交于
In cases where the category bitmap is sparse enough that gaps exist between netlbl_lsm_catmap structs, callers to netlbl_catmap_getlong() could find themselves prematurely ending their search through the category bitmap. Further, the methods used to calculate the 'idx' and 'off' values were incorrect for bitmaps this large. This patch changes the netlbl_catmap_getlong() behavior so that it always skips over gaps and calculates the index and offset values correctly. Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 07 6月, 2016 2 次提交
-
-
由 Paul Moore 提交于
Much like we had to do for AF_BLUETOOTH and AF_ALG, make sure we properly clone the parent socket's LSM attributes to newly created child sockets. Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Paul Moore 提交于
It seems risky to always rely on the caller to ensure the socket's address family is correct before passing it to the NetLabel kAPI, especially since we see at least one LSM which didn't. Add address family checks to the *_delattr() functions to help prevent future problems. Cc: <stable@vger.kernel.org> Reported-by: NManinder Singh <maninder1.s@samsung.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 12 4月, 2016 1 次提交
-
-
由 David Howells 提交于
Add a facility whereby proposed new links to be added to a keyring can be vetted, permitting them to be rejected if necessary. This can be used to block public keys from which the signature cannot be verified or for which the signature verification fails. It could also be used to provide blacklisting. This affects operations like add_key(), KEYCTL_LINK and KEYCTL_INSTANTIATE. To this end: (1) A function pointer is added to the key struct that, if set, points to the vetting function. This is called as: int (*restrict_link)(struct key *keyring, const struct key_type *key_type, unsigned long key_flags, const union key_payload *key_payload), where 'keyring' will be the keyring being added to, key_type and key_payload will describe the key being added and key_flags[*] can be AND'ed with KEY_FLAG_TRUSTED. [*] This parameter will be removed in a later patch when KEY_FLAG_TRUSTED is removed. The function should return 0 to allow the link to take place or an error (typically -ENOKEY, -ENOPKG or -EKEYREJECTED) to reject the link. The pointer should not be set directly, but rather should be set through keyring_alloc(). Note that if called during add_key(), preparse is called before this method, but a key isn't actually allocated until after this function is called. (2) KEY_ALLOC_BYPASS_RESTRICTION is added. This can be passed to key_create_or_update() or key_instantiate_and_link() to bypass the restriction check. (3) KEY_FLAG_TRUSTED_ONLY is removed. The entire contents of a keyring with this restriction emplaced can be considered 'trustworthy' by virtue of being in the keyring when that keyring is consulted. (4) key_alloc() and keyring_alloc() take an extra argument that will be used to set restrict_link in the new key. This ensures that the pointer is set before the key is published, thus preventing a window of unrestrictedness. Normally this argument will be NULL. (5) As a temporary affair, keyring_restrict_trusted_only() is added. It should be passed to keyring_alloc() as the extra argument instead of setting KEY_FLAG_TRUSTED_ONLY on a keyring. This will be replaced in a later patch with functions that look in the appropriate places for authoritative keys. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
-
- 06 4月, 2016 1 次提交
-
-
由 Janak Desai 提交于
We try to be clever and set large chunks of the bitmap at once, when possible; unfortunately we weren't very clever when we wrote the code and messed up the if-conditional. Fix this bug and restore proper operation. Signed-off-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 02 4月, 2016 1 次提交
-
-
由 Daniel Borkmann 提交于
Sasha Levin reported a suspicious rcu_dereference_protected() warning found while fuzzing with trinity that is similar to this one: [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage! [ 52.765688] other info that might help us debug this: [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1 [ 52.765701] 1 lock held by a.out/1525: [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20 [ 52.765721] stack backtrace: [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264 [...] [ 52.765768] Call Trace: [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8 [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110 [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90 [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun] [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun] [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210 [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun] [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690 [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60 [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90 [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140 [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25 Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH, DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices. Since the fix in f91ff5b9 ("net: sk_{detach|attach}_filter() rcu fixes") sk_attach_filter()/sk_detach_filter() now dereferences the filter with rcu_dereference_protected(), checking whether socket lock is held in control path. Since its introduction in 99405162 ("tun: socket filter support"), tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the sock_owned_by_user(sk) doesn't apply in this specific case and therefore triggers the false positive. Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair that is used by tap filters and pass in lockdep_rtnl_is_held() for the rcu_dereference_protected() checks instead. Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 4月, 2016 1 次提交
-
-
由 Nicolas Dichtel 提交于
Size of the attribute IFLA_PHYS_PORT_NAME was missing. Fixes: db24a904 ("net: add support for phys_port_name") CC: David Ahern <dsahern@gmail.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2016 5 次提交
-
-
由 Daniel Borkmann 提交于
Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl and tunnel_label members explicit. No issue has been observed, and gcc/llvm does padding for the old struct already, where tunnel_label was not yet present, so the current code works, but since it's part of uapi, make sure we don't introduce holes in structs. Therefore, add tunnel_ext that we can use generically in future (f.e. to flag OAM messages for backends, etc). Also add the offset to the compat tests to be sure should some compilers not padd the tail of the old version of bpf_tunnel_key. Fixes: 4018ab18 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
IPv6 counters updates use a different macro than IPv4. Fixes: 36cbb245 ("udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Rick Jones <rick.jones2@hp.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This patch should fix the issues seen with a recent fix to prevent tunnel-in-tunnel frames from being generated with GRO. The fix itself is correct for now as long as we do not add any devices that support NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the potential to mess things up due to the fact that the outer transport header points to the outer UDP header and not the GRE header as would be expected. Fixes: fac8e0f5 ("tunnels: Don't apply GRO to multiple layers of encapsulation.") Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcelo Ricardo Leitner 提交于
Somehow my patch for commit cea8768f ("sctp: allow sctp_transmit_packet and others to use gfp") missed two important chunks, which are now added. Fixes: cea8768f ("sctp: allow sctp_transmit_packet and others to use gfp") Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-By: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Haishuang Yan 提交于
When NET_SWITCHDEV=n, switchdev_port_attr_set will return -EOPNOTSUPP, we should ignore this error code and continue to set the ageing time. Fixes: c62987bb ("bridge: push bridge setting ageing_time down to switchdev") Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 3月, 2016 11 次提交
-
-
由 Liping Zhang 提交于
Commit fa50d974 ("ipv4: Namespaceify ip_default_ttl sysctl knob") use sock_net(skb->sk) to get the net namespace, but we can't assume that sk_buff->sk is always exist, so when it is NULL, oops will happen. Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com> Reviewed-by: NNikolay Borisov <kernel@kyup.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Make sure the table names via getsockopt GET_ENTRIES is nul-terminated in ebtables and all the x_tables variants and their respective compat code. Uncovered by KASAN. Reported-by: NBaozeng Ding <sploving1@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
When netlink unicast fails to deliver the message to userspace, we should also check if the NFQA_CFG_F_FAIL_OPEN flag is set so we reinject the packet back to the stack. I think the user expects no packet drops when this flag is set due to queueing to userspace errors, no matter if related to the internal queue or when sending the netlink message to userspace. The userspace application will still get the ENOBUFS error via recvmsg() so the user still knows that, with the current configuration that is in place, the userspace application is not consuming the messages at the pace that the kernel needs. Reported-by: N"Yigal Reiss (yreiss)" <yreiss@cisco.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Tested-by: N"Yigal Reiss (yreiss)" <yreiss@cisco.com>
-
由 Florian Westphal 提交于
Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Problem is that mark_source_chains should not have been called -- the rule doesn't have a next entry, so its supposed to return an absolute verdict of either ACCEPT or DROP. However, the function conditional() doesn't work as the name implies. It only checks that the rule is using wildcard address matching. However, an unconditional rule must also not be using any matches (no -m args). The underflow validator only checked the addresses, therefore passing the 'unconditional absolute verdict' test, while mark_source_chains also tested for presence of matches, and thus proceeeded to the next (not-existent) rule. Unify this so that all the callers have same idea of 'unconditional rule'. Reported-by: NBen Hawkes <hawkes@google.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Otherwise this function may read data beyond the ruleset blob. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
We should check that e->target_offset is sane before mark_source_chains gets called since it will fetch the target entry for loop detection. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arnd Bergmann 提交于
The openvswitch code has gained support for calling into the nf-nat-ipv4/ipv6 modules, however those can be loadable modules in a configuration in which openvswitch is built-in, leading to link errors: net/built-in.o: In function `__ovs_ct_lookup': :(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation' :(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation' The dependency on (!NF_NAT || NF_NAT) prevents similar issues, but NF_NAT is set to 'y' if any of the symbols selecting it are built-in, but the link error happens when any of them are modular. A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in, CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely to be useful in practice, but the driver currently only handles IPv6 being optional. This patch improves the Kconfig dependency so that openvswitch cannot be built-in if either of the two other symbols are set to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute() with two "if (IS_ENABLED())" checks that should catch all corner cases also make the code more readable. The same #ifdef exists ovs_ct_nat_to_attr(), where it does not cause a link error, but for consistency I'm changing it the same way. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: 05752523 ("openvswitch: Interface with NAT.") Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
OVS should call into CT NAT for packets of new expected connections only when the conntrack state is persisted with the 'commit' option to the OVS CT action. The test for this condition is doubly wrong, as the CT status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather than the mask (IPS_EXPECTED), and due to the wrong assumption that the expected bit would apply only for the first (i.e., 'new') packet of a connection, while in fact the expected bit remains on for the lifetime of an expected connection. The 'ctinfo' value IP_CT_RELATED derived from the ct status can be used instead, as it is only ever applicable to the 'new' packets of the expected connection. Fixes: 05752523 ('openvswitch: Interface with NAT.') Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Vishwanath Pai 提交于
This fix adds a new reference counter (ref_netlink) for the struct ip_set. The other reference counter (ref) can be swapped out by ip_set_swap and we need a separate counter to keep track of references for netlink events like dump. Using the same ref counter for dump causes a race condition which can be demonstrated by the following script: ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \ counters ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset save & ipset swap hash_ip3 hash_ip2 ipset destroy hash_ip3 /* will crash the machine */ Swap will exchange the values of ref so destroy will see ref = 0 instead of ref = 1. With this fix in place swap will not succeed because ipset save still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink). Both delete and swap will error out if ref_netlink != 0 on the set. Note: The changes to *_head functions is because previously we would increment ref whenever we called these functions, we don't do that anymore. Reviewed-by: NJoshua Hunt <johunt@akamai.com> Signed-off-by: NVishwanath Pai <vpai@akamai.com> Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Haishuang Yan 提交于
For the input parameter count, it's better to use the size of destination buffer size, as nla_memcpy would take into account the length of the source netlink attribute when a data is copied from an attribute. Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Quentin Armitage 提交于
For a route with IPv6 encapsulation, the traffic class and hop limit values are interchanged when returned to userspace by the kernel. For example, see below. ># ip route add 192.168.0.1 dev eth0.2 encap ip6 dst 0x50 tc 0x50 hoplimit 100 table 1000 ># ip route show table 1000 192.168.0.1 encap ip6 id 0 src :: dst fe83::1 hoplimit 80 tc 100 dev eth0.2 scope link Signed-off-by: NQuentin Armitage <quentin@armitage.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2016 7 次提交
-
-
由 Geliang Tang 提交于
Use KMEM_CACHE() instead of kmem_cache_create() to simplify the code. Signed-off-by: NGeliang Tang <geliangtang@163.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Don't open-code sizeof_footer() in read_partial_message() and ceph_msg_revoke(). Also, after switching to sizeof_footer(), it's now possible to use con_out_kvec_add() in prepare_write_message_footer(). Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Yan, Zheng 提交于
This helper duplicates last extent operation in OSD request, then adjusts the new extent operation's offset and length. The helper is for scatterd page writeback, which adds nonconsecutive dirty pages to single OSD request. Signed-off-by: NYan, Zheng <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Turn r_ops into a flexible array member to enable large, consisting of up to 16 ops, OSD requests. The use case is scattered writeback in cephfs and, as far as the kernel client is concerned, 16 is just a made up number. r_ops had size 3 for copyup+hint+write, but copyup is really a special case - it can only happen once. ceph_osd_request_cache is therefore stuffed with num_ops=2 requests, anything bigger than that is allocated with kmalloc(). req_mempool is backed by ceph_osd_request_cache, which means either num_ops=1 or num_ops=2 for use_mempool=true - all existing users (ceph_writepages_start(), ceph_osdc_writepages()) are fine with that. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
ceph_osd_request_cache was introduced a long time ago. Also, osd_req is about to get a flexible array member, which ceph_osd_request_cache is going to be aware of. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Although msg_size is calculated correctly, the terms are grouped in a misleading way - snaps appears to not have room for a u32 length. Move calculation closer to its use and regroup terms. No functional change. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
This avoids defining large array of r_reply_op_{len,result} in in struct ceph_osd_request. Signed-off-by: NYan, Zheng <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-