- 21 10月, 2015 8 次提交
-
-
由 Alexander Aring 提交于
This patch removes the IPHC related defines for doing bit manipulation from global 6lowpan header to the iphc file which should the only one implementation which use these defines. Also move next header compression defines to their nhc implementation. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Alexander Aring 提交于
This patch removes the lowpan_fetch_skb_u8 function for getting the iphc bytes. Instead we using the generic which has a len parameter to tell the amount of bytes to fetch. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Acked-by: NJukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Alexander Aring 提交于
This patch changes the lowpan_header_decompress function by removing inklayer related information from parameters. This is currently for supporting short and extended address for iphc handling in 802154. We don't support short address handling anyway right now, but there exists already code for handling short addresses in lowpan_header_decompress. The address parameters are also changed to a void pointer, so 6LoWPAN linklayer specific code can put complex structures as these parameters and cast it again inside the generic code by evaluating linklayer type before. The order is also changed by destination address at first and then source address, which is the same like all others functions where destination is always the first, memcpy, dev_hard_header, lowpan_header_compress, etc. This patch also moves the fetching of iphc values from 6LoWPAN linklayer specific code into the generic branch. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Acked-by: NJukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Alexander Aring 提交于
This patch changes the lowpan_header_compress function by removing unused parameters like "len" and drop static value parameters of protocol type. Instead we really check the protocol type inside inside the skb structure. Also we drop the use of IEEE802154_ADDR_LEN which is link-layer specific. Instead we using EUI64_ADDR_LEN which should always the default case for now. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Acked-by: NJukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Alexander Aring 提交于
This patch introduces the LOWPAN_IPHC_MAX_HC_BUF_LEN define which represent the worst-case supported IPHC buffer length. It's used to allocate the stack buffer space for creating the IPHC header. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Acked-by: NJukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Marcel Holtmann 提交于
Before the vendor specific setup stage is triggered call back into the core to trigger an internal notification event. That event is used to send an index update to the monitor interface. With that specific event it is possible to update userspace with manufacturer information before any HCI command has been executed. This is useful for early stage debugging of vendor specific initialization sequences. Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
-
由 Marcel Holtmann 提交于
If the diagnostic settings are not persistent over HCI Reset, then this quirk can be used to tell the Bluetoth core about it. This will ensure that the settings are programmed correctly when the controller is powered up. Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
-
由 Johan Hedberg 提交于
There are LE devices on the market that start off by announcing their public address and then once paired switch to using private address. To be interoperable with such devices we should simply trust the fact that we're receiving an IRK from them to indicate that they may use private addresses in the future. Instead, simply tie the persistency to the bonding/no-bonding information the same way as for LTKs and CSRKs. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
- 19 10月, 2015 2 次提交
-
-
由 stephen hemminger 提交于
Add missing rule to export mpls iptunnel header needed by iproute2 Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
At the time of commit fff32699 ("tcp: reflect SYN queue_mapping into SYNACK packets") we had little ways to cope with SYN floods. We no longer need to reflect incoming skb queue mappings, and instead can pick a TX queue based on cpu cooking the SYNACK, with normal XPS affinities. Note that all SYNACK retransmits were picking TX queue 0, this no longer is a win given that SYNACK rtx are now distributed on all cpus. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 10月, 2015 4 次提交
-
-
由 Eric Dumazet 提交于
Greg reported crashes hitting the following check in __sk_backlog_rcv() BUG_ON(!sock_flag(sk, SOCK_MEMALLOC)); The pfmemalloc bit is currently checked in sk_filter(). This works correctly for TCP, because sk_filter() is ran in tcp_v[46]_rcv() before hitting the prequeue or backlog checks. For UDP or other protocols, this does not work, because the sk_filter() is ran from sock_queue_rcv_skb(), which might be called _after_ backlog queuing if socket is owned by user by the time packet is processed by softirq handler. Fixes: b4b9e355 ("netvm: set PF_MEMALLOC as appropriate during SKB processing") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NGreg Thelen <gthelen@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
A recent change to the dst_output handling caused a new warning when the call to NF_HOOK() is the only used of a local variable passed as 'dev', and CONFIG_NETFILTER is disabled: net/ipv6/ip6_output.c: In function 'ip6_output': net/ipv6/ip6_output.c:135:21: warning: unused variable 'dev' [-Wunused-variable] The reason for this is that the NF_HOOK macro in this case does not reference the variable at all, and the call to dev_net(dev) got removed from the ip6_output function. To avoid that warning now and in the future, this changes the macro into an equivalent inline function, which tells the compiler that the variable is passed correctly but still unused. The dn_forward function apparently had the same problem in the past and added a local workaround that no longer works with the inline function. In order to avoid a regression, we have to also remove the #ifdef from decnet in the same patch. Fixes: ede2059d ("dst: Pass net into dst->output") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
We don't care if module is being unloaded anymore since hook unregister handling will destroy queue entries using that hook. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
since commit 8405a8ff ("netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook") all pending queued entries are discarded. So we can simply remove all of the owner handling -- when module is removed it also needs to unregister all its hooks. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 16 10月, 2015 3 次提交
-
-
由 Jiri Pirko 提交于
This newly introduced netdevice notifier is called before actual change upper happens. That provides a possibility for notifier handlers to know upper change will happen and react to it, including possibility to forbid the change. That is valuable for drivers which can check if the upper device linkage is supported and forbid that in case it is not. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Under stress, a close() on a listener can trigger the WARN_ON(sk->sk_ack_backlog) in inet_csk_listen_stop() We need to test if listener is still active before queueing a child in inet_csk_reqsk_queue_add() Create a common inet_child_forget() helper, and use it from inet_csk_reqsk_queue_add() and inet_csk_listen_stop() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Let's reduce the confusion about inet_csk_reqsk_queue_drop() : In many cases we also need to release reference on request socket, so add a helper to do this, reducing code size and complexity. Fixes: 4bdc3d66 ("tcp/dccp: fix behavior of stale SYN_RECV request sockets") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 10月, 2015 11 次提交
-
-
由 Jiri Pirko 提交于
Similar to the attr usecase, the caller knows if he is holding RTNL and is in atomic section. So let the called to decide the correct call variant. This allows drivers to sleep inside their ops and wait for hw to get the operation status. Then the status is propagated into switchdev core. This avoids silent errors in drivers. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
When object is used in deferred work, we cannot use pointers in switchdev object structures because the memory they point at may be already used by someone else. So rather do local copy of the value. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NScott Feldman <sfeldma@gmail.com> Reviewed-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Caller should know if he can call attr_set directly (when holding RTNL) or if he has to defer the att_set processing for later. This also allows drivers to sleep inside attr_set and report operation status back to switchdev core. Switchdev core then warns if status is not ok, instead of silent errors happening in drivers. Benefit from newly introduced switchdev deferred ops infrastructure. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce infrastructure which will be used internally to defer ops. Note that the deferred ops are queued up and either are processed by scheduled work or explicitly by user calling deferred_process function. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jack Morgenstein 提交于
By design, when no default MAC addresses are set in the Hypervisor for VFs, the VFs are passed zero-macs. When such a MAC is received by the VF, it generates a random MAC address and registers that MAC address with the Hypervisor. This random mac generation is currently done in the mlx4_en module. There is a problem, though, if the mlx4_ib module is loaded by a VF before the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced zero-mac and register that zero-mac as part of QP1 initialization. Having a zero-mac in the port's MAC table creates problems for a Baseboard Management Console. The BMC occasionally sends packets with a zero-mac destination MAC. If there is a zero-mac present in the port's MAC table, the FW will send such BMC packets to the host driver rather than to the wire, and BMC will stop working. To address this problem, we move the replacement of zero-mac addresses with random-mac addresses to procedure mlx4_slave_cap(), which is part of the driver startup for VFs, and is before activation of mlx4_ib and mlx4_en. As a result, zero-mac addresses will never be registered in the port MAC table by the driver. In addition, when mlx4_en does initialize the net device, it needs to set the NET_ADDR_RANDOM flag in the netdev structure if the address was randomly generated. This is done so that udev on the VM does not create a new device name after each VF probe (VM boot and such). To accomplish this, we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en initializes the net-device. Fix was suggested by Matan Barak <matanb@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
On device initialization, wait till firmware indicates that that it is done with initialization before proceeding to initialize the device. Also update initialization segment layout to match driver/firmware interface definitions. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Majd Dibbiny 提交于
This patch implement the pci_error_handlers for mlx5_core which allow the driver to recover from PCI error. Once an error is detected in the PCI, the mlx5_pci_err_detected is called and it: 1) Marks the device to be in 'Internal Error' state. 2) Dispatches an event to the mlx5_ib to flush all the outstanding cqes with error. 3) Returns all the on going commands with error. 4) Unloads the driver. Afterwards, the FW is reset and mlx5_pci_slot_reset is called and it enables the device and restore it's pci state. If the later succeeds, mlx5_pci_resume is called, and it loads the SW stack. Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
The detection of a fatal condition has been updated to take into account the state reported by the device or by detecting an all ones read of the firmware version which indicates that the device is not accessible. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
At listen() time, there is a small window where listener is visible with a zero backlog, triggering a spurious "Possible SYN flooding on port" message. Nothing prevents us from setting the correct backlog. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dave Airlie 提交于
This zeroes the msg so no random stack data ends up getting sent, it also limits the function to not accepting > 4 i2c msgs. Cc: stable@vger.kernel.org Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 14 10月, 2015 1 次提交
-
-
由 Paolo Abeni 提交于
Revert the commit e2ca690b ("ipv4/icmp: redirect messages can use the ingress daddr as source"), which tried to introduce a more suitable behaviour for ICMP redirect messages generated by VRRP routers. However RFC 5798 section 8.1.1 states: The IPv4 source address of an ICMP redirect should be the address that the end-host used when making its next-hop routing decision. while said commit used the generating packet destination address, which do not match the above and in most cases leads to no redirect packets to be generated. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 10月, 2015 11 次提交
-
-
由 Alexandre Belloni 提交于
struct at91_can_data was used to pass a callback to the driver, allowing it to switch the transceiver on and off. As all at91 boards are now using DT, this is not used anymore, remove that structure. Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
-
由 Arnd Bergmann 提交于
The can subsystem communicates with user space using a bcm_msg_head header, which contains two timestamps. This is problematic for multiple reasons: a) The structure layout is currently incompatible between 64-bit user space and 32-bit user space, and cannot work in compat mode (other than x32). b) The timeval structure layout will change in 32-bit user space when we fix the y2038 overflow problem by redefining time_t to 64-bit, making new 32-bit user space incompatible with the current kernel interface. Cars last a long time and often use old kernels, so the actual users of this code are the most likely ones to migrate to y2038 safe user space. This tries to work around part of the problem by changing the publicly visible user interface in the header, but not the binary interface. Fortunately, the values passed around in the structure are relative times and do not actually suffer from the y2038 overflow, so 32-bit is enough here. We replace the use of 'struct timeval' with a newly defined 'struct bcm_timeval' that uses the exact same binary layout as before and that still suffers from problem a) but not problem b). The downside of this approach is that any user space program that currently assigns a timeval structure to these members rather than writing the tv_sec/tv_usec portions individually will suffer a compile-time error when built with an updated kernel header. Fixing this error makes it work fine with old and new headers though. We could address problem a) by using '__u32' or 'int' members rather than 'long', but that would have a more significant downside in also breaking support for all existing 64-bit user binaries that might be using this interface, which is likely not acceptable. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NOliver Hartkopp <socketcan@hartkopp.net> Cc: linux-can@vger.kernel.org Cc: linux-api@vger.kernel.org Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
-
由 David Ahern 提交于
Add operations to retrieve cached IPv6 dst entry from l3mdev device and lookup IPv6 source address. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
The function nf_ct_frag6_gather is called on both the input and the output paths of the networking stack. In particular ipv6_defrag which calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain on input and the LOCAL_OUT chain on output. The addition of a net parameter makes it explicit which network namespace the packets are being reassembled in, and removes the need for nf_ct_frag6_gather to guess. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Acked-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
The function ip_defrag is called on both the input and the output paths of the networking stack. In particular conntrack when it is tracking outbound packets from the local machine calls ip_defrag. So add a struct net parameter and stop making ip_defrag guess which network namespace it needs to defragment packets in. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Acked-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arad, Ronen 提交于
RTA_ALIGNTO is currently define as 4. It has to be 4U to prevent warning for RTA_ALIGN and RTA_DATA expansions when -Wconversion gcc option is enabled. This follows NLMSG_ALIGNTO definition in <include/uapi/linux/netlink.h>. Signed-off-by: NRonen Arad <ronen.arad@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
This patch allows configuring how the source address of ICMP redirect messages is selected; by default the old behaviour is retained, while setting icmp_redirects_use_orig_daddr force the usage of the destination address of the packet that caused the redirect. The new behaviour fits closely the RFC 5798 section 8.1.1, and fix the following scenario: Two machines are set up with VRRP to act as routers out of a subnet, they have IPs x.x.x.1/24 and x.x.x.2/24, with VRRP holding on to x.x.x.254/24. If a host in said subnet needs to get an ICMP redirect from the VRRP router, i.e. to reach a destination behind a different gateway, the source IP in the ICMP redirect is chosen as the primary IP on the interface that the packet arrived at, i.e. x.x.x.1 or x.x.x.2. The host will then ignore said redirect, due to RFC 1122 section 3.2.2.2, and will continue to use the wrong next-op. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Reducing tcp_timewait_sock from 280 bytes to 272 bytes allows SLAB to pack 15 objects per page instead of 14 (on x86) Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
One 32bit hole is following skc_refcnt, use it. skc_incoming_cpu can also be an union for request_sock rcv_wnd. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
sk->sk_refcnt is dirtied for every TCP/UDP incoming packet. This is a performance issue if multiple cpus hit a common socket, or multiple sockets are chained due to SO_REUSEPORT. By moving sk_refcnt 8 bytes further, first 128 bytes of sockets are mostly read. As they contain the lookup keys, this has a considerable performance impact, as cpus can cache them. These 8 bytes are not wasted, we use them as a place holder for various fields, depending on the socket type. Tested: SYN flood hitting a 16 RX queues NIC. TCP listener using 16 sockets and SO_REUSEPORT and SO_INCOMING_CPU for proper siloing. Could process 6.0 Mpps SYN instead of 4.2 Mpps Kernel profile looked like : 11.68% [kernel] [k] sha_transform 6.51% [kernel] [k] __inet_lookup_listener 5.07% [kernel] [k] __inet_lookup_established 4.15% [kernel] [k] memcpy_erms 3.46% [kernel] [k] ipt_do_table 2.74% [kernel] [k] fib_table_lookup 2.54% [kernel] [k] tcp_make_synack 2.34% [kernel] [k] tcp_conn_request 2.05% [kernel] [k] __netif_receive_skb_core 2.03% [kernel] [k] kmem_cache_alloc Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
SO_INCOMING_CPU as added in commit 2c8c56e1 was a getsockopt() command to fetch incoming cpu handling a particular TCP flow after accept() This commits adds setsockopt() support and extends SO_REUSEPORT selection logic : If a TCP listener or UDP socket has this option set, a packet is delivered to this socket only if CPU handling the packet matches the specified one. This allows to build very efficient TCP servers, using one listener per RX queue, as the associated TCP listener should only accept flows handled in softirq by the same cpu. This provides optimal NUMA behavior and keep cpu caches hot. Note that __inet_lookup_listener() still has to iterate over the list of all listeners. Following patch puts sk_refcnt in a different cache line to let this iteration hit only shared and read mostly cache lines. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-