- 01 5月, 2012 18 次提交
-
-
由 Eric Dumazet 提交于
Add ECN (Explicit Congestion Notification) marking capability to netem tc qdisc add dev eth0 root netem drop 0.5 ecn Instead of dropping packets, try to ECN mark them. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Acked-by: NHagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP or UDP stacks have big enough latencies that prefetching next pointer is worth it. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Chapman 提交于
The netlink API lets users create unmanaged L2TPv3 tunnels using iproute2. Until now, a request to create an unmanaged L2TPv3 IP encapsulation tunnel over IPv6 would be rejected with EPROTONOSUPPORT. Now that l2tp_ip6 implements sockets for L2TP IP encapsulation over IPv6, we can add support for that tunnel type. Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Elston 提交于
L2TPv3 defines an IP encapsulation packet format where data is carried directly over IP (no UDP). The kernel already has support for L2TP IP encapsulation over IPv4 (l2tp_ip). This patch introduces support for L2TP IP encapsulation over IPv6. The implementation is derived from ipv6/raw and ipv4/l2tp_ip. Signed-off-by: NChris Elston <celston@katalix.com> Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Elston 提交于
For implementing other protocols on top of IPv6, such as L2TPv3's IP encapsulation over ipv6, we'd like to call some IPv6 functions which are not currently exported. This patch exports them. Signed-off-by: NChris Elston <celston@katalix.com> Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Elston 提交于
This patch adds support for unmanaged L2TPv3 tunnels over IPv6 using the netlink API. We already support unmanaged L2TPv3 tunnels over IPv4. A patch to iproute2 to make use of this feature will be submitted separately. Signed-off-by: NChris Elston <celston@katalix.com> Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Elston 提交于
If an L2TP tunnel uses IPv6, make sure the l2tp debugfs file shows the IPv6 address correctly. Signed-off-by: NChris Elston <celston@katalix.com> Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Chapman 提交于
Userspace uses connect() to associate a pppol2tp socket with a tunnel socket. This needs to allow the caller to supply the new IPv6 sockaddr_pppol2tp structures if IPv6 is used. Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Chapman 提交于
The l2tp_ip socket currently maintains packet/byte stats in its private socket structure. But these counters aren't exposed to userspace and so serve no purpose. The counters were also smp-unsafe. So this patch just gets rid of the stats. While here, change a couple of internal __u32 variables to u32. Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Chapman 提交于
Cleanup the l2tp_ip code to make use of an existing ipv4 support function. Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 James Chapman 提交于
L2TP uses 64-bit counters but since these are not updated atomically, we need to make them safe for smp. This patch addresses that. Signed-off-by: NJames Chapman <jchapman@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
__skb_splice_bits() can check if skb to be spliced has its skb->head mapped to a page fragment, instead of a kmalloc() area. If so we can avoid a copy of the skb head and get a reference on underlying page. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP coalesce can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We had to disable coalescing in this case, for performance reasons. We 'upgrade' skb->head as a fragment in itself. This reduces number of cache misses when user makes its copies, since a less sk_buff are fetched. This makes receive and ofo queues shorter and thus reduce cache line misses in TCP stack. This is a followup of patch "net: allow skb->head to be a page fragment" Tested with tg3 nic, with GRO on or off. We can see "TCPRcvCoalesce" counter being incremented. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
GRO can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We 'upgrade' skb->head as a fragment in itself This avoids the frag_list fallback, and permits to build true GRO skb (one sk_buff and up to 16 fragments), using less memory. This reduces number of cache misses when user makes its copy, since a single sk_buff is fetched. This is a followup of patch "net: allow skb->head to be a page fragment" Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
skb->head is currently allocated from kmalloc(). This is convenient but has the drawback the data cannot be converted to a page fragment if needed. We have three spots were it hurts : 1) GRO aggregation When a linear skb must be appended to another skb, GRO uses the frag_list fallback, very inefficient since we keep all struct sk_buff around. So drivers enabling GRO but delivering linear skbs to network stack aren't enabling full GRO power. 2) splice(socket -> pipe). We must copy the linear part to a page fragment. This kind of defeats splice() purpose (zero copy claim) 3) TCP coalescing. Recently introduced, this permits to group several contiguous segments into a single skb. This shortens queue lengths and save kernel memory, and greatly reduce probabilities of TCP collapses. This coalescing doesnt work on linear skbs (or we would need to copy data, this would be too slow) Given all these issues, the following patch introduces the possibility of having skb->head be a fragment in itself. We use a new skb flag, skb->head_frag to carry this information. build_skb() is changed to accept a frag_size argument. Drivers willing to provide a page fragment instead of kmalloc() data will set a non zero value, set to the fragment size. Then, on situations we need to convert the skb head to a frag in itself, we can check if skb->head_frag is set and avoid the copies or various fallbacks we have. This means drivers currently using frags could be updated to avoid the current skb->head allocation and reduce their memory footprint (aka skb truesize). (thats 512 or 1024 bytes saved per skb). This also makes bpf/netfilter faster since the 'first frag' will be part of skb linear part, no need to copy data. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paul Gortmaker 提交于
Some of the comment blocks are floating in limbo between two functions, or between blocks of code. Delete the extra line feeds between any comment and its associated following block of code, to be consistent with the majority of the rest of the kernel. Also delete trailing newlines at EOF and fix a couple trivial typos in existing comments. This is a 100% cosmetic change with no runtime impact. We get rid of over 500 lines of non-code, and being blank line deletes, they won't even show up as noise in git blame. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Herbert Xu 提交于
Unfortunately it seems that I didn't properly test the case of an expired external querier in the recent multicast bridge series. The setup of the timer in that case is completely broken and leads to a NULL-pointer dereference. This patch fixes it. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2012 5 次提交
-
-
由 Benjamin LaHaise 提交于
Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6. Support has been tested with and without hardware offloading. This version fixes the L2TP over localhost issue with incorrect checksums being reported. Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin LaHaise 提交于
Now that the sematics of udpv6_queue_rcv_skb() match IPv4's udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6. Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin LaHaise 提交于
In order to make sure that when the encap_rcv() hook is introduced it is not called with the socket lock held, move socket locking from callers into udpv6_queue_rcv_skb(), matching what happens in IPv4. Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin LaHaise 提交于
This is the first step in reworking the IPv6 UDP code to be structured more like the IPv4 UDP code. This patch creates __udpv6_queue_rcv_skb() with the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the backlog_rcv method. Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeffrin Jose 提交于
Fixed a coding style issue relating to spaces in net/core/sock.c Signed-off-by: NJeffrin Jose <ahiliation@yahoo.co.in> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 4月, 2012 12 次提交
-
-
由 Allan Stephens 提交于
Adds check to ensure TIPC sockets reject incoming payload messages that have an unrecognized message type. Remove the old open question about whether TIPC_ERR_NO_PORT is the proper return value. It is appropriate here since there are valid instances where another node can make use of the reply, and at this point in time the host is already broadcasting TIPC data, so there are no real security concerns. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Eric Dumazet 提交于
Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to match !!sock_flag() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 hartleys 提交于
Include the header to pickup the definitions of the global symbols. Quiets the following sparse warnings: warning: symbol 'crush_find_rule' was not declared. Should it be static? warning: symbol 'crush_do_rule' was not declared. Should it be static? Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com> Cc: Sage Weil <sage@newdream.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Quoting Tore Anderson from : https://bugzilla.kernel.org/show_bug.cgi?id=42572 When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment size does not take into account the size of the IPv6 Fragmentation header that needs to be included in outbound packets, causing every transmitted TCP segment to be fragmented across two IPv6 packets, the latter of which will only contain 8 bytes of actual payload. RTAX_FEATURE_ALLFRAG is typically set on a route in response to receving a ICMPv6 Packet Too Big message indicating a Path MTU of less than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6 PTBs with MTU < 1280 are still valid, in particular when an IPv6 packet is sent to an IPv4 destination through a stateless translator. Any ICMPv4 Need To Fragment packets originated from the IPv4 part of the path will be translated to ICMPv6 PTB which may then indicate an MTU of less than 1280. The Linux kernel refuses to reduce the effective MTU to anything below 1280 bytes, instead it sets it to exactly 1280 bytes, and RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header), instead of 1232 (additionally taking into account the 8 bytes required by the IPv6 Fragmentation extension header). This in turn results in rather inefficient transmission, as every transmitted TCP segment now is split in two fragments containing 1232+8 bytes of payload. After this patch, all the outgoing packets that includes a Fragmentation header all are "atomic" or "non-fragmented" fragments, i.e., they both have Offset=0 and More Fragments=0. With help from David S. Miller Reported-by: NTore Anderson <tore@fud.no> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Tested-by: NTore Anderson <tore@fud.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Allan Stephens 提交于
Consolidates validation of scope and name sequence range values into a single routine where it applies both to local name publications and to name publications issued by other nodes in the network. This change means that the scope value for non-local publications is now validated and the name sequence range for local publications is now validated only once. Additionally, a publication attempt that fails validation now creates an entry in the system log file only if debugging capabilities have been enabled; this prevents the system log from being cluttered up with messages caused by a defective application or network node. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Replaces two identical chunks of code that delete an unused name sequence structure from TIPC's name table with calls to a new routine that performs this operation. This change is cosmetic and doesn't impact the operation of TIPC. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Eliminate code to zero-out the main topology service structure, which is already zeroed-out. Get rid of a comment documenting a field of the main topology service structure that no longer exists. Both are cosmetic changes with no impact on runtime behaviour. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Initialization now occurs in the calling thread of control, rather than being deferred to the TIPC tasklet. With the current codebase, the deferral is no longer necessary. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Streamlines the job of re-initializing TIPC's network topology service when a node's network address is first assigned. Rather than destroying the topology server port and breaking its connections to existing subscribers, TIPC now simply lets the service continue running (since the change to the port identifier of each port used by the topology service no longer impacts the flow of messages between the service and its subscribers). This enhancement means that applications that utilize the topology service prior to the assignment of TIPC's network address no longer need to re-establish their subscriptions when the address is finally assigned. However, it is worth noting that any subsequent events for existing subscriptions report the new port identifier of the publishing port, rather than the original port identifier. (For example, a name that was previously reported as being published by <0.0.0:ref> may be subsequently withdrawn by <Z.C.N:ref>.) This doesn't impact any of the existing known userspace in tipc-utils, since (a) TIPC continues to treat references to the original port ID correctly and (b) normal use cases assign an address before active use. However if there does happen to be some rare/custom application out there that was relying on this, they can simply bypass the enhancement by issuing a subscription to {0,0} and break its connection to the topology service, if an associated withdrawal event occurs. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Termination no longer tests to see if the configuration service port was successfully created or not. In the unlikely event that the port was not created, attempting to delete the non-existent port is detected gracefully and causes no harm. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Initialization now occurs in the calling thread of control, rather than being deferred to the TIPC tasklet. With the current codebase, the deferral is no longer necessary. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
由 Allan Stephens 提交于
Streamlines the job of re-initializing TIPC's configuration service when a node's network address is first assigned. Rather than destroying the configuration server port and then recreating it, TIPC now simply withdraws the existing {0,<0.0.0>} name publication and creates a new {0,<Z.C.N>} name publication that identifies the node's network address to interested subscribers. Signed-off-by: NAllan Stephens <allan.stephens@windriver.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 26 4月, 2012 5 次提交
-
-
由 Pavel Emelyanov 提交于
Don't pick __u8/__u16 values directly from raw pointers, but instead use an array of structures of code:value pairs. This is OK, since the buffer we take options from is not an skb memory, but a user-to-kernel one. For those options which don't require any value now, require this to be zero (for potential future extension of this API). v2: Changed tcp_repair_opt to use two __u32-s as spotted by David Laight. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
The same macros is defined in 'include/net/af_ieee802154.h' and is called IEEE802154_ADDR_LEN. No need another one, so remove it. Signed-off-by: NAlexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Separate frame allocation routine from data processing function. This makes code more human readable and easier for understanding. Signed-off-by: NAlexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shan Wei 提交于
read only, so change it to const. Signed-off-by: NShan Wei <davidshan@tencent.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Some kfree_skb() calls should be replaced by consume_skb() to avoid drop_monitor/dropwatch false positives. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-