- 22 4月, 2012 29 次提交
-
-
由 Neal Cardwell 提交于
This commit moves the (substantial) common code shared between tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family independent function, tcp_init_sock(). Centralizing this functionality should help avoid drift issues, e.g. where the IPv4 side is updated without a corresponding update to IPv6. There was already some drift: IPv4 initialized snd_cwnd to TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to 2 (in this case it should not matter, since snd_cwnd is also initialized in tcp_init_metrics(), but the general risks and maintenance overhead remain). When diffing the old and new code, note that new tcp_init_sock() function uses the order of steps from the tcp_v4_init_sock() implementation (the order is slightly different in tcp_v6_init_sock()). Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
splice() from socket to pipe needs linear_to_page() helper to transfert skb header to part of page. We can reset the offset in the current sk->sk_sndmsg_page if we are the last user of the page. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch changes content of hashlist (used to get port struct by computed index (0...en_port_count-1)). Now the hash list contains only enabled ports so userspace will be able to say what ports can be used for tx/rx. This becomes handy when userspace will need to disable ports which does not belong to active aggregator. By default, newly added port is enabled. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Better to leave this for userspace Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
iov of more than 8 entries are allocated in sendmsg()/recvmsg() through sock_kmalloc() As these allocations are temporary only and small enough, it makes sense to use plain kmalloc() and avoid sk_omem_alloc atomic overhead. Slightly changed fast path to be even faster. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Mike Waychison <mikew@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
There are options, which are set up on a socket while performing TCP handshake. Need to resurrect them on a socket while repairing. A new sockoption accepts a buffer and parses it. The buffer should be CODE:VALUE sequence of bytes, where CODE is standard option code and VALUE is the respective value. Only 4 options should be handled on repaired socket. To read 3 out of 4 of these options the TCP_INFO sockoption can be used. An ability to get the last one (the mss_clamp) was added by the previous patch. Now the restore. Three of these options -- timestamp_ok, mss_clamp and snd_wscale -- are just restored on a coket. The sack_ok flags has 2 issues. First, whether or not to do sacks at all. This flag is just read and set back. No other sack info is saved or restored, since according to the standart and the code dropping all sack-ed segments is OK, the sender will resubmit them again, so after the repair we will probably experience a pause in connection. Next, the fack bit. It's just set back on a socket if the respective sysctl is set. No collected stats about packets flow is preserved. As far as I see (plz, correct me if I'm wrong) the fack-based congestion algorithm survives dropping all of the stats and repairs itself eventually, probably losing the performance for that period. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The mss_clamp is the only connection-time negotiated option which cannot be obtained from the user space. Make the TCP_MAXSEG sockopt report one in the repair mode. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Reading queues under repair mode is done with recvmsg call. The queue-under-repair set by TCP_REPAIR_QUEUE option is used to determine which queue should be read. Thus both send and receive queue can be read with this. Caller must pass the MSG_PEEK flag. Writing to queues is done with sendmsg call and yet again -- the repair-queue option can be used to push data into the receive queue. When putting an skb into receive queue a zero tcp header is appented to its head to address the tcp_hdr(skb)->syn and the ->fin checks by the (after repair) tcp_recvmsg. These flags flags are both set to zero and that's why. The fin cannot be met in the queue while reading the source socket, since the repair only works for closed/established sockets and queueing fin packet always changes its state. The syn in the queue denotes that the respective skb's seq is "off-by-one" as compared to the actual payload lenght. Thus, at the rcv queue refill we can just drop this flag and set the skb's sequences to precice values. When the repair mode is turned off, the write queue seqs are updated so that the whole queue is considered to be 'already sent, waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the protocol POV the send queue looks like it was sent, but the data between the write_seq and snd_nxt is lost in the network. This helps to avoid another sockoption for setting the snd_nxt sequence. Leaving the whole queue in a 'not yet sent' state (as it will be after sendmsg-s) will not allow to receive any acks from the peer since the ack_seq will be after the snd_nxt. Thus even the ack for the window probe will be dropped and the connection will be 'locked' with the zero peer window. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This includes (according the the previous description): * TCP_REPAIR sockoption This one just puts the socket in/out of the repair mode. Allowed for CAP_NET_ADMIN and for closed/establised sockets only. When repair mode is turned off and the socket happens to be in the established state the window probe is sent to the peer to 'unlock' the connection. * TCP_REPAIR_QUEUE sockoption This one sets the queue which we're about to repair. The 'no-queue' is set by default. * TCP_QUEUE_SEQ socoption Sets the write_seq/rcv_nxt of a selected repaired queue. Allowed for TCP_CLOSE-d sockets only. When the socket changes its state the other seq-s are changed by the kernel according to the protocol rules (most of the existing code is actually reused). * Ability to forcibly bind a socket to a port The sk->sk_reuse is set to SK_FORCE_REUSE. * Immediate connect modification The connect syscall initializes the connection, then directly jumps to the code which finalizes it. * Silent close modification The close just aborts the connection (similar to SO_LINGER with 0 time) but without sending any FIN/RST-s to peer. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This is just the preparation patch, which makes the needed for TCP repair code ready for use. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Name them in a "backward compatible" manner, i.e. reuse or not are still 1 and 0 respectively. The reuse value of 2 means that the socket with it will forcibly reuse everyone else's port. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The two options are separate, and some platforms (e.g. arm pxa) have ISA slots but no ISA dma controller, so they cannot build drivers using the DMA API functions. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
Some architectures like ARM cannot handle large numbers as arguments to udelay, so the drivers should use mdelay when delaying for multiple miliseconds. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
No driver should spin the CPU for 10ms, so better use an msleep, which is allowed in the ->suspend function. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The ax88796 driver uses the CRC32 functions, so make sure that they are actually enabled. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The iwmc3200 driver selects other code in Kconfig that depends on EXPERIMENTAL. Kconfig warns about this when CONFIG_EXPERIMENTAL is not already set, so logically, these options should also be marked experimental or promoted to stable. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The caif_shmcore requires io.h in order to use ioremap, so include that explicitly to compile in all configurations. Also add a note about the use of ioremap(), which is not a proper way to map a DMA buffer into kernel space. It's not completely clear what the intention is for using ioremap, but it is clear that the result of ioremap must not simply be accessed using kernel pointers but should use readl/writel or memcopy_{to,from}io. Assigning the result of ioremap to a regular pointer that can also be set to something else is not ok. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
Drivers that refer to a __devexit function in an operations structure need to annotate that pointer with __devexit_p so replace it with a NULL pointer when the section gets discarded. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The davinci_emac driver can be a module, so the symbols it needs from the cpdma driver must be exported. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Cochran 提交于
The time stamping code in this driver appears to have been copied from the ixp4xx_eth.c driver, including this timing comment. I had actually measured the time stamp delay on an IXP425, but I really doubt that this value also applies here. Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Cochran 提交于
This patch fixes code which needlessly ran the BPF twice per packet. Instead, we just run the classifier once and test whether the packet is any kind of PTP event message. Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
This patch fixes the driver so that multicast PTP event messages can be recognized by the hardware time stamping unit. The station address register must be set according to the desired transport type. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
We will let the pch_gbe code do that according to the receive time stamp filter. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
This patch clears up a few coding style issues: - Makes two function definitions a bit nicer looking. - Remove unneeded parentheses. - Simplify macros for register bits. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
The code in phc_gbe_main will need to call this method in order to set the station address register according to the receive time stamping filter. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
The reset logic after a Rx FIFO overrun will clear the programmed multicast addresses. This patch fixes the issue by reprogramming the registers after the reset. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
This patch makes logic surrounding the test of the transmit time stamping flag more readable. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Takahiro Shimizu 提交于
This patch fixes the helper functions that give the transmit and receive time stamps to return nanoseconds, instead of arbitrary clock ticks. [ RC - Rebased Takahiro's changes and wrote a commit message explaining the changes. ] Signed-off-by: NTakahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 4月, 2012 11 次提交
-
-
由 Eric W. Biederman 提交于
All of the users have been converted to use registera_net_sysctl so we no longer need register_net_sysctl. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
We don't use struct ctl_path anymore so delete the exported constants. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
There isn't much advantage here except that strings paths are a bit easier to read, and converting everything to them allows me to kill off ctl_path. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
The sysctl core no longer natively understands sysctl tables with .child entries. Split the ipv6_table to remove the .child entries. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
The sysctl core no longer natively understands sysctl tables with .child entries. Kill the intermediate tables and use register_net_sysctl directly to remove the need for compatibility code. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
Don't register/unregister every ax25 table in a batch. Instead register and unregister per device ax25 sysctls as ax25 devices come and go. This moves ax25 to be a completely modern sysctl user. Registering the sysctls in just the initial network namespace, removing the use of .child entries that are no longer natively supported by the sysctl core and taking advantage of the fact that there are no longer any ordering constraints between registering and unregistering different sysctl tables. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-