1. 09 1月, 2008 2 次提交
  2. 04 1月, 2008 1 次提交
    • M
      [INET]: Fix netdev renaming and inet address labels · 44344b2a
      Mark McLoughlin 提交于
      When re-naming an interface, the previous secondary address
      labels get lost e.g.
      
        $> brctl addbr foo
        $> ip addr add 192.168.0.1 dev foo
        $> ip addr add 192.168.0.2 dev foo label foo:00
        $> ip addr show dev foo | grep inet
          inet 192.168.0.1/32 scope global foo
          inet 192.168.0.2/32 scope global foo:00
        $> ip link set foo name bar
        $> ip addr show dev bar | grep inet
          inet 192.168.0.1/32 scope global bar
          inet 192.168.0.2/32 scope global bar:2
      
      Turns out to be a simple thinko in inetdev_changename() - clearly we
      want to look at the address label, rather than the device name, for
      a suffix to retain.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44344b2a
  3. 30 12月, 2007 1 次提交
    • G
      [TCP]: use non-delayed ACK for congestion control RTT · 2072c228
      Gavin McCullagh 提交于
      When a delayed ACK representing two packets arrives, there are two RTT
      samples available, one for each packet.  The first (in order of seq
      number) will be artificially long due to the delay waiting for the
      second packet, the second will trigger the ACK and so will not itself
      be delayed.
      
      According to rfc1323, the SRTT used for RTO calculation should use the
      first rtt, so receivers echo the timestamp from the first packet in
      the delayed ack.  For congestion control however, it seems measuring
      delayed ack delay is not desirable as it varies independently of
      congestion.
      
      The patch below causes seq_rtt and last_ackt to be updated with any
      available later packet rtts which should have less (and hopefully
      zero) delack delay.  The rtt value then gets passed to
      ca_ops->pkts_acked().
      
      Where TCP_CONG_RTT_STAMP was set, effort was made to supress RTTs from
      within a TSO chunk (!fully_acked), using only the final ACK (which
      includes any TSO delay) to generate RTTs.  This patch removes these
      checks so RTTs are passed for each ACK to ca_ops->pkts_acked().
      
      For non-delay based congestion control (cubic, h-tcp), rtt is
      sometimes used for rtt-scaling.  In shortening the RTT, this may make
      them a little less aggressive.  Delay-based schemes (eg vegas, veno,
      illinois) should get a cleaner, more accurate congestion signal,
      particularly for small cwnds.  The congestion control module can
      potentially also filter out bad RTTs due to the delayed ack alarm by
      looking at the associated cnt which (where delayed acking is in use)
      should probably be 1 if the alarm went off or greater if the ACK was
      triggered by a packet.
      Signed-off-by: NGavin McCullagh <gavin.mccullagh@nuim.ie>
      Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2072c228
  4. 29 12月, 2007 1 次提交
    • S
      [IPV4] Fix ip=dhcp regression · 9cecd07c
      Simon Horman 提交于
      David Brownell pointed out a regression in my recent "Fix ip command
      line processing" patch. It turns out to be a fairly blatant oversight on
      my part whereby ic_enable is never set, and thus autoconfiguration is
      never enabled. Clearly my testing was broken :-(
      
      The solution that I have is to set ic_enable to 1 if we hit
      ip_auto_config_setup(), which basically means that autoconfiguration is
      activated unless told otherwise. I then flip ic_enable to 0 if ip=off,
      ip=none, ip=::::::off or ip=::::::none using ic_proto_name();
      
      The incremental patch is below, let me know if a non-incremental version
      is prepared, as I did as for the original patch to be reverted pending a
      fix.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cecd07c
  5. 27 12月, 2007 2 次提交
    • S
      [IPV4]: Fix ip command line processing. · a6c05c3d
      Simon Horman 提交于
      Recently the documentation in Documentation/nfsroot.txt was
      update to note that in fact ip=off and ip=::::::off as the
      latter is ignored and the default (on) is used.
      
      This was certainly a step in the direction of reducing confusion.
      But it seems to me that the code ought to be fixed up so that
      ip=::::::off actually turns off ip autoconfiguration.
      
      This patch also notes more specifically that ip=on (aka ip=::::::on)
      is the default.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6c05c3d
    • P
      [NETFILTER]: nf_conntrack_ipv4: fix module parameter compatibility · fae718dd
      Patrick McHardy 提交于
      Some users do "modprobe ip_conntrack hashsize=...". Since we have the
      module aliases this loads nf_conntrack_ipv4 and nf_conntrack, the
      hashsize parameter is unknown for nf_conntrack_ipv4 however and makes
      it fail.
      
      Allow to specify hashsize= for both nf_conntrack and nf_conntrack_ipv4.
      
      Note: the nf_conntrack message in the ringbuffer will display an
      incorrect hashsize since nf_conntrack is first pulled in as a
      dependency and calculates the size itself, then it gets changed
      through a call to nf_conntrack_set_hashsize().
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fae718dd
  6. 21 12月, 2007 2 次提交
  7. 20 12月, 2007 2 次提交
  8. 17 12月, 2007 1 次提交
  9. 15 12月, 2007 2 次提交
  10. 11 12月, 2007 2 次提交
  11. 07 12月, 2007 2 次提交
  12. 05 12月, 2007 6 次提交
  13. 03 12月, 2007 1 次提交
  14. 29 11月, 2007 2 次提交
    • S
      [TCP] illinois: Incorrect beta usage · a357dde9
      Stephen Hemminger 提交于
      Lachlan Andrew observed that my TCP-Illinois implementation uses the
      beta value incorrectly:
        The parameter  beta  in the paper specifies the amount to decrease
        *by*:  that is, on loss,
           W <-  W -  beta*W
        but in   tcp_illinois_ssthresh() uses  beta  as the amount
        to decrease  *to*: W <- beta*W
      
      This bug makes the Linux TCP-Illinois get less-aggressive on uncongested network,
      hurting performance. Note: since the base beta value is .5, it has no
      impact on a congested network.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      a357dde9
    • P
      [INET]: Fix inet_diag register vs rcv race · 07693198
      Pavel Emelyanov 提交于
      The following race is possible when one cpu unregisters the handler
      while other one is trying to receive a message and call this one:
      
      CPU1:                                                 CPU2:
      inet_diag_rcv()                                       inet_diag_unregister()
        mutex_lock(&inet_diag_mutex);
        netlink_rcv_skb(skb, &inet_diag_rcv_msg);
          if (inet_diag_table[nlh->nlmsg_type] == 
                                     NULL) /* false handler is still registered */
          ...
          netlink_dump_start(idiagnl, skb, nlh,
                                 inet_diag_dump, NULL);
                 cb = kzalloc(sizeof(*cb), GFP_KERNEL);
                         /* sleep here freeing memory 
                          * or preempt
                          * or sleep later on nlk->cb_mutex
                          */
                                                               spin_lock(&inet_diag_register_lock);
                                                               inet_diag_table[type] = NULL;
          ...                                                  spin_unlock(&inet_diag_register_lock);
                                                               synchronize_rcu();
                                                               /* CPU1 is sleeping - RCU quiescent
                                                                * state is passed
                                                                */
                                                               return;
          /* inet_diag_dump is finally called: */
          inet_diag_dump()
            handler = inet_diag_table[cb->nlh->nlmsg_type];
            BUG_ON(handler == NULL); 
            /* OOPS! While we slept the unregister has set
             * handler to NULL :(
             */
      
      Grep showed, that the register/unregister functions are called
      from init/fini module callbacks for tcp_/dccp_diag, so it's OK
      to use the inet_diag_mutex to synchronize manipulations with the
      inet_diag_table and the access to it.
      
      Besides, as Herbert pointed out, asynchronous dumps should hold 
      this mutex as well, and thus, we provide the mutex as cb_mutex one.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      07693198
  15. 26 11月, 2007 1 次提交
  16. 23 11月, 2007 2 次提交
  17. 21 11月, 2007 4 次提交
  18. 20 11月, 2007 6 次提交
    • E
      [NETFILTER]: Fix kernel panic with REDIRECT target. · 1f305323
      Evgeniy Polyakov 提交于
      When connection tracking entry (nf_conn) is about to copy itself it can
      have some of its extension users (like nat) as being already freed and
      thus not required to be copied.
      
      Actually looking at this function I suspect it was copied from
      nf_nat_setup_info() and thus bug was introduced.
      
      Report and testing from David <david@unsolicited.net>.
      
      [ Patrick McHardy states:
      
      	I now understand whats happening:
      
      	- new connection is allocated without helper
      	- connection is REDIRECTed to localhost
      	- nf_nat_setup_info adds NAT extension, but doesn't initialize it yet
      	- nf_conntrack_alter_reply performs a helper lookup based on the
      	   new tuple, finds the SIP helper and allocates a helper extension,
      	   causing reallocation because of too little space
      	- nf_nat_move_storage is called with the uninitialized nat extension
      
      	So your fix is entirely correct, thanks a lot :)  ]
      Signed-off-by: NEvgeniy Polyakov <johnpol@2ka.mipt.ru>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f305323
    • J
      [IPV4]: Add missing "space" · 464c4f18
      Joe Perches 提交于
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      464c4f18
    • S
      [TCP]: Problem bug with sysctl_tcp_congestion_control function · 5487796f
      Sam Jansen 提交于
      From: "Sam Jansen" <sjansen@google.com>
      
      sysctl_tcp_congestion_control seems to have a bug that prevents it
      from actually calling the tcp_set_default_congestion_control
      function. This is not so apparent because it does not return an error
      and generally the /proc interface is used to configure the default TCP
      congestion control algorithm.  This is present in 2.6.18 onwards and
      probably earlier, though I have not inspected 2.6.15--2.6.17.
      
      sysctl_tcp_congestion_control calls sysctl_string and expects a successful
      return code of 0. In such a case it actually sets the congestion control
      algorithm with tcp_set_default_congestion_control. Otherwise, it returns the
      value returned by sysctl_string. This was correct in 2.6.14, as sysctl_string
      returned 0 on success. However, sysctl_string was updated to return 1 on
      success around about 2.6.15 and sysctl_tcp_congestion_control was not updated.
      Even though sysctl_tcp_congestion_control returns 1, do_sysctl_strategy
      converts this return code to '0', so the caller never notices the error.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5487796f
    • I
      [TCP] MTUprobe: fix potential sk_send_head corruption · 6e421410
      Ilpo Jrvinen 提交于
      When the abstraction functions got added, conversion here was
      made incorrectly. As a result, the skb may end up pointing
      to skb which got included to the probe skb and then was freed.
      For it to trigger, however, skb_transmit must fail sending as
      well.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e421410
    • S
      [IPVS]: Move remaining sysctl handlers over to CTL_UNNUMBERED · 9055fa1f
      Simon Horman 提交于
      Switch the remaining IPVS sysctl entries over to to use CTL_UNNUMBERED,
      I stronly doubt that anyone is using the sys_sysctl interface to
      these variables.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9055fa1f
    • S
      [IPVS]: Fix sysctl warnings about missing strategy in schedulers · 9e103fa6
      Simon Horman 提交于
      sysctl table check failed: /net/ipv4/vs/lblc_expiration .3.5.21.19 Missing strategy
      [...]
      sysctl table check failed: /net/ipv4/vs/lblcr_expiration .3.5.21.20 Missing strategy
      
      Switch these entried over to use CTL_UNNUMBERED as clearly
      the sys_syscal portion wasn't working.
      
      This is along the same lines as Christian Borntraeger's patch that fixes
      up entries with no stratergy in net/ipv4/ipvs/ip_vs_ctl.c
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e103fa6