1. 02 12月, 2009 4 次提交
    • E
      net: Automatically allocate per namespace data. · f875bae0
      Eric W. Biederman 提交于
      To get the full benefit of batched network namespace cleanup netowrk
      device deletion needs to be performed by the generic code.  When
      using register_pernet_gen_device and freeing the data in exit_net
      it is impossible to delay allocation until after exit_net has called
      as the device uninit methods are no longer safe.
      
      To correct this, and to simplify working with per network namespace data
      I have moved allocation and deletion of per network namespace data into
      the network namespace core.  The core now frees the data only after
      all of the network namespace exit routines have run.
      
      Now it is only required to set the new fields .id and .size
      in the pernet_operations structure if you want network namespace
      data to be managed for you automatically.
      
      This makes the current register_pernet_gen_device and
      register_pernet_gen_subsys routines unnecessary.  For the moment
      I have left them as compatibility wrappers in net_namespace.h
      They will be removed once all of the users have been updated.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f875bae0
    • E
      net: Batch network namespace destruction. · 2b035b39
      Eric W. Biederman 提交于
      It is fairly common to kill several network namespaces at once.  Either
      because they are nested one inside the other or because they are cooperating
      in multiple machine networking experiments.  As the network stack control logic
      does not parallelize easily batch up multiple network namespaces existing
      together.
      
      To get the full benefit of batching the virtual network devices to be
      removed must be all removed in one batch.  For that purpose I have added
      a loop after the last network device operations have run that batches
      up all remaining network devices and deletes them.
      
      An extra benefit is that the reorganization slightly shrinks the size
      of the per network namespace data structures replaceing a work_struct
      with a list_head.
      
      In a trivial test with 4K namespaces this change reduced the cost of
      a destroying 4K namespaces from 7+ minutes (at 12% cpu) to 44 seconds
      (at 60% cpu).  The bulk of that 44s was spent in inet_twsk_purge.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b035b39
    • E
      net: Implement for_each_netdev_reverse. · dcbccbd4
      Eric W. Biederman 提交于
      I will need this shortly to implement network namespace shutdown
      batching.  For sanity sake network devices should be removed in
      the reverse order they were created in.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcbccbd4
    • E
      net: NETDEV_UNREGISTER_PERNET -> NETDEV_UNREGISTER_BATCH · a5ee1551
      Eric W. Biederman 提交于
      The motivation for an additional notifier in batched netdevice
      notification (rt_do_flush) only needs to be called once per batch not
      once per namespace.
      
      For further batching improvements I need a guarantee that the
      netdevices are unregistered in order allowing me to unregister an all
      of the network devices in a network namespace at the same time with
      the guarantee that the loopback device is really and truly
      unregistered last.
      
      Additionally it appears that we moved the route cache flush after
      the final synchronize_net, which seems wrong and there was no
      explanation.  So I have restored the original location of the final
      synchronize_net.
      
      Cc: Octavian Purdila <opurdila@ixiacom.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5ee1551
  2. 29 11月, 2009 3 次提交
  3. 27 11月, 2009 3 次提交
  4. 26 11月, 2009 2 次提交
  5. 24 11月, 2009 8 次提交
    • V
      sctp: Update max.burst implementation · 46d5a808
      Vlad Yasevich 提交于
      Current implementation of max.burst ends up limiting new
      data during cwnd decay period.  The decay is happening becuase
      the connection is idle and we are allowed to fill the congestion
      window.  The point of max.burst is to limit micro-bursts in response
      to large acks.  This still happens, as max.burst is still applied
      to each transmit opportunity.  It will also apply if a very large
      send is made (greater then allowed by burst).
      Tested-by: NFlorian Niederbacher <florian.niederbacher@student.uibk.ac.at>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      46d5a808
    • V
      sctp: Turn the enum socket options into defines · a5b03ad2
      Vlad Yasevich 提交于
      Recent attempt to remove deprecated socket options demonstrated
      that removing options from the enum space will have severe
      binary compatibility issues.  The reason is that it changes
      the subsequent enum space and causes option values to be redefined.
      To solve this, and to get rid of the ugly double statements for
      every option, we simply convert to the #define scheme.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      a5b03ad2
    • V
      sctp: Remove useless last_time_used variable · 245cba7e
      Vlad Yasevich 提交于
      The transport last_time_used variable is rather useless.
      It was only used when determining if CWND needs to be updated
      due to idle transport.  However, idle transport detection was
      based on a Heartbeat timer and last_time_used was not incremented
      when sending Heartbeats.  As a result the check for cwnd reduction
      was always true.  We can get rid of the variable and just base
      our cwnd manipulation on the HB timer (like the code comment sais).
      We also have to call into the cwnd manipulation function regardless
      of whether HBs are enabled or not.  That way we will detect idle
      transports if the user has disabled Heartbeats.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      245cba7e
    • A
      sctp: remove deprecated SCTP_GET_*_OLD stuffs · a242b41d
      Amerigo Wang 提交于
      SCTP_GET_*_OLD stuffs are schedlued to be removed.
      
      Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      a242b41d
    • V
      sctp: Update SWS avaoidance receiver side algorithm · 90f2f531
      Vlad Yasevich 提交于
      We currently send window update SACKs every time we free up 1 PMTU
      worth of data.  That a lot more SACKs then necessary.  Instead, we'll
      now send back the actuall window every time we send a sack, and do
      window-update SACKs when a fraction of the receive buffer has been
      opened.  The fraction is controlled with a sysctl.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      90f2f531
    • V
      sctp: Fix malformed "Invalid Stream Identifier" error · 6383cfb3
      Vlad Yasevich 提交于
      The "Invalid Stream Identifier" error has a 16 bit reserved
      field at the end, thus making the parameter length be 8 bytes.
      We've never supplied that reserved field making wireshark
      tag the packet as malformed.
      Reported-by: NChris Dischino <cdischino@sonusnet.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      6383cfb3
    • W
      sctp: implement the sender side for SACK-IMMEDIATELY extension · b93d6471
      Wei Yongjun 提交于
      This patch implement the sender side for SACK-IMMEDIATELY
      extension.
      
        Section 4.1.  Sender Side Considerations
      
        Whenever the sender of a DATA chunk can benefit from the
        corresponding SACK chunk being sent back without delay, the sender
        MAY set the I-bit in the DATA chunk header.
      
        Reasons for setting the I-bit include
      
        o  The sender is in the SHUTDOWN-PENDING state.
      
        o  The application requests to set the I-bit of the last DATA chunk
           of a user message when providing the user message to the SCTP
           implementation.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      b93d6471
    • W
      sctp: implement definition for SACK-IMMEDIATELY extension · 475cba4e
      Wei Yongjun 提交于
      This patch implement the definition for SACK-IMMEDIATELY
      extension.
      
      Section 3.  The I-bit in the DATA Chunk Header
      
         The following Figure 1 shows the extended DATA chunk.
      
          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |   Type = 0    |  Res  |I|U|B|E|           Length              |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                              TSN                              |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |        Stream Identifier      |     Stream Sequence Number    |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                  Payload Protocol Identifier                  |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         \                                                               \
         /                           User Data                           /
         \                                                               \
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      
                                       Figure 1
      
         The only difference between the DATA chunk in Figure 1 and the DATA
         chunk defined in [RFC4960] is the addition of the I-bit in the flags
         field of the chunk header.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      475cba4e
  6. 21 11月, 2009 1 次提交
  7. 20 11月, 2009 5 次提交
    • J
      mac80211: avoid spurious deauth frames/messages · a58ce43f
      Johannes Berg 提交于
      With WEXT, it happens frequently that the SME
      requests an authentication but then deauthenticates
      right away because some new parameters came along.
      Every time this happens we print a deauth message
      and send a deauth frame, but both of that is rather
      confusing. Avoid it by aborting the authentication
      process silently, and telling cfg80211 about that.
      
      The patch looks larger than it really is:
      __cfg80211_auth_remove() is split out from
      cfg80211_send_auth_timeout(), there's no new code
      except __cfg80211_auth_canceled() (a one-liner) and
      the mac80211 bits (7 new lines of code).
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      a58ce43f
    • J
      mac80211: request TX status where needed · 7351c6bd
      Johannes Berg 提交于
      Right now all frames mac80211 hands to the driver
      have the IEEE80211_TX_CTL_REQ_TX_STATUS flag set to
      request TX status. This isn't really necessary, only
      the injected frames need TX status (the latter for
      hostapd) so move setting this flag.
      
      The rate control algorithms also need TX status, but
      they don't require it.
      
      Also, rt2x00 uses that bit for its own purposes and
      seems to require it being set for all frames, but
      that can be fixed in rt2x00.
      
      This doesn't really change anything for any drivers
      but in the future drivers using hw-rate control may
      opt to not report TX status for frames that don't
      have the IEEE80211_TX_CTL_REQ_TX_STATUS flag set.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Acked-by: Ivo van Doorn <IvDoorn@gmail.com> [rt2x00 bits]
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      7351c6bd
    • J
      cfg80211: disallow bridging managed/adhoc interfaces · ad4bb6f8
      Johannes Berg 提交于
      A number of people have tried to add a wireless interface
      (in managed mode) to a bridge and then complained that it
      doesn't work. It cannot work, however, because in 802.11
      networks all packets need to be acknowledged and as such
      need to be sent to the right address. Promiscuous doesn't
      help here. The wireless address format used for these
      links has only space for three addresses, the
       * transmitter, which must be equal to the sender (origin)
       * receiver (on the wireless medium), which is the AP in
         the case of managed mode
       * the recipient (destination), which is on the APs local
         network segment
      
      In an IBSS, it is similar, but the receiver and recipient
      must match and the third address is used as the BSSID.
      
      To avoid such mistakes in the future, disallow adding a
      wireless interface to a bridge.
      
      Felix has recently added a four-address mode to the AP
      and client side that can be used (after negotiating that
      it is possible, which must happen out-of-band by setting
      up both sides) for bridging, so allow that case.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      ad4bb6f8
    • J
      cfg80211: introduce capability for 4addr mode · 9bc383de
      Johannes Berg 提交于
      It's very likely that not many devices will support
      four-address mode in station or AP mode so introduce
      capability bits for both modes, set them in mac80211
      and check them when userspace tries to use the mode.
      Also, keep track of 4addr in cfg80211 (wireless_dev)
      and not in mac80211 any more. mac80211 can also be
      improved for the VLAN case by not looking at the
      4addr flag but maintaining the station pointer for
      it correctly. However, keep track of use_4addr for
      station mode in mac80211 to avoid all the derefs.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      9bc383de
    • J
      cfg80211: convert bools into flags · 5be83de5
      Johannes Berg 提交于
      We've accumulated a number of options for wiphys
      which make more sense as flags as we keep adding
      more. Convert the existing ones.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      5be83de5
  8. 19 11月, 2009 9 次提交
  9. 18 11月, 2009 5 次提交
    • E
      linkwatch: linkwatch_forget_dev() to speedup device dismantle · e014debe
      Eric Dumazet 提交于
      Herbert Xu a écrit :
      > On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote:
      >> Really, the link watch stuff is just due for a redesign.  I don't
      >> think a simple hack is going to cut it this time, sorry Eric :-)
      >
      > I have no objections against any redesigns, but since the only
      > caller of linkwatch_forget_dev runs in process context with the
      > RTNL, it could also legally emit those events.
      
      Thanks guys, here an updated version then, before linkwatch surgery ?
      
      In this version, I force the event to be sent synchronously.
      
      [PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle
      
      time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105
      
      real	0m0.266s
      user	0m0.000s
      sys	0m0.001s
      
      real	0m0.770s
      user	0m0.000s
      sys	0m0.000s
      
      real	0m1.022s
      user	0m0.000s
      sys	0m0.000s
      
      One problem of current schem in vlan dismantle phase is the
      holding of device done by following chain :
      
      vlan_dev_stop() ->
      	netif_carrier_off(dev) ->
      		linkwatch_fire_event(dev) ->
      			dev_hold() ...
      
      And __linkwatch_run_queue() runs up to one second later...
      
      A generic fix to this problem is to add a linkwatch_forget_dev() method
      to unlink the device from the list of watched devices.
      
      dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory),
      to be able to unlink device in O(1).
      
      After patch :
      time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105
      
      real    0m0.024s
      user    0m0.000s
      sys     0m0.000s
      
      real    0m0.032s
      user    0m0.000s
      sys     0m0.001s
      
      real    0m0.033s
      user    0m0.000s
      sys     0m0.000s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e014debe
    • O
      net: introduce NETDEV_UNREGISTER_PERNET · 395264d5
      Octavian Purdila 提交于
      This new event is called once for each unique net namespace in batched
      unregister operations (with the argument set to a random device from
      that namespace) and once per device in non-batched unregister
      operations.
      
      It allows us to factorize some device unregister work such as clearing the
      routing cache.
      Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      395264d5
    • E
      net: add dev_txq_stats_fold() helper · d83345ad
      Eric Dumazet 提交于
      Some drivers ndo_get_stats() method need to perform txqueue stats folding.
      
      Move folding from dev_get_stats() to a new dev_txq_stats_fold() function
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d83345ad
    • P
      fcntl: rename F_OWNER_GID to F_OWNER_PGRP · 978b4053
      Peter Zijlstra 提交于
      This is for consistency with various ioctl() operations that include the
      suffix "PGRP" in their names, and also for consistency with PRIO_PGRP,
      used with setpriority() and getpriority().  Also, using PGRP instead of
      GID avoids confusion with the common abbreviation of "group ID".
      
      I'm fine with anything that makes it more consistent, and if PGRP is what
      is the predominant abbreviation then I see no need to further confuse
      matters by adding a third one.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      978b4053
    • A
      mm: allow memory hotplug and hibernation in the same kernel · 6ad696d2
      Andi Kleen 提交于
      Allow memory hotplug and hibernation in the same kernel
      
      Memory hotplug and hibernation were exclusive in Kconfig.  This is
      obviously a problem for distribution kernels who want to support both in
      the same image.
      
      After some discussions with Rafael and others the only problem is with
      parallel memory hotadd or removal while a hibernation operation is in
      process.  It was also working for s390 before.
      
      This patch removes the Kconfig level exclusion, and simply makes the
      memory add / remove functions grab the pm_mutex to exclude against
      hibernation.
      
      Fixes a regression - old kernels didn't exclude memory hotadd and
      hibernation.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ad696d2