1. 29 1月, 2011 3 次提交
  2. 28 1月, 2011 8 次提交
  3. 27 1月, 2011 4 次提交
    • D
      net: Implement read-only protection and COW'ing of metrics. · 62fa8a84
      David S. Miller 提交于
      Routing metrics are now copy-on-write.
      
      Initially a route entry points it's metrics at a read-only location.
      If a routing table entry exists, it will point there.  Else it will
      point at the all zero metric place-holder called 'dst_default_metrics'.
      
      The writeability state of the metrics is stored in the low bits of the
      metrics pointer, we have two bits left to spare if we want to store
      more states.
      
      For the initial implementation, COW is implemented simply via kmalloc.
      However future enhancements will change this to place the writable
      metrics somewhere else, in order to increase sharing.  Very likely
      this "somewhere else" will be the inetpeer cache.
      
      Note also that this means that metrics updates may transiently fail
      if we cannot COW the metrics successfully.
      
      But even by itself, this patch should decrease memory usage and
      increase cache locality especially for routing workloads.  In those
      cases the read-only metric copies stay in place and never get written
      to.
      
      TCP workloads where metrics get updated, and those rare cases where
      PMTU triggers occur, will take a very slight performance hit.  But
      that hit will be alleviated when the long-term writable metrics
      move to a more sharable location.
      
      Since the metrics storage went from a u32 array of RTAX_MAX entries to
      what is essentially a pointer, some retooling of the dst_entry layout
      was necessary.
      
      Most importantly, we need to preserve the alignment of the reference
      count so that it doesn't share cache lines with the read-mostly state,
      as per Eric Dumazet's alignment assertion checks.
      
      The only non-trivial bit here is the move of the 'flags' member into
      the writeable cacheline.  This is OK since we are always accessing the
      flags around the same moment when we made a modification to the
      reference count.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62fa8a84
    • D
      xfrm6: Don't forget to propagate peer into ipsec route. · 7cc2edb8
      David S. Miller 提交于
      Like ipv4, we have to propagate the ipv6 route peer into
      the ipsec top-level route during instantiation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7cc2edb8
    • J
      mac80211: use DECLARE_EVENT_CLASS · ba99d93b
      Johannes Berg 提交于
      For events that include only the local struct as
      their parameter, we can use DECLARE_EVENT_CLASS
      and save quite some binary size across segments
      as well lines of code.
      
         text	   data	    bss	    dec	    hex	filename
       375745	  19296	    916	 395957	  60ab5	mac80211.ko.before
       367473	  17888	    916	 386277	  5e4e5	mac80211.ko.after
        -8272   -1408       0   -9680   -25d0 delta
      
      Some more tracepoints with identical arguments
      could be combined like this but for now this is
      the one that benefits most.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      ba99d93b
    • E
      net_sched: sch_mqprio: dont leak kernel memory · 144ce879
      Eric Dumazet 提交于
      mqprio_dump() should make sure all fields of struct tc_mqprio_qopt are
      initialized.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      144ce879
  4. 26 1月, 2011 4 次提交
    • J
      TCP: fix a bug that triggers large number of TCP RST by mistake · 44f5324b
      Jerry Chu 提交于
      This patch fixes a bug that causes TCP RST packets to be generated
      on otherwise correctly behaved applications, e.g., no unread data
      on close,..., etc. To trigger the bug, at least two conditions must
      be met:
      
      1. The FIN flag is set on the last data packet, i.e., it's not on a
      separate, FIN only packet.
      2. The size of the last data chunk on the receive side matches
      exactly with the size of buffer posted by the receiver, and the
      receiver closes the socket without any further read attempt.
      
      This bug was first noticed on our netperf based testbed for our IW10
      proposal to IETF where a large number of RST packets were observed.
      netperf's read side code meets the condition 2 above 100%.
      
      Before the fix, tcp_data_queue() will queue the last skb that meets
      condition 1 to sk_receive_queue even though it has fully copied out
      (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
      tcp_recvmsg() often returns all the copied out data successfully
      without actually consuming the skb, due to a check
      "if ((chunk = len - tp->ucopy.len) != 0) {"
      and
      "len -= chunk;"
      after tcp_prequeue_process() that causes "len" to become 0 and an
      early exit from the big while loop.
      
      I don't see any reason not to free the skb whose data have been fully
      consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
      get there if MSG_PEEK is on. Am I missing some arcane cases related
      to urgent data?
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44f5324b
    • F
      mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface · eb3e554b
      Felix Fietkau 提交于
      Some drivers (e.g. ath9k) do not always disable beacons when they're
      supposed to. When an interface is changed using the change_interface op,
      the mode specific sdata part is in an undefined state and trying to
      get a beacon at this point can produce weird crashes.
      
      To fix this, add a check for ieee80211_sdata_running before using
      anything from the sdata.
      Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
      Cc: stable@kernel.org
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      eb3e554b
    • E
      pktgen: speedup fragmented skbs · 26ad7879
      Eric Dumazet 提交于
      We spend lot of time clearing pages in pktgen.
      (Or not clearing them on ipv6 and leaking kernel memory)
      
      Since we dont modify them, we can use one zeroed page, and get
      references on it. This page can use NUMA affinity as well.
      
      Define pktgen_finalize_skb() helper, used both in ipv4 and ipv6
      
      Results using skbs with one frag :
      
      Before patch :
      
      Result: OK: 608980458(c608978520+d1938) nsec, 1000000000
      (100byte,1frags)
        1642088pps 1313Mb/sec (1313670400bps) errors: 0
      
      After patch :
      
      Result: OK: 345285014(c345283891+d1123) nsec, 1000000000
      (100byte,1frags)
        2896158pps 2316Mb/sec (2316926400bps) errors: 0
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26ad7879
    • D
      ipv6: Revert 'administrative down' address handling changes. · 73a8bd74
      David S. Miller 提交于
      This reverts the following set of commits:
      
      d1ed113f ("ipv6: remove duplicate neigh_ifdown")
      29ba5fed ("ipv6: don't flush routes when setting loopback down")
      9d82ca98 ("ipv6: fix missing in6_ifa_put in addrconf")
      2de79570 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
      8595805a ("IPv6: only notify protocols if address is compeletely gone")
      27bdb2ab ("IPv6: keep tentative addresses in hash table")
      93fa159a ("IPv6: keep route for tentative address")
      8f37ada5 ("IPv6: fix race between cleanup and add/delete address")
      84e8b803 ("IPv6: addrconf notify when address is unavailable")
      dc2b99f7 ("IPv6: keep permanent addresses on admin down")
      
      because the core semantic change to ipv6 address handling on ifdown
      has broken some things, in particular "disable_ipv6" sysctl handling.
      
      Stephen has made several attempts to get things back in working order,
      but nothing has restored disable_ipv6 fully yet.
      Reported-by: NEric W. Biederman <ebiederm@xmission.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73a8bd74
  5. 25 1月, 2011 12 次提交
  6. 24 1月, 2011 1 次提交
  7. 22 1月, 2011 4 次提交
  8. 21 1月, 2011 4 次提交