1. 17 12月, 2011 2 次提交
  2. 16 12月, 2011 1 次提交
  3. 14 12月, 2011 8 次提交
  4. 13 12月, 2011 3 次提交
    • H
      netem: add cell concept to simulate special MAC behavior · 90b41a1c
      Hagen Paul Pfeifer 提交于
      This extension can be used to simulate special link layer
      characteristics. Simulate because packet data is not modified, only the
      calculation base is changed to delay a packet based on the original
      packet size and artificial cell information.
      
      packet_overhead can be used to simulate a link layer header compression
      scheme (e.g. set packet_overhead to -20) or with a positive
      packet_overhead value an additional MAC header can be simulated. It is
      also possible to "replace" the 14 byte Ethernet header with something
      else.
      
      cell_size and cell_overhead can be used to simulate link layer schemes,
      based on cells, like some TDMA schemes. Another application area are MAC
      schemes using a link layer fragmentation with a (small) header each.
      Cell size is the maximum amount of data bytes within one cell. Cell
      overhead is an additional variable to change the per-cell-overhead
      (e.g.  5 byte header per fragment).
      
      Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
      cell overhead 5 byte):
      
        tc qdisc add dev eth0 root netem rate 5kbit 20 100 5
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90b41a1c
    • G
      tcp memory pressure controls · d1a4c0b3
      Glauber Costa 提交于
      This patch introduces memory pressure controls for the tcp
      protocol. It uses the generic socket memory pressure code
      introduced in earlier patches, and fills in the
      necessary data in cg_proto struct.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
      CC: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1a4c0b3
    • G
      socket: initial cgroup code. · e1aab161
      Glauber Costa 提交于
      The goal of this work is to move the memory pressure tcp
      controls to a cgroup, instead of just relying on global
      conditions.
      
      To avoid excessive overhead in the network fast paths,
      the code that accounts allocated memory to a cgroup is
      hidden inside a static_branch(). This branch is patched out
      until the first non-root cgroup is created. So when nobody
      is using cgroups, even if it is mounted, no significant performance
      penalty should be seen.
      
      This patch handles the generic part of the code, and has nothing
      tcp-specific.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com>
      CC: Kirill A. Shutemov <kirill@shutemov.name>
      CC: David S. Miller <davem@davemloft.net>
      CC: Eric W. Biederman <ebiederm@xmission.com>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1aab161
  5. 12 12月, 2011 1 次提交
  6. 10 12月, 2011 6 次提交
  7. 09 12月, 2011 6 次提交
  8. 07 12月, 2011 6 次提交
  9. 06 12月, 2011 3 次提交
  10. 05 12月, 2011 2 次提交
    • P
      perf: Fix loss of notification with multi-event · 10c6db11
      Peter Zijlstra 提交于
      When you do:
              $ perf record -e cycles,cycles,cycles noploop 10
      
      You expect about 10,000 samples for each event, i.e., 10s at
      1000samples/sec. However, this is not what's happening. You
      get much fewer samples, maybe 3700 samples/event:
      
      $ perf report -D | tail -15
      Aggregated stats:
                 TOTAL events:      10998
                  MMAP events:         66
                  COMM events:          2
                SAMPLE events:      10930
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      cycles stats:
                 TOTAL events:       3642
                SAMPLE events:       3642
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      
      On a Intel Nehalem or even AMD64, there are 4 counters capable
      of measuring cycles, so there is plenty of space to measure those
      events without multiplexing (even with the NMI watchdog active).
      And even with multiplexing, we'd expect roughly the same number
      of samples per event.
      
      The root of the problem was that when the event that caused the buffer
      to become full was not the first event passed on the cmdline, the user
      notification would get lost. The notification was sent to the file
      descriptor of the overflowed event but the perf tool was not polling
      on it.  The perf tool aggregates all samples into a single buffer,
      i.e., the buffer of the first event. Consequently, it assumes
      notifications for any event will come via that descriptor.
      
      The seemingly straight forward solution of moving the waitq into the
      ringbuffer object doesn't work because of life-time issues. One could
      perf_event_set_output() on a fd that you're also blocking on and cause
      the old rb object to be freed while its waitq would still be
      referenced by the blocked thread -> FAIL.
      
      Therefore link all events to the ringbuffer and broadcast the wakeup
      from the ringbuffer object to all possible events that could be waited
      upon. This is rather ugly, and we're open to better solutions but it
      works for now.
      Reported-by: NStephane Eranian <eranian@google.com>
      Finished-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111126014731.GA7030@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
      10c6db11
    • E
      tcp: take care of misalignments · 117632e6
      Eric Dumazet 提交于
      We discovered that TCP stack could retransmit misaligned skbs if a
      malicious peer acknowledged sub MSS frame. This currently can happen
      only if output interface is non SG enabled : If SG is enabled, tcp
      builds headless skbs (all payload is included in fragments), so the tcp
      trimming process only removes parts of skb fragments, header stay
      aligned.
      
      Some arches cant handle misalignments, so force a head reallocation and
      shrink headroom to MAX_TCP_HEADER.
      
      Dont care about misaligments on x86 and PPC (or other arches setting
      NET_IP_ALIGN to 0)
      
      This patch introduces __pskb_copy() which can specify the headroom of
      new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      117632e6
  11. 04 12月, 2011 2 次提交
    • J
      net: Add Open vSwitch kernel components. · ccb1352e
      Jesse Gross 提交于
      Open vSwitch is a multilayer Ethernet switch targeted at virtualized
      environments.  In addition to supporting a variety of features
      expected in a traditional hardware switch, it enables fine-grained
      programmatic extension and flow-based control of the network.
      This control is useful in a wide variety of applications but is
      particularly important in multi-server virtualization deployments,
      which are often characterized by highly dynamic endpoints and the need
      to maintain logical abstractions for multiple tenants.
      
      The Open vSwitch datapath provides an in-kernel fast path for packet
      forwarding.  It is complemented by a userspace daemon, ovs-vswitchd,
      which is able to accept configuration from a variety of sources and
      translate it into packet processing rules.
      
      See http://openvswitch.org for more information and userspace
      utilities.
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      ccb1352e
    • P
      vlan: Move vlan_set_encap_proto() to vlan header file · 396cf943
      Pravin B Shelar 提交于
      Open vSwitch needs this function for vlan handling.
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      396cf943