1. 26 8月, 2014 19 次提交
  2. 25 8月, 2014 16 次提交
    • D
      Merge branch 'ndo_xmit_flush' · fe88e6dd
      David S. Miller 提交于
      Basic deferred TX queue flushing infrastructure.
      
      Over time, and specifically and more recently at the Networking
      Workshop during Kernel SUmmit in Chicago, we have discussed the idea
      of having some way to optimize transmits of multiple TX packets at
      a time.
      
      There are several areas of overhead that could be amortized with such
      schemes.  One has to do with locking and transactional overhead, the
      other has to do with device specific costs.
      
      This patch set here is more aimed at device specific costs.
      
      Typically a device queues up a packet in the TX queue and then has to
      do something to have the device start processing that new entry.
      Sometimes this is composed of doing an MMIO write to a "tail"
      register, and in other cases it can involve something as expensive as
      a hypervisor call.
      
      The basic setup defined here is that when the driver supports deferred
      TX queue flushing, ndo_start_xmit should no longer perform that
      operation.  Instead a new operation, ndo_xmit_flush, should do it.
      
      I have converted IGB and virtio_net as example initial users.  The IGB
      conversion is tested, virtio_net is not but it does compile :-)
      
      All ndo_start_xmit call sites have been abstracted behind a new helper
      called netdev_start_xmit().
      
      This just adds the infrastructure, it does not actually add any
      instances of actually doing multiple ndo_start_xmit calls per
      ndo_xmit_flush invocation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe88e6dd
    • D
      c223a078
    • D
      igb: Support netdev_ops->ndo_xmit_flush() · c1ebf46c
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1ebf46c
    • D
      net: Add ops->ndo_xmit_flush() · 4798248e
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4798248e
    • I
      ipv6: White-space cleansing : gaps between function and symbol export · 4c83acbc
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch removes some blank lines between the end of a function
      definition and the EXPORT_SYMBOL_GPL macro in order to prevent
      checkpatch warning that EXPORT_SYMBOL must immediately follow
      a function.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c83acbc
    • I
      ipv6: White-space cleansing : Structure layouts · cc24beca
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch addresses structure definitions, specifically it cleanses the brace
      placement and replaces spaces with tabs in a few places.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc24beca
    • I
      ipv6: White-space cleansing : Line Layouts · 67ba4152
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      A number of items are addressed in this patch:
      * Multiple spaces converted to tabs
      * Spaces before tabs removed.
      * Spaces in pointer typing cleansed (char *)foo etc.
      * Remove space after sizeof
      * Ensure spacing around comparators such as if statements.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67ba4152
    • D
      net: ec_bhf: remove excessive debug messages · a9b0b2fa
      Darek Marcinkiewicz 提交于
      This cuts down the number of debug information spit out by
      the driver.
      Signed-off-by: NDariusz Marcinkiewicz <reksio@newterm.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9b0b2fa
    • D
      random32: improvements to prandom_bytes · a98406e2
      Daniel Borkmann 提交于
      This patch addresses a couple of minor items, mostly addesssing
      prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
      for length arguments, 2) We can use put_unaligned() when filling
      the array instead of open coding it [ perhaps some archs will
      further benefit from their own arch specific implementation when
      GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
      int as type for getting the arch seed, 5) Make use of
      prandom_u32_max() for timer slack.
      
      Regarding the change to put_unaligned(), callers of prandom_bytes()
      which internally invoke prandom_bytes_state(), don't bother as
      they expect the array to be filled randomly and don't have any
      control of the internal state what-so-ever (that's also why we
      have periodic reseeding there, etc), so they really don't care.
      
      Now for the direct callers of prandom_bytes_state(), which
      are solely located in test cases for MTD devices, that is,
      drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
      
      These tests basically fill a test write-vector through
      prandom_bytes_state() with an a-priori defined seed each time
      and write that to a MTD device. Later on, they set up a read-vector
      and read back that blocks from the device. So in the verification
      phase, the write-vector is being re-setup [ so same seed and
      prandom_bytes_state() called ], and then memcmp()'ed against the
      read-vector to check if the data is the same.
      
      Akinobu, Lothar and I also tested this patch and it runs through
      the 3 relevant MTD test cases w/o any errors on the nandsim device
      (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
      and i.MX6):
      
        # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
                           third_id_byte=0x00 fourth_id_byte=0x15
        # modprobe mtd_oobtest dev=0
        # modprobe mtd_pagetest dev=0
        # modprobe mtd_subpagetest dev=0
      
      We also don't have any users depending directly on a particular
      result of the PRNG (except the PRNG self-test itself), and that's
      just fine as it e.g. allowed us easily to do things like upgrading
      from taus88 to taus113.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Tested-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: NLothar Waßmann <LW@KARO-electronics.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a98406e2
    • D
      Merge branch 'csums-next' · c1e60bd4
      David S. Miller 提交于
      Tom Herbert says:
      
      ====================
      net: Checksum offload changes - Part V
      
      I am working on overhauling RX checksum offload. Goals of this effort
      are:
      
      - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
      - Preserve CHECKSUM_COMPLETE through encapsulation layers
      - Don't do skb_checksum more than once per packet
      - Unify GRO and non-GRO csum verification as much as possible
      - Unify the checksum functions (checksum_init)
      - Simplify code
      
      What is in this fifth patch set:
      
      - Added GRO checksum validation functions
      - Call the GRO validations functions from TCP and GRE gro_receive
      - Perform checksum verification in the UDP gro_receive path using
        GRO functions and add support for gro_receive in UDP6
      
      Changes in V2:
      
      - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
        to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
        performance penalty in checksumming bytes which are before the header
        GRO is at.
      
      Please review carefully and test if possible, mucking with basic
      checksum functions is always a little precarious :-)
      
      ----
      
      Test results with this patch set are below. I did not notice any
      performace regression.
      
      Tests run:
         TCP_STREAM: super_netperf with 200 streams
         TCP_RR: super_netperf with 200 streams and -r 1,1
      
      Device bnx2x (10Gbps):
         No GRE RSS hash (RX interrupts occur on one core)
         UDP RSS port hashing enabled.
      
      * GRE with checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              9.91% CPU utilization
              5163.78 Mbps
          TCP_RR
              50.64% CPU utilization
              219/347/502 90/95/99% latencies
              834103 tps
        Without fix:
          TCP_STREAM
              10.05% CPU utilization
              5186.22 tps
          TCP_RR
              49.70% CPU utilization
              227/338/486 90/95/99% latencies
              813450 tps
      
      * GRE without checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              10.18% CPU utilization
              5159 Mbps
          TCP_RR
              51.86% CPU utilization
              214/325/471 90/95/99% latencies
              865943 tps
        Without fix:
          TCP_STREAM
              10.26% CPU utilization
              5307.87 Mbps
          TCP_RR
              50.59% CPU utilization
              224/325/476 90/95/99% latencies
              846429 tps
      
      *** Simulate device returns CHECKSUM_COMPLETE
      
      * VXLAN with checksum
        With fix:
          TCP_STREAM
              13.03% CPU utilization
              9093.9 Mbps
          TCP_RR
              95.96% CPU utilization
              161/259/474 90/95/99% latencies
              1.14806e+06 tps
        Without fix:
          TCP_STREAM
              13.59% CPU utilization
              9093.97 Mbps
          TCP_RR
              93.95% CPU utilization
              160/259/484 90/95/99% latencies
              1.10262e+06 tps
      
      * VXLAN without checksum
        With fix:
          TCP_STREAM
              13.28% CPU utilization
              9093.87 Mbps
          TCP_RR
              95.04% CPU utilization
              155/246/439 90/95/99% latencies
              1.15e+06 tps
        Without fix:
          TCP_STREAM
              13.37% CPU utilization
              9178.45 Mbps
          TCP_RR
              93.74% CPU utilization
              161/257/469 90/95/99% latencies
              1.1068e+06 Mbps
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1e60bd4
    • T
      gre: When GRE csum is present count as encap layer wrt csum · 48a5fc77
      Tom Herbert 提交于
      In GRE demux if the GRE checksum pop rcv encapsulation so that any
      encapsulated checksums are treated as tunnel checksums.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48a5fc77
    • T
      udp: additional GRO support · 57c67ff4
      Tom Herbert 提交于
      Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
      for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57c67ff4
    • T
      tcp: Call skb_gro_checksum_validate · 149d0774
      Tom Herbert 提交于
      In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP
      checksum in the gro context.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      149d0774
    • T
      gre: call skb_gro_checksum_simple_validate · 758f75d1
      Tom Herbert 提交于
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      758f75d1
    • T
      net: add gro_compute_pseudo functions · 1933a785
      Tom Herbert 提交于
      Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are
      the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo
      for GRO path. The IP header is taken from skb_gro_network_header.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1933a785
    • T
      net: skb_gro_checksum_* functions · 573e8fca
      Tom Herbert 提交于
      Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check,
      and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete.
      These are the cognates of the normal checksum functions but are used
      in the gro_receive path and operate on GRO related fields in sk_buffs.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      573e8fca
  3. 24 8月, 2014 5 次提交
    • D
      net: use reciprocal_scale() helper · 8fc54f68
      Daniel Borkmann 提交于
      Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale().
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fc54f68
    • D
      net: Allow raw buffers to be passed into the flow dissector. · 690e36e7
      David S. Miller 提交于
      Drivers, and perhaps other entities we have not yet considered,
      sometimes want to know how deep the protocol headers go before
      deciding how large of an SKB to allocate and how much of the packet to
      place into the linear SKB area.
      
      For example, consider a driver which has a device which DMAs into
      pools of pages and then tells the driver where the data went in the
      DMA descriptor(s).  The driver can then build an SKB and reference
      most of the data via SKB fragments (which are page/offset/length
      triplets).
      
      However at least some of the front of the packet should be placed into
      the linear SKB area, which comes before the fragments, so that packet
      processing can get at the headers efficiently.  The first thing each
      protocol layer is going to do is a "pskb_may_pull()" so we might as
      well aggregate as much of this as possible while we're building the
      SKB in the driver.
      
      Part of supporting this is that we don't have an SKB yet, so we want
      to be able to let the flow dissector operate on a raw buffer in order
      to compute the offset of the end of the headers.
      
      So now we have a __skb_flow_dissect() which takes an explicit data
      pointer and length.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690e36e7
    • D
      Merge branch 'bcm7xxx_apd_eee' · 1ad676a6
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: phy: bcm7xxx: APD and EEE support
      
      This patch series enables Auto-power down and EEE for the BCM7xxx integrated
      Gigabit PHYs.
      
      I also put a fix for the fixed PHY that would allow clause 45 over clause 22
      reads/writes but would return bogus data by using e.g: ethtool --show-eee
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ad676a6
    • F
      net: phy: bcm7xxx: enable EEE at the PHY level · b8f9a029
      Florian Fainelli 提交于
      The 28nm Gigabit PHY on BCM7xxx chips comes out of reset with absolutely
      no EEE capabilities, such that we would actually return that we do not
      support EEE when accessing 3.20 (MDIO_PCS_EEE_ABLE) registers.
      
      Poke through the vendor-specific C45 register to enable EEE globally at
      the PHY level, and advertise supported EEE modes.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8f9a029
    • F
      net: phy: allow phy_init_eee() to work with internal PHYs · a9f63095
      Florian Fainelli 提交于
      Internal PHYs do not have any specific phy_interface_t defined because
      they are within an Ethernet MAC or a larger IC, they will fail the early
      check in phy_init_eee(). Allow these PHYs to proceed with EEE
      initialization and report error/success by checking the standard C45
      EEE-related registers.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9f63095