1. 27 8月, 2014 10 次提交
  2. 26 8月, 2014 20 次提交
  3. 25 8月, 2014 10 次提交
    • D
      Merge branch 'ndo_xmit_flush' · fe88e6dd
      David S. Miller 提交于
      Basic deferred TX queue flushing infrastructure.
      
      Over time, and specifically and more recently at the Networking
      Workshop during Kernel SUmmit in Chicago, we have discussed the idea
      of having some way to optimize transmits of multiple TX packets at
      a time.
      
      There are several areas of overhead that could be amortized with such
      schemes.  One has to do with locking and transactional overhead, the
      other has to do with device specific costs.
      
      This patch set here is more aimed at device specific costs.
      
      Typically a device queues up a packet in the TX queue and then has to
      do something to have the device start processing that new entry.
      Sometimes this is composed of doing an MMIO write to a "tail"
      register, and in other cases it can involve something as expensive as
      a hypervisor call.
      
      The basic setup defined here is that when the driver supports deferred
      TX queue flushing, ndo_start_xmit should no longer perform that
      operation.  Instead a new operation, ndo_xmit_flush, should do it.
      
      I have converted IGB and virtio_net as example initial users.  The IGB
      conversion is tested, virtio_net is not but it does compile :-)
      
      All ndo_start_xmit call sites have been abstracted behind a new helper
      called netdev_start_xmit().
      
      This just adds the infrastructure, it does not actually add any
      instances of actually doing multiple ndo_start_xmit calls per
      ndo_xmit_flush invocation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe88e6dd
    • D
      c223a078
    • D
      igb: Support netdev_ops->ndo_xmit_flush() · c1ebf46c
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1ebf46c
    • D
      net: Add ops->ndo_xmit_flush() · 4798248e
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4798248e
    • I
      ipv6: White-space cleansing : gaps between function and symbol export · 4c83acbc
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch removes some blank lines between the end of a function
      definition and the EXPORT_SYMBOL_GPL macro in order to prevent
      checkpatch warning that EXPORT_SYMBOL must immediately follow
      a function.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c83acbc
    • I
      ipv6: White-space cleansing : Structure layouts · cc24beca
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch addresses structure definitions, specifically it cleanses the brace
      placement and replaces spaces with tabs in a few places.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc24beca
    • I
      ipv6: White-space cleansing : Line Layouts · 67ba4152
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      A number of items are addressed in this patch:
      * Multiple spaces converted to tabs
      * Spaces before tabs removed.
      * Spaces in pointer typing cleansed (char *)foo etc.
      * Remove space after sizeof
      * Ensure spacing around comparators such as if statements.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67ba4152
    • D
      net: ec_bhf: remove excessive debug messages · a9b0b2fa
      Darek Marcinkiewicz 提交于
      This cuts down the number of debug information spit out by
      the driver.
      Signed-off-by: NDariusz Marcinkiewicz <reksio@newterm.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9b0b2fa
    • D
      random32: improvements to prandom_bytes · a98406e2
      Daniel Borkmann 提交于
      This patch addresses a couple of minor items, mostly addesssing
      prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
      for length arguments, 2) We can use put_unaligned() when filling
      the array instead of open coding it [ perhaps some archs will
      further benefit from their own arch specific implementation when
      GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
      int as type for getting the arch seed, 5) Make use of
      prandom_u32_max() for timer slack.
      
      Regarding the change to put_unaligned(), callers of prandom_bytes()
      which internally invoke prandom_bytes_state(), don't bother as
      they expect the array to be filled randomly and don't have any
      control of the internal state what-so-ever (that's also why we
      have periodic reseeding there, etc), so they really don't care.
      
      Now for the direct callers of prandom_bytes_state(), which
      are solely located in test cases for MTD devices, that is,
      drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
      
      These tests basically fill a test write-vector through
      prandom_bytes_state() with an a-priori defined seed each time
      and write that to a MTD device. Later on, they set up a read-vector
      and read back that blocks from the device. So in the verification
      phase, the write-vector is being re-setup [ so same seed and
      prandom_bytes_state() called ], and then memcmp()'ed against the
      read-vector to check if the data is the same.
      
      Akinobu, Lothar and I also tested this patch and it runs through
      the 3 relevant MTD test cases w/o any errors on the nandsim device
      (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
      and i.MX6):
      
        # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
                           third_id_byte=0x00 fourth_id_byte=0x15
        # modprobe mtd_oobtest dev=0
        # modprobe mtd_pagetest dev=0
        # modprobe mtd_subpagetest dev=0
      
      We also don't have any users depending directly on a particular
      result of the PRNG (except the PRNG self-test itself), and that's
      just fine as it e.g. allowed us easily to do things like upgrading
      from taus88 to taus113.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Tested-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: NLothar Waßmann <LW@KARO-electronics.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a98406e2
    • D
      Merge branch 'csums-next' · c1e60bd4
      David S. Miller 提交于
      Tom Herbert says:
      
      ====================
      net: Checksum offload changes - Part V
      
      I am working on overhauling RX checksum offload. Goals of this effort
      are:
      
      - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
      - Preserve CHECKSUM_COMPLETE through encapsulation layers
      - Don't do skb_checksum more than once per packet
      - Unify GRO and non-GRO csum verification as much as possible
      - Unify the checksum functions (checksum_init)
      - Simplify code
      
      What is in this fifth patch set:
      
      - Added GRO checksum validation functions
      - Call the GRO validations functions from TCP and GRE gro_receive
      - Perform checksum verification in the UDP gro_receive path using
        GRO functions and add support for gro_receive in UDP6
      
      Changes in V2:
      
      - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
        to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
        performance penalty in checksumming bytes which are before the header
        GRO is at.
      
      Please review carefully and test if possible, mucking with basic
      checksum functions is always a little precarious :-)
      
      ----
      
      Test results with this patch set are below. I did not notice any
      performace regression.
      
      Tests run:
         TCP_STREAM: super_netperf with 200 streams
         TCP_RR: super_netperf with 200 streams and -r 1,1
      
      Device bnx2x (10Gbps):
         No GRE RSS hash (RX interrupts occur on one core)
         UDP RSS port hashing enabled.
      
      * GRE with checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              9.91% CPU utilization
              5163.78 Mbps
          TCP_RR
              50.64% CPU utilization
              219/347/502 90/95/99% latencies
              834103 tps
        Without fix:
          TCP_STREAM
              10.05% CPU utilization
              5186.22 tps
          TCP_RR
              49.70% CPU utilization
              227/338/486 90/95/99% latencies
              813450 tps
      
      * GRE without checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              10.18% CPU utilization
              5159 Mbps
          TCP_RR
              51.86% CPU utilization
              214/325/471 90/95/99% latencies
              865943 tps
        Without fix:
          TCP_STREAM
              10.26% CPU utilization
              5307.87 Mbps
          TCP_RR
              50.59% CPU utilization
              224/325/476 90/95/99% latencies
              846429 tps
      
      *** Simulate device returns CHECKSUM_COMPLETE
      
      * VXLAN with checksum
        With fix:
          TCP_STREAM
              13.03% CPU utilization
              9093.9 Mbps
          TCP_RR
              95.96% CPU utilization
              161/259/474 90/95/99% latencies
              1.14806e+06 tps
        Without fix:
          TCP_STREAM
              13.59% CPU utilization
              9093.97 Mbps
          TCP_RR
              93.95% CPU utilization
              160/259/484 90/95/99% latencies
              1.10262e+06 tps
      
      * VXLAN without checksum
        With fix:
          TCP_STREAM
              13.28% CPU utilization
              9093.87 Mbps
          TCP_RR
              95.04% CPU utilization
              155/246/439 90/95/99% latencies
              1.15e+06 tps
        Without fix:
          TCP_STREAM
              13.37% CPU utilization
              9178.45 Mbps
          TCP_RR
              93.74% CPU utilization
              161/257/469 90/95/99% latencies
              1.1068e+06 Mbps
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1e60bd4