1. 01 9月, 2015 26 次提交
  2. 31 8月, 2015 14 次提交
    • D
      ipv4: Fix 32-bit build. · 80ec1927
      David S. Miller 提交于
         net/ipv4/af_inet.c: In function 'snmp_get_cpu_field64':
      >> net/ipv4/af_inet.c:1486:26: error: 'offt' undeclared (first use in this function)
            v = *(((u64 *)bhptr) + offt);
                                   ^
         net/ipv4/af_inet.c:1486:26: note: each undeclared identifier is reported only once for each function it appears in
         net/ipv4/af_inet.c: In function 'snmp_fold_field64':
      >> net/ipv4/af_inet.c:1499:39: error: 'offct' undeclared (first use in this function)
            res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
                                                ^
      >> net/ipv4/af_inet.c:1499:10: error: too many arguments to function 'snmp_get_cpu_field'
            res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
                   ^
         net/ipv4/af_inet.c:1455:5: note: declared here
          u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt)
              ^
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80ec1927
    • K
      netlink: rx mmap: fix POLLIN condition · 0ef70770
      Ken-ichirou MATSUZAWA 提交于
      Poll() returns immediately after setting the kernel current frame
      (ring->head) to SKIP from user space even though there is no new
      frame. And in a case of all frames is VALID, user space program
      unintensionally sets (only) kernel current frame to UNUSED, then
      calls poll(), it will not return immediately even though there are
      VALID frames.
      
      To avoid situations like above, I think we need to scan all frames
      to find VALID frames at poll() like netlink_alloc_skb(),
      netlink_forward_ring() finding an UNUSED frame at skb allocation.
      Signed-off-by: NKen-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ef70770
    • D
      Merge branch 'thunderx-features-fixes' · 793768f5
      David S. Miller 提交于
      Aleksey Makarov says:
      
      ====================
      net: thunderx: New features and fixes
      
      v2:
        - The unused affinity_mask field of the structure cmp_queue
        has been deleted. (thanks to David Miller)
        - The unneeded initializers have been dropped. (thanks to Alexey Klimov)
        - The commit message "net: thunderx: Rework interrupt handling"
        has been fixed. (thanks to Alexey Klimov)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      793768f5
    • S
      net: thunderx: Support for internal loopback mode · d77a2384
      Sunil Goutham 提交于
      Support for setting VF's corresponding BGX LMAC in internal
      loopback mode. This mode can be used for verifying basic HW
      functionality such as packet I/O, RX checksum validation,
      CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external
      network connectivity.
      
      'loopback' mode can be enabled or disabled via ethtool.
      
      Note: This feature is not supported when no of VFs enabled are
      morethan no of physical interfaces i.e active BGX LMACs
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d77a2384
    • S
      net: thunderx: Support for upto 96 queues for a VF · 92dc8769
      Sunil Goutham 提交于
      This patch adds support for handling multiple qsets assigned to a
      single VF. There by increasing no of queues from earlier 8 to max
      no of CPUs in the system i.e 48 queues on a single node and 96 on
      dual node system. User doesn't have option to assign which Qsets/VFs
       to be merged. Upon request from VF, PF assigns next free Qsets as
      secondary qsets. To maintain current behavior no of queues is kept
      to 8 by default which can be increased via ethtool.
      
      If user wants to unbind NICVF driver from a secondary Qset then it
      should be done after tearing down primary VF's interface.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92dc8769
    • S
      net: thunderx: Rework interrupt handling · 39ad6eea
      Sunil Goutham 提交于
      Rework interrupt handler to avoid checking IRQ affinity of
      CQ interrupts. Now separate handlers are registered for each IRQ
      including RBDR. Register interrupt handlers for only those
      which are being used. Add nicvf_dump_intr_status() and use it
      in irq handlers.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39ad6eea
    • S
      net: thunderx: Support for HW VLAN stripping · aa2e259b
      Sunil Goutham 提交于
      This patch configures HW to strip 802.1Q header if found in a
      receiving packet. The stripped VLAN ID and TCI information is
      passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
      so that other HW offload features can be used for tagged packets.
      
      This offload feature can be enabled or disabled via ethtool.
      
      Network stack normally ignores RPS for 802.1Q packets and hence low
      throughput. With this offload enabled throughput for tagged packets
      will be almost same as normal packets.
      
      Note: This patch doesn't enable HW VLAN insertion for transmit packets.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa2e259b
    • S
      net: thunderx: Receive hashing HW offload support · 38bb5d4f
      Sunil Goutham 提交于
      Adding support for receive hashing HW offload by using RSS_ALG
      and RSS_TAG fields of CQE_RX descriptor. Also removed dependency
      on minimum receive queue count to configure RSS so that hash is
      always generated.
      
      This hash is used by RPS logic to distribute flows across multiple
      CPUs. Offload can be disabled via ethtool.
      Signed-off-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38bb5d4f
    • S
      net: thunderx: mailboxes: remove code duplication · 6051cba7
      Sunil Goutham 提交于
      Use the nicvf_send_msg_to_pf() function in the mailbox code.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6051cba7
    • S
      net: thunderx: Add receive error stats reporting via ethtool · a2dc5ded
      Sunil Goutham 提交于
      Added ethtool support to dump receive packet error statistics reported
      in CQE. Also made some small fixes
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2dc5ded
    • A
      net: thunderx: fix MAINTAINERS · 322e5cc5
      Aleksey Makarov 提交于
      The liquidio and thunder drivers have different maintainers.
      Signed-off-by: NAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      322e5cc5
    • D
      Merge branch 'snmp-stat-aggregation' · ef34c0f6
      David S. Miller 提交于
      Raghavendra K T says:
      
      ====================
      Optimize the snmp stat aggregation for large cpus
      
      While creating 1000 containers, perf is showing lot of time spent in
      snmp_fold_field on a large cpu system.
      
      The current patch tries to improve by reordering the statistics gathering.
      
      Please note that similar overhead was also reported while creating
      veth pairs  https://lkml.org/lkml/2013/3/19/556
      
      Changes in V4:
       - remove 'item' variable and use IPSTATS_MIB_MAX to avoid sparse
         warning (Eric) also remove 'item' parameter (Joe)
       - add missing memset of padding.
      
      Changes in V3:
       - use memset to initialize temp buffer in leaf function. (David)
       - use memcpy to copy the buffer data to stat instead of unalign_pu (Joe)
       - Move buffer definition to leaf function __snmp6_fill_stats64() (Eric)
       -
      Changes in V2:
       - Allocate the stat calculation buffer in stack. (Eric)
      
      Setup:
      160 cpu (20 core) baremetal powerpc system with 1TB memory
      
      1000 docker containers was created with command
      docker run -itd  ubuntu:15.04  /bin/bash in loop
      
      observation:
      Docker container creation linearly increased from around 1.6 sec to 7.5 sec
      (at 1000 containers) perf data showed, creating veth interfaces resulting in
      the below code path was taking more time.
      
      rtnl_fill_ifinfo
        -> inet6_fill_link_af
          -> inet6_fill_ifla6_attrs
            -> snmp_fold_field
      
      proposed idea:
       currently __snmp6_fill_stats64 calls snmp_fold_field that walks
      through per cpu data to of an item (iteratively for around 36 items).
       The patch tries to aggregate the statistics by going through
      all the items of each cpu sequentially which is reducing cache
      misses.
      
      Performance of docker creation improved by around more than 2x
      after the patch.
      
      before the patch:
      ================
      3f45ba571a42e925c4ec4aaee0e48d7610a9ed82a4c931f83324d41822cf6617
      real	0m6.836s
      user	0m0.095s
      sys	0m0.011s
      
      perf record -a docker run -itd  ubuntu:15.04  /bin/bash
      =======================================================
          50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
           9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
           3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
           2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock
           1.37%  docker           docker                  [.] backtrace_qsort
           1.31%  docker           docker                  [.] strings.FieldsFunc
      
        cache-misses:  2.7%
      
      after the patch:
      =============
      9178273e9df399c8290b6c196e4aef9273be2876225f63b14a60cf97eacfafb5
      real	0m3.249s
      user	0m0.088s
      sys	0m0.020s
      
      perf record -a docker run -itd  ubuntu:15.04  /bin/bash
      =======================================================
          10.57%  docker           docker                [.] scanblock
           8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
           6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
           6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one
           3.96%  docker           docker                [.] runtime_MSpan_Sweep
           2.47%  docker           docker                [.] strings.FieldsFunc
      
      cache-misses: 1.41 %
      
      Please let me know if you have suggestions/comments.
      Thanks Eric, Joe and David for the comments.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef34c0f6
    • R
      net: Optimize snmp stat aggregation by walking all the percpu data at once · a3a77372
      Raghavendra K T 提交于
      Docker container creation linearly increased from around 1.6 sec to 7.5 sec
      (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.
      
      reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
      through per cpu data of an item (iteratively for around 36 items).
      
      idea: This patch tries to aggregate the statistics by going through
      all the items of each cpu sequentially which is reducing cache
      misses.
      
      Docker creation got faster by more than 2x after the patch.
      
      Result:
                             Before           After
      Docker creation time   6.836s           3.25s
      cache miss             2.7%             1.41%
      
      perf before:
          50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
           9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
           3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
           2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock
      
      perf after:
          10.57%  docker           docker                [.] scanblock
           8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
           6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
           6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one
      
      changes/ideas suggested:
      Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
      place of unaligned_put (Joe).
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3a77372
    • R