1. 25 2月, 2014 31 次提交
    • D
      Merge branch 'gianfar' · 432c5b3a
      David S. Miller 提交于
      Claudiu Manoil says:
      
      ====================
      gianfar: Device reset and reconfig fixes
      
      These patches end up fixing some notable device reset & reconfig
      related problems.  One issue is on-the-fly (Rx/Tx on) programming
      of interrupt coalescing (IC) registers on the processing path,
      against HW recommendation.  This is an old issue that became visible
      after BQL introduction, as under certain conditions (low traffic)
      one TX interrupt gets lost and BQL fires Tx timeout as a result.
      Another notable issue is a race on the Tx path (xmit, clean_tx)
      during device reset (i.e. during Tx timeout watchdog firing)
      that leads to NULL access.
      Fixing the problematic on-thy-fly register writes (i.e. the IC regs)
      required the implementation of a MAC soft reset procedure.
      The race leading to NULL access was addressed by fixing the
      stop_gfar()/startup_gfar() pair (disable/enable napi a.s.o.)
      and adding the device state DOWN to sync with the TX path.
      
      v2: Refactored if() clauses from gfar_set_features(), PATCH 2.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      432c5b3a
    • C
      gianfar: Fix Tx int miss, dont write IC on-the-fly · f19015ba
      Claudiu Manoil 提交于
      Programming the interrupt coalescing (IC) registers while
      the controller/DMA is on may incur the loss of one Tx
      confirmation interrupt, under certain conditions.  This is
      a subtle hw race because it does not occur during a burst
      of Tx packets.  It has been observed on p2020 devices that,
      if just one packet is being xmit'ed, the Tx confirmation
      doesn't trigger and BQL evetually blocks the Tx queues,
      followed by Tx timeout and an un-responsive device.
      This issue was not apparent prior to introducing BQL
      support, as a late Tx confirmation was not an issue back then
      and the next burst of Tx frames would have triggered the
      Tx confirmation/ Tx ring cleanup anyway.
      
      Bottom line, the hw specifications state that the IC registers
      should not be programmed while the Rx/Tx blocks (the DMA) are
      enabled. Further more, these registers are currently re-written
      with the same values on the processing path, over and over again.
      To fix this, rewriting the IC registers has been removed from
      the processing path (napi poll).  A complete MAC reset procedure
      has been implemented for the ethtool -c option instead, to
      reliably update these registers while the controller is stopped.
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f19015ba
    • C
      gianfar: Fix device reset races (oops) for Tx · 0851133b
      Claudiu Manoil 提交于
      The device reset procedure, stop_gfar()/startup_gfar(), has
      concurrency issues.
      "Kernel access of bad area" oopses show up during Tx timeout
      device reset or other reset cases (like changing MTU) that
      happen while the interface still has traffic. The oopses
      happen in start_xmit and clean_tx_ring when accessing tx_queue->
      tx_skbuff which is NULL. The race comes from de-allocating the
      tx_skbuff while transmission and napi processing are still
      active. Though the Tx queues get temoprarily stopped when Tx
      timeout occurs, they get re-enabled as a result of Tx congestion
      handling inside the napi context (see clean_tx_ring()). Not
      disabling the napi during reset is also a bug, because
      clean_tx_ring() will try to access tx_skbuff while it is being
      de-alloc'ed and re-alloc'ed.
      
      To fix this, stop_gfar() needs to disable napi processing
      after stopping the Tx queues. However, in order to prevent
      clean_tx_ring() to re-enable the Tx queue before the napi
      gets disabled, the device state DOWN has been introduced.
      It prevents the Tx congestion management from re-enabling the
      de-congested Tx queue while the device is brought down.
      An additional locking state, RESETTING, has been introduced
      to prevent simultaneous resets or to prevent configuring the
      device while it is resetting.
      The bogus 'rxlock's (for each Rx queue) have been removed since
      their purpose is not justified, as they don't prevent nor are
      suited to prevent device reset/reconfig races (such as this one).
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0851133b
    • C
      gianfar: Don't free/request irqs on device reset · 80ec396c
      Claudiu Manoil 提交于
      Resetting the device (stop_gfar()/startup_gfar()) should
      be fast and to the point, in order to timely recover
      from an error condition (like Tx timeout) or during
      device reconfig.  The irq free/ request routines are just
      redundant here, and they should be part of the device
      close/ open routines instead.
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80ec396c
    • C
      gianfar: Fix on-the-fly vlan and mtu updates · 88302648
      Claudiu Manoil 提交于
      The RCTRL and TCTRL registers should not be changed
      on-the-fly, while the controller is running, otherwise
      unexpected behaviour occurs.  But that's exactly what
      gfar_vlan_mode() does, updating the VLAN acceleration
      bits inside RCTRL/TCTRL.  The attempt to lock these
      operations doesn't help, but only adds to the confusion.
      There's also a dependency for Rx FCB insertion (activating
      /de-activating the TOE offload block on Rx) which might
      change the required rx buffer size.  This makes matters
      worse as gfar_vlan_mode() ends up calling gfar_change_mtu(),
      though the MTU size remains the same.  Note that there are
      other situations that may affect the required rx buffer size,
      like changing RXCSUM or rx hw timestamping, but errorneously
      the rx buffer size is not recomputed/ updated in the process.
      
      To fix this, do the vlan updates properly inside the MAC
      reset and reconfiguration procedure, which takes care of
      the rx buffer size dependecy and the rx TOE block (PRSDEP)
      activation/deactivation as well (in the correct order).
      As a consequence, MTU/ rx buff size updates are done now
      by the same MAC reset and reconfig procedure, so that out
      of context updates to MAXFRM, MRBLR, and MACCFG inside
      change_mtu() are no longer needed.  The rx buffer size
      dependecy to Rx FCB is now handled for the other cases too
      (RXCSUM and rx hw timestamping).
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88302648
    • C
      gianfar: Implement MAC reset and reconfig procedure · a328ac92
      Claudiu Manoil 提交于
      The main MAC config registers like: RCTRL/TCTRL, MRBLR,
      MAXFRM, RXIC/TXIC, most fields of MACCFG1/2, should not
      be changed on-the-fly, but at least after stopping the
      DMA and disabling the Rx/Tx blocks and, for increased
      reliability, after a MAC soft reset.
      
      Impelement a complete MAC soft reset and reconfig procedure
      following the latest HW advisories - gfar_mac_reset() - to
      replace gfar_mac_init() and (the confusing) init_registers()
      functions.
      
      Factor out separate config functions for RCTRL and TCTRL,
      insure programming order of the relevant config regs after
      MAC soft reset.
      
      Split gfar_hw_init() into gfar_mac_reset() and the remaining
      global regs that don't need to be reconfigured after MAC soft
      reset (FIFOCFG, ATTRELI, HW counters a.s.o).
      
      As gfar_hw_init() now makes all the register writes @probe()
      time, based on all the device flags and config options, it
      must be moved further down, just before register_netdev(),
      as the last config step when the config values are comitted
      to HW.  Also, move netif_carrier_off() after register_netdev(),
      because it has no effect if called before.
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a328ac92
    • D
      225a9a25
    • F
      net: bcmgenet: Use devm_ioremap_resource() · 5343a10d
      Fabio Estevam 提交于
      According to Documentation/driver-model/devres.txt, devm_request_and_ioremap()
      is deprecated, so use devm_ioremap_resource() instead.
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5343a10d
    • J
      bridge: netfilter: Use ether_addr_copy · 04091142
      Joe Perches 提交于
      Convert the uses of memcpy to ether_addr_copy because
      for some architectures it is smaller and faster.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04091142
    • J
      bridge: Use ether_addr_copy and ETH_ALEN · e5a727f6
      Joe Perches 提交于
      Convert the more obvious uses of memcpy to ether_addr_copy.
      
      There are still uses of memcpy that could be converted but
      these addresses are __aligned(2).
      
      Convert a couple uses of 6 in gr_private.h to ETH_ALEN.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5a727f6
    • B
      cgxb4: Stop using ethtool SPEED_* constants · e8b39015
      Ben Hutchings 提交于
      ethtool speed values are just numbers of megabits and there is no need
      to add SPEED_40000.  To be consistent, use integer constants directly
      for all speeds.
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8b39015
    • D
      tools: bpf_dbg: various misc code cleanups · 7debf780
      Daniel Borkmann 提交于
      Lets clean up bpf_dbg a bit and improve its code slightly
      in various areas: i) Get rid of some macros as there's no
      good reason for keeping them, ii) remove one unused variable
      and reduce scope of various variables found by cppcheck,
      iii) Close non-default file descriptors when exiting the shell.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7debf780
    • D
      loopback: sctp: add NETIF_F_SCTP_CSUM to device features · b17c7069
      Daniel Borkmann 提交于
      Drivers are allowed to set NETIF_F_SCTP_CSUM if they have
      hardware crc32c checksumming support for the SCTP protocol.
      Currently, NETIF_F_SCTP_CSUM flag is available in igb,
      ixgbe, i40e/i40evf drivers and for vlan devices.
      
      If we don't have NETIF_F_SCTP_CSUM then crc32c is done
      through CPU instructions, invoked from crypto layer, or
      if not available as slow-path fallback in software.
      
      Currently, loopback device propagates checksum offloading
      feature flags in dev->features, but is missing SCTP checksum
      offloading. Therefore, account for NETIF_F_SCTP_CSUM as
      well.
      
      Before patch:
      
      ./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
      SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
      4194304 4194304   4096    10.00    4683.50
      
      After patch:
      
      ./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
      SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
      4194304 4194304   4096    10.00    15348.26
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b17c7069
    • M
      pktgen: document all supported flags · 72f8e06f
      Mathias Krause 提交于
      The documentation misses a few of the supported flags. Fix this. Also
      respect the dependency to CONFIG_XFRM for the IPSEC flag.
      
      Cc: Fan Du <fan.du@windriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72f8e06f
    • M
      pktgen: simplify error handling in pgctrl_write() · 09455747
      Mathias Krause 提交于
      The 'out' label is just a relict from previous times as pgctrl_write()
      had multiple error paths. Get rid of it and simply return right away
      on errors.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09455747
    • M
      pktgen: fix out-of-bounds access in pgctrl_write() · 20b0c718
      Mathias Krause 提交于
      If a privileged user writes an empty string to /proc/net/pktgen/pgctrl
      the code for stripping the (then non-existent) '\n' actually writes the
      zero byte at index -1 of data[]. The then still uninitialized array will
      very likely fail the command matching tests and the pr_warning() at the
      end will therefore leak stack bytes to the kernel log.
      
      Fix those issues by simply ensuring we're passed a non-empty string as
      the user API apparently expects a trailing '\n' for all commands.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20b0c718
    • D
      Merge branch 'qlcnic-next' · 8bfdfbc1
      David S. Miller 提交于
      Shahed Shaikh says:
      
      ====================
      qlcnic: Re-factoring and enhancements
      
      This patch series includes following changes -
      * Re-factored firmware minidump template header handling
      * Support to make 8 vNIC mode application to work with 16 vNIC mode
      * Enhance error message logging when adapter is in failed state and
        when adapter lock access fails.
      * Allow vlan0 traffic
      * update MAINTAINERS
      
      Please apply this series to net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bfdfbc1
    • S
      Update MAINTAINERS for qlcnic driver · e6b0b019
      Shahed Shaikh 提交于
      Keep myself as only maintainer for qlcnic driver and update
      group email alias to Dept-HSGLinuxNICDev@qlogic.com
      Signed-off-by: NShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6b0b019
    • S
      qlcnic: Update version to 5.3.56 · 3dd47056
      Shahed Shaikh 提交于
      Signed-off-by: NShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3dd47056
    • H
    • R
      qlcnic: Allow vlan0 traffic · cecd59d8
      Rajesh Borundia 提交于
      o Adapter allows vlan0 traffic in case of SR-IOV after setting
        QLC_SRIOV_ALLOW_VLAN0 bit even though we do not add vlan0 filters.
      Signed-off-by: NRajesh Borundia <rajesh.borundia@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cecd59d8
    • S
    • J
      qlcnic: Updates to QLogic application/driver interface for virtual NIC configuration · d91abf90
      Jitendra Kalsaria 提交于
      Qlogic application interface in the driver which has larger than 8 vNIC
      configuration support has been updated to handle the following cases:
      
      o Only 8 or lower total vNICs were enabled within the vNIC 0-7 range
      o vNICs were enabled in the vNIC 0-15 range such that enabled vNICs were
        not contiguous and only 8 or lower number of total VNICs were enabled
      o Disconnect in the vNIC mapping between application and driver when the
        enabled VNICs were dis contiguous
      Signed-off-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d91abf90
    • S
      qlcnic: Re-factor firmware minidump template header handling · 225837a0
      Shahed Shaikh 提交于
      Treat firmware minidump template headers for 82xx and 83xx/84xx adapters separately,
      as it may change for 82xx and 83xx/84xx adapter type independently.
      Signed-off-by: NShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      225837a0
    • D
      Merge branch 'mlx4' · a1991c74
      David S. Miller 提交于
      Amir Vadai says:
      
      ====================
      net/mlx4: Mellanox driver update 01-01-2014
      
      This small patchset has a fix to a bogus usage of
      netif_get_num_default_rss_queues() in mlx4_en driver.
      
      Changes from V1:
      - Removed affinity_hint patch, to make it a generic instead of mlx specific
      
      Changes from V0:
      - Instead of reverting the netif_get_num_default_rss_queues() in mlx4_en,
        fixing it to limit the actual number of receive queues instead of limiting
        the number of IRQ's.
      
      Patchset was applied and tested against commit: cb6e926e "ipv6:fix checkpatch
      errors with assignment in if condition"
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1991c74
    • I
      net/mlx4: Fix limiting number of IRQ's instead of RSS queues · bb2146bc
      Ido Shamay 提交于
      This fix a performance bug introduced by commit 90b1ebe7 "mlx4: set
      maximal number of default RSS queues", which limits the numbers of IRQs
      opened by core module.
      The limit should be on the number of queues in the indirection table -
      rx_rings, and not on the number of IRQ's. Also, limiting on mlx4_core
      initialization instead of in mlx4_en, prevented using "ethtool -L" to
      utilize all the CPU's, when performance mode is prefered, since limiting
      this number to 8 reduces overall packet rate by 15%-50% in multiple TCP
      streams applications.
      
      For example, after running ethtool -L <ethx> rx 16
      
                Packet rate
      Before the fix  897799
      After the fix   1142070
      
      Results were obtained using netperf:
      
      S=200 ; ( for i in $(seq 1 $S) ; do ( \
        netperf -H 11.7.13.55 -t TCP_RR -l 30 &) ; \
        wait ; done | grep "1        1" | awk '{SUM+=$6} END {print SUM}' )
      
      CC: Yuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb2146bc
    • I
      net/mlx4: Set number of RX rings in a utility function · 02512482
      Ido Shamay 提交于
      mlx4_en_add() is too long.
      Moving set number of RX rings to a utiltity function to improve
      readability and modulization of the code.
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02512482
    • D
      bonding: remove no longer needed lock for bond_xxx_info_query() · 7a4ddcd9
      dingtianhong 提交于
      The bond_xxx_info_query() was already in RTNL, so no need to use
      bond lock to protect the bond slave list, so remove it.
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a4ddcd9
    • D
      bonding: use rcu_dereference() to access curr_active_slave · 4335d60e
      dingtianhong 提交于
      The bond_info_show_master already in RCU read-side critical section,
      and the we access curr_active_slave without the curr_slave_lock, we
      could not sure whether the curr_active_slave will be changed during
      the processing, so use RCU to protected the pointer.
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4335d60e
    • D
      bonding: netpoll: remove unwanted slave_dev_support_netpoll() · 82741808
      dingtianhong 提交于
      The __netpoll_setup() will check the slave's flag and ndo_poll_controller just
      like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was
      not used by any place, so remove it.
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82741808
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · 1f5a7407
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      1) Introduce skb_to_sgvec_nomark function to add further data to the sg list
         without calling sg_unmark_end first. Needed to add extended sequence
         number informations. From Fan Du.
      
      2) Add IPsec extended sequence numbers support to the Authentication Header
         protocol for ipv4 and ipv6. From Fan Du.
      
      3) Make the IPsec flowcache namespace aware, from Fan Du.
      
      4) Avoid creating temporary SA for every packet when no key manager is
         registered. From Horia Geanta.
      
      5) Support filtering of SA dumps to show only the SAs that match a
         given filter. From Nicolas Dichtel.
      
      6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles
         are never used, instead we create a new cache entry whenever xfrm_lookup()
         is called on a socket policy. Most protocols cache the used routes to the
         socket, so this caching is not needed.
      
      7)  Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas
          Dichtel.
      
      8) Cleanup error handling of xfrm_state_clone.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f5a7407
  2. 22 2月, 2014 9 次提交