1. 23 12月, 2014 4 次提交
  2. 20 12月, 2014 2 次提交
    • G
      enic: fix rx skb checksum · 17e96834
      Govindarajulu Varadarajan 提交于
      Hardware always provides compliment of IP pseudo checksum. Stack expects
      whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
      
      This causes checksum error in nf & ovs.
      
      kernel: qg-19546f09-f2: hw csum failure
      kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF          O--------------   3.10.0-123.8.1.el7.x86_64 #1
      kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
      kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
      kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
      kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
      kernel: Call Trace:
      kernel: <IRQ>  [<ffffffff815e237b>] dump_stack+0x19/0x1b
      kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
      kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
      kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
      kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
      kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
      kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
      kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
      kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
      kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
      
      Hardware verifies IP & tcp/udp header checksum but does not provide payload
      checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
      
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Stefan Assmann <sassmann@redhat.com>
      Reported-by: NSunil Choudhary <schoudha@redhat.com>
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17e96834
    • L
      sunvnet: fix a memory leak in vnet_handle_offloads · 4f2ff8ef
      Li RongQing 提交于
      when skb_gso_segment returns error, the original skb should be freed
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Acked-by: NDavid L Stevens <david.stevens@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f2ff8ef
  3. 19 12月, 2014 3 次提交
  4. 17 12月, 2014 3 次提交
    • O
      net: Disallow providing non zero VLAN ID for NIC drivers FDB add flow · 65891fea
      Or Gerlitz 提交于
      The current implementations all use dev_uc_add_excl() and such whose API
      doesn't support vlans, so we can't make it with NICs HW for now.
      
      Fixes: f6f6424b ('net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del')
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65891fea
    • I
      net/mlx4: Cache line CQE/EQE stride fixes · c3f2511f
      Ido Shamay 提交于
      This commit contains 2 fixes for the 128B CQE/EQE stride feaure.
      Wei found that mlx4_QUERY_HCA function marked the wrong capability
      in flags (64B CQE/EQE), when CQE/EQE stride feature was enabled.
      Also added small fix in initial CQE ownership bit assignment, when CQE
      is size is not default 32B.
      
      Fixes: 77507aa2 (net/mlx4: Enable CQE/EQE stride support)
      Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3f2511f
    • N
      net: fec: Fix NAPI race · 94191fd6
      Nimrod Andy 提交于
      Do camera capture test on i.MX6q sabresd board, and save the capture data to
      nfs rootfs. The command is:
      gst-launch-1.0 -e imxv4l2src device=/dev/video1 num-buffers=2592000 ! tee name=t !
      queue ! imxv4l2sink sync=false t. ! queue ! vpuenc ! queue ! mux. pulsesrc num-buffers=3720937
      blocksize=4096 ! 'audio/x-raw, rate=44100, channels=2' ! queue ! imxmp3enc ! mpegaudioparse !
      queue ! mux. qtmux name=mux ! filesink location=video_recording_long.mov
      
      After about 10 hours running, there have net watchdog timeout kernel dump:
      ...
      WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x2b4/0x2d8()
      NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.24-01051-gdb840b7 #440
      [<80014e6c>] (unwind_backtrace) from [<800118ac>] (show_stack+0x10/0x14)
      [<800118ac>] (show_stack) from [<806ae3f0>] (dump_stack+0x78/0xc0)
      [<806ae3f0>] (dump_stack) from [<8002b504>] (warn_slowpath_common+0x68/0x8c)
      [<8002b504>] (warn_slowpath_common) from [<8002b558>] (warn_slowpath_fmt+0x30/0x40)
      [<8002b558>] (warn_slowpath_fmt) from [<8055e0d4>] (dev_watchdog+0x2b4/0x2d8)
      [<8055e0d4>] (dev_watchdog) from [<800352d8>] (call_timer_fn.isra.33+0x24/0x8c)
      [<800352d8>] (call_timer_fn.isra.33) from [<800354c4>] (run_timer_softirq+0x184/0x220)
      [<800354c4>] (run_timer_softirq) from [<8002f420>] (__do_softirq+0xc0/0x22c)
      [<8002f420>] (__do_softirq) from [<8002f804>] (irq_exit+0xa8/0xf4)
      [<8002f804>] (irq_exit) from [<8000ee5c>] (handle_IRQ+0x54/0xb4)
      [<8000ee5c>] (handle_IRQ) from [<80008598>] (gic_handle_irq+0x28/0x5c)
      [<80008598>] (gic_handle_irq) from [<800123c0>] (__irq_svc+0x40/0x74)
      Exception stack(0x80d27f18 to 0x80d27f60)
      7f00:                                                       80d27f60 0000014c
      7f20: 8858c60e 0000004d 884e4540 0000004d ab7250d0 80d34348 00000000 00000000
      7f40: 00000001 00000000 00000017 80d27f60 800702a4 80476e6c 600f0013 ffffffff
      [<800123c0>] (__irq_svc) from [<80476e6c>] (cpuidle_enter_state+0x50/0xe0)
      [<80476e6c>] (cpuidle_enter_state) from [<80476fa8>] (cpuidle_idle_call+0xac/0x154)
      [<80476fa8>] (cpuidle_idle_call) from [<8000f174>] (arch_cpu_idle+0x8/0x44)
      [<8000f174>] (arch_cpu_idle) from [<80064c54>] (cpu_startup_entry+0x100/0x158)
      [<80064c54>] (cpu_startup_entry) from [<80cd8a9c>] (start_kernel+0x304/0x368)
      ---[ end trace 09ebd32fb032f86d ]---
      ...
      
      There might have a race in napi_schedule(), leaving interrupts disabled forever.
      After these patch, the case still work more than 40 hours running.
      Signed-off-by: NFugang Duan <B38611@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94191fd6
  5. 16 12月, 2014 11 次提交
  6. 13 12月, 2014 6 次提交
    • C
      net/macb: add TX multiqueue support for gem · 02c958dd
      Cyrille Pitchen 提交于
      gem devices designed with multiqueue CANNOT work without this patch.
      
      When probing a gem device, the driver must first prepare and enable the
      peripheral clock before accessing I/O registers. The second step is to read the
      MID register to find whether the device is a gem or an old macb IP.
      For gem devices, it reads the Design Configuration Register 6 (DCFG6) to
      compute to total number of queues, whereas macb devices always have a single
      queue.
      Only then it can call alloc_etherdev_mq() with the correct number of queues.
      This is the reason why the order of some initializations has been changed in
      macb_probe().
      Eventually, the dedicated IRQ and TX ring buffer descriptors are initialized
      for each queue.
      
      For backward compatibility reasons, queue0 uses the legacy registers ISR, IER,
      IDR, IMR, TBQP and RBQP. On the other hand, the other queues use new registers
      ISR[1..7], IER[1..7], IDR[1..7], IMR[1..7], TBQP[1..7] and RBQP[1..7].
      Except this hardware detail there is no real difference between queue0 and the
      others. The driver hides that thanks to the struct macb_queue.
      This structure allows us to share a common set of functions for all the queues.
      
      Besides when a TX error occurs, the gem MUST be halted before writing any of
      the TBQP registers to reset the relevant queue. An immediate side effect is
      that the other queues too aren't processed anymore by the gem.
      So macb_tx_error_task() calls netif_tx_stop_all_queues() to notify the Linux
      network engine that all transmissions are stopped.
      
      Also macb_tx_error_task() now calls spin_lock_irqsave() to prevent the
      interrupt handlers of the other queues from running as each of them may wake
      its associated queue up (please refer to macb_tx_interrupt()).
      
      Finally, as all queues have previously been stopped, they should be restarted
      calling netif_tx_start_all_queues() and setting the TSTART bit into the Network
      Control Register. Before this patch, when dealing with a single queue, the
      driver used to defer the reset of the faulting queue and the write of the
      TSTART bit until the next call of macb_start_xmit().
      As explained before, this bit is now set by macb_tx_error_task() too. That's
      why the faulting queue MUST be reset by setting the TX_USED bit in its first
      buffer descriptor before writing the TSTART bit.
      
      Queue 0 always exits and is the lowest priority when other queues are available.
      The higher the index of the queue is, the higher its priority is.
      
      When transmitting frames, the TX queue is selected by the skb->queue_mapping
      value. So queue discipline can be used to define the queue priority policy.
      Signed-off-by: NCyrille Pitchen <cyrille.pitchen@atmel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02c958dd
    • Q
      jme: replace calls to redundant function · 06f66529
      Quentin Lambert 提交于
      Calls to tasklet_hi_enable are replaced by calls to
      tasklet_enable since the 2 functions are redundant.
      Signed-off-by: NQuentin Lambert <lambert.quentin@gmail.com>
      Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06f66529
    • T
      net: ethernet: davicom: Allow to select DM9000 for nios2 · a169758a
      Tobias Klauser 提交于
      This chip is present on older revisions of the DE2 development kit.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a169758a
    • T
      net: ethernet: smsc: Allow to select SMC91X for nios2 · 5499776b
      Tobias Klauser 提交于
      This chip is present on the Nios2 Development Kit 2C35.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5499776b
    • H
      cxgb4: Add support for QSA modules · 40e9de4b
      Hariprasad Shenai 提交于
      Firmware 1.12.25.0 added support for QSA module, adding the driver code for it.
      Also fixes some ethtool get settings for other module types.
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40e9de4b
    • K
      cxgb4/cxgb4i: set the max. pdu length in firmware · 64bfead8
      Karen Xie 提交于
      Programs the firmware of the maximum outgoing iscsi pdu length per connection.
      Signed-off-by: NKaren Xie <kxie@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64bfead8
  7. 12 12月, 2014 11 次提交
    • D
      vio: create routines for inc,dec vio dring indexes · fe47c3c2
      Dwight Engen 提交于
      Both sunvdc and sunvnet implemented distinct functionality for incrementing
      and decrementing dring indexes. Create common functions for use by both
      from the sunvnet versions, which were chosen since they will still work
      correctly in case a non power of two ring size is used.
      Signed-off-by: NDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe47c3c2
    • C
      r8169:update rtl8168g pcie ephy parameter · 5fbea337
      Chun-Hao Lin 提交于
      Add ephy parameter to rtl8168g.
      Also change the common function of rtl8168g from "rtl_hw_start_8168g_1" to
       "rtl_hw_start_8168g". And function "rtl_hw_start_8168g_1" is used for
      setting rtl8168g hardware parameters.
      
      Following is the explanation of what hardware parameter change for.
      rtl8168g may erroneous judge the PCIe signal quality and show the error bit
      on PCI configuration space when in PCIe low power mode.
      The following ephy parameters are for above issue.
      { 0x00, 0x0000,	0x0008 }
      { 0x0c, 0x37d0,	0x0820 }
      { 0x1e, 0x0000,	0x0001 }
      
      rtl8168g may return to PCIe L0 from PCIe L0s low power mode too slow.
      The following ephy parameter is for above issue.
      { 0x19, 0x8000,	0x0000 }
      Signed-off-by: NChunhao Lin <hau@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fbea337
    • A
      fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads · 124b74c1
      Alexander Duyck 提交于
      This change makes it so that dma_rmb is used when reading the Rx
      descriptor.  The advantage of dma_rmb is that it allows for a much
      lower cost barrier on x86, powerpc, arm, and arm64 architectures than a
      traditional memory barrier when dealing with reads that only have to
      synchronize to coherent memory.
      
      In addition I have updated the code so that it just checks to see if any
      bits have been set instead of just the DD bit since the DD bit will always
      be set as a part of a descriptor write-back so we just need to check for a
      non-zero value being present at that memory location rather than just
      checking for any specific bit.  This allows the code itself to appear much
      cleaner and allows the compiler more room to optimize.
      
      Cc: Matthew Vick <matthew.vick@intel.com>
      Cc: Don Skidmore <donald.c.skidmore@intel.com>
      Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      124b74c1
    • A
      r8169: Use dma_rmb() and dma_wmb() for DescOwn checks · a0750138
      Alexander Duyck 提交于
      The r8169 use a pair of wmb() calls when setting up the descriptor rings.
      The first is to synchronize the descriptor data with the descriptor status,
      and the second is to synchronize the descriptor status with the use of the
      MMIO doorbell to notify the device that descriptors are ready.  This can
      come at a heavy price on some systems, and is not really necessary on
      systems such as x86 as a simple barrier() would suffice to order store/store
      accesses.  As such we can replace the first memory barrier with
      dma_wmb() to reduce the cost for these accesses.
      
      In addition the r8169 uses a rmb() to prevent compiler optimization in the
      cleanup paths, however by moving the barrier down a few lines and replacing
      it with a dma_rmb() we should be able to use it to guarantee
      descriptor accesses do not occur until the device has updated the DescOwn
      bit from its end.
      
      One last change I made is to move the update of cur_tx in the xmit path to
      after the wmb.  This way we can guarantee the device and all CPUs should
      see the DescOwn update before they see the cur_tx value update.
      
      Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
      Cc: Francois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0750138
    • C
    • M
      net/mlx4: Add support for A0 steering · 7d077cd3
      Matan Barak 提交于
      Add the required firmware commands for A0 steering and a way to enable
      that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT,
      QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used
      to configure and query the device.
      
      The different A0 DMFS (steering) modes are:
      
      Static - optimized performance, but flow steering rules are
      limited. This mode should be choosed explicitly by the user
      in order to be used.
      
      Dynamic - this mode should be explicitly choosed by the user.
      In this mode, the FW works in optimized steering mode as long as
      it can and afterwards automatically drops to classic (full) DMFS.
      
      Disable - this mode should be explicitly choosed by the user.
      The user instructs the system not to use optimized steering, even if
      the FW supports Dynamic A0 DMFS (and thus will be able to use optimized
      steering in Default A0 DMFS mode).
      
      Default - this mode is implicitly choosed. In this mode, if the FW
      supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll
      work at Disable A0 DMFS mode.
      
      Under SRIOV configuration, when the A0 steering mode is enabled,
      older guest VF drivers who aren't using the RX QP allocation flag
      (MLX4_RESERVE_A0_QP) will get a QP from the general range and
      fail when attempting to register a steering rule. To avoid that,
      the PF context behaviour is changed once on A0 static mode, to
      require support for the allocation flag in VF drivers too.
      
      In order to enable A0 steering, we use log_num_mgm_entry_size param.
      If the value of the parameter is not positive, we treat the absolute
      value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this
      bit field enables static A0 steering.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d077cd3
    • M
      net/mlx4: Refactor QUERY_PORT · 431df8c7
      Matan Barak 提交于
      Currently QUERY_PORT is done as a part of QUERY_DEV_CAP firmware command.
      
      Since we would like to use it without querying all device capabilities,
      extract this part to be a function of its own.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      431df8c7
    • M
      net/mlx4_core: Add explicit error message when rule doesn't meet configuration · 579d059b
      Matan Barak 提交于
      When a given flow steering rule is invalid in respect to the current
      steering configuration, print the correct error message to the system log.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      579d059b
    • M
      net/mlx4: Add A0 hybrid steering · d57febe1
      Matan Barak 提交于
      A0 hybrid steering is a form of high performance flow steering.
      By using this mode, mlx4 cards use a fast limited table based steering,
      in order to enable fast steering of unicast packets to a QP.
      
      In order to implement A0 hybrid steering we allocate resources
      from different zones:
      (1) General range
      (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region.
      
      When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP,
      we try hard to allocate the QP from range (2). Otherwise, we try hard not
      to allocate from this  range. However, when the system is pushed to its
      limits and one needs every resource, the allocator uses every region it can.
      
      Meaning, when we run out of raw-eth qps, the allocator allocates from the
      general range (and the special-A0 area is no longer active). If we run out
      of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that
      is also exhausted, the allocator will allocate from the general range
      (and the A0 region is no longer active).
      
      Note that if a raw-eth qp is allocated from the general range, it attempts
      to allocate the range such that bits 6 and 7 (blueflame bits) in the
      QP number are not set.
      
      When the feature is used in SRIOV, the VF has to notify the PF what
      kind of QP attributes it needs. In order to do that, along with the
      "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According
      to the combination of these bits, the PF tries to allocate a suitable QP.
      
      In order to maintain backward compatibility (with older PFs), the PF
      notifies which QP attributes it supports via QUERY_FUNC_CAP command.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d57febe1
    • M
      net/mlx4: Add mlx4_bitmap zone allocator · 7a89399f
      Matan Barak 提交于
      The zone allocator is a mechanism which manages a few mlx4_bitmaps.
      
      When allocating a resource, the user indicates the desired zone of
      which this resource will be allocated from. If possible, the resource
      will be allocated from this zone. Otherwise, the resource will be
      allocated from a less-than, equal-to, higher-than priority zone,
      according to the desired zone's properties with that respective
      allocation order.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a89399f
    • D
      net/mlx4: Add a check if there are too many reserved QPs · ab256e5a
      Dotan Barak 提交于
      The number of reserved QPs is affected both from the firmware and
      from the driver's requirements. This patch adds a check that
      validates that this number is indeed feasable.
      Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab256e5a