1. 02 9月, 2014 2 次提交
  2. 22 8月, 2014 1 次提交
  3. 13 8月, 2014 1 次提交
  4. 08 8月, 2014 1 次提交
    • A
      cxgb4: IEEE fixes for DCBx state machine · 10b00466
      Anish Bhatt 提交于
      * Changes required due to 16eecd9b ("dcbnl : Fix misleading
        dcb_app->priority explanation")
      * Driver was previously not aware of what DCBx version was negotiated by
        firmware, this could lead to DCB app table  in kernel or in firmware being
        populated wrong  since IEEE/CEE used different formats made clear by above
        mentioned commit
      * Driver was missing a couple of state transitions that could be caused
        by other drivers that use chelsio hardware, resulting in incorrect behaviour
        (the change that addresses this also flips the state machine to switch on
         state instead of transition, hope this is okay in current window)
      * Prio queue info & tsa is no longer thrown away
      
      v2: Print DCBx state transition messages only when debug is enabled
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10b00466
  5. 07 8月, 2014 1 次提交
  6. 06 8月, 2014 1 次提交
  7. 18 7月, 2014 1 次提交
  8. 16 7月, 2014 4 次提交
    • H
      cxgb4/iw_cxgb4: work request logging feature · 7730b4c7
      Hariprasad Shenai 提交于
      This commit enhances the iwarp driver to optionally keep a log of rdma
      work request timining data for kernel mode QPs.  If iw_cxgb4 module option
      c4iw_wr_log is set to non-zero, each work request is tracked and timing
      data maintained in a rolling log that is 4096 entries deep by default.
      Module option c4iw_wr_log_size_order allows specifing a log2 size to use
      instead of the default order of 12 (4096 entries). Both module options
      are read-only and must be passed in at module load time to set them. IE:
      
      modprobe iw_cxgb4 c4iw_wr_log=1 c4iw_wr_log_size_order=10
      
      The timing data is viewable via the iw_cxgb4 debugfs file "wr_log".
      Writing anything to this file will clear all the timing data.
      Data tracked includes:
      
      - The host time when the work request was posted, just before ringing
      the doorbell.  The host time when the completion was polled by the
      application.  This is also the time the log entry is created.  The delta
      of these two times is the amount of time took processing the work request.
      
      - The qid of the EQ used to post the work request.
      
      - The work request opcode.
      
      - The cqe wr_id field.  For sq completions requests this is the swsqe
      index.  For recv completions this is the MSN of the ingress SEND.
      This value can be used to match log entries from this log with firmware
      flowc event entries.
      
      - The sge timestamp value just before ringing the doorbell when
      posting,  the sge timestamp value just after polling the completion,
      and CQE.timestamp field from the completion itself.  With these three
      timestamps we can track the latency from post to poll, and the amount
      of time the completion resided in the CQ before being reaped by the
      application.  With debug firmware, the sge timestamp is also logged by
      firmware in its flowc history so that we can compute the latency from
      posting the work request until the firmware sees it.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7730b4c7
    • H
      cxgb4/iw_cxgb4: display TPTE on errors · 031cf476
      Hariprasad Shenai 提交于
      With ingress WRITE or READ RESPONSE errors, HW provides the offending
      stag from the packet.  This patch adds logic to log the parsed TPTE
      in this case. cxgb4 now exports a function to read a TPTE entry
      from adapter memory.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      031cf476
    • H
      cxgb4/iw_cxgb4: use firmware ord/ird resource limits · 4c2c5763
      Hariprasad Shenai 提交于
      Advertise a larger max read queue depth for qps, and gather the resource limits
      from fw and use them to avoid exhaustinq all the resources.
      
      Design:
      
      cxgb4:
      
      Obtain the max_ordird_qp and max_ird_adapter device params from FW
      at init time and pass them up to the ULDs when they attach.  If these
      parameters are not available, due to older firmware, then hard-code
      the values based on the known values for older firmware.
      iw_cxgb4:
      
      Fix the c4iw_query_device() to report these correct values based on
      adapter parameters.  ibv_query_device() will always return:
      
      max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp)
      max_res_rd_atom = max_ird_adapter
      
      Bump up the per qp max module option to 32, allowing it to be increased
      by the user up to the device max of max_ordird_qp.  32 seems to be
      sufficient to maximize throughput for streaming read benchmarks.
      
      Fail connection setup if the negotiated IRD exhausts the available
      adapter ird resources.  So the driver will track the amount of ird
      resource in use and not send an RI_WR/INIT to FW that would reduce the
      available ird resources below zero.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c2c5763
    • H
      iw_cxgb4: Detect Ing. Padding Boundary at run-time · 04e10e21
      Hariprasad Shenai 提交于
      Updates iw_cxgb4 to determine the Ingress Padding Boundary from
      cxgb4_lld_info, and take subsequent actions.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04e10e21
  9. 03 7月, 2014 1 次提交
  10. 02 7月, 2014 4 次提交
  11. 25 6月, 2014 1 次提交
    • L
      cxgb4: Not need to hold the adap_rcu_lock lock when read adap_rcu_list · ee9a33b2
      Li RongQing 提交于
      cxgb4_netdev maybe lead to dead lock, since it uses a spin lock, and be called
      in both thread and softirq context, but not disable BH, the lockdep report is
      below; In fact, cxgb4_netdev only reads adap_rcu_list with RCU protection, so
      not need to hold spin lock again.
      	=================================
      	[ INFO: inconsistent lock state ]
      	3.14.7+ #24 Tainted: G         C O
      	---------------------------------
      	inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      	radvd/3794 [HC0[0]:SC1[1]:HE1:SE0] takes:
      	 (adap_rcu_lock){+.?...}, at: [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
      	{SOFTIRQ-ON-W} state was registered at:
      	  [<ffffffff810fca81>] __lock_acquire+0x34a/0xe48
      	  [<ffffffff810fd98b>] lock_acquire+0x82/0x9d
      	  [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43
      	  [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
      	  [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4]
      	  [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c
      	  [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e
      	  [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11
      	  [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18
      	  [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6]
      	  [<ffffffffa01f8df0>] addrconf_add_linklocal+0x5f/0x95 [ipv6]
      	  [<ffffffffa01fc3e9>] addrconf_notify+0x632/0x841 [ipv6]
      	  [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c
      	  [<ffffffff810e09a1>] __raw_notifier_call_chain+0x9/0xb
      	  [<ffffffff810e09b2>] raw_notifier_call_chain+0xf/0x11
      	  [<ffffffff8151b3b7>] call_netdevice_notifiers_info+0x4e/0x56
      	  [<ffffffff8151b3d0>] call_netdevice_notifiers+0x11/0x13
      	  [<ffffffff8151c0a6>] netdev_state_change+0x1f/0x38
      	  [<ffffffff8152f004>] linkwatch_do_dev+0x3b/0x49
      	  [<ffffffff8152f184>] __linkwatch_run_queue+0x10b/0x144
      	  [<ffffffff8152f1dd>] linkwatch_event+0x20/0x27
      	  [<ffffffff810d7bc0>] process_one_work+0x1cb/0x2ee
      	  [<ffffffff810d7e3b>] worker_thread+0x12e/0x1fc
      	  [<ffffffff810dd391>] kthread+0xc4/0xcc
      	  [<ffffffff815dc48c>] ret_from_fork+0x7c/0xb0
      	irq event stamp: 3388
      	hardirqs last  enabled at (3388): [<ffffffff810c6c85>]
      	__local_bh_enable_ip+0xaa/0xd9
      	hardirqs last disabled at (3387): [<ffffffff810c6c2d>]
      	__local_bh_enable_ip+0x52/0xd9
      	softirqs last  enabled at (3288): [<ffffffffa01f1d5b>]
      	rcu_read_unlock_bh+0x0/0x2f [ipv6]
      	softirqs last disabled at (3289): [<ffffffff815ddafc>]
      	do_softirq_own_stack+0x1c/0x30
      
      	other info that might help us debug this:
      	 Possible unsafe locking scenario:
      
      	       CPU0
      	       ----
      	  lock(adap_rcu_lock);
      	  <Interrupt>
      	    lock(adap_rcu_lock);
      
      	 *** DEADLOCK ***
      
      	5 locks held by radvd/3794:
      	 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffffa020b85a>]
      	rawv6_sendmsg+0x74b/0xa4d [ipv6]
      	 #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8151ac6b>]
      	rcu_lock_acquire+0x0/0x29
      	 #2:  (rcu_read_lock){.+.+..}, at: [<ffffffffa01f4cca>]
      	rcu_lock_acquire.constprop.16+0x0/0x30 [ipv6]
      	 #3:  (rcu_read_lock){.+.+..}, at: [<ffffffff810e09b4>]
      	rcu_lock_acquire+0x0/0x29
      	 #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa0998782>]
      	rcu_lock_acquire.constprop.40+0x0/0x30 [cxgb4]
      
      	stack backtrace:
      	CPU: 7 PID: 3794 Comm: radvd Tainted: G         C O 3.14.7+ #24
      	Hardware name: Supermicro X7DBU/X7DBU, BIOS 6.00 12/03/2007
      	 ffffffff81f15990 ffff88012fdc36a8 ffffffff815d0016 0000000000000006
      	 ffff8800c80dc2a0 ffff88012fdc3708 ffffffff815cc727 0000000000000001
      	 0000000000000001 ffff880100000000 ffffffff81015b02 ffff8800c80dcb58
      	Call Trace:
      	 <IRQ>  [<ffffffff815d0016>] dump_stack+0x4e/0x71
      	 [<ffffffff815cc727>] print_usage_bug+0x1ec/0x1fd
      	 [<ffffffff81015b02>] ? save_stack_trace+0x27/0x44
      	 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0
      	 [<ffffffff810fc640>] mark_lock+0x11b/0x212
      	 [<ffffffff810fca0b>] __lock_acquire+0x2d4/0xe48
      	 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0
      	 [<ffffffff810fbff6>] ? check_usage_forwards+0x4c/0xa6
      	 [<ffffffff810c6c8a>] ? __local_bh_enable_ip+0xaf/0xd9
      	 [<ffffffff810fd98b>] lock_acquire+0x82/0x9d
      	 [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4]
      	 [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4]
      	 [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43
      	 [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4]
      	 [<ffffffffa09987b0>] ? rcu_lock_acquire.constprop.40+0x2e/0x30 [cxgb4]
      	 [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4]
      	 [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
      	 [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4]
      	 [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d
      	 [<ffffffff810e09b4>] ? raw_notifier_call_chain+0x11/0x11
      	 [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c
      	 [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e
      	 [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11
      	 [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18
      	 [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6]
      	 [<ffffffff810fde6a>] ? trace_hardirqs_on+0xd/0xf
      	 [<ffffffffa01fb634>] addrconf_prefix_rcv+0x385/0x6ea [ipv6]
      	 [<ffffffffa0207950>] ndisc_rcv+0x9d3/0xd76 [ipv6]
      	 [<ffffffffa020d536>] icmpv6_rcv+0x592/0x67b [ipv6]
      	 [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9
      	 [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9
      	 [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b
      	 [<ffffffffa020df97>] ? rcu_read_unlock+0x21/0x23 [ipv6]
      	 [<ffffffff8150df52>] ? rcu_read_unlock+0x23/0x23
      	 [<ffffffffa01f4ede>] ip6_input_finish+0x1e4/0x2fc [ipv6]
      	 [<ffffffffa01f540b>] ip6_input+0x33/0x38 [ipv6]
      	 [<ffffffffa01f5557>] ip6_mc_input+0x147/0x160 [ipv6]
      	 [<ffffffffa01f4ba3>] ip6_rcv_finish+0x7c/0x81 [ipv6]
      	 [<ffffffffa01f5397>] ipv6_rcv+0x3a1/0x3e2 [ipv6]
      	 [<ffffffff8151ef96>] __netif_receive_skb_core+0x4ab/0x511
      	 [<ffffffff810fdc94>] ? mark_held_locks+0x71/0x99
      	 [<ffffffff8151f0c0>] ? process_backlog+0x69/0x15e
      	 [<ffffffff8151f045>] __netif_receive_skb+0x49/0x5b
      	 [<ffffffff8151f0cf>] process_backlog+0x78/0x15e
      	 [<ffffffff8151f571>] ? net_rx_action+0x1a2/0x1cc
      	 [<ffffffff8151f47b>] net_rx_action+0xac/0x1cc
      	 [<ffffffff810c69b7>] ? __do_softirq+0xad/0x218
      	 [<ffffffff810c69ff>] __do_softirq+0xf5/0x218
      	 [<ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30
      	 <EOI>  [<ffffffff810c6bb6>] do_softirq+0x38/0x5d
      	 [<ffffffffa01f1d5b>] ? ip6_copy_metadata+0x156/0x156 [ipv6]
      	 [<ffffffff810c6c78>] __local_bh_enable_ip+0x9d/0xd9
      	 [<ffffffffa01f1d88>] rcu_read_unlock_bh+0x2d/0x2f [ipv6]
      	 [<ffffffffa01f28b4>] ip6_finish_output2+0x381/0x3d8 [ipv6]
      	 [<ffffffffa01f49ef>] ip6_finish_output+0x6e/0x73 [ipv6]
      	 [<ffffffffa01f4a70>] ip6_output+0x7c/0xa8 [ipv6]
      	 [<ffffffff815b1bfa>] dst_output+0x18/0x1c
      	 [<ffffffff815b1c9e>] ip6_local_out+0x1c/0x21
      	 [<ffffffffa01f2489>] ip6_push_pending_frames+0x37d/0x427 [ipv6]
      	 [<ffffffff81558af8>] ? skb_orphan+0x39/0x39
      	 [<ffffffffa020b85a>] ? rawv6_sendmsg+0x74b/0xa4d [ipv6]
      	 [<ffffffffa020ba51>] rawv6_sendmsg+0x942/0xa4d [ipv6]
      	 [<ffffffff81584cd2>] inet_sendmsg+0x3d/0x66
      	 [<ffffffff81508930>] __sock_sendmsg_nosec+0x25/0x27
      	 [<ffffffff8150b0d7>] sock_sendmsg+0x5a/0x7b
      	 [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b
      	 [<ffffffff8116d756>] ? might_fault+0x9e/0xa5
      	 [<ffffffff8116d70d>] ? might_fault+0x55/0xa5
      	 [<ffffffff81508cb1>] ? copy_from_user+0x2a/0x2c
      	 [<ffffffff8150b70c>] ___sys_sendmsg+0x226/0x2d9
      	 [<ffffffff810fcd25>] ? __lock_acquire+0x5ee/0xe48
      	 [<ffffffff810fde01>] ? trace_hardirqs_on_caller+0x145/0x1a1
      	 [<ffffffff8118efcb>] ? slab_free_hook.isra.71+0x50/0x59
      	 [<ffffffff8115c81f>] ? release_pages+0xbc/0x181
      	 [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d
      	 [<ffffffff81115e97>] ? read_seqcount_begin.constprop.25+0x73/0x90
      	 [<ffffffff8150c408>] __sys_sendmsg+0x3d/0x5b
      	 [<ffffffff8150c433>] SyS_sendmsg+0xd/0x19
      	 [<ffffffff815dc53d>] system_call_fastpath+0x1a/0x1f
      Reported-by: NBen Greear <greearb@candelatech.com>
      Cc: Casey Leedom <leedom@chelsio.com>
      Cc: Hariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee9a33b2
  12. 23 6月, 2014 2 次提交
  13. 11 6月, 2014 3 次提交
  14. 03 6月, 2014 1 次提交
  15. 14 5月, 2014 1 次提交
  16. 13 5月, 2014 2 次提交
  17. 01 5月, 2014 1 次提交
  18. 29 3月, 2014 1 次提交
  19. 27 3月, 2014 1 次提交
  20. 15 3月, 2014 1 次提交
    • S
      cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes · 05eb2389
      Steve Wise 提交于
      The current logic suffers from a slow response time to disable user DB
      usage, and also fails to avoid DB FIFO drops under heavy load. This commit
      fixes these deficiencies and makes the avoidance logic more optimal.
      This is done by more efficiently notifying the ULDs of potential DB
      problems, and implements a smoother flow control algorithm in iw_cxgb4,
      which is the ULD that puts the most load on the DB fifo.
      
      Design:
      
      cxgb4:
      
      Direct ULD callback from the DB FULL/DROP interrupt handler.  This allows
      the ULD to stop doing user DB writes as quickly as possible.
      
      While user DB usage is disabled, the LLD will accumulate DB write events
      for its queues.  Then once DB usage is reenabled, a single DB write is
      done for each queue with its accumulated write count.  This reduces the
      load put on the DB fifo when reenabling.
      
      iw_cxgb4:
      
      Instead of marking each qp to indicate DB writes are disabled, we create
      a device-global status page that each user process maps.  This allows
      iw_cxgb4 to only set this single bit to disable all DB writes for all
      user QPs vs traversing the idr of all the active QPs.  If the libcxgb4
      doesn't support this, then we fall back to the old approach of marking
      each QP.  Thus we allow the new driver to work with an older libcxgb4.
      
      When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes
      via the status page and transition the DB state to STOPPED.  As user
      processes see that DB writes are disabled, they call into iw_cxgb4
      to submit their DB write events.  Since the DB state is in STOPPED,
      the QP trying to write gets enqueued on a new DB "flow control" list.
      As subsequent DB writes are submitted for this flow controlled QP, the
      amount of writes are accumulated for each QP on the flow control list.
      So all the user QPs that are actively ringing the DB get put on this
      list and the number of writes they request are accumulated.
      
      When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq
      context, we change the DB state to FLOW_CONTROL, and begin resuming all
      the QPs that are on the flow control list.  This logic runs on until
      the flow control list is empty or we exit FLOW_CONTROL mode (due to
      a DB DROP upcall, for example).  QPs are removed from this list, and
      their accumulated DB write counts written to the DB FIFO.  Sets of QPs,
      called chunks in the code, are removed at one time. The chunk size is 64.
      So 64 QPs are resumed at a time, and before the next chunk is resumed, the
      logic waits (blocks) for the DB FIFO to drain.  This prevents resuming to
      quickly and overflowing the FIFO.  Once the flow control list is empty,
      the db state transitions back to NORMAL and user QPs are again allowed
      to write directly to the user DB register.
      
      The algorithm is designed such that if the DB write load is high enough,
      then all the DB writes get submitted by the kernel using this flow
      controlled approach to avoid DB drops.  As the load lightens though, we
      resume to normal DB writes directly by user applications.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05eb2389
  21. 26 2月, 2014 1 次提交
  22. 25 2月, 2014 1 次提交
  23. 19 2月, 2014 7 次提交