1. 22 7月, 2014 2 次提交
  2. 18 7月, 2014 2 次提交
  3. 16 7月, 2014 4 次提交
    • H
      cxgb4/iw_cxgb4: work request logging feature · 7730b4c7
      Hariprasad Shenai 提交于
      This commit enhances the iwarp driver to optionally keep a log of rdma
      work request timining data for kernel mode QPs.  If iw_cxgb4 module option
      c4iw_wr_log is set to non-zero, each work request is tracked and timing
      data maintained in a rolling log that is 4096 entries deep by default.
      Module option c4iw_wr_log_size_order allows specifing a log2 size to use
      instead of the default order of 12 (4096 entries). Both module options
      are read-only and must be passed in at module load time to set them. IE:
      
      modprobe iw_cxgb4 c4iw_wr_log=1 c4iw_wr_log_size_order=10
      
      The timing data is viewable via the iw_cxgb4 debugfs file "wr_log".
      Writing anything to this file will clear all the timing data.
      Data tracked includes:
      
      - The host time when the work request was posted, just before ringing
      the doorbell.  The host time when the completion was polled by the
      application.  This is also the time the log entry is created.  The delta
      of these two times is the amount of time took processing the work request.
      
      - The qid of the EQ used to post the work request.
      
      - The work request opcode.
      
      - The cqe wr_id field.  For sq completions requests this is the swsqe
      index.  For recv completions this is the MSN of the ingress SEND.
      This value can be used to match log entries from this log with firmware
      flowc event entries.
      
      - The sge timestamp value just before ringing the doorbell when
      posting,  the sge timestamp value just after polling the completion,
      and CQE.timestamp field from the completion itself.  With these three
      timestamps we can track the latency from post to poll, and the amount
      of time the completion resided in the CQ before being reaped by the
      application.  With debug firmware, the sge timestamp is also logged by
      firmware in its flowc history so that we can compute the latency from
      posting the work request until the firmware sees it.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7730b4c7
    • H
      cxgb4/iw_cxgb4: display TPTE on errors · 031cf476
      Hariprasad Shenai 提交于
      With ingress WRITE or READ RESPONSE errors, HW provides the offending
      stag from the packet.  This patch adds logic to log the parsed TPTE
      in this case. cxgb4 now exports a function to read a TPTE entry
      from adapter memory.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      031cf476
    • H
      cxgb4/iw_cxgb4: use firmware ord/ird resource limits · 4c2c5763
      Hariprasad Shenai 提交于
      Advertise a larger max read queue depth for qps, and gather the resource limits
      from fw and use them to avoid exhaustinq all the resources.
      
      Design:
      
      cxgb4:
      
      Obtain the max_ordird_qp and max_ird_adapter device params from FW
      at init time and pass them up to the ULDs when they attach.  If these
      parameters are not available, due to older firmware, then hard-code
      the values based on the known values for older firmware.
      iw_cxgb4:
      
      Fix the c4iw_query_device() to report these correct values based on
      adapter parameters.  ibv_query_device() will always return:
      
      max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp)
      max_res_rd_atom = max_ird_adapter
      
      Bump up the per qp max module option to 32, allowing it to be increased
      by the user up to the device max of max_ordird_qp.  32 seems to be
      sufficient to maximize throughput for streaming read benchmarks.
      
      Fail connection setup if the negotiated IRD exhausts the available
      adapter ird resources.  So the driver will track the amount of ird
      resource in use and not send an RI_WR/INIT to FW that would reduce the
      available ird resources below zero.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c2c5763
    • H
      iw_cxgb4: Detect Ing. Padding Boundary at run-time · 04e10e21
      Hariprasad Shenai 提交于
      Updates iw_cxgb4 to determine the Ingress Padding Boundary from
      cxgb4_lld_info, and take subsequent actions.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04e10e21
  4. 02 7月, 2014 1 次提交
  5. 11 6月, 2014 4 次提交
  6. 06 6月, 2014 1 次提交
    • Y
      RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_resp · b7dfa889
      Yann Droneaud 提交于
      The i386 ABI disagrees with most other ABIs regarding alignment of
      data types larger than 4 bytes: on most ABIs a padding must be added
      at end of the structures, while it is not required on i386.
      
      So for most ABI struct c4iw_alloc_ucontext_resp gets implicitly padded
      to be aligned on a 8 bytes multiple, while for i386, such padding is
      not added.
      
      The tool pahole can be used to find such implicit padding:
      
        $ pahole --anon_include \
                 --nested_anon_include \
                 --recursive \
                 --class_name c4iw_alloc_ucontext_resp \
                 drivers/infiniband/hw/cxgb4/iw_cxgb4.o
      
      Then, structure layout can be compared between i386 and x86_64:
      
        +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
        --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
        @@ -2,9 +2,8 @@ struct c4iw_alloc_ucontext_resp {
                __u64                      status_page_key;      /*     0     8 */
                __u32                      status_page_size;     /*     8     4 */
      
        -       /* size: 12, cachelines: 1, members: 2 */
        -       /* last cacheline: 12 bytes */
        +       /* size: 16, cachelines: 1, members: 2 */
        +       /* padding: 4 */
        +       /* last cacheline: 16 bytes */
         };
      
      This ABI disagreement will make an x86_64 kernel try to write past the
      buffer provided by an i386 binary.
      
      When boundary check will be implemented, the x86_64 kernel will refuse
      to write past the i386 userspace provided buffer and the uverbs will
      fail.
      
      If the structure is on a page boundary and the next page is not
      mapped, ib_copy_to_udata() will fail and the uverb will fail.
      
      Additionally, as reported by Dan Carpenter, without the implicit
      padding being properly cleared, an information leak would take place
      in most architectures.
      
      This patch adds an explicit padding to struct c4iw_alloc_ucontext_resp,
      and, like 92b0ca7c ("IB/mlx5: Fix stack info leak in
      mlx5_ib_alloc_ucontext()"), makes function c4iw_alloc_ucontext()
      not writting this padding field to userspace. This way, x86_64 kernel
      will be able to write struct c4iw_alloc_ucontext_resp as expected by
      unpatched and patched i386 libcxgb4.
      
      Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
      Link: http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain
      Link: http://marc.info/?i=20140328082428.GH25192@mwanda
      Cc: <stable@vger.kernel.org>
      Fixes: 05eb2389 ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
      Reported-by: NYann Droneaud <ydroneaud@opteya.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Acked-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b7dfa889
  7. 30 5月, 2014 1 次提交
    • Y
      RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp · b6f04d3d
      Yann Droneaud 提交于
      The i386 ABI disagrees with most other ABIs regarding alignment of
      data types larger than 4 bytes: on most ABIs a padding must be added
      at end of the structures, while it is not required on i386.
      
      So for most ABI struct c4iw_create_cq_resp gets implicitly padded
      to be aligned on a 8 bytes multiple, while for i386, such padding
      is not added.
      
      The tool pahole can be used to find such implicit padding:
      
        $ pahole --anon_include \
                 --nested_anon_include \
                 --recursive \
                 --class_name c4iw_create_cq_resp \
                 drivers/infiniband/hw/cxgb4/iw_cxgb4.o
      
      Then, structure layout can be compared between i386 and x86_64:
      
        +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
        --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
        @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp {
                __u32                      size;                 /*    28     4 */
                __u32                      qid_mask;             /*    32     4 */
      
        -       /* size: 36, cachelines: 1, members: 6 */
        -       /* last cacheline: 36 bytes */
        +       /* size: 40, cachelines: 1, members: 6 */
        +       /* padding: 4 */
        +       /* last cacheline: 40 bytes */
         };
      
      This ABI disagreement will make an x86_64 kernel try to write past the
      buffer provided by an i386 binary.
      
      When boundary check will be implemented, the x86_64 kernel will refuse
      to write past the i386 userspace provided buffer and the uverbs will
      fail.
      
      If the structure is on a page boundary and the next page is not
      mapped, ib_copy_to_udata() will fail and the uverb will fail.
      
      This patch adds an explicit padding at end of structure
      c4iw_create_cq_resp, and, like 92b0ca7c ("IB/mlx5: Fix stack info
      leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq()
      not writting this padding field to userspace. This way, x86_64 kernel
      will be able to write struct c4iw_create_cq_resp as expected by
      unpatched and patched i386 libcxgb4.
      
      Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
      Cc: <stable@vger.kernel.org>
      Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
      Fixes: e24a72a3 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()")
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Acked-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b6f04d3d
  8. 20 5月, 2014 2 次提交
  9. 29 4月, 2014 4 次提交
  10. 12 4月, 2014 10 次提交
  11. 02 4月, 2014 4 次提交
  12. 29 3月, 2014 1 次提交
  13. 25 3月, 2014 4 次提交