1. 07 10月, 2011 2 次提交
  2. 26 9月, 2011 1 次提交
    • N
      [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference · e48f129c
      Neil Horman 提交于
      This oops was reported recently:
      d:mon> e
      cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120]
          pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
          lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
          sp: c0000000fd4c73a0
         msr: 8000000000009032
         dar: 0
       dsisr: 40000000
        current = 0xc0000000fd640d40
        paca    = 0xc00000000054ff80
          pid   = 5085, comm = iscsid
      d:mon> t
      [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
      [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi]
      [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
      [scsi_transport_iscsi2]
      [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4
      [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c
      [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c
      [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8
      [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac
      [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c
      [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
      --- Exception: c01 (System Call) at 00000080da560cfc
      
      The root cause was an EEH error, which sent us down the offload_close path in
      the cxgb3 driver, which in turn sets cdev->l2opt to NULL, without regard for
      upper layer driver (like the cxgbi drivers) which might have execution contexts
      in the middle of its use. The result is the oops above, when t3_l2t_get attempts
      to dereference L2DATA(cdev)->nentries in arp_hash right after the EEH error handler sets it to NULL.
      
      The fix is to prevent the setting of the NULL pointer until after there are no
      further users of it.  The t3cdev->l2opt pointer is now converted to be an rcu
      pointer and the L2DATA macro is now called under the protection of the
      rcu_read_lock().  When the EEH error path:
      t3_adapter_error->offload_close->cxgb3_offload_deactivate
      Is exectured, setting of that l2opt pointer to NULL, is now gated on an rcu
      quiescence point, preventing, allowing L2DATA callers to safely check for a NULL
      pointer without concern that the underlying data will be freeded before the
      pointer is dereferenced.
      
      This has been tested by the reporter and shown to fix the reproted oops
      
      [nhorman: fix up unitinialised variable reported by Dan Carpenter]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NKaren Xie <kxie@chelsio.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      e48f129c
  3. 18 8月, 2011 2 次提交
  4. 17 8月, 2011 1 次提交
  5. 27 7月, 2011 1 次提交
  6. 25 7月, 2011 1 次提交
    • N
      iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi · 12352183
      Nicholas Bellinger 提交于
      This patch renames the following iscsi_proto.h structures to avoid
      namespace issues with drivers/target/iscsi/iscsi_target_core.h:
      
      *) struct iscsi_cmd -> struct iscsi_scsi_req
      *) struct iscsi_cmd_rsp -> struct iscsi_scsi_rsp
      *) struct iscsi_login -> struct iscsi_login_req
      
      This patch includes useful ISCSI_FLAG_LOGIN_[CURRENT,NEXT]_STAGE*,
      and ISCSI_FLAG_SNACK_TYPE_* definitions used by iscsi_target_mod, and
      fixes the incorrect definition of struct iscsi_snack to following
      RFC-3720 Section 10.16. SNACK Request.
      
      Also, this patch updates libiscsi, iSER, be2iscsi, and bn2xi to
      use the updated structure definitions in a handful of locations.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NNicholas A. Bellinger <nab@linux-iscsi.org>
      12352183
  7. 23 7月, 2011 1 次提交
    • M
      IB/qib: Defer HCA error events to tasklet · e67306a3
      Mike Marciniszyn 提交于
      With ib_qib options:
      
          options ib_qib krcvqs=1 pcie_caps=0x51 rcvhdrcnt=4096 singleport=1 ibmtu=4
      
      a run of ib_write_bw -a yields the following:
      
          ------------------------------------------------------------------
           #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
           1048576   5000           2910.64            229.80
          ------------------------------------------------------------------
      
      The top cpu use in a profile is:
      
          CPU: Intel Architectural Perfmon, speed 2400.15 MHz (estimated)
          Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
          of 0x00 (No unit mask) count 1002300
          Counted LLC_MISSES events (Last level cache demand requests from this core that
          missed the LLC) with a unit mask of 0x41 (No unit mask) count 10000
          samples  %        samples  %        app name                 symbol name
          15237    29.2642  964      17.1195  ib_qib.ko                qib_7322intr
          12320    23.6618  1040     18.4692  ib_qib.ko                handle_7322_errors
          4106      7.8860  0              0  vmlinux                  vsnprintf
      
      
      Analysis of the stats, profile, the code, and the annotated profile indicate:
       - All of the overflow interrupts (one per packet overflow) are
         serviced on CPU0 with no mitigation on the frequency.
       - All of the receive interrupts are being serviced by CPU0.  (That is
         the way truescale.cmds statically allocates the kctx IRQs to CPU)
       - The code is spending all of its time servicing QIB_I_C_ERROR
         RcvEgrFullErr interrupts on CPU0, starving the packet receive
         processing.
       - The decode_err routine is very inefficient, using a printf variant
         to format a "%s" and continues to loop when the errs mask has been
         cleared.
       - Both qib_7322intr and handle_7322_errors read pci registers, which
         is very inefficient.
      
      The fix does the following:
       - Adds a tasklet to service QIB_I_C_ERROR
       - Replaces the very inefficient scnprintf() with a memcpy().  A field
         is added to qib_hwerror_msgs to save the sizeof("string") at
         compile time so that a strlen is not needed during err_decode().
       - The most frequent errors (Overflows) are serviced first to exit the
         loop as early as possible.
       - The loop now exits as soon as the errs mask is clear rather than
         fruitlessly looping through the msp array.
      
      With this fix the performance changes to:
      
          ------------------------------------------------------------------
           #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
           1048576   5000           2990.64            2941.35
          ------------------------------------------------------------------
      
      During testing of the error handling overflow patch, it was determined
      that some CPU's were slower when servicing both overflow and receive
      interrupts on CPU0 with different MSI interrupt vectors.
      
      This patch adds an option (krcvq01_no_msi) to not use a dedicated MSI
      interrupt for kctx's < 2 and to service them on the default interrupt.
      For some CPUs, the cost of the interrupt enter/exit is more costly
      than then the additional PCI read in the default handler.
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      e67306a3
  8. 22 7月, 2011 1 次提交
  9. 19 7月, 2011 17 次提交
  10. 18 7月, 2011 1 次提交
  11. 16 7月, 2011 1 次提交
    • G
      IB/mthca: Stop returning separate error and status from FW commands · cdb73db0
      Goldwyn Rodrigues 提交于
      Instead of having firmware command functions return an error and also
      a status, leading to code like:
      
      	err = mthca_FW_COMMAND(..., &status);
      	if (err)
      		goto out;
              if (status) {
      		err = -E...;
      		goto out;
      	}
      
      all over the place, just handle the FW status inside the FW command
      handling code (the way mlx4 does it), so we can simply write:
      
      	err = mthca_FW_COMMAND(...);
      	if (err)
      		goto out;
      
      In addition to simplifying the source code, this also saves a healthy
      chunk of text:
      
          add/remove: 0/0 grow/shrink: 10/88 up/down: 510/-3357 (-2847)
          function                                     old     new   delta
          static.trans_table                           324     584    +260
          mthca_cmd_poll                               352     477    +125
          mthca_cmd_wait                               511     567     +56
          mthca_table_put                              213     240     +27
          mthca_cleanup_db_tab                         372     387     +15
          __mthca_remove_one                           314     323      +9
          mthca_cleanup_user_db_tab                    275     283      +8
          __mthca_init_one                            1738    1746      +8
          mthca_cleanup                                 20      21      +1
          mthca_MAD_IFC                               1081    1082      +1
          mthca_MGID_HASH                               43      40      -3
          mthca_MAP_ICM_AUX                             23      20      -3
          mthca_MAP_ICM                                 19      16      -3
          mthca_MAP_FA                                  23      20      -3
          mthca_READ_MGM                                43      38      -5
          mthca_QUERY_SRQ                               43      38      -5
          mthca_QUERY_QP                                59      54      -5
          mthca_HW2SW_SRQ                               43      38      -5
          mthca_HW2SW_MPT                               60      55      -5
          mthca_HW2SW_EQ                                43      38      -5
          mthca_HW2SW_CQ                                43      38      -5
          mthca_free_icm_table                         120     114      -6
          mthca_query_srq                              214     206      -8
          mthca_free_qp                                662     654      -8
          mthca_cmd                                     38      28     -10
          mthca_alloc_db                              1321    1311     -10
          mthca_setup_hca                             1067    1055     -12
          mthca_WRITE_MTT                               35      22     -13
          mthca_WRITE_MGM                               40      27     -13
          mthca_UNMAP_ICM_AUX                           36      23     -13
          mthca_UNMAP_FA                                36      23     -13
          mthca_SYS_DIS                                 36      23     -13
          mthca_SYNC_TPT                                36      23     -13
          mthca_SW2HW_SRQ                               35      22     -13
          mthca_SW2HW_MPT                               35      22     -13
          mthca_SW2HW_EQ                                35      22     -13
          mthca_SW2HW_CQ                                35      22     -13
          mthca_RUN_FW                                  36      23     -13
          mthca_DISABLE_LAM                             36      23     -13
          mthca_CLOSE_IB                                36      23     -13
          mthca_CLOSE_HCA                               38      25     -13
          mthca_ARM_SRQ                                 39      26     -13
          mthca_free_icms                              178     164     -14
          mthca_QUERY_DDR                              389     375     -14
          mthca_resize_cq                             1063    1048     -15
          mthca_unmap_eq_icm                           123     107     -16
          mthca_map_eq_icm                             396     380     -16
          mthca_cmd_box                                 90      74     -16
          mthca_SET_IB                                 433     417     -16
          mthca_RESIZE_CQ                              369     353     -16
          mthca_MAP_ICM_page                           240     224     -16
          mthca_MAP_EQ                                 183     167     -16
          mthca_INIT_IB                                473     457     -16
          mthca_INIT_HCA                               745     729     -16
          mthca_map_user_db                            816     798     -18
          mthca_SYS_EN                                 157     139     -18
          mthca_cleanup_qp_table                        78      59     -19
          mthca_cleanup_eq_table                       168     149     -19
          mthca_UNMAP_ICM                              143     121     -22
          mthca_modify_srq                             172     149     -23
          mthca_unmap_fmr                              198     174     -24
          mthca_query_qp                               814     790     -24
          mthca_query_pkey                             343     319     -24
          mthca_SET_ICM_SIZE                            34      10     -24
          mthca_QUERY_DEV_LIM                         1870    1846     -24
          mthca_map_cmd                               1130    1105     -25
          mthca_ENABLE_LAM                             401     375     -26
          mthca_modify_port                            247     220     -27
          mthca_query_device                           884     850     -34
          mthca_NOP                                     75      41     -34
          mthca_table_get                              287     249     -38
          mthca_init_qp_table                          333     293     -40
          mthca_MODIFY_QP                              348     308     -40
          mthca_close_hca                              131      89     -42
          mthca_free_eq                                435     390     -45
          mthca_query_port                             755     705     -50
          mthca_free_cq                                581     528     -53
          mthca_alloc_icm_table                        578     524     -54
          mthca_multicast_attach                      1041     986     -55
          mthca_init_hca                               326     271     -55
          mthca_query_gid                              487     431     -56
          mthca_free_srq                               524     468     -56
          mthca_free_mr                                168     111     -57
          mthca_create_eq                             1560    1501     -59
          mthca_multicast_detach                       790     728     -62
          mthca_write_mtt                              918     854     -64
          mthca_register_device                       1406    1342     -64
          mthca_fmr_alloc                              947     883     -64
          mthca_mr_alloc                               652     582     -70
          mthca_process_mad                           1242    1164     -78
          mthca_dev_lim                                910     830     -80
          find_mgm                                     482     400     -82
          mthca_modify_qp                             3852    3753     -99
          mthca_init_cq                               1281    1181    -100
          mthca_alloc_srq                             1719    1610    -109
          mthca_init_eq_table                         1807    1679    -128
          mthca_init_tavor                             761     491    -270
          mthca_init_arbel                            2617    2098    -519
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
      cdb73db0
  12. 14 7月, 2011 1 次提交
    • B
      IB/srp: Avoid duplicate devices from LUN scan · fd1b6c4a
      Bart Van Assche 提交于
      SCSI scanning of a channel:id:lun triplet in Linux works as follows
      (function scsi_scan_target() in drivers/scsi/scsi_scan.c):
      
      - If lun == SCAN_WILD_CARD, send a REPORT LUNS command to the target
        and process the result.
      
      - If lun != SCAN_WILD_CARD, send an INQUIRY command to the LUN
        corresponding to the specified channel:id:lun triplet to verify
        whether the LUN exists.
      
      So a SCSI driver must either take the channel and target id values in
      account in its quecommand() function or it should declare that it only
      supports one channel and one target id.
      
      Currently the ib_srp driver does neither.  As a result scanning the
      SCSI bus via e.g. rescan-scsi-bus.sh causes many duplicate SCSI
      devices to be created. For each 0:0:L device, several duplicates are
      created with the same LUN number and with (C:I) != (0:0). Fix this by
      declaring that the ib_srp driver only supports one channel and one
      target id.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: <stable@kernel.org>
      Acked-by: NDavid Dillow <dillowda@ornl.gov>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      fd1b6c4a
  13. 05 7月, 2011 1 次提交
  14. 18 6月, 2011 4 次提交
  15. 10 6月, 2011 1 次提交
    • G
      rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679
      Greg Rose 提交于
      The message size allocated for rtnl ifinfo dumps was limited to
      a single page.  This is not enough for additional interface info
      available with devices that support SR-IOV and caused a bug in
      which VF info would not be displayed if more than approximately
      40 VFs were created per interface.
      
      Implement a new function pointer for the rtnl_register service that will
      calculate the amount of data required for the ifinfo dump and allocate
      enough data to satisfy the request.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c7ac8679
  16. 07 6月, 2011 1 次提交
  17. 26 5月, 2011 3 次提交
    • N
      RDMA/cma: Save PID of ID's owner · 83e9502d
      Nir Muchtar 提交于
      Save the PID associated with an RDMA CM ID for reporting via netlink.
      Signed-off-by: NNir Muchtar <nirm@voltaire.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      83e9502d
    • N
      RDMA/cma: Add support for netlink statistics export · 753f618a
      Nir Muchtar 提交于
      Add callbacks and data types for statistics export of all current
      devices/ids.  The schema for RDMA CM is a series of netlink messages.
      Each one contains an rdma_cm_stat struct.  Additionally, two netlink
      attributes are created for the addresses for each message (if
      applicable).
      
      Their types used are:
      RDMA_NL_RDMA_CM_ATTR_SRC_ADDR (The source address for this ID)
      RDMA_NL_RDMA_CM_ATTR_DST_ADDR (The destination address for this ID)
      sockaddr_* structs are encapsulated within these attributes.
      
      In other words, every transaction contains a series of messages like:
      
      -------message 1-------
      struct rdma_cm_id_stats {
             __u32 qp_num;
             __u32 bound_dev_if;
             __u32 port_space;
             __s32 pid;
             __u8 cm_state;
             __u8 node_type;
             __u8 port_num;
             __u8 reserved;
      }
      RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute - contains the source address
      RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute - contains the destination address
      -------end 1-------
      -------message 2-------
      struct rdma_cm_id_stats
      RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute
      RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute
      -------end 2-------
      Signed-off-by: NNir Muchtar <nirm@voltaire.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      753f618a
    • S
      RDMA/cma: Pass QP type into rdma_create_id() · b26f9b99
      Sean Hefty 提交于
      The RDMA CM currently infers the QP type from the port space selected
      by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
      be a 1-1 correspondence between port space and QP type.  For netlink
      export of RDMA CM state, we want to export the QP type to userspace,
      so it is cleaner to explicitly associate a QP type to an ID.
      
      Modify rdma_create_id() to allow the user to specify the QP type, and
      use it to make our selections of datagram versus connected mode.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b26f9b99