1. 09 12月, 2010 1 次提交
    • D
      IB/uverbs: Handle large number of entries in poll CQ · 7182afea
      Dan Carpenter 提交于
      In ib_uverbs_poll_cq() code there is a potential integer overflow if
      userspace passes in a large cmd.ne.  The calls to kmalloc() would
      allocate smaller buffers than intended, leading to memory corruption.
      There iss also an information leak if resp wasn't all used.
      Unprivileged userspace may call this function, although only if an
      RDMA device that uses this function is present.
      
      Fix this by copying CQ entries one at a time, which avoids the
      allocation entirely, and also by moving this copying into a function
      that makes sure to initialize all memory copied to userspace.
      
      Special thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      for his help and advice.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      
      [ Monkey around with things a bit to avoid bad code generation by gcc
        when designated initializers are used.  - Roland ]
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      7182afea
  2. 02 12月, 2010 2 次提交
  3. 25 11月, 2010 1 次提交
  4. 12 11月, 2010 1 次提交
    • E
      net: get rid of rtable->idev · 72cdd1d9
      Eric Dumazet 提交于
      It seems idev field in struct rtable has no special purpose, but adding
      extra atomic ops.
      
      We hold refcounts on the device itself (using percpu data, so pretty
      cheap in current kernel).
      
      infiniband case is solved using dst.dev instead of idev->dev
      
      Removal of this field means routing without route cache is now using
      shared data, percpu data, and only potential contention is a pair of
      atomic ops on struct neighbour per forwarded packet.
      
      About 5% speedup on routing test.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Roland Dreier <rolandd@cisco.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72cdd1d9
  5. 26 10月, 2010 3 次提交
  6. 24 10月, 2010 1 次提交
    • S
      RDMA/ucma: Allow tuning the max listen backlog · 97cb7e40
      Steve Wise 提交于
      For iWARP connections, the connect request is carried in a TCP payload
      on an already established TCP connection.  So if the ucma's backlog is
      full, the connection request is transmitted and acked at the TCP level
      by the time the connect request gets dropped in the ucma.  The end
      result is the connection gets rejected by the iWARP provider.
      Further, a 32 node 256NP OpenMPI job will generate > 128 connect
      requests on some ranks.
      
      This patch increases the default max backlog to 1024, and adds a
      sysctl variable so the backlog can be adjusted at run time.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      97cb7e40
  7. 15 10月, 2010 1 次提交
  8. 14 10月, 2010 2 次提交
  9. 12 10月, 2010 1 次提交
  10. 29 9月, 2010 1 次提交
  11. 28 9月, 2010 1 次提交
    • E
      IB/core: Add link layer property to ports · a3f5adaf
      Eli Cohen 提交于
      This patch allows ports to have different link layers:
      IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET.  This is required
      for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support.  For
      devices that do not provide an implementation for querying the link
      layer property of a port, we return a default value based on the
      transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND
      and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      a3f5adaf
  12. 05 8月, 2010 1 次提交
  13. 29 7月, 2010 1 次提交
    • S
      IB/cm: Check LAP state before sending an MRA · 50a025c6
      Sean Hefty 提交于
      NULL pointer dereferences in ib_cm_init_qp_attr() were seen by some
      users.  From a crash dump, I determined that we died in
      cm_init_qp_rts_attr() (it's inlined, so it doesn't show up in the
      traceback) on the line labeled below:
      
      static int cm_init_qp_rts_attr(struct cm_id_private *cm_id_priv,
                                     struct ib_qp_attr *qp_attr,
                                     int *qp_attr_mask)
      {
              ........
              if (cm_id_priv->id.lap_state == IB_CM_LAP_UNINIT) {
                      .....
              } else {
                     *qp_attr_mask = IB_QP_ALT_PATH | IB_QP_PATH_MIG_STATE;
                     qp_attr->alt_port_num = cm_id_priv->alt_av.port->port_num; <-die
      
      
      The problem is that the rdma_cm can call ib_send_cm_mra() after a
      connection has been established.  The ib_cm incorrectly assumes that
      the MRA is in response to a LAP (load alternate path) message, even
      though no LAP message has been received.  The ib_cm needs to check the
      lap_state before sending an MRA if the cm_id state is established.
      Reported-by: NArthur Kepner <akepner@sgi.com>
      Reported-by: NJosh England <jjengla@gmail.com>
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      50a025c6
  14. 15 7月, 2010 1 次提交
  15. 11 6月, 2010 1 次提交
  16. 26 5月, 2010 1 次提交
    • J
      IB/ucm: Use memdup_user() · e642df6a
      Julia Lawall 提交于
      Use memdup_user when user data is immediately copied into the
      allocated region.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression from,to,size,flag;
      position p;
      identifier l1,l2;
      @@
      
      -  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
      +  to = memdup_user(from,size);
         if (
      -      to==NULL
      +      IS_ERR(to)
                       || ...) {
         <+... when != goto l1;
      -  -ENOMEM
      +  PTR_ERR(to)
         ...+>
         }
      -  if (copy_from_user(to, from, size) != 0) {
      -    <+... when != goto l2;
      -    -EFAULT
      -    ...+>
      -  }
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      e642df6a
  17. 24 5月, 2010 1 次提交
  18. 22 5月, 2010 1 次提交
    • R
      IB/core: Allow device-specific per-port sysfs files · 9a6edb60
      Ralph Campbell 提交于
      Add a new parameter to ib_register_device() so that low-level device
      drivers can pass in a pointer to a callback function that will be
      called for each port that is registered in sysfs.  This allows
      low-level device drivers to create files in
      
          /sys/class/infiniband/<hca>/ports/<N>/
      
      without having to poke through the internals of the RDMA sysfs handling.
      
      There is no need for an unregister function since the kobject
      reference will go to zero when ib_unregister_device() is called.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      9a6edb60
  19. 16 5月, 2010 1 次提交
  20. 22 4月, 2010 2 次提交
    • T
      RDMA/cma: Randomize local port allocation · 5d7220e8
      Tetsuo Handa 提交于
      Randomize local port allocation in the way sctp_get_port_local() does.
      Update rover at the end of loop since we're likely to pick a valid port
      on the first try.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      5d7220e8
    • R
      IB: Explicitly rule out llseek to avoid BKL in default_llseek() · bc1db9af
      Roland Dreier 提交于
      Several RDMA user-access drivers have file_operations structures with
      no .llseek method set.  None of the drivers actually do anything with
      f_pos, so this means llseek is essentially a NOP, instead of returning
      an error as leaving other file_operations methods unimplemented would
      do.  This is mostly harmless, except that a NULL .llseek means that
      default_llseek() is used, and this function grabs the BKL, which we
      would like to avoid.
      
      Since llseek does nothing useful on these files, we would like it to
      return an error to userspace instead of silently grabbing the BKL and
      succeeding.  For nearly all of the file types, we take the
      belt-and-suspenders approach of setting the .llseek method to
      no_llseek and also calling nonseekable_open(); the exception is the
      uverbs_event files, which are created with anon_inode_getfile(), which
      already sets f_mode the same way as nonseekable_open() would.
      
      This work is motivated by Arnd Bergmann's bkl-removal tree.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      bc1db9af
  21. 08 4月, 2010 1 次提交
  22. 01 4月, 2010 1 次提交
  23. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  24. 19 3月, 2010 1 次提交
  25. 12 3月, 2010 1 次提交
  26. 08 3月, 2010 3 次提交
  27. 07 3月, 2010 1 次提交
  28. 04 3月, 2010 1 次提交
  29. 25 2月, 2010 5 次提交