1. 11 1月, 2011 1 次提交
    • O
      IPoIB: Remove LRO support · 19e364f6
      Or Gerlitz 提交于
      As a first step in moving from LRO to GRO, revert commit af40da89
      ("IPoIB: add LRO support").  Also eliminate the ethtool set_flags
      callback which isn't needed anymore.  Finally, we need to include
      <linux/sched.h> directly to get the declaration of restart_syscall()
      (which used to be included implicitly through <linux/inet_lro.h>).
      
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vladimir Sokolovsky <vlad@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      19e364f6
  2. 09 12月, 2010 1 次提交
    • D
      IB/uverbs: Handle large number of entries in poll CQ · 7182afea
      Dan Carpenter 提交于
      In ib_uverbs_poll_cq() code there is a potential integer overflow if
      userspace passes in a large cmd.ne.  The calls to kmalloc() would
      allocate smaller buffers than intended, leading to memory corruption.
      There iss also an information leak if resp wasn't all used.
      Unprivileged userspace may call this function, although only if an
      RDMA device that uses this function is present.
      
      Fix this by copying CQ entries one at a time, which avoids the
      allocation entirely, and also by moving this copying into a function
      that makes sure to initialize all memory copied to userspace.
      
      Special thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      for his help and advice.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      
      [ Monkey around with things a bit to avoid bad code generation by gcc
        when designated initializers are used.  - Roland ]
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      7182afea
  3. 02 12月, 2010 5 次提交
  4. 18 11月, 2010 1 次提交
  5. 17 11月, 2010 1 次提交
    • J
      SCSI host lock push-down · f281233d
      Jeff Garzik 提交于
      Move the mid-layer's ->queuecommand() invocation from being locked
      with the host lock to being unlocked to facilitate speeding up the
      critical path for drivers who don't need this lock taken anyway.
      
      The patch below presents a simple SCSI host lock push-down as an
      equivalent transformation.  No locking or other behavior should change
      with this patch.  All existing bugs and locking orders are preserved.
      
      Additionally, add one parameter to queuecommand,
      	struct Scsi_Host *
      and remove one parameter from queuecommand,
      	void (*done)(struct scsi_cmnd *)
      
      Scsi_Host* is a convenient pointer that most host drivers need anyway,
      and 'done' is redundant to struct scsi_cmnd->scsi_done.
      
      Minimal code disturbance was attempted with this change.  Most drivers
      needed only two one-line modifications for their host lock push-down.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      Acked-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f281233d
  6. 29 10月, 2010 1 次提交
  7. 27 10月, 2010 7 次提交
  8. 26 10月, 2010 6 次提交
    • C
      fs: do not assign default i_ino in new_inode · 85fe4025
      Christoph Hellwig 提交于
      Instead of always assigning an increasing inode number in new_inode
      move the call to assign it into those callers that actually need it.
      For now callers that need it is estimated conservatively, that is
      the call is added to all filesystems that do not assign an i_ino
      by themselves.  For a few more filesystems we can avoid assigning
      any inode number given that they aren't user visible, and for others
      it could be done lazily when an inode number is actually needed,
      but that's left for later patches.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85fe4025
    • E
      IB/core: Add link layer type information to sysfs · 8ad330a0
      Eli Cohen 提交于
      Since an IB transport port may use either IB or Ethernet as its link layer,
      add the file /sys/class/infiniband/<device>/ports/<port_num>/link_layer to
      show the link layer for the port.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      8ad330a0
    • E
      IB/mlx4: Add VLAN support for IBoE · 4c3eb3ca
      Eli Cohen 提交于
      This patch allows IBoE traffic to be encapsulated in 802.1Q tagged
      VLAN frames.  The VLAN tag is encoded in the GID and derived from it
      by a simple computation.
      
      The netdev notifier callback is modified to catch VLAN device
      addition/removal and the port's GID table is updated to reflect the
      change, so that for each netdevice there is an entry in the GID table.
      When the port's GID table is exhausted, GID entries will not be added.
      Only children of the main interfaces can add to the GID table; if a
      VLAN interface is added on another VLAN interface (e.g. "vconfig add
      eth2.6 8"), then that interfaces will not add an entry to the GID
      table.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      4c3eb3ca
    • E
      IB/core: Add VLAN support for IBoE · af7bd463
      Eli Cohen 提交于
      Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the
      GID derived from a link local address in the following way:
      
          GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN.
      
      The 3 bits user priority field of the packets are identical to the 3
      bits of the SL.
      
      In case of rdma_cm apps, the TOS field is used to generate the SL
      field by doing a shift right of 5 bits effectively taking to 3 MS bits
      of the TOS field.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      af7bd463
    • E
      IB/mlx4: Add support for IBoE · fa417f7b
      Eli Cohen 提交于
      Add support for IBoE to mlx4_ib.  The bulk of the code is handling the
      new address vector fields; mlx4 needs the MAC address of a remote node
      to include it in a WQE (for datagrams) or in the QP context (for
      connected QPs).  Address resolution is done by assuming all unicast
      GIDs are either link-local IPv6 addresses.
      
      Multicast group attach/detach needs to update the NIC's multicast
      filters; but since attaching a QP to a multicast group can be done
      before the QP is bound to a port, for IBoE we need to keep track of
      all multicast groups that a QP is attached too before it transitions
      from INIT to RTR (since it does not have a port in the INIT state).
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      
      [ Many things cleaned up and otherwise monkeyed with; hope I didn't
        introduce too many bugs.  - Roland ]
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      fa417f7b
    • E
  9. 25 10月, 2010 4 次提交
  10. 24 10月, 2010 4 次提交
    • J
      IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144 · d0d68b86
      Jack Morgenstein 提交于
      The Node Description cannot be changed via MADs (it is read-only).
      Until now, it was changed in the driver via sysfs, and the new Node
      Description was simply inserted by the driver into MAD responses
      (replacing the description returned by FW).
      
      System startup scripts use the sysfs interface to change the node
      description at driver startup to show the hostname, etc. However, this
      has a race condition: the SM could discover the original FW node
      description rather than the system-specific description if it queried the
      port before the startup scripts finish running.
      
      For mlx4, we fix this with a new FW command (SET_NODE) that allows
      passing the new node description to FW.  When this command is invoked,
      FW sends a trap 144 to the SM.  When it gets this trap, the SM can
      query the node to obtain the new node description -- thus eliminating
      the effects of the race.
      
      This patch simply calls SET_NODE command when a new node description
      is entered via sysfs (thus causing trap 144 to be issued by the FW).
      We ignore all failures of the SET_NODE command (including those caused
      by using a device FW that predates the SET_NODE command), since in
      that case things work just as before.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      d0d68b86
    • M
      IB: Replace EXTRA_CFLAGS with ccflags-y · 7454159d
      matt mooney 提交于
      Signed-off-by: Nmatt mooney <mfm@muteddisk.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      7454159d
    • S
      RDMA/ucma: Allow tuning the max listen backlog · 97cb7e40
      Steve Wise 提交于
      For iWARP connections, the connect request is carried in a TCP payload
      on an already established TCP connection.  So if the ucma's backlog is
      full, the connection request is transmitted and acked at the TCP level
      by the time the connect request gets dropped in the ucma.  The end
      result is the connection gets rejected by the iWARP provider.
      Further, a 32 node 256NP OpenMPI job will generate > 128 connect
      requests on some ranks.
      
      This patch increases the default max backlog to 1024, and adds a
      sysctl variable so the backlog can be adjusted at run time.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      97cb7e40
    • E
      IPoIB: Set dev_id field of net_device · c3aa9b18
      Eli Cohen 提交于
      Use the net device's dev_id field to encode the port number of the pci
      device.  This can be used to to associate a net device with the pci
      device's port. The encoding is: dev_id = port - 1.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      c3aa9b18
  11. 23 10月, 2010 6 次提交
  12. 18 10月, 2010 1 次提交
  13. 17 10月, 2010 1 次提交
  14. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373