1. 25 5月, 2007 1 次提交
  2. 19 5月, 2007 1 次提交
  3. 15 5月, 2007 6 次提交
  4. 10 5月, 2007 1 次提交
    • R
      Add suspend-related notifications for CPU hotplug · 8bb78442
      Rafael J. Wysocki 提交于
      Since nonboot CPUs are now disabled after tasks and devices have been
      frozen and the CPU hotplug infrastructure is used for this purpose, we need
      special CPU hotplug notifications that will help the CPU-hotplug-aware
      subsystems distinguish normal CPU hotplug events from CPU hotplug events
      related to a system-wide suspend or resume operation in progress.  This
      patch introduces such notifications and causes them to be used during
      suspend and resume transitions.  It also changes all of the
      CPU-hotplug-aware subsystems to take these notifications into consideration
      (for now they are handled in the same way as the corresponding "normal"
      ones).
      
      [oleg@tv-sign.ru: cleanups]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8bb78442
  5. 09 5月, 2007 1 次提交
    • R
      IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules · f7c6a7b5
      Roland Dreier 提交于
      Export ib_umem_get()/ib_umem_release() and put low-level drivers in
      control of when to call ib_umem_get() to pin and DMA map userspace,
      rather than always calling it in ib_uverbs_reg_mr() before calling the
      low-level driver's reg_user_mr method.
      
      Also move these functions to be in the ib_core module instead of
      ib_uverbs, so that driver modules using them do not depend on
      ib_uverbs.
      
      This has a number of advantages:
       - It is better design from the standpoint of making generic code a
         library that can be used or overridden by device-specific code as
         the details of specific devices dictate.
       - Drivers that do not need to pin userspace memory regions do not
         need to take the performance hit of calling ib_mem_get().  For
         example, although I have not tried to implement it in this patch,
         the ipath driver should be able to avoid pinning memory and just
         use copy_{to,from}_user() to access userspace memory regions.
       - Buffers that need special mapping treatment can be identified by
         the low-level driver.  For example, it may be possible to solve
         some Altix-specific memory ordering issues with mthca CQs in
         userspace by mapping CQ buffers with extra flags.
       - Drivers that need to pin and DMA map userspace memory for things
         other than memory regions can use ib_umem_get() directly, instead
         of hacks using extra parameters to their reg_phys_mr method.  For
         example, the mlx4 driver that is pending being merged needs to pin
         and DMA map QP and CQ buffers, but it does not need to create a
         memory key for these buffers.  So the cleanest solution is for mlx4
         to call ib_umem_get() in the create_qp and create_cq methods.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f7c6a7b5
  6. 07 5月, 2007 2 次提交
    • R
      IB: Return "maybe missed event" hint from ib_req_notify_cq() · ed23a727
      Roland Dreier 提交于
      The semantics defined by the InfiniBand specification say that
      completion events are only generated when a completions is added to a
      completion queue (CQ) after completion notification is requested.  In
      other words, this means that the following race is possible:
      
      	while (CQ is not empty)
      		ib_poll_cq(CQ);
      	// new completion is added after while loop is exited
      	ib_req_notify_cq(CQ);
      	// no event is generated for the existing completion
      
      To close this race, the IB spec recommends doing another poll of the
      CQ after requesting notification.
      
      However, it is not always possible to arrange code this way (for
      example, we have found that NAPI for IPoIB cannot poll after
      requesting notification).  Also, some hardware (eg Mellanox HCAs)
      actually will generate an event for completions added before the call
      to ib_req_notify_cq() -- which is allowed by the spec, since there's
      no way for any upper-layer consumer to know exactly when a completion
      was really added -- so the extra poll of the CQ is just a waste.
      
      Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for
      ib_req_notify_cq() so that it can return a hint about whether the a
      completion may have been added before the request for notification.
      The return value of ib_req_notify_cq() is extended so:
      
      	 < 0	means an error occurred while requesting notification
      	== 0	means notification was requested successfully, and if
      		IB_CQ_REPORT_MISSED_EVENTS was passed in, then no
      		events were missed and it is safe to wait for another
      		event.
      	 > 0	is only returned if IB_CQ_REPORT_MISSED_EVENTS was
      		passed in.  It means that the consumer must poll the
      		CQ again to make sure it is empty to avoid the race
      		described above.
      
      We add a flag to enable this behavior rather than turning it on
      unconditionally, because checking for missed events may incur
      significant overhead for some low-level drivers, and consumers that
      don't care about the results of this test shouldn't be forced to pay
      for the test.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      ed23a727
    • M
      IB: Add CQ comp_vector support · f4fd0b22
      Michael S. Tsirkin 提交于
      Add a num_comp_vectors member to struct ib_device and extend
      ib_create_cq() to pass in a comp_vector parameter -- this parallels
      the userspace libibverbs API.  Update all hardware drivers to set
      num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector
      value.  Pass the value of num_comp_vectors to userspace rather than
      hard-coding a value of 1.
      
      We want multiple CQ event vector support (via MSI-X or similar for
      adapters that can generate multiple interrupts), but it's not clear
      how many vectors we want, or how we want to deal with policy issues
      such as how to decide which vector to use or how to set up interrupt
      affinity.  This patch is useful for experimenting, since no core
      changes will be necessary when updating a driver to support multiple
      vectors, and we know that we want to make at least these changes
      anyway.
      Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f4fd0b22
  7. 02 5月, 2007 1 次提交
  8. 26 4月, 2007 1 次提交
  9. 25 4月, 2007 1 次提交
  10. 13 4月, 2007 2 次提交
  11. 23 3月, 2007 1 次提交
  12. 02 3月, 2007 1 次提交
    • H
      IB/ehca: Fix sync between completion handler and destroy cq · 31726798
      Hoang-Nam Nguyen 提交于
      This patch fixes two issues reported by Roland Dreier and Christoph Hellwig:
      
      - Mismatched sync/locking between completion handler and destroy cq We
        introduced a counter nr_events per cq to track number of irq events
        seen. This counter is incremented when an event queue entry is seen
        and decremented after completion handler has been called regardless
        if scaling code is active or not. Note that nr_callbacks tracks
        number of events assigned to a cpu and both counters can potentially
        diverge.
      
        The sync between running completion handler and destroy cq is done
        by using the global spin lock ehca_cq_idr_lock.
      
      - Replace yield by wait_event on the counter above to become zero.
      Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      31726798
  13. 17 2月, 2007 4 次提交
  14. 15 2月, 2007 1 次提交
  15. 12 2月, 2007 1 次提交
  16. 11 2月, 2007 1 次提交
  17. 05 2月, 2007 3 次提交
    • H
      IB/ehca: Remove obsolete prototypes · b45bfcc1
      Hoang-Nam Nguyen 提交于
      Remove prototypes for functions that don't exist.
      Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      b45bfcc1
    • H
      IB/ehca: Remove use of do_mmap() · 4c34bdf5
      Hoang-Nam Nguyen 提交于
      This patch removes do_mmap() from ehca:
       - Call remap_pfn_range() for hardware register block
       - Use vm_insert_page() to register memory allocated for completion
         queues and queue pairs
       - The actual mmap() call/trigger is now controlled by user space,
         ie. libehca
      Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      4c34bdf5
    • M
      IB: Return qp pointer as part of ib_wc · 062dbb69
      Michael S. Tsirkin 提交于
      struct ib_wc currently only includes the local QP number: this matches
      the IB spec, but seems mostly useless. The following patch replaces
      this with the pointer to qp itself, and updates all low level drivers
      and all users.
      
      This has the following advantages:
      - Ability to get a per-qp context through wc->qp->qp_context
      - Existing drivers already have the qp pointer ready in poll cq, so
        this change actually saves a tiny bit (extra memory read) on data path
        (for ehca it would actually be expensive to find the QP pointer when
        polling a CQ, but ehca does not support SRQ so we can leave wc->qp as
        NULL for ehca)
      - Users that need the QP number can still get it through wc->qp->qp_num
      
      Use case:
      
      In IPoIB connected mode code, I have a common CQ shared by multiple
      QPs.  To track connection usage, I need a way to get at some per-QP
      context upon the completion, and I would like to avoid allocating
      context object per work request just to stick a QP pointer into it.
      With this code, I can just use wc->qp->qp_context.
      Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      062dbb69
  18. 23 1月, 2007 2 次提交
  19. 10 1月, 2007 1 次提交
    • H
      IB/ehca: Use proper GFP_ flags for get_zeroed_page() · f2d91361
      Hoang-Nam Nguyen 提交于
      Here is a patch for ehca to use proper flag, ie. GFP_ATOMIC
      resp. GFP_KERNEL, when calling get_zeroed_page() to prevent "Bug:
      scheduling while atomic...". This error does not cause a kernel panic
      but makes ipoib un-usable afterwards.  It is reproducible on
      2.6.20-rc4 if one does ifconfig down during a flood ping test.  I have
      not observed this error in earlier releases incl. 2.6.20-rc1.
      
      This error occurs when a qp event/irq is received and ehca event
      handler allocates a control block/page to obtain HCA error data block.
      Use of GFP_ATOMIC when in interrupt context prevents this issue.
      
      Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f2d91361
  20. 08 12月, 2006 1 次提交
  21. 30 11月, 2006 1 次提交
  22. 14 11月, 2006 2 次提交
  23. 10 11月, 2006 1 次提交
  24. 31 10月, 2006 1 次提交
  25. 05 10月, 2006 1 次提交
    • D
      IRQ: Maintain regs pointer globally rather than passing to IRQ handlers · 7d12e780
      David Howells 提交于
      Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
      of passing regs around manually through all ~1800 interrupt handlers in the
      Linux kernel.
      
      The regs pointer is used in few places, but it potentially costs both stack
      space and code to pass it around.  On the FRV arch, removing the regs parameter
      from all the genirq function results in a 20% speed up of the IRQ exit path
      (ie: from leaving timer_interrupt() to leaving do_IRQ()).
      
      Where appropriate, an arch may override the generic storage facility and do
      something different with the variable.  On FRV, for instance, the address is
      maintained in GR28 at all times inside the kernel as part of general exception
      handling.
      
      Having looked over the code, it appears that the parameter may be handed down
      through up to twenty or so layers of functions.  Consider a USB character
      device attached to a USB hub, attached to a USB controller that posts its
      interrupts through a cascaded auxiliary interrupt controller.  A character
      device driver may want to pass regs to the sysrq handler through the input
      layer which adds another few layers of parameter passing.
      
      I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
      main part of the code on FRV and i386, though I can't test most of the drivers.
      I've also done partial conversion for powerpc and MIPS - these at least compile
      with minimal configurations.
      
      This will affect all archs.  Mostly the changes should be relatively easy.
      Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
      
      	struct pt_regs *old_regs = set_irq_regs(regs);
      
      And put the old one back at the end:
      
      	set_irq_regs(old_regs);
      
      Don't pass regs through to generic_handle_irq() or __do_IRQ().
      
      In timer_interrupt(), this sort of change will be necessary:
      
      	-	update_process_times(user_mode(regs));
      	-	profile_tick(CPU_PROFILING, regs);
      	+	update_process_times(user_mode(get_irq_regs()));
      	+	profile_tick(CPU_PROFILING);
      
      I'd like to move update_process_times()'s use of get_irq_regs() into itself,
      except that i386, alone of the archs, uses something other than user_mode().
      
      Some notes on the interrupt handling in the drivers:
      
       (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
           the input_dev struct.
      
       (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
           something different depending on whether it's been supplied with a regs
           pointer or not.
      
       (*) Various IRQ handler function pointers have been moved to type
           irq_handler_t.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
      7d12e780
  26. 03 10月, 2006 1 次提交