1. 01 6月, 2011 8 次提交
    • M
      intel-iommu: Add domain check in domain_remove_one_dev_info · 8519dc44
      Mike Habeck 提交于
      The comment in domain_remove_one_dev_info() states "No need to compare
      PCI domain; it has to be the same". But for the si_domain that isn't
      going to be true, as it consists of all the PCI devices that are
      identity mapped thus multiple PCI domains can be in si_domain.  The
      code needs to validate the PCI domain too.
      Signed-off-by: NMike Habeck <habeck@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      8519dc44
    • M
      intel-iommu: Remove Host Bridge devices from identity mapping · 825507d6
      Mike Travis 提交于
      When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
      that do not use the IOMMU causes a kernel panic.  Fix that by not
      inserting those devices into the si_domain.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      825507d6
    • M
      intel-iommu: Use coherent DMA mask when requested · c681d0ba
      Mike Travis 提交于
      The __intel_map_single function is not honoring the passed in DMA mask.
      This results in not using the coherent DMA mask when called from
      intel_alloc_coherent().
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      c681d0ba
    • C
      intel-iommu: Dont cache iova above 32bit · 1c9fc3d1
      Chris Wright 提交于
      Mike Travis and Mike Habeck reported an issue where iova allocation
      would return a range that was larger than a device's dma mask.
      
      https://lkml.org/lkml/2011/3/29/423
      
      The dmar initialization code will reserve all PCI MMIO regions and copy
      those reservations into a domain specific iova tree.  It is possible for
      one of those regions to be above the dma mask of a device.  It is typical
      to allocate iovas with a 32bit mask (despite device's dma mask possibly
      being larger) and cache the result until it exhausts the lower 32bit
      address space.  Freeing the iova range that is >= the last iova in the
      lower 32bit range when there is still an iova above the 32bit range will
      corrupt the cached iova by pointing it to a region that is above 32bit.
      If that region is also larger than the device's dma mask, a subsequent
      allocation will return an unusable iova and cause dma failure.
      
      Simply don't cache an iova that is above the 32bit caching boundary.
      Reported-by: NMike Travis <travis@sgi.com>
      Reported-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Acked-by: NMike Travis <travis@sgi.com>
      Tested-by: NMike Habeck <habeck@sgi.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      1c9fc3d1
    • M
      intel-iommu: Speed up processing of the identity_mapping function · cb452a40
      Mike Travis 提交于
      When there are a large count of PCI devices, and the pass through
      option for iommu is set, much time is spent in the identity_mapping
      function hunting though the iommu domains to check if a specific
      device is "identity mapped".
      
      Speed up the function by checking the cached info to see if
      it's mapped to the static identity domain.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      cb452a40
    • C
      intel-iommu: Check for identity mapping candidate using system dma mask · 8fcc5372
      Chris Wright 提交于
      The identity mapping code appears to make the assumption that if the
      devices dma_mask is greater than 32bits the device can use identity
      mapping.  But that is not true: take the case where we have a 40bit
      device in a 44bit architecture. The device can potentially receive a
      physical address that it will truncate and cause incorrect addresses
      to be used.
      
      Instead check to see if the device's dma_mask is large enough
      to address the system's dma_mask.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      8fcc5372
    • A
      intel-iommu: Only unlink device domains from iommu · 9b4554b2
      Alex Williamson 提交于
      Commit a97590e5 added unlinking domains from iommus to reciprocate the
      iommu from domains unlinking that was already done.  We actually want
      to only do this for device domains and never for the static
      identity map domain or VM domains.  The SI domain is special and
      never freed, while VM domain->id lives in their own special address
      space, separate from iommu->domain_ids.
      
      In the current code, a VM can get domain->id zero, then mark that
      domain unused when unbound from pci-stub.  This leads to DMAR
      write faults when the device is re-bound to the host driver.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      9b4554b2
    • Y
      intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7
      Youquan Song 提交于
      There are no externally-visible changes with this. In the loop in the
      internal __domain_mapping() function, we simply detect if we are mapping:
        - size >= 2MiB, and
        - virtual address aligned to 2MiB, and
        - physical address aligned to 2MiB, and
        - on hardware that supports superpages.
      
      (and likewise for larger superpages).
      
      We automatically use a superpage for such mappings. We never have to
      worry about *breaking* superpages, since we trust that we will always
      *unmap* the same range that was mapped. So all we need to do is ensure
      that dma_pte_clear_range() will also cope with superpages.
      
      Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
      it can return a PTE at the appropriate level rather than always
      extending the page tables all the way down to level 1. Again, this is
      simplified by the fact that we should never encounter existing small
      pages when we're creating a mapping; any old mapping that used the same
      virtual range will have been entirely removed and its obsolete page
      tables freed.
      
      Provide an 'intel_iommu=sp_off' argument on the command line as a
      chicken bit. Not that it should ever be required.
      
      ==
      
      The original commit seen in the iommu-2.6.git was Youquan's
      implementation (and completion) of my own half-baked code which I'd
      typed into an email. Followed by half a dozen subsequent 'fixes'.
      
      I've taken the unusual step of rewriting history and collapsing the
      original commits in order to keep the main history simpler, and make
      life easier for the people who are going to have to backport this to
      older kernels. And also so I can give it a more coherent commit comment
      which (hopefully) gives a better explanation of what's going on.
      
      The original sequence of commits leading to identical code was:
      
      Youquan Song (3):
            intel-iommu: super page support
            intel-iommu: Fix superpage alignment calculation error
            intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
      
      David Woodhouse (4):
            intel-iommu: Precalculate superpage support for dmar_domain
            intel-iommu: Fix hardware_largepage_caps()
            intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
            intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      6dd9a7c7
  2. 24 5月, 2011 3 次提交
  3. 19 5月, 2011 6 次提交
  4. 18 5月, 2011 19 次提交
  5. 17 5月, 2011 4 次提交
    • L
      Merge branch 'timers-fixes-for-linus' of... · a085963a
      Linus Torvalds 提交于
      Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        tick: Clear broadcast active bit when switching to oneshot
        rtc: mc13xxx: Don't call rtc_device_register while holding lock
        rtc: rp5c01: Initialize drvdata before registering device
        rtc: pcap: Initialize drvdata before registering device
        rtc: msm6242: Initialize drvdata before registering device
        rtc: max8998: Initialize drvdata before registering device
        rtc: max8925: Initialize drvdata before registering device
        rtc: m41t80: Initialize clientdata before registering device
        rtc: ds1286: Initialize drvdata before registering device
        rtc: ep93xx: Initialize drvdata before registering device
        rtc: davinci: Initialize drvdata before registering device
        rtc: mxc: Initialize drvdata before registering device
        clocksource: Install completely before selecting
      a085963a
    • B
      x86, AMD: Fix ARAT feature setting again · 14fb57dc
      Borislav Petkov 提交于
      Trying to enable the local APIC timer on early K8 revisions
      uncovers a number of other issues with it, in conjunction with
      the C1E enter path on AMD. Fixing those causes much more churn
      and troubles than the benefit of using that timer brings so
      don't enable it on K8 at all, falling back to the original
      functionality the kernel had wrt to that.
      Reported-and-bisected-by: NNick Bowler <nbowler@elliptictech.com>
      Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Nick Bowler <nbowler@elliptictech.com>
      Cc: Joerg-Volker-Peetz <jvpeetz@web.de>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      14fb57dc
    • B
      Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors" · 328935e6
      Borislav Petkov 提交于
      This reverts commit e20a2d20, as it crashes
      certain boxes with specific AMD CPU models.
      
      Moving the lower endpoint of the Erratum 400 check to accomodate
      earlier K8 revisions (A-E) opens a can of worms which is simply
      not worth to fix properly by tweaking the errata checking
      framework:
      
      * missing IntPenging MSR on revisions < CG cause #GP:
      
      http://marc.info/?l=linux-kernel&m=130541471818831
      
      * makes earlier revisions use the LAPIC timer instead of the C1E
      idle routine which switches to HPET, thus not waking up in
      deeper C-states:
      
      http://lkml.org/lkml/2011/4/24/20
      
      Therefore, leave the original boundary starting with K8-revF.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      328935e6
    • J
      scsi: remove performance regression due to async queue run · 9937a5e2
      Jens Axboe 提交于
      Commit c21e6beb removed our queue request_fn re-enter
      protection, and defaulted to always running the queues from
      kblockd to be safe. This was a known potential slow down,
      but should be safe.
      
      Unfortunately this is causing big performance regressions for
      some, so we need to improve this logic. Looking into the details
      of the re-enter, the real issue is on requeue of requests.
      
      Requeue of requests upon seeing a BUSY condition from the device
      ends up re-running the queue, causing traces like this:
      
      scsi_request_fn()
              scsi_dispatch_cmd()
                      scsi_queue_insert()
                              __scsi_queue_insert()
                                      scsi_run_queue()
      					scsi_request_fn()
      						...
      
      potentially causing the issue we want to avoid. So special
      case the requeue re-run of the queue, but improve it to offload
      the entire run of local queue and starved queue from a single
      workqueue callback. This is a lot better than potentially
      kicking off a workqueue run for each device seen.
      
      This also fixes the issue of the local device going into recursion,
      since the above mentioned commit never moved that queue run out
      of line.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9937a5e2