提交 · 70e535d1e5d1e4317e894d6228b762cf9c3fbc6a · OpenHarmony / kernel_linux

01 6月, 2011 9 次提交

intel-iommu: Fix off-by-one in RMRR setup · 70e535d1

由 David Woodhouse 提交于 5月 31, 2011

We were mapping an extra byte (and hence usually an extra page):
iommu_prepare_identity_map() expects to be given an 'end' argument which
is the last byte to be mapped; not the first byte *not* to be mapped.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

70e535d1

intel-iommu: Add domain check in domain_remove_one_dev_info · 8519dc44

由 Mike Habeck 提交于 5月 28, 2011

The comment in domain_remove_one_dev_info() states "No need to compare
PCI domain; it has to be the same". But for the si_domain that isn't
going to be true, as it consists of all the PCI devices that are
identity mapped thus multiple PCI domains can be in si_domain.  The
code needs to validate the PCI domain too.
Signed-off-by: NMike Habeck <habeck@sgi.com>
Signed-off-by: NMike Travis <travis@sgi.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

8519dc44

intel-iommu: Remove Host Bridge devices from identity mapping · 825507d6

由 Mike Travis 提交于 5月 28, 2011

When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
that do not use the IOMMU causes a kernel panic.  Fix that by not
inserting those devices into the si_domain.
Signed-off-by: NMike Travis <travis@sgi.com>
Reviewed-by: NMike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

825507d6

intel-iommu: Use coherent DMA mask when requested · c681d0ba

由 Mike Travis 提交于 5月 28, 2011

The __intel_map_single function is not honoring the passed in DMA mask.
This results in not using the coherent DMA mask when called from
intel_alloc_coherent().
Signed-off-by: NMike Travis <travis@sgi.com>
Acked-by: NChris Wright <chrisw@sous-sol.org>
Reviewed-by: NMike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

c681d0ba

intel-iommu: Dont cache iova above 32bit · 1c9fc3d1

由 Chris Wright 提交于 5月 28, 2011

Mike Travis and Mike Habeck reported an issue where iova allocation
would return a range that was larger than a device's dma mask.

https://lkml.org/lkml/2011/3/29/423

The dmar initialization code will reserve all PCI MMIO regions and copy
those reservations into a domain specific iova tree.  It is possible for
one of those regions to be above the dma mask of a device.  It is typical
to allocate iovas with a 32bit mask (despite device's dma mask possibly
being larger) and cache the result until it exhausts the lower 32bit
address space.  Freeing the iova range that is >= the last iova in the
lower 32bit range when there is still an iova above the 32bit range will
corrupt the cached iova by pointing it to a region that is above 32bit.
If that region is also larger than the device's dma mask, a subsequent
allocation will return an unusable iova and cause dma failure.

Simply don't cache an iova that is above the 32bit caching boundary.
Reported-by: NMike Travis <travis@sgi.com>
Reported-by: NMike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Acked-by: NMike Travis <travis@sgi.com>
Tested-by: NMike Habeck <habeck@sgi.com>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

1c9fc3d1

intel-iommu: Speed up processing of the identity_mapping function · cb452a40

由 Mike Travis 提交于 5月 28, 2011

When there are a large count of PCI devices, and the pass through
option for iommu is set, much time is spent in the identity_mapping
function hunting though the iommu domains to check if a specific
device is "identity mapped".

Speed up the function by checking the cached info to see if
it's mapped to the static identity domain.
Signed-off-by: NMike Travis <travis@sgi.com>
Reviewed-by: NMike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

cb452a40

intel-iommu: Check for identity mapping candidate using system dma mask · 8fcc5372

由 Chris Wright 提交于 5月 28, 2011

The identity mapping code appears to make the assumption that if the
devices dma_mask is greater than 32bits the device can use identity
mapping.  But that is not true: take the case where we have a 40bit
device in a 44bit architecture. The device can potentially receive a
physical address that it will truncate and cause incorrect addresses
to be used.

Instead check to see if the device's dma_mask is large enough
to address the system's dma_mask.
Signed-off-by: NMike Travis <travis@sgi.com>
Reviewed-by: NMike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

8fcc5372

intel-iommu: Only unlink device domains from iommu · 9b4554b2

由 Alex Williamson 提交于 5月 24, 2011

Commit a97590e5 added unlinking domains from iommus to reciprocate the
iommu from domains unlinking that was already done.  We actually want
to only do this for device domains and never for the static
identity map domain or VM domains.  The SI domain is special and
never freed, while VM domain->id lives in their own special address
space, separate from iommu->domain_ids.

In the current code, a VM can get domain->id zero, then mark that
domain unused when unbound from pci-stub.  This leads to DMAR
write faults when the device is re-bound to the host driver.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

9b4554b2

intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7

由 Youquan Song 提交于 5月 25, 2011

There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
  - size >= 2MiB, and
  - virtual address aligned to 2MiB, and
  - physical address aligned to 2MiB, and
  - on hardware that supports superpages.

(and likewise for larger superpages).

We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.

Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.

Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.

==

The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.

I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.

The original sequence of commits leading to identical code was:

Youquan Song (3):
      intel-iommu: super page support
      intel-iommu: Fix superpage alignment calculation error
      intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()

David Woodhouse (4):
      intel-iommu: Precalculate superpage support for dmar_domain
      intel-iommu: Fix hardware_largepage_caps()
      intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
      intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
Signed-off-by: NYouquan Song <youquan.song@intel.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

6dd9a7c7

24 5月, 2011 3 次提交

intel-iommu: Flush unmaps at domain_exit · 7b668357

由 Alex Williamson 提交于 5月 24, 2011

We typically batch unmaps to be lazily flushed out at
regular intervals.  When we destroy a domain, we need
to force a flush of these lazy unmaps to be sure none
reference the domain we're about to free.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org

7b668357

intel-iommu: Remove obsolete comment from detect_intel_iommu · b3a530e4

由 Jan Kiszka 提交于 5月 15, 2011

Since cacd4213, this comment no longer applies.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

b3a530e4

intel-iommu: fix VT-d PMR disable for TXT on S3 resume · b779260b

由 Joseph Cihula 提交于 5月 03, 2011

This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which
was merged as commit 51a63e67.

This patch adds support for S3, as pointed out by Chris Wright.
Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

b779260b

17 5月, 2011 1 次提交

PCI: Clear bridge resource flags if requested size is 0 · 93d2175d

由 Yinghai Lu 提交于 5月 13, 2011

During pci remove/rescan testing found:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  0x1000-0x0fff]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pci 0000:c0:03.0: Error enabling bridge (-22), continuing
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pcieport: probe of 0000:c0:03.0 failed with error -22

This bug was caused by commit c8adf9a3 ("PCI: pre-allocate
additional resources to devices only after successful allocation of
essential resources.")

After that commit, pci_hotplug_io_size is changed to additional_io_size
from minium size.  So it will not go through resource_size(res) != 0
path, and will not be reset.

The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
pci bridge, and later if children do not need IO resource.  those bridge
resources will not need to be allocated.  but flags is still there.
that will confuse the the pci_enable_bridges later.

related code:

   static void assign_requested_resources_sorted(struct resource_list *head,
                                    struct resource_list_x *fail_head)
   {
           struct resource *res;
           struct resource_list *list;
           int idx;

           for (list = head->next; list; list = list->next) {
                   res = list->res;
                   idx = res - &list->dev->resource[0];
                   if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
   ...
                           reset_resource(res);
                   }
           }
   }

At last, We have to clear the flags in pbus_size_mem/io when requested
size == 0 and !add_head.  becasue this case it will not go through
adjust_resources_sorted().

Just make size1 = size0 when !add_head. it will make flags get cleared.

At the same time when requested size == 0, add_size != 0, will still
have in head and add_list.  because we do not clear the flags for it.

After this, we will get right result:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  disabled]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
  pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
  pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
  pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
  aer 0000:c0:03.0:pcie02: service driver aer loaded
  pciehp 0000:c0:03.0:pcie04: Hotplug Controller:

v3: more simple fix. also fix one typo in pbus_size_mem
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Reviewed-by: NRam Pai <linuxram@us.ibm.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

93d2175d

21 4月, 2011 1 次提交

intel_iommu: disable all VT-d PMRs when TXT launched · 51a63e67

由 Joseph Cihula 提交于 3月 21, 2011

Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled,
on each VT-d engine, after DMA remapping is enabled on the engines.
This is because the behavior of having both enabled is not deterministic
and because, if TXT has been used to launch the kernel, the PMRs may be
programmed to cover memory regions that will be used for DMA.

Under some circumstances (certain quirks detected, lack of multiple
devices, etc.), the current code does not set up DMA remapping on some
VT-d engines. In such cases it also skips disabling the PMRs. This
causes failures when the kernel is launched with TXT (most often this
occurs on the graphics engine and results in colored vertical bars on
the display).

This patch detects when the kernel has been launched with TXT and then
disables the PMRs on all VT-d engines. In some cases where the reason
that remapping is not being enabled is due to possible ACPI DMAR table
errors, the VT-d engine addresses may not be correct and thus not able
to be safely programmed even to disable PMRs. Because part of the TXT
launch process is the verification of these addresses, it will always be
safe to disable PMRs if the TXT launch has succeeded and hence only
doing this in such cases.
Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

51a63e67

12 4月, 2011 3 次提交

PCI: pci-label: Fix build failure when CONFIG_NLS is set to 'm' by allmodconfig · 8a226e00

由 Randy Dunlap 提交于 3月 29, 2011

Create a kconfig option symbol for PCI_LABEL and enable it
when DMI || ACPI are enabled.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

8a226e00

PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS · 1f112cee

由 Rafael J. Wysocki 提交于 4月 11, 2011

Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose.  However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation.  Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.

To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it.  Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Tested-by: NShriram Rajagopalan <rshriram@cs.ubc.ca>

1f112cee

pci: fix PCI bus allocation alignment handling · b42282e5

由 Linus Torvalds 提交于 4月 11, 2011

In commit 13583b16 ("PCI: refactor io size calculation code") Ram
had a thinko in the refactorization of the code: the end result used the
variable 'align' for the bus alignment, but the original code used
'min_align'.

Since then, another use of that 'align' variable got introduced by
commit c8adf9a3 ("PCI: pre-allocate additional resources to devices
only after successful allocation of essential resources.")

Fix both of those uses to use 'min_align' as they should.

Daniel Hellstrom <daniel@gaisler.com>
Acked-by: NRam Pai <linuxram@us.ibm.com>
Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b42282e5

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

29 3月, 2011 1 次提交
- T
  drivers: Final irq namespace conversion · dced35ae
  由 Thomas Gleixner 提交于 3月 28, 2011
```
Scripted with coccinelle.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
```
  dced35ae
24 3月, 2011 2 次提交

userns: security: make capabilities relative to the user namespace · 3486740a

由 Serge E. Hallyn 提交于 3月 23, 2011

- Introduce ns_capable to test for a capability in a non-default
  user namespace.
- Teach cap_capable to handle capabilities in a non-default
  user namespace.

The motivation is to get to the unprivileged creation of new
namespaces.  It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.

I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.

Changelog:
	11/05/2010: [serge] add apparmor
	12/14/2010: [serge] fix capabilities to created user namespaces
	Without this, if user serge creates a user_ns, he won't have
	capabilities to the user_ns he created.  THis is because we
	were first checking whether his effective caps had the caps
	he needed and returning -EPERM if not, and THEN checking whether
	he was the creator.  Reverse those checks.
	12/16/2010: [serge] security_real_capable needs ns argument in !security case
	01/11/2011: [serge] add task_ns_capable helper
	01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
	02/16/2011: [serge] fix a logic bug: the root user is always creator of
		    init_user_ns, but should not always have capabilities to
		    it!  Fix the check in cap_capable().
	02/21/2011: Add the required user_ns parameter to security_capable,
		    fixing a compile failure.
	02/23/2011: Convert some macros to functions as per akpm comments.  Some
		    couldn't be converted because we can't easily forward-declare
		    them (they are inline if !SECURITY, extern if SECURITY).  Add
		    a current_user_ns function so we can use it in capability.h
		    without #including cred.h.  Move all forward declarations
		    together to the top of the #ifdef __KERNEL__ section, and use
		    kernel-doc format.
	02/23/2011: Per dhowells, clean up comment in cap_capable().
	02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

(Original written and signed off by Eric;  latest, modified version
acked by him)

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3486740a

PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdev · 134fac3f

由 Rafael J. Wysocki 提交于 3月 23, 2011

The Intel IOMMU subsystem uses a sysdev class and a sysdev for
executing iommu_suspend() after interrupts have been turned off
on the boot CPU (during system suspend) and for executing
iommu_resume() before turning on interrupts on the boot CPU
(during system resume).  However, since both of these functions
ignore their arguments, the entire mechanism may be replaced with a
struct syscore_ops object which is simpler.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NJoerg Roedel <joerg.roedel@amd.com>

134fac3f

22 3月, 2011 6 次提交

ACPI, APEI, Add PCIe AER error information printing support · c413d768

由 Huang Ying 提交于 2月 21, 2011

The AER error information printing support is implemented in
drivers/pci/pcie/aer/aer_print.c.  So some string constants, functions
and macros definitions can be re-used without being exported.

The original PCIe AER error information printing function is not
re-used directly because the overall format is quite different.  And
changing the original printing format may make some original users'
scripts broken.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

c413d768

PCIe, AER, use pre-generated prefix in error information printing · b64a4414

由 Huang Ying 提交于 2月 21, 2011

When printing PCIe AER error information, each line is prefixed with
PCIe device and driver information.  In original implementation, the
prefix is generated when each line is printed.  In fact, all lines
share the same prefix.  So this patch pre-generated the prefix, and
use that one when each line is printed.

In addition to common prefix can be pre-generated, the trailing white
spaces in string constants and NULLs in char * array constants can be
removed too.  These can reduce the object file size further.

The size of object file before and after changing is as follow:

           text    data     bss     dec
before:    3038       0       0    3038
after:     2118       0       0    2118
Signed-off-by: NHuang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

b64a4414

PCI: Disable ASPM when _OSC control is not granted for PCIe services · eca67315

由 Naga Chumbalkar 提交于 3月 21, 2011

v3 -> v2: Added text to describe the problem
v2 -> v1: Split this patch from v1
v1	: Part of: http://marc.info/?l=linux-pci&m=130042212003242&w=2

Disable ASPM when no _OSC control for PCIe services is granted
by the BIOS. This is to protect systems with a buggy BIOS that
did not set the ACPI FADT "ASPM Controls" bit even though the
underlying HW can't do ASPM.

To turn "on" ASPM the minimum the BIOS needs to do:
1. Clear the ACPI FADT "ASPM Controls" bit.
2. Support _OSC appropriately

There is no _OSC Control bit for ASPM. However, we expect the BIOS to
support _OSC for a Root Bridge that originates a PCIe hierarchy. If this
is not the case - we are better off not enabling ASPM on that server.

Commit 852972ac (ACPI: Disable ASPM if the
Platform won't provide _OSC control for PCIe) describes the above scenario.
To quote verbatim from there:
[The PCI SIG documentation for the _OSC OS/firmware handshaking interface
states:

"If the _OSC control method is absent from the scope of a host bridge
device, then the operating system must not enable or attempt to use any
features defined in this section for the hierarchy originated by the host
bridge."

The obvious interpretation of this is that the OS should not attempt to use
PCIe hotplug, PME or AER - however, the specification also notes that an
_OSC method is *required* for PCIe hierarchies, and experimental validation
with An Alternative OS indicates that it doesn't use any PCIe functionality
if the _OSC method is missing. That arguably means we shouldn't be using
MSI or extended config space, but right now our problems seem to be limited
to vendors being surprised when ASPM gets enabled on machines when other
OSs refuse to do so. So, for now, let's just disable ASPM if the _OSC
method doesn't exist or refuses to hand over PCIe capability control.]
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

eca67315

PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs · bbfa306a

由 Naga Chumbalkar 提交于 3月 21, 2011

v3 -> v2: Modified the text that describes the problem
v2 -> v1: Returned -EPERM
v1      : http://marc.info/?l=linux-pci&m=130013194803727&w=2

For servers whose hardware cannot handle ASPM the BIOS ought to set the
FADT bit shown below:
In Sec 5.2.9.3 (IA-PC Boot Arch. Flags) of ACPI4.0a Specification, please
see Table 5-11:
PCIe ASPM Controls: If set, indicates to OSPM that it must not enable
OPSM ASPM control on this platform.

However there are shipping servers whose BIOS did not set this bit. (An
example is the HP ProLiant DL385 G6. A Maintenance BIOS will fix that).
For such servers even if a call is made via pci_no_aspm(), based on _OSC
support in the BIOS, it may be too late because the ASPM code may have
already allocated and filled its "link_list".

So if a user sets the ASPM "policy" to "powersave" via /sys then
pcie_aspm_set_policy() will run through the "link_list" and re-configure
ASPM policy on devices that advertise ASPM L0s/L1 capability:
# echo powersave > /sys/module/pcie_aspm/parameters/policy
# cat /sys/module/pcie_aspm/parameters/policy
default performance [powersave]

That can cause NMIs since the hardware doesn't play well with ASPM:
[ 1651.906015] NMI: PCI system error (SERR) for reason b1 on CPU 0.
[ 1651.906015] Dazed and confused, but trying to continue

Ideally, the BIOS should have set that FADT bit in the first place but we
could be more robust - especially given the fact that Windows doesn't
cause NMIs in the above scenario.

There should be a sanity check to not allow a user to modify ASPM policy
when aspm_disabled is set.
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

bbfa306a

PCI: PCIe links may not get configured for ASPM under POWERSAVE mode · 1a680b7c

由 Naga Chumbalkar 提交于 3月 21, 2011

v3 -> v2: Moved ASPM enabling logic to pci_set_power_state()
v2 -> v1: Preserved the logic in pci_raw_set_power_state()
	: Added ASPM enabling logic after scanning Root Bridge
	: http://marc.info/?l=linux-pci&m=130046996216391&w=2
v1	: http://marc.info/?l=linux-pci&m=130013164703283&w=2

The assumption made in commit 41cd766b
(PCI: Don't enable aspm before drivers have had a chance to veto it) that
pci_enable_device() will result in re-configuring ASPM when aspm_policy is
POWERSAVE is no longer valid.  This is due to commit
97c145f7 (PCI: read current power state
at enable time) which resets dev->current_state to D0. Due to this the
call to pcie_aspm_pm_state_change() is never made. Note the equality check
(below) that returns early:
./drivers/pci/pci.c: pci_raw_set_pci_power_state()
546         /* Check if we're already there */
547         if (dev->current_state == state)
548                 return 0;

Therefore OSPM never configures the PCIe links for ASPM to turn them "on".

Fix it by configuring ASPM from the pci_enable_device() code path. This
also allows a driver such as the e1000e networking driver a chance to
disable ASPM (L0s, L1), if need be, prior to enabling the device. A
driver may perform this action if the device is known to mis-behave
wrt ASPM.
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

1a680b7c

PCI/ACPI: Report ASPM support to BIOS if not disabled from command line · 8b8bae90

由 Rafael J. Wysocki 提交于 3月 05, 2011

We need to distinguish the situation in which ASPM support is
disabled from the command line or through .config from the situation
in which it is disabled, because the hardware or BIOS can't handle
it.  In the former case we should not report ASPM support to the BIOS
through ACPI _OSC, but in the latter case we should do that.

Introduce pcie_aspm_support_enabled() that can be used by
acpi_pci_root_add() to determine whether or not it should report ASPM
support to the BIOS through _OSC.

Cc: stable@kernel.org
References: https://bugzilla.kernel.org/show_bug.cgi?id=29722
References: https://bugzilla.kernel.org/show_bug.cgi?id=20232Reported-and-tested-by: NOrtwin Glück <odi@odi.ch>
Reviewed-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

8b8bae90

17 3月, 2011 3 次提交

G
unicore32 machine related files: pci bus handling · 700598ce
由 GuanXuetao 提交于 1月 15, 2011
```
This patch implements arch-specific pci bus driver.
Signed-off-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
```
700598ce

PCI: label: remove #include of ACPI header to avoid warnings · 65d8defe

由 Shyam_Iyer@Dell.com 提交于 3月 11, 2011

I found that including acpi/apci_drivers.h is not necessary and
introduces these warnings:

In file included from drivers/pci/pci-label.c:32:
include/acpi/acpi_drivers.h:103: warning: ‘struct acpi_device’ declared inside parameter list
include/acpi/acpi_drivers.h:103: warning: its scope is only this definition or declaration, which is probably not what you want
include/acpi/acpi_drivers.h:107: warning: ‘struct acpi_pci_root’ declared inside parameter list
Signed-off-by: NShyam Iyer <shyam_iyer@dell.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

65d8defe

PCI: label: Fix compilation error when CONFIG_ACPI is unset · 07eefe1c

由 Narendra_K@Dell.com 提交于 3月 07, 2011

This patch fixes compilation error descibed below introduced by
the commit 6058989b

drivers/pci/pci-label.c: In function ‘pci_create_firmware_label_files’:
drivers/pci/pci-label.c:366:2: error: implicit declaration of function ‘device_has_dsm’
Signed-off-by: NNarendra K <narendra_k@dell.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

07eefe1c

15 3月, 2011 1 次提交

PM: Remove CONFIG_PM_OPS · aa338601

由 Rafael J. Wysocki 提交于 2月 11, 2011

After redefining CONFIG_PM to depend on (CONFIG_PM_SLEEP ||
CONFIG_PM_RUNTIME) the CONFIG_PM_OPS option is redundant and can be
replaced with CONFIG_PM.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

aa338601

12 3月, 2011 2 次提交

intel-iommu: Fix get_domain_for_dev() error path · 2fe9723d

由 Alex Williamson 提交于 3月 04, 2011

If we run out of domain_ids and fail iommu_attach_domain(), we
fall into domain_exit() without having setup enough of the
domain structure for this to do anything useful.  In fact, it
typically runs off into the weeds walking the bogus domain->devices
list.  Just free the domain.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NDonald Dutile <ddutile@redhat.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org

2fe9723d

intel-iommu: Unlink domain from iommu · a97590e5

由 Alex Williamson 提交于 3月 04, 2011

When we remove a device, we unlink the iommu from the domain, but
we never do the reverse unlinking of the domain from the iommu.
This means that we never clear iommu->domain_ids, eventually leading
to resource exhaustion if we repeatedly bind and unbind a device
to a driver.  Also free empty domains to avoid a resource leak.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NDonald Dutile <ddutile@redhat.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org

a97590e5

05 3月, 2011 7 次提交

PCI: pre-allocate additional resources to devices only after successful... · c8adf9a3

由 Ram Pai 提交于 2月 14, 2011

PCI: pre-allocate additional resources to devices only after successful allocation of essential resources.

Linux tries to pre-allocate minimal resources to hotplug bridges. This
works fine as long as there are enough resources to satisfy all other
genuine resource requirements. However if enough resources are not
available to satisfy any of these nice-to-have pre-allocations, the
resource-allocator reports errors and returns failure.

This patch distinguishes between must-have resource from nice-to-have
resource. Any failure to allocate nice-to-have resources are ignored.

This behavior can be particularly useful to trigger automatic
reallocation when the OS discovers genuine allocation-conflicts or
genuine unallocated-requests caused by buggy allocation behavior of the
native BIOS/uEFI.

https://bugzilla.kernel.org/show_bug.cgi?id=15960 captures the
movitation behind the patch. This patch is verified to resolve the above
bug.

changelog v2: o fixed a bug where pci_assign_resource() was called on a
resource of zero resource size.

changelog v3: addressed Bjorn's comment
o "Please don't indent and right-justify the changelog".
o removed add_size from struct resource. The additional
size is now tracked using a linked list.

changelog v4: o moved freeing up of elements in head list from
assign_requested_resources_sorted() to
__assign_resources_sorted().
o removed a wrong reference to 'add_size' in
pbus_size_mem().
o some code optimizations in adjust_resources_sorted()
and assign_requested_resources_sorted()

changelog v5: o moved freeing up of elements in head list from
assign_requested_resources_sorted() to
__assign_resources_sorted().
o removed a wrong reference to 'add_size' in
pbus_size_mem().
o some code optimizations in adjust_resources_sorted()
and assign_requested_resources_sorted()

changelog v5: o factored out common code and made them into
separate independent patches
o added comments in kdoc format
o added a BUG_ON in pci_assign_unassigned_resources()
to catch for memory leak.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

c8adf9a3

PCI: introduce reset_resource() · fc075e1d

由 Ram Pai 提交于 2月 14, 2011

Introduce reset_resource() which factors out resource reset logic.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

fc075e1d

PCI: data structure agnostic free list function · 094732a5

由 Ram Pai 提交于 2月 14, 2011

Replace free_failed_list() with a free_list() call. free_list() can
handle 'resource_list_x', 'resource_list' and any linked list linked
through ->next
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

094732a5

PCI: refactor io size calculation code · 13583b16

由 Ram Pai 提交于 2月 14, 2011

Refactor code that calculates the io size in pbus_size_io() and
pbus_mem_io() into separate functions.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

13583b16

PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH · 87e3dc38

由 Jiri Slaby 提交于 2月 28, 2011

Some broken BIOSes on ICH4 chipset report an ACPI region which is in
conflict with legacy IDE ports when ACPI is disabled. Even though the
regions overlap, IDE ports are working correctly (we cannot find out
the decoding rules on chipsets).

So the only problem is the reported region itself, if we don't reserve
the region in the quirk everything works as expected.

This patch avoids reserving any quirk regions below PCIBIOS_MIN_IO
which is 0x1000. Some regions might be (and are by a fast google
query) below this border, but the only difference is that they won't
be reserved anymore. They should still work though the same as before.

The conflicts look like (1f.0 is bridge, 1f.1 is IDE ctrl):
pci 0000:00:1f.1: address space collision: [io 0x0170-0x0177] conflicts with 0000:00:1f.0 [io 0x0100-0x017f]

At 0x0100 a 128 bytes long ACPI region is reported in the quirk for
ICH4. ata_piix then fails to find disks because the IDE legacy ports
are zeroed:
ata_piix 0000:00:1f.1: device not available (can't reserve [io 0x0000-0x0007])

References: https://bugzilla.novell.com/show_bug.cgi?id=558740Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

87e3dc38

PCI hotplug: acpiphp: set current_state to D0 in register_slot · 47e9037a

由 Stefano Stabellini 提交于 2月 28, 2011

If a device doesn't support power management (pm_cap == 0) but it is
acpi_pci_power_manageable() because there is a _PS0 method declared for
it and _EJ0 is also declared for the slot then nobody is going to set
current_state = PCI_D0 for this device.  This is what I think it is
happening:

pci_enable_device
    |
__pci_enable_device_flags
/* here we do not set current_state because !pm_cap */
    |
do_pci_enable_device
    |
pci_set_power_state
    |
__pci_start_power_transition
    |
pci_platform_power_transition
/* platform_pci_power_manageable() calls acpi_pci_power_manageable that
 * returns true */
    |
platform_pci_set_power_state
/* acpi_pci_set_power_state gets called and does nothing because the
 * acpi device has _EJ0, see the comment "If the ACPI device has _EJ0,
 * ignore the device" */

at this point if we refer to the commit message that introduced the
comment above (10b3dcae), it is up to
the hotplug driver to set the state to D0.
However AFAICT the pci hotplug driver never does, in fact
drivers/pci/hotplug/acpiphp_glue.c:register_slot sets the slot flags to
(SLOT_ENABLED | SLOT_POWEREDON) but it does not set the pci device
current state to PCI_D0.

So my proposed fix is also to set current_state = PCI_D0 in
register_slot.
Comments are very welcome.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

47e9037a

PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs · 6058989b

由 Narendra_K@Dell.com 提交于 3月 02, 2011

This patch exports ACPI _DSM (Device Specific Method) provided firmware
instance number and string name of PCI devices as defined by 'PCI
Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming a
PCI or PCI Express Device Under Operating Systems) to sysfs.

New files created are:
  /sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
  /sys/bus/pci/devices/.../acpi_index which contains the firmware device type
instance for the given device.

cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/acpi_index
1
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/label
Embedded Broadcom 5709C NIC 1

cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/acpi_index
2
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/label
Embedded Broadcom 5709C NIC 2

The ACPI _DSM provided firmware 'instance number' and 'string name' will
be given priority if the firmware also provides 'SMBIOS type 41 device
type instance and string'.
Signed-off-by: NMatthew Garrett <mjg@redhat.com>
Signed-off-by: NJordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: NNarendra K <narendra_k@dell.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

6058989b

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年