提交 · a870614a5371f8e36676e9bb2e089f4121976135 · openanolis / cloud-kernel

14 1月, 2014 8 次提交

PCI: Cleanup pci.h whitespace · 2ee546c4

由 Bjorn Helgaas 提交于 1月 13, 2014

Put empty or trivial inline stub functions on one line when they fit.  No
functional change.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

2ee546c4

PCI: Reorder so actual code comes before stubs · 4c859804

由 Bjorn Helgaas 提交于 1月 13, 2014

Consistently use the:

    #ifdef CONFIG_PCI_FOO
    int pci_foo(...);
    #else
    static inline int pci_foo(...) { return -1; }
    #endif

pattern, instead of sometimes using "#ifndef CONFIG_PCI_FOO".

No functional change.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

4c859804

ACPICA: Add helper macros to extract bus/segment numbers from HEST table. · 4059a310

由 Betty Dall 提交于 1月 13, 2014

This change adds two macros to extract the encoded bus and segment
numbers from the HEST Bus field.
Signed-off-by: NBetty Dall <betty.dall@hp.com>
Signed-off-by: NBob Moore <robert.moore@intel.com>
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4059a310

PCI: Make local functions static · 0b950f0f

由 Stephen Hemminger 提交于 1月 10, 2014

Using 'make namespacecheck' identify code which should be declared static.
Checked for users in other driver/archs as well.  Compile tested only.

This stops exporting the following interfaces to modules:

    pci_target_state()
    pci_load_saved_state()

[bhelgaas: retained pci_find_next_ext_capability() and pci_cfg_space_size()]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

0b950f0f

PCI: Remove unused alloc_pci_dev() · e2760c54

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This removes this unused and deprecated interface:

    alloc_pci_dev()

[bhelgaas: split to separate patch]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

e2760c54

PCI: Remove unused pci_renumber_slot() · 4ab44676

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts part of f46753c5 ("PCI: introduce pci_slot") and
d25b7c8d ("PCI: rename pci_update_slot_number to pci_renumber_slot"),
removing this interface:

    pci_renumber_slot()

[bhelgaas: split to separate patch, add historical link from Alex]
Link: http://lkml.kernel.org/r/20081009043140.8678.44164.stgit@bob.kioSigned-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NAlex Chiang <achiang@canonical.com>

4ab44676

PCI: Remove unused pcie_aspm_enabled() · 8f92fb06

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts part of 3e1b1600 ("ACPI/PCI: PCIe ASPM _OSC support
capabilities called when root bridge added"), removing this interface:

    pcie_aspm_enabled()

[bhelgaas: split to separate patch]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Andrew Patterson <andrew.patterson@hp.com>

8f92fb06

PCI: Remove unused pci_vpd_truncate() · 3984ca1c

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts db567943 ("PCI: add interface to set visible size of
VPD"), removing this interface:

    pci_vpd_truncate()

[bhelgaas: split to separate patch, also remove prototype from pci.h]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

3984ca1c

11 1月, 2014 4 次提交

PCI: Remove unused ID-Based Ordering support · 7c2dd2d7

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts b48d4425 ("PCI: add ID-based ordering enable/disable
support"), removing these interfaces:

    pci_enable_ido()
    pci_disable_ido()

[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>

7c2dd2d7

PCI: Remove unused Optimized Buffer Flush/Fill support · ecc86356

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts 48a92a81 ("PCI: add OBFF enable/disable support"),
removing these interfaces:

    pci_enable_obff()
    pci_disable_obff()

[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>

ecc86356

PCI: Remove unused Latency Tolerance Reporting support · 3ea8197e

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts 51c2e0a7 ("PCI: add latency tolerance reporting
enable/disable support"), removing these interfaces:

    pci_enable_ltr()
    pci_disable_ltr()
    pci_set_ltr()

[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>

3ea8197e

PCI: Removed unused parts of Page Request Interface support · b340cacc

由 Stephen Hemminger 提交于 1月 10, 2014

My philosophy is unused code is dead code.  And dead code is subject to bit
rot and is a likely source of bugs.  Use it or lose it.

This reverts parts of c320b976 ("PCI: Add implementation for PRI
capability"), removing these interfaces:

    pci_pri_enabled()
    pci_pri_stopped()
    pci_pri_status()

[bhelgaas: split to separate patch]
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Joerg Roedel <joro@8bytes.org>

b340cacc

08 1月, 2014 1 次提交

PCI: Enforce bus address limits in resource allocation · f75b99d5

由 Yinghai Lu 提交于 12月 20, 2013

When allocating space for 32-bit BARs, we previously limited RESOURCE
addresses so they would fit in 32 bits. However, the BUS address need not
be the same as the resource address, and it's the bus address that must fit
in the 32-bit BAR.

This patch adds:

- pci_clip_resource_to_region(), which clips a resource so it contains
only the range that maps to the specified bus address region, e.g., to
clip a resource to 32-bit bus addresses, and

- pci_bus_alloc_from_region(), which allocates space for a resource from
the specified bus address region,

and changes pci_bus_alloc_resource() to allocate space for 64-bit BARs from
the entire bus address region, and space for 32-bit BARs from only the bus
address region below 4GB.

If we had this window:

pci_root HWP0002:0a: host bridge window [mem 0xf0180000000-0xf01fedfffff] (bus address [0x80000000-0xfedfffff])

we previously could not put a 32-bit BAR there, because the CPU addresses
don't fit in 32 bits. This patch fixes this, so we can use this space for
32-bit BARs.

It's also possible (though unlikely) to have resources with 32-bit CPU
addresses but bus addresses above 4GB. In this case the previous code
would allocate space that a 32-bit BAR could not map.

Remove PCIBIOS_MAX_MEM_32, which is no longer used.

[bhelgaas: reworked starting from http://lkml.kernel.org/r/1386658484-15774-3-git-send-email-yinghai@kernel.org]
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

f75b99d5

04 1月, 2014 4 次提交

PCI/MSI: Add pci_enable_msi_range() and pci_enable_msix_range() · 302a2523

由 Alexander Gordeev 提交于 12月 30, 2013

This adds pci_enable_msi_range(), which supersedes the pci_enable_msi()
and pci_enable_msi_block() MSI interfaces.

It also adds pci_enable_msix_range(), which supersedes the
pci_enable_msix() MSI-X interface.

The old interfaces have three categories of return values:

    negative: failure; caller should not retry
    positive: failure; value indicates number of interrupts that *could*
	have been allocated, and caller may retry with a smaller request
    zero: success; at least as many interrupts allocated as requested

It is error-prone to handle these three cases correctly in drivers.

The new functions return either a negative error code or a number of
successfully allocated MSI/MSI-X interrupts, which is expected to lead to
clearer device driver code.

pci_enable_msi(), pci_enable_msi_block() and pci_enable_msix() still exist
unchanged, but are deprecated and may be removed after callers are updated.

[bhelgaas: tweak changelog]
Suggested-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NTejun Heo <tj@kernel.org>

302a2523

PCI/MSI: Add pci_msix_vec_count() · ff1aa430

由 Alexander Gordeev 提交于 12月 30, 2013

This creates an MSI-X counterpart for pci_msi_vec_count().  Device drivers
can use this function to obtain maximum number of MSI-X interrupts the
device supports and use that number in a subsequent call to
pci_enable_msix().

pci_msix_vec_count() supersedes pci_msix_table_size() and returns a
negative errno if device does not support MSI-X interrupts.  After this
update, callers must always check the returned value.

The only user of pci_msix_table_size() was the PCI-Express port driver,
which is also updated by this change.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NTejun Heo <tj@kernel.org>

ff1aa430

PCI/MSI: Remove pci_enable_msi_block_auto() · 7b92b4f6

由 Alexander Gordeev 提交于 12月 30, 2013

The new pci_msi_vec_count() interface makes pci_enable_msi_block_auto()
superfluous.

Drivers can use pci_msi_vec_count() to learn the maximum number of MSIs
supported by the device, and then call pci_enable_msi_block().

pci_enable_msi_block_auto() was introduced recently, and its only user is
the AHCI driver, which is also updated by this change.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NTejun Heo <tj@kernel.org>

7b92b4f6

PCI/MSI: Add pci_msi_vec_count() · d1ac1d26

由 Alexander Gordeev 提交于 12月 30, 2013

Device drivers can use this interface to obtain the maximum number of MSI
interrupts the device supports and use that number, e.g., in a subsequent
call to pci_enable_msi_block().
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NTejun Heo <tj@kernel.org>

d1ac1d26

22 12月, 2013 2 次提交

PCI: Add pci_bus_address() to get bus address of a BAR · 06cf56e4

由 Bjorn Helgaas 提交于 12月 21, 2013

We store BAR information as a struct resource, which contains the CPU
address, not the bus address.  Drivers often need the bus address, and
there's currently no convenient way to get it, so they often read the
BAR directly, or use the resource address (which doesn't work if there's
any translation between CPU and bus addresses).

Add pci_bus_address() to make this convenient.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

06cf56e4

PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev · fc279850

由 Yinghai Lu 提交于 12月 09, 2013

These interfaces:

  pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
  pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)

took a pci_dev, but they really depend only on the pci_bus.  And we want to
use them in resource allocation paths where we have the bus but not a
device, so this patch converts them to take the pci_bus instead of the
pci_dev:

  pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
  pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)

In fact, with standard PCI-PCI bridges, they only depend on the host
bridge, because that's the only place address translation occurs, but
we aren't going that far yet.

[bhelgaas: changelog]
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

fc279850

21 12月, 2013 3 次提交

PCI: Change pci_bus_region addresses to dma_addr_t · 0a5ef7b9

由 Bjorn Helgaas 提交于 12月 21, 2013

Struct pci_bus_region contains bus addresses, which are type dma_addr_t,
not resource_size_t.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

0a5ef7b9

PCI/MSI: Make pci_enable_msi/msix() 'nvec' argument type as int · 52179dc9

由 Alexander Gordeev 提交于 12月 16, 2013

Make pci_enable_msi_block(), pci_enable_msi_block_auto() and
pci_enable_msix() consistent with regard to the type of 'nvec' argument.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NTejun Heo <tj@kernel.org>

52179dc9

PCI/MSI: Return -ENOSYS for unimplemented interfaces, not -1 · 8ec5db6b

由 Alexander Gordeev 提交于 12月 16, 2013

Suggested-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NTejun Heo <tj@kernel.org>

8ec5db6b

20 12月, 2013 1 次提交

PCI/MSI: Export MSI mode using attributes, not kobjects · 1c51b50c

由 Greg Kroah-Hartman 提交于 12月 19, 2013

The PCI MSI sysfs code is a mess with kobjects for things that don't really
need to be kobjects.  This patch creates attributes dynamically for the MSI
interrupts instead of using kobjects.

Note, this removes a directory from sysfs.  Old MSI kobjects:

  pci_device
     └── msi_irqs
         └── 40
             └── mode

New MSI attributes:

  pci_device
     └── msi_irqs
         └── 40

As there was only one file "mode" with the kobject model, the interrupt
number is now a file that returns the "mode" of the interrupt (msi vs.
msix).
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>

1c51b50c

18 12月, 2013 4 次提交

PCI: Rename PCI_VC_PORT_REG1/2 to PCI_VC_PORT_CAP1/2 · 274127a1

由 Alex Williamson 提交于 12月 17, 2013

These are set of two capability registers, it's pretty much given that
they're registers, so reflect their purpose in the name.
Suggested-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

274127a1

PCI: Add Virtual Channel to save/restore support · 425c1b22

由 Alex Williamson 提交于 12月 17, 2013

While we don't really have any infrastructure for making use of VC
support, the system BIOS can configure the topology to non-default
VC values prior to boot. This may be due to silicon bugs, desire to
reserve traffic classes, or perhaps just BIOS bugs. When we reset
devices, the VC configuration may return to default values, which can
be incompatible with devices upstream. For instance, Nvidia GRID
cards provide a PCIe switch and some number of GPUs, all supporting
VC. The power-on default for VC is to support TC0-7 across VC0,
however some platforms will only enable TC0/VC0 mapping across the
topology. When we do a secondary bus reset on the downstream switch
port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end
of the link only enables TC0/VC0. If the GPU attempts to use TC1-7,
it fails.

This patch attempts to provide complete support for VC save/restore,
even beyond the minimally required use case above. This includes
save/restore and reload of the arbitration table, save/restore and
reload of the port arbitration tables, and re-enabling of the
channels for VC, VC9, and MFVC capabilities.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

425c1b22

PCI: Add support for save/restore of extended capabilities · fd0f7f73

由 Alex Williamson 提交于 12月 17, 2013

Current save/restore is specific to standard capabilities.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

fd0f7f73

PCI: Add pci_wait_for_pending() (refactor pci_wait_for_pending_transaction()) · 157e876f

由 Alex Williamson 提交于 12月 17, 2013

We currently have two instance of this loop which waits for a pending bit
to clear in a status dword. Generalize the function for future users.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

157e876f

16 12月, 2013 1 次提交

PCI: pciehp: Use symbolic constants for Slot Control fields · e7b4f0d7

由 Bjorn Helgaas 提交于 12月 14, 2013

Add symbolic constants for the PCIe Slot Control indicator and power
control fields defined by spec and use them instead of open-coded hex
constants.

No functional change.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

e7b4f0d7

14 12月, 2013 1 次提交

PCI/checkpatch: Deprecate DEFINE_PCI_DEVICE_TABLE · 92e112fd

由 Joe Perches 提交于 12月 13, 2013

Prefer use of the direct definition of struct pci_device_id instead of
indirection via macro DEFINE_PCI_DEVICE_TABLE.

Update the PCI documentation to deprecate DEFINE_PCI_DEVICE_TABLE.  Update
checkpatch adding --fix option.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJingoo Han <jg1.han@samsung.com>

92e112fd

13 12月, 2013 1 次提交

PCI: Drop "irq" param from *_restore_msi_irqs() · ac8344c4

由 DuanZhenzhong 提交于 12月 04, 2013

Change x86_msi.restore_msi_irqs(struct pci_dev *dev, int irq) to
x86_msi.restore_msi_irqs(struct pci_dev *dev).

restore_msi_irqs() restores multiple MSI-X IRQs, so param 'int irq' is
unneeded.  This makes code more consistent between vm and bare metal.

Dom0 MSI-X restore code can also be optimized as XEN only has a hypercall
to restore all MSI-X vectors at one time.
Tested-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ac8344c4

08 12月, 2013 1 次提交

PCI: Disable Bus Master only on kexec reboot · 4fc9bbf9

由 Khalid Aziz 提交于 11月 27, 2013

Add a flag to tell the PCI subsystem that kernel is shutting down in
preparation to kexec a kernel.  Add code in PCI subsystem to use this flag
to clear Bus Master bit on PCI devices only in case of kexec reboot.

This fixes a power-off problem on Acer Aspire V5-573G and likely other
machines and avoids any other issues caused by clearing Bus Master bit on
PCI devices in normal shutdown path.  The problem was introduced by
b566a22c ("PCI: disable Bus Master on PCI device shutdown").

This patch is based on discussion at
http://marc.info/?l=linux-pci&m=138425645204355&w=2

Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861Reported-by: NChang Liu <cl91tp@gmail.com>
Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Cc: stable@vger.kernel.org	# v3.5+

4fc9bbf9

26 11月, 2013 1 次提交

PCI: Omit PCI ID macro strings to shorten quirk names · ecf61c78

由 Michal Marek 提交于 11月 11, 2013

Pasting the verbatim PCI_(VENDOR|DEVICE)_* macros in the __pci_fixup_*
symbol names results in insanely long names such as

__pci_fixup_resumePCI_VENDOR_ID_SERVERWORKSPCI_DEVICE_ID_SERVERWORKS_HT1000SBquirk_disable_broadcom_boot_interrupt

When Link-Time Optimization adds its numeric suffix to such symbol, it
overflows the namebuf[KSYM_NAME_LEN] array in kernel/kallsyms.c.  Use the
line number instead to create (nearly) unique symbol names.
Reported-by: NJoe Mario <jmario@redhat.com>
Signed-off-by: NMichal Marek <mmarek@suse.cz>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Andi Kleen <ak@linux.intel.com>

ecf61c78

22 11月, 2013 5 次提交

mm: place page->pmd_huge_pte to right union · 7aa555bf

由 Kirill A. Shutemov 提交于 11月 21, 2013

I don't know what went wrong, mis-merge or something, but ->pmd_huge_pte
placed in wrong union within struct page.

In original patch[1] it's placed to union with ->lru and ->slab, but in
commit e009bb30 ("mm: implement split page table lock for PMD
level") it's in union with ->index and ->freelist.

That union seems also unused for pages with table tables and safe to
re-use, but it's not what I've tested.

Let's move it to original place.  It fixes indentation at least.  :)

[1] https://lkml.org/lkml/2013/10/7/288Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7aa555bf

mm: hugetlbfs: fix hugetlbfs optimization · 27c73ae7

由 Andrea Arcangeli 提交于 11月 21, 2013

Commit 7cb2ef56 ("mm: fix aio performance regression for database
caused by THP") can cause dereference of a dangling pointer if
split_huge_page runs during PageHuge() if there are updates to the
tail_page->private field.

Also it is repeating compound_head twice for hugetlbfs and it is running
compound_head+compound_trans_head for THP when a single one is needed in
both cases.

The new code within the PageSlab() check doesn't need to verify that the
THP page size is never bigger than the smallest hugetlbfs page size, to
avoid memory corruption.

A longstanding theoretical race condition was found while fixing the
above (see the change right after the skip_unlock label, that is
relevant for the compound_lock path too).

By re-establishing the _mapcount tail refcounting for all compound
pages, this also fixes the below problem:

  echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

  BUG: Bad page state in process bash  pfn:59a01
  page:ffffea000139b038 count:0 mapcount:10 mapping:          (null) index:0x0
  page flags: 0x1c00000000008000(tail)
  Modules linked in:
  CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
    dump_stack+0x55/0x76
    bad_page+0xd5/0x130
    free_pages_prepare+0x213/0x280
    __free_pages+0x36/0x80
    update_and_free_page+0xc1/0xd0
    free_pool_huge_page+0xc2/0xe0
    set_max_huge_pages.part.58+0x14c/0x220
    nr_hugepages_store_common.isra.60+0xd0/0xf0
    nr_hugepages_store+0x13/0x20
    kobj_attr_store+0xf/0x20
    sysfs_write_file+0x189/0x1e0
    vfs_write+0xc5/0x1f0
    SyS_write+0x55/0xb0
    system_call_fastpath+0x16/0x1b
Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Tested-by: NKhalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27c73ae7

mm: thp: give transparent hugepage code a separate copy_page · 30b0a105

由 Dave Hansen 提交于 11月 21, 2013

Right now, the migration code in migrate_page_copy() uses copy_huge_page()
for hugetlbfs and thp pages:

       if (PageHuge(page) || PageTransHuge(page))
                copy_huge_page(newpage, page);

So, yay for code reuse.  But:

  void copy_huge_page(struct page *dst, struct page *src)
  {
        struct hstate *h = page_hstate(src);

and a non-hugetlbfs page has no page_hstate().  This works 99% of the
time because page_hstate() determines the hstate from the page order
alone.  Since the page order of a THP page matches the default hugetlbfs
page order, it works.

But, if you change the default huge page size on the boot command-line
(say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
so page_hstate() returns null and copy_huge_page() oopses pretty fast
since copy_huge_page() dereferences the hstate:

  void copy_huge_page(struct page *dst, struct page *src)
  {
        struct hstate *h = page_hstate(src);
        if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
  ...

Mel noticed that the migration code is really the only user of these
functions.  This moves all the copy code over to migrate.c and makes
copy_huge_page() work for THP by checking for it explicitly.

I believe the bug was introduced in commit b32967ff ("mm: numa: Add
THP migration for the NUMA working set scanning fault case")

[akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi]
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30b0a105

genetlink: fix genl_set_err() group ID · 91398a09

由 Johannes Berg 提交于 11月 21, 2013

Fix another really stupid bug - I introduced genl_set_err()
precisely to be able to adjust the group and reject invalid
ones, but then forgot to do so.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91398a09

genetlink: fix genlmsg_multicast() bug · 220815a9

由 Johannes Berg 提交于 11月 21, 2013

Unfortunately, I introduced a tremendously stupid bug into
genlmsg_multicast() when doing all those multicast group
changes: it adjusts the group number, but then passes it
to genlmsg_multicast_netns() which does that again.

Somehow, my tests failed to catch this, so add a warning
into genlmsg_multicast_netns() and remove the offending
group ID adjustment.

Also add a warning to the similar code in other functions
so people who misuse them are more loudly warned.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

220815a9

21 11月, 2013 3 次提交

net/phy: Add the autocross feature for forced links on VSC82x4 · 3fb69bca

由 Madalin Bucur 提交于 11月 20, 2013

Add auto-MDI/MDI-X capability for forced (autonegotiation disabled)
10/100 Mbps speeds on Vitesse VSC82x4 PHYs. Exported previously static
function genphy_setup_forced() required by the new config_aneg handler
in the Vitesse PHY module.
Signed-off-by: NMadalin Bucur <madalin.bucur@freescale.com>
Signed-off-by: NShruti Kanetkar <Shruti@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fb69bca

net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426

由 Hannes Frederic Sowa 提交于 11月 21, 2013

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
	msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3d33426

btrfs: Use trace condition for get_extent tracepoint · 4cd8587c

由 Steven Rostedt 提交于 11月 14, 2013

Doing an if statement to test some condition to know if we should
trigger a tracepoint is pointless when tracing is disabled. This just
adds overhead and wastes a branch prediction. This is why the
TRACE_EVENT_CONDITION() was created. It places the check inside the jump
label so that the branch does not happen unless tracing is enabled.

That is, instead of doing:

	if (em)
		trace_btrfs_get_extent(root, em);

Which is basically this:

	if (em)
		if (static_key(trace_btrfs_get_extent)) {

Using a TRACE_EVENT_CONDITION() we can just do:

	trace_btrfs_get_extent(root, em);

And the condition trace event will do:

	if (static_key(trace_btrfs_get_extent)) {
		if (em) {
			...

The static key is a non conditional jump (or nop) that is faster than
having to check if em is NULL or not.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

4cd8587c

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功