提交 · 7beceebf5b9d14e333ab6025a6feccdc8e765225 · openanolis / cloud-kernel

04 1月, 2015 6 次提交

rhashtable: Supports for nulls marker · f89bd6f8

由 Thomas Graf 提交于 1月 02, 2015

In order to allow for wider usage of rhashtable, use a special nulls
marker to terminate each chain. The reason for not using the existing
nulls_list is that the prev pointer usage would not be valid as entries
can be linked in two different buckets at the same time.

The 4 nulls base bits can be set through the rhashtable_params structure
like this:

struct rhashtable_params params = {
        [...]
        .nulls_base = (1U << RHT_BASE_SHIFT),
};

This reduces the hash length from 32 bits to 27 bits.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f89bd6f8

rhashtable: Per bucket locks & deferred expansion/shrinking · 97defe1e

由 Thomas Graf 提交于 1月 02, 2015

Introduces an array of spinlocks to protect bucket mutations. The number
of spinlocks per CPU is configurable and selected based on the hash of
the bucket. This allows for parallel insertions and removals of entries
which do not share a lock.

The patch also defers expansion and shrinking to a worker queue which
allows insertion and removal from atomic context. Insertions and
deletions may occur in parallel to it and are only held up briefly
while the particular bucket is linked or unzipped.

Mutations of the bucket table pointer is protected by a new mutex, read
access is RCU protected.

In the event of an expansion or shrinking, the new bucket table allocated
is exposed as a so called future table as soon as the resize process
starts.  Lookups, deletions, and insertions will briefly use both tables.
The future table becomes the main table after an RCU grace period and
initial linking of the old to the new table was performed. Optimization
of the chains to make use of the new number of buckets follows only the
new table is in use.

The side effect of this is that during that RCU grace period, a bucket
traversal using any rht_for_each() variant on the main table will not see
any insertions performed during the RCU grace period which would at that
point land in the future table. The lookup will see them as it searches
both tables if needed.

Having multiple insertions and removals occur in parallel requires nelems
to become an atomic counter.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97defe1e

spinlock: Add spin_lock_bh_nested() · 113948d8

由 Thomas Graf 提交于 1月 02, 2015

Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

113948d8

nft_hash: Remove rhashtable_remove_pprev() · 897362e4

由 Thomas Graf 提交于 1月 02, 2015

The removal function of nft_hash currently stores a reference to the
previous element during lookup which is used to optimize removal later
on. This was possible because a lock is held throughout calling
rhashtable_lookup() and rhashtable_remove().

With the introdution of deferred table resizing in parallel to lookups
and insertions, the nftables lock will no longer synchronize all
table mutations and the stored pprev may become invalid.

Removing this optimization makes removal slightly more expensive on
average but allows taking the resize cost out of the insert and
remove path.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

897362e4

rhashtable: Convert bucket iterators to take table and index · 88d6ed15

由 Thomas Graf 提交于 1月 02, 2015

This patch is in preparation to introduce per bucket spinlocks. It
extends all iterator macros to take the bucket table and bucket
index. It also introduces a new rht_dereference_bucket() to
handle protected accesses to buckets.

It introduces a barrier() to the RCU iterators to the prevent
the compiler from caching the first element.

The lockdep verifier is introduced as stub which always succeeds
and properly implement in the next patch when the locks are
introduced.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88d6ed15

rhashtable: Do hashing inside of rhashtable_lookup_compare() · 8d24c0b4

由 Thomas Graf 提交于 1月 02, 2015

Hash the key inside of rhashtable_lookup_compare() like
rhashtable_lookup() does. This allows to simplify the hashing
functions and keep them private.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d24c0b4

03 1月, 2015 2 次提交

timecounter: provide a macro to initialize the cyclecounter mask field. · 1891172a

由 Richard Cochran 提交于 1月 02, 2015

There is no need for users of the timecounter/cyclecounter code to include
clocksource.h just for a single macro.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1891172a

net: Add Transparent Ethernet Bridging GRO support. · 9b174d88

由 Jesse Gross 提交于 12月 30, 2014

Currently the only tunnel protocol that supports GRO with encapsulated
Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer
so that it can be used by other tunnel protocols such as GRE and Geneve.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b174d88

01 1月, 2015 1 次提交

net: fec: add Wake-on-LAN support · de40ed31

由 Nimrod Andy 提交于 12月 24, 2014

Support for Wake-on-LAN using Magic Packet. ENET IP supports sleep mode
in low power status, when system enter suspend status, Magic packet can
wake up system even if all SOC clocks are gate. The patch doing below things:
- flagging the device as a wakeup source for the system, as well as
  its Wake-on-LAN interrupt
- prepare the hardware for entering WoL mode
- add standard ethtool WOL interface
- enable the ENET interrupt to wake us

Tested on i.MX6q/dl sabresd, sabreauto boards, i.MX6SX arm2 boards.
Signed-off-by: NFugang Duan <B38611@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de40ed31

31 12月, 2014 4 次提交

arm: sa1100: move irda header to linux/platform_data · dd450777

由 Dmitry Eremin-Solenikov 提交于 12月 24, 2014

In the end asm/mach/irda.h header is not used by anybody except sa1100.
Move the header to the platform data includes dir and rename it to
irda-sa11x0.h.
Signed-off-by: NDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd450777

timecounter: keep track of accumulated fractional nanoseconds · 2eebdde6

由 Richard Cochran 提交于 12月 21, 2014

The current timecounter implementation will drop a variable amount
of resolution, depending on the magnitude of the time delta. In
other words, reading the clock too often or too close to a time
stamp conversion will introduce errors into the time values. This
patch fixes the issue by introducing a fractional nanosecond field
that accumulates the low order bits.
Reported-by: NJanusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2eebdde6

timecounter: provide a helper function to shift the time. · 796c1efd

由 Richard Cochran 提交于 12月 21, 2014

Some PTP Hardware Clock drivers use a struct timecounter to represent
their clock. To adjust the time by a given offset, these drivers all
perform a two step read/write of their timecounter. However, it is
better and simpler just to adjust the offset in one step. This patch
introduces a little routine to help drivers implement the adjtime
method.
Suggested-by: NJanusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

796c1efd

time: move the timecounter/cyclecounter code into its own file. · 74d23cc7

由 Richard Cochran 提交于 12月 21, 2014

The timecounter code has almost nothing to do with the clocksource
code. Let it live in its own file. This will help isolate the
timecounter users from the clocksource users in the source tree.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74d23cc7

30 12月, 2014 1 次提交

mm: get rid of radix tree gfp mask for pagecache_get_page · 45f87de5

由 Michal Hocko 提交于 12月 29, 2014

Commit 2457aec6 ("mm: non-atomically mark page accessed during page
cache allocation where possible") has added a separate parameter for
specifying gfp mask for radix tree allocations.

Not only this is less than optimal from the API point of view because it
is error prone, it is also buggy currently because
grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
the radix tree allocation wouldn't obey the restriction and might
recurse into filesystem and cause deadlocks.  This is the case for most
filesystems unfortunately because only ext4 and gfs2 are using
AOP_FLAG_NOFS.

Let's simply remove radix_gfp_mask parameter because the allocation
context is same for both page cache and for the radix tree.  Just make
sure that the radix tree gets only the sane subset of the mask (e.g.  do
not pass __GFP_WRITE).

Long term it is more preferable to convert remaining users of
AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
interface even further.
Reported-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

45f87de5

27 12月, 2014 2 次提交

netlink/genetlink: pass network namespace to bind/unbind · 023e2cfa

由 Johannes Berg 提交于 12月 23, 2014

Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.

To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.

Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

023e2cfa

net: Generalize ndo_gso_check to ndo_features_check · 5f35227e

由 Jesse Gross 提交于 12月 23, 2014

GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.

This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.

By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.

CC: Tom Herbert <therbert@google.com>
CC: Joe Stringer <joestringer@nicira.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>
Acked-by: NTom Herbert <therbert@google.com>
Fixes: 04ffcb25 ("net: Add ndo_gso_check")
Tested-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f35227e

24 12月, 2014 1 次提交

audit: restore AUDIT_LOGINUID unset ABI · 041d7b98

由 Richard Guy Briggs 提交于 12月 23, 2014

A regression was caused by commit 780a7654:
	 audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by e1760bd5)

When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.

This broke userspace by not returning the same information that was sent and
expected.

The rule:
	auditctl -a exit,never -F auid=-1
gives:
	auditctl -l
		LIST_RULES: exit,never f24=0 syscall=all
when it should give:
		LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all

Tag it so that it is reported the same way it was set.  Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.

Cc: stable@vger.kernel.org # v3.10-rc1+
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NPaul Moore <pmoore@redhat.com>

041d7b98

20 12月, 2014 1 次提交

PM: Eliminate CONFIG_PM_RUNTIME · 464ed18e

由 Rafael J. Wysocki 提交于 12月 19, 2014

Having switched over all of the users of CONFIG_PM_RUNTIME to use
CONFIG_PM directly, turn the latter into a user-selectable option
and drop the former entirely from the tree.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Acked-by: NKevin Hilman <khilman@linaro.org>

464ed18e

19 12月, 2014 1 次提交

mm: cma: split cma-reserved in dmesg log · e48322ab

由 Pintu Kumar 提交于 12月 18, 2014

When the system boots up, in the dmesg logs we can see the memory
statistics along with total reserved as below.  Memory: 458840k/458840k
available, 65448k reserved, 0K highmem

When CMA is enabled, still the total reserved memory remains the same.
However, the CMA memory is not considered as reserved.  But, when we see
/proc/meminfo, the CMA memory is part of free memory.  This creates
confusion.  This patch corrects the problem by properly subtracting the
CMA reserved memory from the total reserved memory in dmesg logs.

Below is the dmesg snapshot from an arm based device with 512MB RAM and
12MB single CMA region.

Before this change:
  Memory: 458840k/458840k available, 65448k reserved, 0K highmem

After this change:
  Memory: 458840k/458840k available, 53160k reserved, 12288k cma-reserved, 0K highmem
Signed-off-by: NPintu Kumar <pintu.k@samsung.com>
Signed-off-by: NVishnu Pratap Singh <vishnu.ps@samsung.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e48322ab

18 12月, 2014 12 次提交

kernel: Provide READ_ONCE and ASSIGN_ONCE · 230fa253

由 Christian Borntraeger 提交于 11月 25, 2014

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
scalar types as suggested by Linus Torvalds. Accesses larger than
the machines word size cannot be guaranteed to be atomic. These
macros will use memcpy and emit a build warning.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

230fa253

KVM: move APIC types to arch/x86/ · cb5281a5

由 Paolo Bonzini 提交于 12月 17, 2014

They are not used anymore by IA64, move them away.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cb5281a5

libceph: fixup includes in pagelist.h · 84a1d2d1

由 Ilya Dryomov 提交于 11月 17, 2014

pagelist.h needs to include linux/types.h and asm/byteorder.h and not
rely on other headers pulling yet another set of headers.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>

84a1d2d1

ceph: use getattr request to fetch inline data · 01deead0

由 Yan, Zheng 提交于 11月 14, 2014

Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data
in getattr reply will be copied to the page.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

01deead0

ceph: add inline data to pagecache · 31c542a1

由 Yan, Zheng 提交于 11月 14, 2014

Request reply and cap message can contain inline data. add inline data
to the page cache if there is Fc cap.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

31c542a1

libceph: specify position of extent operation · 715e4cd4

由 Yan, Zheng 提交于 11月 13, 2014

allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>

715e4cd4

Y
libceph: add SETXATTR/CMPXATTR osd operations support · d74b50be
由 Yan, Zheng 提交于 11月 12, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
```
d74b50be
Y
libceph: require cephx message signature by default · a3fc9800
由 Yan, Zheng 提交于 11月 11, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
```
a3fc9800

libceph: update ceph_msg_header structure · d4e1a4e0

由 John Spray 提交于 10月 16, 2014

2 bytes of what was reserved space is now used by userspace for the
compat_version field.
Signed-off-by: NJohn Spray <john.spray@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

d4e1a4e0

Y
libceph: message signature support · 33d07337
由 Yan, Zheng 提交于 11月 04, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
33d07337

libceph: nuke ceph_kvfree() · 4965fc38

由 Ilya Dryomov 提交于 10月 23, 2014

Use kvfree() from linux/mm.h instead, which is identical.  Also fix the
ceph_buffer comment: we will allocate with kmalloc() up to 32k - the
value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an
implementation detail so don't mention it at all.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>

4965fc38

ceph: fix file lock interruption · 9280be24

由 Yan, Zheng 提交于 10月 14, 2014

When a lock operation is interrupted, current code sends a unlock request to
MDS to undo the lock operation. This method does not work as expected because
the unlock request can drop locks that have already been acquired.

The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
requests to interrupt blocked file lock request. These requests do not drop
locks that have alread been acquired, they only interrupt blocked file lock
request.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

9280be24

17 12月, 2014 6 次提交

vm_area_operations: kill ->migrate() · 50062175

由 Al Viro 提交于 5月 15, 2014

the only instance this method has ever grown was one in kernfs -
one that call ->migrate() of another vm_ops if it exists.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

50062175

thermal: cpu_cooling: return ERR_PTR() for !CPU_THERMAL or !THERMAL_OF · 503ccc3f

由 Javi Merino 提交于 12月 17, 2014

The documentation of of_cpufreq_cooling_register() and
cpufreq_cooling_register() say that they return ERR_PTR() on error.
Accordingly, callers only check for IS_ERR().  Therefore, make them
return ERR_PTR(-ENOSYS) as is customary in the kernel when config
options are missing.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: NJavi Merino <javi.merino@arm.com>
Signed-off-by: NEduardo Valentin <edubezval@gmail.com>

503ccc3f

A
new helper: iter_is_iovec() · 777eda2c
由 Al Viro 提交于 12月 17, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
777eda2c

PM / Domains: Export of_genpd_get_from_provider function · 7496fcbe

由 Amit Daniel Kachhap 提交于 12月 15, 2014

This function looks up a PM domain form the provider. This will be
useful to add parent/child domain relationship from the SoC specific
code. The caller of the function must make sure that PM domain provider
is already registered.
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Signed-off-by: NAmit Daniel Kachhap <amit.daniel@samsung.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

7496fcbe

cpuidle / ACPI: remove unused CPUIDLE_FLAG_TIME_INVALID · 62c4cf97

由 Len Brown 提交于 12月 16, 2014

CPUIDLE_FLAG_TIME_INVALID is no longer checked
by menu or ladder cpuidle governors, so don't
bother setting or defining it.

It was originally invented to account for the fact that
acpi_safe_halt() enables interrupts to invoke HLT.
That would allow interrupt service routines to be included
in the last_idle duration measurements made in cpuidle_enter_state(),
potentially returning a duration much larger than reality.

But menu and ladder can gracefully handle erroneously large duration
intervals without checking for CPUIDLE_FLAG_TIME_INVALID.
Further, if they don't check CPUIDLE_FLAG_TIME_INVALID, they
can also benefit from the instances when the duration interval
is not erroneously large.
Signed-off-by: NLen Brown <len.brown@intel.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

62c4cf97

net: Allow FIXED_PHY to be modular. · 6539c44d

由 David S. Miller 提交于 12月 16, 2014

Otherwise we get things like:

warning: (NET_DSA_BCM_SF2 && BCMGENET && SYSTEMPORT) selects FIXED_PHY which has unmet direct dependencies (NETDEVICES && PHYLIB=y)

In order to make this work we have to rename fixed.c to fixed_phy.c
because the regulator drivers already have a module named "fixed.o".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6539c44d

16 12月, 2014 3 次提交

x86, irq: Introduce helper to check whether an IOAPIC has been registered · e89900c9

由 Jiang Liu 提交于 10月 27, 2014

Introduce acpi_ioapic_registered() to check whether an IOAPIC has already
been registered, it will be used when enabling IOAPIC hotplug.
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Acked-by: NPavel Machek <pavel@ucw.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1414387308-27148-18-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e89900c9

x86, irq: Keep balance of IOAPIC pin reference count · cffe0a2b

由 Jiang Liu 提交于 10月 27, 2014

To keep balance of IOAPIC pin reference count, we need to protect
pirq_enable_irq(), acpi_pci_irq_enable() and intel_mid_pci_irq_enable()
from reentrance. There are two cases which will cause reentrance.

The first case is caused by suspend/hibernation. If pcibios_disable_irq
is called during suspending/hibernating, we don't release the assigned
IRQ number, otherwise it may break the suspend/hibernation. So late when
pcibios_enable_irq is called during resume, we shouldn't allocate IRQ
number again.

The second case is that function acpi_pci_irq_enable() may be called
twice for PCI devices present at boot time as below:
1) pci_acpi_init()
	--> acpi_pci_irq_enable() if pci_routeirq is true
2) pci_enable_device()
	--> pcibios_enable_device()
		--> acpi_pci_irq_enable()
We can't kill kernel parameter pci_routeirq yet because it's still
needed for debugging purpose.

So flag irq_managed is introduced to track whether IRQ number is
assigned by OS and to protect pirq_enable_irq(), acpi_pci_irq_enable()
and intel_mid_pci_irq_enable() from reentrance.
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1414387308-27148-13-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cffe0a2b

IB/mlx5: Handle page faults · 7bdf65d4

由 Haggai Eran 提交于 12月 11, 2014

This patch implement a page fault handler (leaving the pages pinned as
of time being).  The page fault handler handles initiator and responder
page faults for UD/RC transports, for send/receive operations, as well
as RDMA read/write initiator support.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7bdf65d4

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功