提交 · 415e12b2379239973feab91850b0dce985c6058a · openeuler / Kernel

15 1月, 2011 1 次提交

PCI/ACPI: Request _OSC control once for each root bridge (v3) · 415e12b2

由 Rafael J. Wysocki 提交于 1月 07, 2011

Move the evaluation of acpi_pci_osc_control_set() (to request control of
PCI Express native features) into acpi_pci_root_add() to avoid calling
it many times for the same root complex with the same arguments.
Additionally, check if all of the requisite _OSC support bits are set
before calling acpi_pci_osc_control_set() for a given root complex.

References: https://bugzilla.kernel.org/show_bug.cgi?id=20232Reported-by: NOzan Caglayan <ozan@pardus.org.tr>
Tested-by: NOzan Caglayan <ozan@pardus.org.tr>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

415e12b2

24 12月, 2010 6 次提交

PCI/PCIe: Clear Root PME Status bits early during system resume · fe31e697

由 Rafael J. Wysocki 提交于 12月 19, 2010

I noticed that PCI Express PMEs don't work on my Toshiba Portege R500
after the system has been woken up from a sleep state by a PME
(through Wake-on-LAN).  After some investigation it turned out that
the BIOS didn't clear the Root PME Status bit in the root port that
received the wakeup PME and since the Requester ID was also set in
the port's Root Status register, any subsequent PMEs didn't trigger
interrupts.

This problem can be avoided by clearing the Root PME Status bits in
all PCI Express root ports during early resume.  For this purpose,
add an early resume routine to the PCIe port driver and make this
driver be always registered, even if pci_ports_disable is set (in
which case the driver's only function is to provide the early
resume callback).
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

fe31e697

x86/PCI: irq and pci_ids patch for Intel Patsburg · 9b444b36

由 Seth Heasley 提交于 11月 17, 2010

This patch adds an additional LPC Controller DeviceID for the Intel
Patsburg PCH.
Signed-off-by: NSeth Heasley <seth.heasley@intel.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

9b444b36

PCI: make pci_restore_state return void · 1d3c16a8

由 Jon Mason 提交于 11月 30, 2010

pci_restore_state only ever returns 0, thus there is no benefit in
having it return any value.  Also, a large majority of the callers do
not check the return code of pci_restore_state.  Make the
pci_restore_state a void return and avoid the overhead.
Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NJon Mason <jon.mason@exar.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

1d3c16a8

PCI: Disable ASPM if BIOS asks us to · 2f671e2d

由 Matthew Garrett 提交于 12月 06, 2010

We currently refuse to touch the ASPM registers if the BIOS tells us that
ASPM isn't supported. This can cause problems if the BIOS has (for any
reason) enabled ASPM on some devices anyway. Change the code such that we
explicitly clear ASPM if the FADT indicates that ASPM isn't supported,
and make sure we tidy up appropriately on device removal in order to deal
with the hotplug case. If ASPM is disabled because the BIOS doesn't hand
over control then we won't touch the registers.
Signed-off-by: NMatthew Garrett <mjg@redhat.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

2f671e2d

PCI: Add mask bit definition for MSI-X table · 8d805286

由 Sheng Yang 提交于 11月 11, 2010

Then we can use it instead of magic number 1.
Reviewed-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

8d805286

PCI: MSI: Move MSI-X entry definition to pci_regs.h · 00aaaef9

由 Sheng Yang 提交于 11月 11, 2010

Then it can be used by others.
Reviewed-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

00aaaef9

23 12月, 2010 2 次提交

taskstats: pad taskstats netlink response for aligment issues on ia64 · 4be2c95d

由 Jeff Mahoney 提交于 12月 21, 2010

The taskstats structure is internally aligned on 8 byte boundaries but the
layout of the aggregrate reply, with two NLA headers and the pid (each 4
bytes), actually force the entire structure to be unaligned. This causes
the kernel to issue unaligned access warnings on some architectures like
ia64. Unfortunately, some software out there doesn't properly unroll the
NLA packet and assumes that the start of the taskstats structure will
always be 20 bytes from the start of the netlink payload. Aligning the
start of the taskstats structure breaks this software, which we don't
want. So, for now the alignment only happens on architectures that
require it and those users will have to update to fixed versions of those
packages. Space is reserved in the packet only when needed. This ifdef
should be removed in several years e.g. 2012 once we can be confident
that fixed versions are installed on most systems. We add the padding
before the aggregate since the aggregate is already a defined type.

Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems")
previously addressed the alignment issues by padding out the pid field.
This was supposed to be a compatible change but the circumstances
described above mean that it wasn't. This patch backs out that change,
since it was a hack, and introduces a new NULL attribute type to provide
the padding. Padding the response with 4 bytes avoids allocating an
aligned taskstats structure and copying it back. Since the structure
weighs in at 328 bytes, it's too big to do it on the stack.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reported-by: NBrian Rogers <brian@xyzw.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Guillaume Chazarain <guichaz@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4be2c95d

include/linux/unaligned: pack the whole struct rather than just the field · 4e06fd14

由 Will Newton 提交于 12月 21, 2010

The current packed struct implementation of unaligned access adds the
packed attribute only to the field within the unaligned struct rather than
to the struct as a whole.  This is not sufficient to enforce proper
behaviour on architectures with a default struct alignment of more than
one byte.

For example, the current implementation of __get_unaligned_cpu16 when
compiled for arm with gcc -O1 -mstructure-size-boundary=32 assumes the
struct is on a 4 byte boundary so performs the load of the 16bit packed
field as if it were on a 4 byte boundary:

__get_unaligned_cpu16:
        ldrh    r0, [r0, #0]
        bx      lr

Moving the packed attribute to the struct rather than the field causes the
proper unaligned access code to be generated:

__get_unaligned_cpu16:
	ldrb	r3, [r0, #0]	@ zero_extendqisi2
	ldrb	r0, [r0, #1]	@ zero_extendqisi2
	orr	r0, r3, r0, asl #8
	bx	lr
Signed-off-by: NWill Newton <will.newton@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e06fd14

21 12月, 2010 1 次提交

clarify a usage constraint for cnt32_to_63() · b8da46d3

由 Nicolas Pitre 提交于 12月 20, 2010

The cnt32_to_63 algorithm relies on proper counter data evaluation
ordering to work properly. This was missing from the provided
documentation.

Let's augment the documentation with the missing usage constraint and
fix the only instance that got it wrong.
Signed-off-by: NNicolas Pitre <nico@fluxnic.net>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b8da46d3

18 12月, 2010 3 次提交

resources: add arch hook for preventing allocation in reserved areas · fcb11918

由 Bjorn Helgaas 提交于 12月 16, 2010

This adds arch_remove_reservations(), which an arch can implement if it
needs to protect part of the address space from allocation.

Sometimes that can be done by just putting a region in the resource tree,
but there are cases where that doesn't work well.  For example, x86 BIOS
E820 reservations are not related to devices, so they may overlap part of,
all of, or more than a device resource, so they may not end up at the
correct spot in the resource tree.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

fcb11918

Revert "resources: support allocating space within a region from the top down" · c0f5ac54

由 Bjorn Helgaas 提交于 12月 16, 2010

This reverts commit e7f8567d.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

c0f5ac54

ceph: mark user pages dirty on direct-io reads · b6aa5901

由 Henry C Chang 提交于 12月 15, 2010

For read operation, we have to set the argument _write_ of get_user_pages
to 1 since we will write data to pages. Also, we need to SetPageDirty before
releasing these pages.
Signed-off-by: NHenry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b6aa5901

17 12月, 2010 4 次提交

block: max hardware sectors limit wrapper · 72d4cd9f

由 Mike Snitzer 提交于 12月 17, 2010

Implement blk_limits_max_hw_sectors() and make
blk_queue_max_hw_sectors() a wrapper around it.

DM needs this to avoid setting queue_limits' max_hw_sectors and
max_sectors directly.  dm_set_device_limits() now leverages
blk_limits_max_hw_sectors() logic to establish the appropriate
max_hw_sectors minimum (PAGE_SIZE).  Fixes issue where DM was
incorrectly setting max_sectors rather than max_hw_sectors (which
caused dm_merge_bvec()'s max_hw_sectors check to be ineffective).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

72d4cd9f

block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead · e692cb66

由 Martin K. Petersen 提交于 12月 01, 2010

When stacking devices, a request_queue is not always available. This
forced us to have a no_cluster flag in the queue_limits that could be
used as a carrier until the request_queue had been set up for a
metadevice.

There were several problems with that approach. First of all it was up
to the stacking device to remember to set queue flag after stacking had
completed. Also, the queue flag and the queue limits had to be kept in
sync at all times. We got that wrong, which could lead to us issuing
commands that went beyond the max scatterlist limit set by the driver.

The proper fix is to avoid having two flags for tracking the same thing.
We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the
block layer merging functions. The queue_limit 'no_cluster' is turned
into 'cluster' to avoid double negatives and to ease stacking.
Clustering defaults to being enabled as before. The queue flag logic is
removed from the stacking function, and explicitly setting the cluster
flag is no longer necessary in DM and MD.
Reported-by: NEd Lin <ed.lin@promise.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

e692cb66

SSB: Fix nvram_get on BCM47xx platform · 3f84622d

由 Hauke Mehrtens 提交于 11月 27, 2010

The nvram_get function was never in the mainline kernel, it only existed in
an external OpenWrt patch. Use nvram_getenv function, which is in mainline
and use an include instead of an extra function declaration.  et0macaddr
contains the mac address in text from like 00:11:22:33:44:55. We have to
parse it before adding it into macaddr.

nvram_parse_macaddr will be merged into asm/mach-bcm47xx/nvram.h through
the MIPS git tree and will be available soon. It will not build now without
nvram_parse_macaddr, but it hasn't before either.
Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
To: linux-mips@linux-mips.org
Cc: mb@bu3sch.de
Cc: netdev@vger.kernel.org
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: NMichael Buesch <mb@bu3sch.de>
Patchwork: https://patchwork.linux-mips.org/patch/1849/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

3f84622d

PM / Runtime: Fix pm_runtime_suspended() · f08f5a0a

由 Rafael J. Wysocki 提交于 12月 16, 2010

There are some situations (e.g. in __pm_generic_call()), where
pm_runtime_suspended() is used to decide whether or not to execute
a device's (system) ->suspend() callback.  The callback is not
executed if pm_runtime_suspended() returns true, but it does so
for devices that don't even support runtime PM, because the
power.disable_depth device field is ignored by it.  This leads to
problems (i.e. devices are not suspened when they should), so rework
pm_runtime_suspended() so that it returns false if the device's
power.disable_depth field is different from zero.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org

f08f5a0a

16 12月, 2010 2 次提交

xen: Provide a variant of __RING_SIZE() that is an integer constant expression · 667c78af

由 Jeremy Fitzhardinge 提交于 12月 08, 2010

Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where
this is being used to specify array sizes.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Miller <davem@davemloft.net>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

667c78af

fanotify: split version into version and metadata_len · 62731fa0

由 Alexey Zaytsev 提交于 11月 22, 2010

To implement per event type optional headers we are interested in
knowing how long the metadata structure is.  This patch slits the __u32
version field into a __u8 version and a __u16 metadata_len field (with
__u8 left over).  This should allow for backwards compat ABI.
Signed-off-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com>
[rewrote descrtion and changed object sizes and ordering - eparis]
Signed-off-by: NEric Paris <eparis@redhat.com>

62731fa0

15 12月, 2010 1 次提交

Input: define separate EVIOCGKEYCODE_V2/EVIOCSKEYCODE_V2 · ab4e0192

由 Dmitry Torokhov 提交于 12月 14, 2010

The desire to keep old names for the EVIOCGKEYCODE/EVIOCSKEYCODE while
extending them to support large scancodes was a mistake. While we tried
to keep ABI intact (and we succeeded in doing that, programs compiled
on older kernels will work on newer ones) there is still a problem with
recompiling existing software with newer kernel headers.

New kernel headers will supply updated ioctl numbers and kernel will
expect that userspace will use struct input_keymap_entry to set and
retrieve keymap data. But since the names of ioctls are still the same
userspace will happily compile even if not adjusted to make use of the
new structure and will start miraculously fail in the field.

To avoid this issue let's revert EVIOCGKEYCODE/EVIOCSKEYCODE definitions
and add EVIOCGKEYCODE_V2/EVIOCSKEYCODE_V2 so that userspace can explicitly
select the style of ioctls it wants to employ.
Reviewed-by: NHenrik Rydberg <rydberg@euromail.se>
Acked-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

ab4e0192

14 12月, 2010 1 次提交

bootmem: Add alloc_bootmem_align() · 53dde5f3

由 Suresh Siddha 提交于 11月 16, 2010

Add an alloc_bootmem_align() interface to allocate bootmem with
specified alignment.  This is necessary to be able to allocate the
xsave area in a subsequent patch.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20101116212441.977574826@sbsiddha-MOBL3.sc.intel.com>
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>

53dde5f3

11 12月, 2010 4 次提交

ACPI: video: fix build for CONFIG_ACPI=n · b72512ed

由 Chris Wilson 提交于 9月 05, 2010

In file included from drivers/gpu/drm/i915/intel_opregion.c:30:
include/acpi/video.h:22: warning: ‘struct acpi_device’ declared inside parameter list
...
include/acpi/video.h:24: error: ‘ENODEV’ undeclared (first use in this function)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NLen Brown <len.brown@intel.com>

b72512ed

ACPI: video: fix build for VIDEO_OUTPUT_CONTROL=n · 3353bebe

由 Len Brown 提交于 11月 30, 2010

drivers/built-in.o: In function `acpi_video_bus_put_devices':
video.c:(.text+0x79663): undefined reference to
`video_output_unregister'
drivers/built-in.o: In function `acpi_video_bus_add':
video.c:(.text+0x7b0b3): undefined reference to `video_output_register'
Signed-off-by: NLen Brown <len.brown@intel.com>

3353bebe

acpi: fix _OSI string setup regression · d90aa92c

由 Lin Ming 提交于 12月 09, 2010

commit b0ed7a91(ACPICA/ACPI: Add new host interfaces for _OSI suppor)
introduced a regression that _OSI string setup fails.

There are 2 paths to setup _OSI string.

DMI:
acpi_dmi_osi_linux -> set_osi_linux -> acpi_osi_setup -> copy _OSI
string to osi_setup_string

Boot command line:
acpi_osi_setup -> copy _OSI string to osi_setup_string

Later, acpi_osi_setup_late will be called to handle osi_setup_string.
If _OSI string is "Linux" or "!Linux", then the call path is,

acpi_osi_setup_late -> acpi_cmdline_osi_linux -> set_osi_linux ->
acpi_osi_setup -> copy _OSI string to osi_setup_string

This actually never installs _OSI string(acpi_install_interface not
called), but just copy the _OSI string to osi_setup_string.

This patch fixes the regression.
Reported-and-tested-by: NLukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

d90aa92c

atm: correct sysfs 'device' link creation and parent relationships · d9ca676b

由 Dan Williams 提交于 12月 08, 2010

The ATM subsystem was incorrectly creating the 'device' link for ATM
nodes in sysfs.  This led to incorrect device/parent relationships
exposed by sysfs and udev.  Instead of rolling the 'device' link by hand
in the generic ATM code, pass each ATM driver's bus device down to the
sysfs code and let sysfs do this stuff correctly.
Signed-off-by: NDan Williams <dcbw@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9ca676b

09 12月, 2010 3 次提交

tcp: Replace time wait bucket msg by counter · 67631510

由 Tom Herbert 提交于 12月 08, 2010

Rather than printing the message to the log, use a mib counter to keep
track of the count of occurences of time wait bucket overflow.  Reduces
spam in logs.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67631510

sched: Cure more NO_HZ load average woes · 0f004f5a

由 Peter Zijlstra 提交于 11月 30, 2010

There's a long-running regression that proved difficult to fix and
which is hitting certain people and is rather annoying in its effects.

Damien reported that after 74f5187a (sched: Cure load average vs
NO_HZ woes) his load average is unnaturally high, he also noted that
even with that patch reverted the load avgerage numbers are not
correct.

The problem is that the previous patch only solved half the NO_HZ
problem, it addressed the part of going into NO_HZ mode, not of
comming out of NO_HZ mode. This patch implements that missing half.

When comming out of NO_HZ mode there are two important things to take
care of:

 - Folding the pending idle delta into the global active count.
 - Correctly aging the averages for the idle-duration.

So with this patch the NO_HZ interaction should be complete and
behaviour between CONFIG_NO_HZ=[yn] should be equivalent.

Furthermore, this patch slightly changes the load average computation
by adding a rounding term to the fixed point multiplication.
Reported-by: NDamien Wyart <damien.wyart@free.fr>
Reported-by: NTim McGrath <tmhikaru@gmail.com>
Tested-by: NDamien Wyart <damien.wyart@free.fr>
Tested-by: NOrion Poplawski <orion@cora.nwra.com>
Tested-by: NKyle McMartin <kyle@mcmartin.ca>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Chase Douglas <chase.douglas@canonical.com>
LKML-Reference: <1291129145.32004.874.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f004f5a

perf: Fix duplicate events with multiple-pmu vs software events · 51676957

由 Peter Zijlstra 提交于 12月 07, 2010

Because the multi-pmu bits can share contexts between struct pmu
instances we could get duplicate events by iterating the pmu list.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

51676957

08 12月, 2010 5 次提交

nfs: remove extraneous and problematic calls to nfs_clear_request · 2df485a7

由 Trond Myklebust 提交于 12月 07, 2010

When a nfs_page is freed, nfs_free_request is called which also calls
nfs_clear_request to clean out the lock and open contexts and free the
pagecache page.

However, a couple of places in the nfs code call nfs_clear_request
themselves. What happens here if the refcount on the request is still high?
We'll be releasing contexts and freeing pointers while the request is
possibly still in use.

Remove those bare calls to nfs_clear_context. That should only be done when
the request is being freed.

Note that when doing this, we need to watch out for tests of req->wb_page.
Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked()
would check the value of req->wb_page to figure out if the page is mapped
into the nfsi->nfs_page_tree. We now indicate the page is mapped using
the new bit PG_MAPPED in req->wb_flags .
Reported-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2df485a7

fanotify: Introduce FAN_NOFD · e9a3854f

由 Lino Sanfilippo 提交于 11月 24, 2010

FAN_NOFD is used in fanotify events that do not provide an open file
descriptor (like the overflow_event).
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: NEric Paris <eparis@redhat.com>

e9a3854f

fanotify: on group destroy allow all waiters to bypass permission check · 09e5f14e

由 Lino Sanfilippo 提交于 11月 19, 2010

When fanotify_release() is called, there may still be processes waiting for
access permission. Currently only processes for which an event has already been
queued into the groups access list will be woken up.  Processes for which no
event has been queued will continue to sleep and thus cause a deadlock when
fsnotify_put_group() is called.
Furthermore there is a race allowing further processes to be waiting on the
access wait queue after wake_up (if they arrive before clear_marks_by_group()
is called).
This patch corrects this by setting a flag to inform processes that the group
is about to be destroyed and thus not to wait for access permission.

[additional changelog from eparis]
Lets think about the 4 relevant code paths from the PoV of the
'operator' 'listener' 'responder' and 'closer'.  Where operator is the
process doing an action (like open/read) which could require permission.
Listener is the task (or in this case thread) slated with reading from
the fanotify file descriptor.  The 'responder' is the thread responsible
for responding to access requests.  'Closer' is the thread attempting to
close the fanotify file descriptor.

The 'operator' is going to end up in:
fanotify_handle_event()
  get_response_from_access()
    (THIS BLOCKS WAITING ON USERSPACE)

The 'listener' interesting code path
fanotify_read()
  copy_event_to_user()
    prepare_for_access_response()
      (THIS CREATES AN fanotify_response_event)

The 'responder' code path:
fanotify_write()
  process_access_response()
    (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')

The 'closer':
fanotify_release()
  (SUPPOSED TO CLEAN UP THE REST OF THIS MESS)

What we have today is that in the closer we remove all of the
fanotify_response_events and set a bit so no more response events are
ever created in prepare_for_access_response().

The bug is that we never wake all of the operators up and tell them to
move along.  You fix that in fanotify_get_response_from_access().  You
also fix other operators which haven't gotten there yet.  So I agree
that's a good fix.
[/additional changelog from eparis]

[remove additional changes to minimize patch size]
[move initialization so it was inside CONFIG_FANOTIFY_PERMISSION]
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: NEric Paris <eparis@redhat.com>

09e5f14e

fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is called · b1085ba8

由 Lino Sanfilippo 提交于 11月 05, 2010

Unsetting FMODE_NONOTIFY in fsnotify_open() is too late, since fsnotify_perm()
is called before. If FMODE_NONOTIFY is set fsnotify_perm() will skip permission
checks, so a user can still disable permission checks by setting this flag
in an open() call.
This patch corrects this by unsetting the flag before fsnotify_perm is called.
Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: NEric Paris <eparis@redhat.com>

b1085ba8

fanotify: remove packed from access response message · 88d60c32

由 Eric Paris 提交于 11月 08, 2010

Since fanotify has decided to be careful about alignment and packing
rather than rely on __attribute__((packed)) for multiarch support.
Since this attribute isn't doing anything on fanotify_response we just
drop it.  This does not break API/ABI.
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

88d60c32

07 12月, 2010 3 次提交

Input: add input driver for polled GPIO buttons · 0e7d0c86

由 Gabor Juhos 提交于 12月 06, 2010

The existing gpio-keys driver can be usable only for GPIO lines with
interrupt support. Several devices have buttons connected to a GPIO
line which is not capable to generate interrupts. This patch adds a
new input driver using the generic GPIO layer and the input-polldev
to support such buttons.

[Ben Gardiner <bengardiner@nanometrics.ca: fold code to use more
 of the original gpio_keys infrastructure; cleanups and other
 improvements.]
Signed-off-by: NGabor Juhos <juhosg@openwrt.org>
Signed-off-by: NBen Gardiner <bengardiner@nanometrics.ca>
Tested-by: NBen Gardiner <bengardiner@nanometrics.ca>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

0e7d0c86

PM / Hibernate: Fix memory corruption related to swap · c9e664f1

由 Rafael J. Wysocki 提交于 12月 03, 2010

There is a problem that swap pages allocated before the creation of
a hibernation image can be released and used for storing the contents
of different memory pages while the image is being saved.  Since the
kernel stored in the image doesn't know of that, it causes memory
corruption to occur after resume from hibernation, especially on
systems with relatively small RAM that need to swap often.

This issue can be addressed by keeping the GFP_IOFS bits clear
in gfp_allowed_mask during the entire hibernation, including the
saving of the image, until the system is finally turned off or
the hibernation is aborted.  Unfortunately, for this purpose
it's necessary to rework the way in which the hibernate and
suspend code manipulates gfp_allowed_mask.

This change is based on an earlier patch from Hugh Dickins.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Reported-by: NOndrej Zary <linux@rainbow-software.org>
Acked-by: NHugh Dickins <hughd@google.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: stable@kernel.org

c9e664f1

filter: fix sk_filter rcu handling · 46bcf14f

由 Eric Dumazet 提交于 12月 06, 2010

Pavel Emelyanov tried to fix a race between sk_filter_(de|at)tach and
sk_clone() in commit 47e958ea

Problem is we can have several clones sharing a common sk_filter, and
these clones might want to sk_filter_attach() their own filters at the
same time, and can overwrite old_filter->rcu, corrupting RCU queues.

We can not use filter->rcu without being sure no other thread could do
the same thing.

Switch code to a more conventional ref-counting technique : Do the
atomic decrement immediately and queue one rcu call back when last
reference is released.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46bcf14f

06 12月, 2010 1 次提交

ASoC: Fix off by one error in WM8994 EQ register bank size · 3fcc0afb

由 Uk Kim 提交于 12月 05, 2010

Signed-off-by: NUk Kim <w0806.kim@samsung.com>
Acked-by: NLiam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org

3fcc0afb

05 12月, 2010 1 次提交

drm/i915: announce to userspace that the bsd ring is coherent · bbf0c6b3

由 Daniel Vetter 提交于 12月 05, 2010

Otherwise we can't really fix the abi-braindeadness of forcing
libva to manually wait for rendering when switching rings. Which
in turn makes implementing hw semaphores a pointless exercise
(at least for ironlake).

[Also added the relaxed fencing param to explain the jump in
numbering - relaxed fencing is in -next.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

bbf0c6b3

03 12月, 2010 2 次提交

mem-hotplug: introduce {un}lock_memory_hotplug() · 20d6c96b

由 KOSAKI Motohiro 提交于 12月 02, 2010

Presently hwpoison is using lock_system_sleep() to prevent a race with
memory hotplug.  However lock_system_sleep() is a no-op if
CONFIG_HIBERNATION=n.  Therefore we need a new lock.
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Suggested-by: NHugh Dickins <hughd@google.com>
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

20d6c96b

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功