提交 · ae191838b0251d73b9d0a7254c6938406f5f6320 · openanolis / cloud-kernel

08 3月, 2011 2 次提交

nilfs2: optimize rec_len functions · ae191838

由 Ryusuke Konishi 提交于 2月 04, 2011

This is a similar change to those in ext2/ext3 codebase (commit
40a063f6 and a4ae3094, respectively).

The addition of 64k block capability in the rec_len_from_disk and
rec_len_to_disk functions added a bit of math overhead which slows
down file create workloads needlessly when the architecture cannot
even support 64k blocks.  This will cut the corner.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

ae191838

nilfs2: use common file attribute macros · f0c9f242

由 Ryusuke Konishi 提交于 1月 20, 2011

Replaces uses of own inode flags (i.e. NILFS_SECRM_FL, NILFS_UNRM_FL,
NILFS_COMPR_FL, and so forth) with common inode flags, and removes the
own flag declarations.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

f0c9f242

05 3月, 2011 5 次提交

mm: add alloc_page_vma_node() · 236344d6

由 Andi Kleen 提交于 3月 04, 2011

Add a alloc_page_vma_node that allows passing the "local" node in.  Used
in a followon patch.
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

236344d6

mm: change alloc_pages_vma to pass down the policy node for local policy · 2f5f9486

由 Andi Kleen 提交于 3月 04, 2011

Currently alloc_pages_vma() always uses the local node as policy node for
the LOCAL policy.  Pass this node down as an argument instead.

No behaviour change from this patch, but will be needed for followons.
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f5f9486

libceph: fix msgr keepalive flag · e76661d0

由 Sage Weil 提交于 3月 03, 2011

There was some broken keepalive code using a dead variable.  Shift to using
the proper bit flag.
Signed-off-by: NSage Weil <sage@newdream.net>

e76661d0

libceph: fix msgr backoff · 60bf8bf8

由 Sage Weil 提交于 3月 04, 2011

With commit f363e45f we replaced a bunch of hacky workqueue mutual
exclusion logic with the WQ_NON_REENTRANT flag.  One pieces of fallout is
that the exponential backoff breaks in certain cases:

 * con_work attempts to connect.
 * we get an immediate failure, and the socket state change handler queues
   immediate work.
 * con_work calls con_fault, we decide to back off, but can't queue delayed
   work.

In this case, we add a BACKOFF bit to make con_work reschedule delayed work
next time it runs (which should be immediately).
Signed-off-by: NSage Weil <sage@newdream.net>

60bf8bf8

Mark ptrace_{traceme,attach,detach} static · e3e89cc5

由 Linus Torvalds 提交于 3月 04, 2011

They are only used inside kernel/ptrace.c, and have been for a long
time. We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3e89cc5

03 3月, 2011 1 次提交

blktrace: Remove blk_fill_rwbs_rq. · 2d3a8497

由 Tao Ma 提交于 3月 03, 2011

If we enable trace events to trace block actions, We use
blk_fill_rwbs_rq to analyze the corresponding actions
in request's cmd_flags, but we only choose the minor 2 bits
from it, so most of other flags(e.g, REQ_SYNC) are missing.
For example, with a sync write we get:
write_test-2409  [001]   160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
8 [write_test]

Since now we have integrated the flags of both bio and request,
it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
blk_fill_rwbs_rq isn't needed any more.

With this patch, after a sync write we get:
write_test-2417  [000]   226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
 8 [write_test]
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

2d3a8497

02 3月, 2011 3 次提交

block: add @force_kblockd to __blk_run_queue() · 1654e741

由 Tejun Heo 提交于 3月 02, 2011

__blk_run_queue() automatically either calls q->request_fn() directly
or schedules kblockd depending on whether the function is recursed.
blk-flush implementation needs to be able to explicitly choose
kblockd.  Add @force_kblockd.

All the current users are converted to specify %false for the
parameter and this patch doesn't introduce any behavior change.

stable: This is prerequisite for fixing ide oops caused by the new
        blk-flush implementation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

1654e741

mfd: Don't suspend WM8994 if the CODEC is not suspended · 77bd70e9

由 Mark Brown 提交于 2月 04, 2011

ASoC supports keeping the audio subsysetm active over suspend in order
to support use cases such as audio passthrough from a cellular modem
with the main CPU suspended. Ensure that we don't power down the CODEC
when this is happening by checking to see if VMID is up and skipping
suspend and resume when it is. If the CODEC has suspended then it'll
turn VMID off before the core suspend() gets called.
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

77bd70e9

blk-throttle: Do not use kblockd workqueue for throtl work · 450adcbe

由 Vivek Goyal 提交于 3月 01, 2011

o Dominik Klein reported a system hang issue while doing some blkio
  throttling testing.

  https://lkml.org/lkml/2011/2/24/173

o Some tracing revealed that CFQ was not dispatching any more jobs as
  queue unplug was not happening. And queue unplug was not happening
  because unplug work was not being called as there was one throttling
  work on same cpu which as not finished yet. And throttling work had not
  finished as it was tyring to dispatch a bio to CFQ but all the request
  descriptors were consume to it was put to sleep.

o So basically it is a cyclic dependecny between CFQ unplug work and
  throtl dispatch work. Tejun suggested that use separate workqueue for
  such cases.

o This patch uses a separate workqueue for throttle related work and
  does not rely on kblockd workqueue anymore.

Cc: stable@kernel.org
Reported-by: NDominik Klein <dk@in-telegence.net>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

450adcbe

01 3月, 2011 1 次提交

ACPI: Fix build for CONFIG_NET unset · af06216a

由 Rafael J. Wysocki 提交于 3月 01, 2011

Several ACPI drivers fail to build if CONFIG_NET is unset, because
they refer to things depending on CONFIG_THERMAL that in turn depends
on CONFIG_NET.  However, CONFIG_THERMAL doesn't really need to depend
on CONFIG_NET, because the only part of it requiring CONFIG_NET is
the netlink interface in thermal_sys.c.

Put the netlink interface in thermal_sys.c under #ifdef CONFIG_NET
and remove the dependency of CONFIG_THERMAL on CONFIG_NET from
drivers/thermal/Kconfig.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Len Brown <lenb@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Luming Yu <luming.yu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

af06216a

26 2月, 2011 1 次提交

rapidio: fix sysfs config attribute to access 16MB of maint space · fe41947e

由 Alexandre Bounine 提交于 2月 25, 2011

Fixes sysfs config attribute to allow access to entire 16MB maintenance
space of RapidIO devices.
Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe41947e

25 2月, 2011 1 次提交

PM: Make ACPI wakeup from S5 work again when CONFIG_PM_SLEEP is unset · 805bdaec

由 Rafael J. Wysocki 提交于 2月 24, 2011

Commit 074037ec (PM / Wakeup: Introduce wakeup source objects and
event statistics (v3)) caused ACPI wakeup to only work if
CONFIG_PM_SLEEP is set, but it also worked for CONFIG_PM_SLEEP unset
before.  This can be fixed by making device_set_wakeup_enable(),
device_init_wakeup() and device_may_wakeup() work in the same way
as before commit 074037ec when CONFIG_PM_SLEEP is unset.
Reported-and-tested-by: NJustin Maggard <jmaggard10@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

805bdaec

24 2月, 2011 2 次提交

Fix over-zealous flush_disk when changing device size. · 93b270f7

由 NeilBrown 提交于 2月 24, 2011

There are two cases when we call flush_disk.
In one, the device has disappeared (check_disk_change) so any
data will hold becomes irrelevant.
In the oter, the device has changed size (check_disk_size_change)
so data we hold may be irrelevant.

In both cases it makes sense to discard any 'clean' buffers,
so they will be read back from the device if needed.

In the former case it makes sense to discard 'dirty' buffers
as there will never be anywhere safe to write the data.  In the
second case it *does*not* make sense to discard dirty buffers
as that will lead to file system corruption when you simply enlarge
the containing devices.

flush_disk calls __invalidate_devices.
__invalidate_device calls both invalidate_inodes and invalidate_bdev.

invalidate_inodes *does* discard I_DIRTY inodes and this does lead
to fs corruption.

invalidate_bev *does*not* discard dirty pages, but I don't really care
about that at present.

So this patch adds a flag to __invalidate_device (calling it
__invalidate_device2) to indicate whether dirty buffers should be
killed, and this is passed to invalidate_inodes which can choose to
skip dirty inodes.

flusk_disk then passes true from check_disk_change and false from
check_disk_size_change.

dm avoids tripping over this problem by calling i_size_write directly
rathher than using check_disk_size_change.

md does use check_disk_size_change and so is affected.

This regression was introduced by commit 608aeef1 which causes
check_disk_size_change to call flush_disk, so it is suitable for any
kernel since 2.6.27.

Cc: stable@kernel.org
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NNeilBrown <neilb@suse.de>

93b270f7

mm: prevent concurrent unmap_mapping_range() on the same inode · 2aa15890

由 Miklos Szeredi 提交于 2月 23, 2011

Michael Leun reported that running parallel opens on a fuse filesystem
can trigger a "kernel BUG at mm/truncate.c:475"

Gurudas Pai reported the same bug on NFS.

The reason is, unmap_mapping_range() is not prepared for more than
one concurrent invocation per inode.  For example:

  thread1: going through a big range, stops in the middle of a vma and
     stores the restart address in vm_truncate_count.

  thread2: comes in with a small (e.g. single page) unmap request on
     the same vma, somewhere before restart_address, finds that the
     vma was already unmapped up to the restart address and happily
     returns without doing anything.

Another scenario would be two big unmap requests, both having to
restart the unmapping and each one setting vm_truncate_count to its
own value.  This could go on forever without any of them being able to
finish.

Truncate and hole punching already serialize with i_mutex.  Other
callers of unmap_mapping_range() do not, and it's difficult to get
i_mutex protection for all callers.  In particular ->d_revalidate(),
which calls invalidate_inode_pages2_range() in fuse, may be called
with or without i_mutex.

This patch adds a new mutex to 'struct address_space' to prevent
running multiple concurrent unmap_mapping_range() on the same mapping.

[ We'll hopefully get rid of all this with the upcoming mm
  preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
  lockbreak" patch in particular.  But that is for 2.6.39 ]
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reported-by: NMichael Leun <lkml20101129@newton.leun.net>
Reported-by: NGurudas Pai <gurudas.pai@oracle.com>
Tested-by: NGurudas Pai <gurudas.pai@oracle.com>
Acked-by: NHugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2aa15890

22 2月, 2011 1 次提交

module: explicitly align module_version_attribute structure · 98562ad8

由 Dmitry Torokhov 提交于 2月 04, 2011

We force particular alignment when we generate attribute structures
when generation MODULE_VERSION() data and we need to make sure that
this alignment is followed when we iterate over these structures,
otherwise we may crash on platforms whose natural alignment is not
sizeof(void *), such as m68k.
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
[ There are more issues here, but the fixes are incredibly ugly - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98562ad8

20 2月, 2011 1 次提交

net: dcb: match dcb_app protocol field with 802.1Qaz spec · 226111d1

由 John Fastabend 提交于 2月 18, 2011

The dcb_app protocol field is a __u32 however the 802.1Qaz
specification defines it as a 16 bit field. This patch brings
the structure inline with the spec making it a __u16.

CC: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

226111d1

19 2月, 2011 1 次提交

Expand CONFIG_DEBUG_LIST to several other list operations · 3c18d4de

由 Linus Torvalds 提交于 2月 18, 2011

When list debugging is enabled, we aim to readably show list corruption
errors, and the basic list_add/list_del operations end up having extra
debugging code in them to do some basic validation of the list entries.

However, "list_del_init()" and "list_move[_tail]()" ended up avoiding
the debug code due to how they were written. This fixes that.

So the _next_ time we have list_move() problems with stale list entries,
we'll hopefully have an easier time finding them..
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3c18d4de

18 2月, 2011 2 次提交

RTC: Re-enable UIE timer/polling emulation · 456d66ec

由 John Stultz 提交于 2月 11, 2011

This patch re-enables UIE timer/polling emulation for rtc devices
that do not support alarm irqs.

CC: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

456d66ec

RTC: Revert UIE emulation removal · 6e57b1d6

由 John Stultz 提交于 2月 11, 2011

Uwe pointed out that my alarm based UIE emulation is not sufficient
to replace the older timer/polling based UIE emulation on devices
where there is no alarm irq. This causes rtc devices without alarms
to return -EINVAL to UIE ioctls. The fix is to re-instate the old
timer/polling method for devices without alarm irqs.

This patch reverts the following commits:
042620a0 - Remove UIE emulation
1daeddd5 - Cleanup removed UIE emulation declaration
b5cc8ca1 - Remove Kconfig symbol for UIE emulation

The emulation mode will still need to be wired-in with a following
patch before it will work.

CC: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

6e57b1d6

17 2月, 2011 1 次提交

workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' · 58a69cb4

由 Tejun Heo 提交于 2月 16, 2011

There are two spellings in use for 'freeze' + 'able' - 'freezable' and
'freezeable'.  The former is the more prominent one.  The latter is
mostly used by workqueue and in a few other odd places.  Unify the
spelling to 'freezable'.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Acked-by: NDmitry Torokhov <dtor@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Steven Whitehouse <swhiteho@redhat.com>

58a69cb4

16 2月, 2011 1 次提交

thp: prevent hugepages during args/env copying into the user stack · a7d6e4ec

由 Andrea Arcangeli 提交于 2月 15, 2011

Transparent hugepages can only be created if rmap is fully
functional. So we must prevent hugepages to be created while
is_vma_temporary_stack() is true.

This also optmizes away some harmless but unnecessary setting of
khugepaged_scan.address and it switches some BUG_ON to VM_BUG_ON.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7d6e4ec

14 2月, 2011 1 次提交

klist: Fix object alignment on 64-bit. · 795abaf1

由 David Miller 提交于 2月 13, 2011

Commit c0e69a5b ("klist.c: bit 0 in pointer can't be used as flag")
intended to make sure that all klist objects were at least pointer size
aligned, but used the constant "4" which only works on 32-bit.

Use "sizeof(void *)" which is correct in all cases.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
Cc: stable <stable@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

795abaf1

11 2月, 2011 2 次提交

Input: matrix_keypad - increase the limit of rows and columns · cfaea567

由 Trilok Soni 提交于 2月 11, 2011

Some keyboard controllers support more than 16 columns and rows.
Increase the limit to 32.
Signed-off-by: NTrilok Soni <tsoni@codeaurora.org>
Acked-by: NEric Miao <eric.y.miao@gmail.com>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

cfaea567

security: add cred argument to security_capable() · 6037b715

由 Chris Wright 提交于 2月 09, 2011

Expand security_capable() to include cred, so that it can be usable in a
wider range of call sites.
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

6037b715

09 2月, 2011 2 次提交

CDC NCM errata updates for cdc.h · 3a9dda76

由 Alexey Orishko 提交于 2月 07, 2011

Changes are based on the following documents:
- CDC NCM errata:
http://www.usb.org/developers/devclass_docs/NCM10_012011.zip
- CDC and WMC errata link:
http://www.usb.org/developers/devclass_docs/CDC1.2_WMC1.1_012011.zipSigned-off-by: NAlexey Orishko <alexey.orishko@stericsson.com>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a9dda76

virtio: console: Update Copyright · 5084f893

由 Amit Shah 提交于 1月 31, 2011

Signed-off-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

5084f893

05 2月, 2011 2 次提交

genirq: Add missing status flags to modification mask · 872434d6

由 Thomas Gleixner 提交于 2月 05, 2011

The mask which filters out the valid bits which can be set via
irq_modify_status() is missing IRQ_NO_BALANCING, which breaks UV.

Add IRQ_PER_CPU as well to avoid another one line patch for 39.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

872434d6

USB: Fix trout build failure with ci13xxx_msm gadget · 8cf28f1f

由 Pavankumar Kondeti 提交于 2月 04, 2011

This patch fixes the below compilation errors.

CC drivers/usb/gadget/ci13xxx_msm.o
CC net/mac80211/led.o
drivers/usb/gadget/ci13xxx_msm.c: In function 'ci13xxx_msm_notify_event':
drivers/usb/gadget/ci13xxx_msm.c:42: error: 'USB_AHBBURST' undeclared (first use in this function)
drivers/usb/gadget/ci13xxx_msm.c:42: error: (Each undeclared identifier is reported only once
drivers/usb/gadget/ci13xxx_msm.c:42: error: for each function it appears in.)
drivers/usb/gadget/ci13xxx_msm.c:43: error: 'USB_AHBMODE' undeclared (first use in this function)
make[4]: *** [drivers/usb/gadget/ci13xxx_msm.o] Error 1
make[3]: *** [drivers/usb/gadget] Error 2

MSM USB driver is not supported on boards like trout (MSM7201) which
has an external PHY.
Signed-off-by: NPavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

8cf28f1f

04 2月, 2011 1 次提交
- D
  net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6. · e2d57766
  由 David S. Miller 提交于 2月 03, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  e2d57766
03 2月, 2011 6 次提交

tracing: Replace syscall_meta_data struct array with pointer array · 3d56e331

由 Steven Rostedt 提交于 2月 02, 2011

Currently the syscall_meta structures for the syscall tracepoints are
placed in the __syscall_metadata section, and at link time, the linker
makes one large array of all these syscall metadata structures. On boot
up, this array is read (much like the initcall sections) and the syscall
data is processed.

The problem is that there is no guarantee that gcc will place complex
structures nicely together in an array format. Two structures in the
same file may be placed awkwardly, because gcc has no clue that they
are suppose to be in an array.

A hack was used previous to force the alignment to 4, to pack the
structures together. But this caused alignment issues with other
architectures (sparc).

Instead of packing the structures into an array, the structures' addresses
are now put into the __syscall_metadata section. As pointers are always the
natural alignment, gcc should always pack them tightly together
(otherwise initcall, extable, etc would also fail).

By having the pointers to the structures in the section, we can still
iterate the trace_events without causing unnecessary alignment problems
with other architectures, or depending on the current behaviour of
gcc that will likely change in the future just to tick us kernel developers
off a little more.

The __syscall_metadata section is also moved into the .init.data section
as it is now only needed at boot up.
Suggested-by: NDavid Miller <davem@davemloft.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

3d56e331

tracepoints: Fix section alignment using pointer array · 65498646

由 Mathieu Desnoyers 提交于 1月 26, 2011

Make the tracepoints more robust, making them solid enough to handle compiler
changes by not relying on anything based on compiler-specific behavior with
respect to structure alignment. Implement an approach proposed by David Miller:
use an array of const pointers to refer to the individual structures, and export
this pointer array through the linker script rather than the structures per se.
It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
for the pointers), but are less likely to break due to compiler changes.

History:

commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
added the aligned(32) type and variable attribute to the tracepoint structures
to deal with gcc happily aligning statically defined structures on 32-byte
multiples.

One attempt was to use a 8-byte alignment for tracepoint structures by applying
both the variable and type attribute to tracepoint structures definitions and
declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.

The reason is that the "aligned" attribute only specify the _minimum_ alignment
for a structure, leaving both the compiler and the linker free to align on
larger multiples. Because tracepoint.c expects the structures to be placed as an
array within each section, up-alignment cause NULL-pointer exceptions due to the
extra unexpected padding.

(this patch applies on top of -tip)
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
LKML-Reference: <20110126222622.GA10794@Krystal>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

65498646

tracing: Replace trace_event struct array with pointer array · e4a9ea5e

由 Steven Rostedt 提交于 1月 27, 2011

Currently the trace_event structures are placed in the _ftrace_events
section, and at link time, the linker makes one large array of all
the trace_event structures. On boot up, this array is read (much like
the initcall sections) and the events are processed.

The problem is that there is no guarantee that gcc will place complex
structures nicely together in an array format. Two structures in the
same file may be placed awkwardly, because gcc has no clue that they
are suppose to be in an array.

A hack was used previous to force the alignment to 4, to pack the
structures together. But this caused alignment issues with other
architectures (sparc).

Instead of packing the structures into an array, the structures' addresses
are now put into the _ftrace_event section. As pointers are always the
natural alignment, gcc should always pack them tightly together
(otherwise initcall, extable, etc would also fail).

By having the pointers to the structures in the section, we can still
iterate the trace_events without causing unnecessary alignment problems
with other architectures, or depending on the current behaviour of
gcc that will likely change in the future just to tick us kernel developers
off a little more.

The _ftrace_event section is also moved into the .init.data section
as it is now only needed at boot up.
Suggested-by: NDavid Miller <davem@davemloft.net>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e4a9ea5e

vfs: sparse: add __FMODE_EXEC · 3cd90ea4

由 Namhyung Kim 提交于 2月 01, 2011

FMODE_EXEC is a constant type of fmode_t but was used with normal integer
constants.  This results in following warnings from sparse.  Fix it using
new macro __FMODE_EXEC.

 fs/exec.c:116:58: warning: restricted fmode_t degrades to integer
 fs/exec.c:689:58: warning: restricted fmode_t degrades to integer
 fs/fcntl.c:777:9: warning: restricted fmode_t degrades to integer
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3cd90ea4

vfs: sparse: remove a warning on OPEN_FMODE() · 1a44bc8c

由 Namhyung Kim 提交于 2月 01, 2011

AND-ing FMODE_* constant with normal integer results in following
sparse warnings. Fix it.

 fs/open.c:662:21: warning: restricted fmode_t degrades to integer
 fs/anon_inodes.c:123:34: warning: restricted fmode_t degrades to integer
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1a44bc8c

memcg: prevent endless loop when charging huge pages to near-limit group · 19942822

由 Johannes Weiner 提交于 2月 01, 2011

If reclaim after a failed charging was unsuccessful, the limits are
checked again, just in case they settled by means of other tasks.

This is all fine as long as every charge is of size PAGE_SIZE, because in
that case, being below the limit means having at least PAGE_SIZE bytes
available.

But with transparent huge pages, we may end up in an endless loop where
charging and reclaim fail, but we keep going because the limits are not
yet exceeded, although not allowing for a huge page.

Fix this up by explicitely checking for enough room, not just whether we
are within limits.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19942822

01 2月, 2011 1 次提交

kernel.h: fix kernel-doc warning · ffbbf2da

由 Randy Dunlap 提交于 1月 31, 2011

Fix kernel-doc warning in kernel.h from commit 7ef88ad5
("BUILD_BUG_ON: make it handle more cases"):

  Warning(include/linux/kernel.h:605): No description found for parameter 'condition'
  Warning(include/linux/kernel.h:605): Excess function parameter 'cond' description in 'BUILD_BUG_ON'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ffbbf2da

30 1月, 2011 2 次提交

net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT · 709b46e8

由 Eric W. Biederman 提交于 1月 29, 2011

SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
which unfortunately means the existing infrastructure for compat networking
ioctls is insufficient.  A trivial compact ioctl implementation would conflict
with:

SIOCAX25ADDUID
SIOCAIPXPRISLT
SIOCGETSGCNT_IN6
SIOCGETSGCNT
SIOCRSSCAUSE
SIOCX25SSUBSCRIP
SIOCX25SDTEFACILITIES

To make this work I have updated the compat_ioctl decode path to mirror the
the normal ioctl decode path.  I have added an ipv4 inet_compat_ioctl function
so that I can have ipv4 specific compat ioctls.   I have added a compat_ioctl
function into struct proto so I can break out ioctls by which kind of ip socket
I am using.  I have added a compat_raw_ioctl function because SIOCGETSGCNT only
works on raw sockets.  I have added a ipmr_compat_ioctl that mirrors the normal
ipmr_ioctl.

This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
has unsigned longs in it so changes between 32bit and 64bit kernels.

This change was sufficient to run a 32bit ip multicast routing daemon on a
64bit kernel.
Reported-by: NBill Fenner <fenner@aristanetworks.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

709b46e8

caif: bugfix - add caif headers for userspace usage. · 52fe7c9c

由 sjur.brandeland@stericsson.com 提交于 1月 29, 2011

Add caif_socket.h and if_caif.h to the kernel header files
exported for use by userspace.
Signed-off-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52fe7c9c

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功