提交 · 19ac21465e15e476220909c01b23df847b6ffa30 · openanolis / cloud-kernel

11 10月, 2007 12 次提交

[DCCP]: Convert dccps_timestamp_time to ktime_t · 19ac2146

由 Arnaldo Carvalho de Melo 提交于 8月 19, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19ac2146

[KTIME]: Introduce ktime_sub_ns and ktime_sub_us · a272378d

由 Arnaldo Carvalho de Melo 提交于 8月 19, 2007

First user will be the DCCP transport networking protocol.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a272378d

J
[ETHTOOL]: Introduce ->{get,set}_priv_flags, ETHTOOL_[GS]PFLAGS · 339bf024
由 Jeff Garzik 提交于 8月 15, 2007
```
Signed-off-by: NJeff Garzik <jeff@garzik.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
339bf024
J
[ETHTOOL]: Introduce get_sset_count. Obsolete get_stats_count, self_test_count · ff03d49f
由 Jeff Garzik 提交于 8月 15, 2007
```
Signed-off-by: NJeff Garzik <jeff@garzik.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ff03d49f

[ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctls · 3ae7c0b2

由 Jeff Garzik 提交于 8月 15, 2007

Signed-off-by: NJeff Garzik <jeff@garzik.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ae7c0b2

[NET] netconsole: Support dynamic reconfiguration using configfs · 0bcc1816

由 Satyam Sharma 提交于 8月 10, 2007

Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>.

This patch introduces support for dynamic reconfiguration (adding, removing
and/or modifying parameters of netconsole targets at runtime) using a
userspace interface exported via configfs.  Documentation is also updated
accordingly.

Issues and brief design overview:

(1) Kernel-initiated creation / destruction of kernel objects is not
    possible with configfs -- the lifetimes of the "config items" is managed
    exclusively from userspace.  But netconsole must support boot/module
    params too, and these are parsed in kernel and hence netpolls must be
    setup from the kernel.  Joel Becker suggested to separately manage the
    lifetimes of the two kinds of netconsole_target objects -- those created
    via configfs mkdir(2) from userspace and those specified from the
    boot/module option string.  This adds complexity and some redundancy here
    and also means that boot/module param-created targets are not exposed
    through the configfs namespace (and hence cannot be updated / destroyed
    dynamically).  However, this saves us from locking / refcounting
    complexities that would need to be introduced in configfs to support
    kernel-initiated item creation / destroy there.

(2) In configfs, item creation takes place in the call chain of the
    mkdir(2) syscall in the driver subsystem.  If we used an ioctl(2) to
    create / destroy objects from userspace, the special userspace program is
    able to fill out the structure to be passed into the ioctl and hence
    specify attributes such as local interface that are required at the time
    we set up the netpoll.  For configfs, this information is not available at
    the time of mkdir(2).  So, we keep all newly-created targets (via
    configfs) disabled by default.  The user is expected to set various
    attributes appropriately (including the local network interface if
    required) and then write(2) "1" to the "enabled" attribute.  Thus,
    netpoll_setup() is then called on the set parameters in the context of
    _this_ write(2) on the "enabled" attribute itself.  This design enables
    the user to reconfigure existing netconsole targets at runtime to be
    attached to newly-come-up interfaces that may not have existed when
    netconsole was loaded or when the targets were actually created.  All this
    effectively enables us to get rid of custom ioctls.

(3) Ultra-paranoid configfs attribute show() and store() operations, with
    sanity and input range checking, using only safe string primitives, and
    compliant with the recommendations in Documentation/filesystems/sysfs.txt.

(4) A new function netpoll_print_options() is created in the netpoll API,
    that just prints out the configured parameters for a netpoll structure.
    netpoll_parse_options() is modified to use that and it is also exported to
    be used from netconsole.
Signed-off-by: NSatyam Sharma <satyam@infradead.org>
Acked-by: NKeiichi Kii <k-keiichi@bx.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0bcc1816

[TCP]: Update comment about highest_sack validity · 13dae426

由 Ilpo Järvinen 提交于 8月 10, 2007

This stale info came from the original idea, which proved to be
unnecessarily complex, sacked_out > 0 is easy to do and that when
it's going to be needed anyway (it _can_ be valid also when
sacked_out == 0 but there's not going to be a guarantee about it
for now).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13dae426

[TCP]: Tighten tcp_sock's belt, drop left_out · b5860bba

由 Ilpo Järvinen 提交于 8月 09, 2007

It is easily calculable when needed and user are not that many
after all.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5860bba

[TCP]: Access to highest_sack obsoletes forward_cnt_hint · 539d243f

由 Ilpo Järvinen 提交于 5月 27, 2007

In addition, added a reference about the purpose of the loop.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

539d243f

[TCP]: Add highest_sack seqno, points to globally highest SACK · d738cd8f

由 Ilpo Järvinen 提交于 3月 24, 2007

It is guaranteed to be valid only when !tp->sacked_out. In most
cases this seqno is available in the last ACK but there is no
guarantee for that. The new fast recovery loss marking algorithm
needs this as entry point.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d738cd8f

[NET]: Generic Large Receive Offload for TCP traffic · 71c87e0c

由 Jan-Bernd Themann 提交于 8月 08, 2007

This patch provides generic Large Receive Offload (LRO) functionality
for IPv4/TCP traffic.

LRO combines received tcp packets to a single larger tcp packet and
passes them then to the network stack in order to increase performance
(throughput). The interface supports two modes: Drivers can either
pass SKBs or fragment lists to the LRO engine.
Signed-off-by: NJan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71c87e0c

[NET]: Make NAPI polling independent of struct net_device objects. · bea3348e

由 Stephen Hemminger 提交于 10月 03, 2007

Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.

In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.

The signature of the ->poll() call back goes from:

	int foo_poll(struct net_device *dev, int *budget)

to

	int foo_poll(struct napi_struct *napi, int budget)

The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract).  The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.

The napi_struct is to be embedded in the device driver private data
structures.

Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler.  Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.

With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.

Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.

[ Ported to current tree and all drivers converted.  Integrated
  Stephen's follow-on kerneldoc additions, and restored poll_list
  handling to the old style to fix mutual exclusion issues.  -DaveM ]
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bea3348e

09 10月, 2007 1 次提交

mm: set_page_dirty_balance() vs ->page_mkwrite() · a200ee18

由 Peter Zijlstra 提交于 10月 08, 2007

All the current page_mkwrite() implementations also set the page dirty. Which
results in the set_page_dirty_balance() call to _not_ call balance, because the
page is already found dirty.

This allows us to dirty a _lot_ of pages without ever hitting
balance_dirty_pages().  Not good (tm).

Force a balance call if ->page_mkwrite() was successful.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a200ee18

08 10月, 2007 1 次提交

Don't do load-average calculations at even 5-second intervals · 0c2043ab

由 Linus Torvalds 提交于 10月 07, 2007

It turns out that there are a few other five-second timers in the
kernel, and if the timers get in sync, the load-average can get
artificially inflated by events that just happen to coincide.

So just offset the load average calculation it by a timer tick.

Noticed by Anders Boström, for whom the coincidence started triggering
on one of his machines with the JBD jiffies rounding code (JBD is one of
the subsystems that also end up using a 5-second timer by default).
Tested-by: NAnders Boström <anders@bostrom.dyndns.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0c2043ab

27 9月, 2007 1 次提交

Revert "[PATCH] x86-64: fix x86_64-mm-sched-clock-share" · ff0ce684

由 Linus Torvalds 提交于 9月 26, 2007

This reverts commit 184c44d2.

As noted by Dave Jones:
   "Linus, please revert the above cset.  It doesn't seem to be
    necessary (it was added to fix a miscompile in 'make allnoconfig'
    which doesn't seem to be repeatable with it reverted) and actively
   breaks the ARM SA1100 framebuffer driver."
Requested-by: NDave Jones <davej@redhat.com>
Cc: Russell King <rmk+lkml@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ff0ce684

21 9月, 2007 1 次提交

signalfd simplification · b8fceee1

由 Davide Libenzi 提交于 9月 20, 2007

This simplifies signalfd code, by avoiding it to remain attached to the
sighand during its lifetime.

In this way, the signalfd remain attached to the sighand only during
poll(2) (and select and epoll) and read(2).  This also allows to remove
all the custom "tsk == current" checks in kernel/signal.c, since
dequeue_signal() will only be called by "current".

I think this is also what Ben was suggesting time ago.

The external effect of this, is that a thread can extract only its own
private signals and the group ones.  I think this is an acceptable
behaviour, in that those are the signals the thread would be able to
fetch w/out signalfd.
Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b8fceee1

20 9月, 2007 4 次提交

sched: add /proc/sys/kernel/sched_compat_yield · 1799e35d

由 Ingo Molnar 提交于 9月 19, 2007

add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield()
more agressive, by moving the yielding task to the last position
in the rbtree.

with sched_compat_yield=0:

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  2539 mingo     20   0  1576  252  204 R   50  0.0   0:02.03 loop_yield
  2541 mingo     20   0  1576  244  196 R   50  0.0   0:02.05 loop

with sched_compat_yield=1:

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  2584 mingo     20   0  1576  248  196 R   99  0.0   0:52.45 loop
  2582 mingo     20   0  1576  256  204 R    0  0.0   0:00.00 loop_yield
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

1799e35d

Fix NUMA Memory Policy Reference Counting · 480eccf9

由 Lee Schermerhorn 提交于 9月 18, 2007

This patch proposes fixes to the reference counting of memory policy in the
page allocation paths and in show_numa_map().  Extracted from my "Memory
Policy Cleanups and Enhancements" series as stand-alone.

Shared policy lookup [shmem] has always added a reference to the policy,
but this was never unrefed after page allocation or after formatting the
numa map data.

Default system policy should not require additional ref counting, nor
should the current task's task policy.  However, show_numa_map() calls
get_vma_policy() to examine what may be [likely is] another task's policy.
The latter case needs protection against freeing of the policy.

This patch adds a reference count to a mempolicy returned by
get_vma_policy() when the policy is a vma policy or another task's
mempolicy.  Again, shared policy is already reference counted on lookup.  A
matching "unref" [__mpol_free()] is performed in alloc_page_vma() for
shared and vma policies, and in show_numa_map() for shared and another
task's mempolicy.  We can call __mpol_free() directly, saving an admittedly
inexpensive inline NULL test, because we know we have a non-NULL policy.

Handling policy ref counts for hugepages is a bit trickier.
huge_zonelist() returns a zone list that might come from a shared or vma
'BIND policy.  In this case, we should hold the reference until after the
huge page allocation in dequeue_hugepage().  The patch modifies
huge_zonelist() to return a pointer to the mempolicy if it needs to be
unref'd after allocation.

Kernel Build [16cpu, 32GB, ia64] - average of 10 runs:

		w/o patch	w/ refcount patch
	    Avg	  Std Devn	   Avg	  Std Devn
Real:	 100.59	    0.38	 100.63	    0.43
User:	1209.60	    0.37	1209.91	    0.31
System:   81.52	    0.42	  81.64	    0.34
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: NAndi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

480eccf9

Fix user namespace exiting OOPs · 28f300d2

由 Pavel Emelyanov 提交于 9月 18, 2007

It turned out, that the user namespace is released during the do_exit() in
exit_task_namespaces(), but the struct user_struct is released only during the
put_task_struct(), i.e.  MUCH later.

On debug kernels with poisoned slabs this will cause the oops in
uid_hash_remove() because the head of the chain, which resides inside the
struct user_namespace, will be already freed and poisoned.

Since the uid hash itself is required only when someone can search it, i.e.
when the namespace is alive, we can safely unhash all the user_struct-s from
it during the namespace exiting.  The subsequent free_uid() will complete the
user_struct destruction.

For example simple program

   #include <sched.h>

   char stack[2 * 1024 * 1024];

   int f(void *foo)
   {
   	return 0;
   }

   int main(void)
   {
   	clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0);
   	return 0;
   }

run on kernel with CONFIG_USER_NS turned on will oops the
kernel immediately.

This was spotted during OpenVZ kernel testing.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
Acked-by: N"Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28f300d2

Convert uid hash to hlist · 735de223

由 Pavel Emelyanov 提交于 9月 18, 2007

Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses
list_heads, thus occupying twice as much place as it could.  Convert it to
hlist_heads.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

735de223

17 9月, 2007 2 次提交

Fix non-ISA link error in drivers/scsi/advansys.c · fa890d58

由 Matthew Wilcox 提交于 9月 16, 2007

When CONFIG_ISA is disabled, the isa_driver support will not be compiled
in.  Define stubs so that we don't get link-time errors.
Signed-off-by: NMatthew Wilcox <matthew@wil.cx>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fa890d58

[NET] skbuff: Add skb_cow_head · d9cc2048

由 Herbert Xu 提交于 9月 16, 2007

This patch adds an optimised version of skb_cow that avoids the copy if
the header can be modified even if the rest of the payload is cloned.

This can be used in encapsulating paths where we only need to modify the
header. As it is, this can be used in PPPOE and bridging.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9cc2048

12 9月, 2007 4 次提交

Fix select on /proc files without ->poll · dd23aae4

由 Alexey Dobriyan 提交于 9月 11, 2007

Taneli Vähäkangas <vahakang@cs.helsinki.fi> reported that commit
786d7e16 aka "Fix rmmod/read/write races
in /proc entries" broke SBCL + SLIME combo.

The old code in do_select() used DEFAULT_POLLMASK, if couldn't find
->poll handler.  The new code makes ->poll always there and returns 0 by
default, which is not correct.  Return DEFAULT_POLLMASK instead.

Steps to reproduce:

	install emacs, SBCL, SLIME
	emacs
	M-x slime	in *inferior-lisp* buffer
	[watch it doing "Connecting to Swank on port X.."]

Please, apply before 2.6.23.

P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: T Taneli Vahakangas <vahakang@cs.helsinki.fi>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd23aae4

PTR_ALIGN · a83308e6

由 Matthew Wilcox 提交于 9月 11, 2007

The AdvanSys driver wants to align some pointers, and the ALIGN macro
doesn't work for pointers.  Rather than try to make it work, add a new
PTR_ALIGN macro which is typesafe.
Signed-off-by: NMatthew Wilcox <matthew@wil.cx>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a83308e6

leds: Add missing include for leds.h · df96efd7

由 Yoichi Yuasa 提交于 9月 11, 2007

This patch has added #include <linux/spinlock.h> to include/linux/leds.h
for rwlock_t.
Signed-off-by: NYoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: NRichard Purdie <rpurdie@rpsys.net>

df96efd7

ide: add ide_dev_is_sata() helper (take 2) · 6c3c22f3

由 Sergei Shtylyov 提交于 9月 11, 2007

Make the SATA drive detection code from eighty_ninty_three() into inline
ide_dev_is_sata() helper fixing it along the way to be more strict while
checking word 80 for the reserved values...
Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

6c3c22f3

11 9月, 2007 4 次提交

PCI: irq and pci_ids patch for Intel Tolapai · 99fa9844

由 Jason Gaston 提交于 8月 30, 2007

This patch adds the Intel Tolapai LPC and SMBus Controller DID's.
Signed-off-by: NJason Gaston <jason.d.gaston@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

99fa9844

PCI AER: fix warnings when PCIEAER=n · 5547bbee

由 Randy Dunlap 提交于 8月 23, 2007

Fix warnings when CONFIG_PCIEAER=n:

drivers/pci/pcie/portdrv_pci.c:105: warning: statement with no effect
drivers/pci/pcie/portdrv_pci.c:226: warning: statement with no effect
drivers/scsi/arcmsr/arcmsr_hba.c:352: warning: statement with no effect
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NLinas Vepstas <linas@austin.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

5547bbee

[NETFILTER]: Fix/improve deadlock condition on module removal netfilter · 16fcec35

由 Neil Horman 提交于 9月 11, 2007

So I've had a deadlock reported to me.  I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine.  This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation.  So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like....  :)  ).
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16fcec35

J
[libata, IDE] add new VIA bridge to VIA PATA drivers · b311ec4a
由 Joseph Chan 提交于 9月 10, 2007
```
Signed-off-by: NJoseph Chan <josephchan@via.com.tw>
Signed-off-by: NJeff Garzik <jeff@garzik.org>
```
b311ec4a

05 9月, 2007 1 次提交

Input: add more Braille keycodes · 9e3d3d07

由 Samuel Thibault 提交于 9月 04, 2007

Some braille keyboards have 10 dots, so extend the Input braille keys
definitions.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

9e3d3d07

01 9月, 2007 1 次提交

NFS: Fix a write request leak in nfs_invalidate_page() · 1b3b4a1a

由 Trond Myklebust 提交于 8月 28, 2007

Ryusuke Konishi says:

The recent truncate_complete_page() clears the dirty flag from a page
before calling a_ops->invalidatepage(),
^^^^^^
static void
truncate_complete_page(struct address_space *mapping, struct page *page)
{
        ...
        cancel_dirty_page(page, PAGE_CACHE_SIZE);  <--- Inserted here at
kernel 2.6.20

        if (PagePrivate(page))
                do_invalidatepage(page, 0);   ---> will call
a_ops->invalidatepage()
        ...
}

and this is disturbing nfs_wb_page_priority() from calling 
nfs_writepage_locked() that is expected to handle the pending
request (=nfs_page) associated with the page.

int nfs_wb_page_priority(struct inode *inode, struct page *page, int how)
{
        ...
        if (clear_page_dirty_for_io(page)) {
                ret = nfs_writepage_locked(page, &wbc);
                if (ret < 0)
                        goto out;
        }
        ...
}

Since truncate_complete_page() will get rid of the page after
a_ops->invalidatepage() returns, the request (=nfs_page) associated
with the page becomes a garbage in nfs_inode->nfs_page_tree.
------------------------

Fix this by ensuring that nfs_wb_page_priority() recognises that it may
also need to clear out non-dirty pages that have an nfs_page associated
with them.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b3b4a1a

31 8月, 2007 7 次提交

hugepage: fix broken check for offset alignment in hugepage mappings · dec4ad86

由 David Gibson 提交于 8月 30, 2007

For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.

In commit 68589bc3, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
 But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d8929, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings.  This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.

This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap().  I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dec4ad86

i2c-piix4: Fix SB700 PCI device ID · 60693e5a

由 Shane Huang 提交于 8月 30, 2007

We find that SB700 and SB800 use the same SMBus device ID as SB600, which is
0x4385, instead of the already submitted 0x4395.

Besides removing the wrong SB700 device ID, add SB800 support to kernel, by
renaming the PCI_DEVICE_ID_ATI_IXP600_SMBUS into
PCI_DEVICE_ID_ATI_SBX00_SMBUS.
Signed-off-by: NShane Huang <shane.huang@amd.com>
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60693e5a

PM: Fix dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION · f3de4be9

由 Rafael J. Wysocki 提交于 8月 30, 2007

Dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION introduced by commit
296699de "Introduce CONFIG_SUSPEND for
suspend-to-Ram and standby" are incorrect, as they don't cover the facts that
(1) not all architectures support suspend and (2) SMP hibernation is only
possible on X86 and PPC64 (if CONFIG_PPC64_SWSUSP is set).
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f3de4be9

libata: implement BROKEN_HPA horkage and apply it to affected drives · 16c55b03

由 Tejun Heo 提交于 8月 29, 2007

Some drives choke on READ_NATIVE_MAX_ADDRESS[_EXT].  Implement
ATA_HORKAGE_BROKEN_HPA and apply it to affected drives.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

16c55b03

SLUB: Force inlining for functions in slub_def.h · aa137f9d

由 Christoph Lameter 提交于 8月 31, 2007

Some compilers (especially older gcc releases) may skip inlining
sometimes which will lead to link failures.  Force the inlining of
keyfunctions in slub_def.h to avoid these issues.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Acked-by: NJan Dittmer <jdi@l4x.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa137f9d

ata: add ATA_MWDMA* and ATA_SWDMA* defines · 91a6d4ed

由 Bartlomiej Zolnierkiewicz 提交于 8月 27, 2007

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

91a6d4ed

[TCP]: Allow minimum RTO to be configurable via routing metrics. · 05bb1fad

由 David S. Miller 提交于 8月 30, 2007

Cell phone networks do link layer retransmissions and other
things that cause unnecessary timeout retransmits.  So allow
the minimum RTO to be inflated per-route to deal with this.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05bb1fad

28 8月, 2007 1 次提交

sched: make the scheduler converge to the ideal latency · f6cf891c

由 Ingo Molnar 提交于 8月 28, 2007

de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].

With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):

 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:   17 .   13 | per:   44 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   36 .   40
 out:  29 . 27. 32 | flu:  2 .  0 | ran:   17 .   13 | per:   46 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
 out:  29 . 27. 32 | flu:  0 .  0 | ran:   18 .   13 | per:   47 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40

average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.

to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.

( this adds a new field to task_struct, but we can eliminate that
  overhead in 2.6.24 by putting all the scheduler timestamps into an
  anonymous union. )

with this change in place, chew-max output is fluctuation-less all
around:

 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40

this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NMike Galbraith <efault@gmx.de>

f6cf891c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功