1. 03 2月, 2010 22 次提交
    • R
      uartlite: fix crash when using as console · 03eac7bb
      Richard Röjfors 提交于
      Move the ulite_console_setup to the .devinit section since it might be
      called on probe, which is in devinit.  Fixes the crash below where the
      uartlite hw is probed after the .init section is freed from the kernel.
      
      uartlite: ttyUL0 at MMIO 0xc8000100 (irq = 30) is a uartlite
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<c176720e>] ulite_console_setup+0x6f/0xa8
      *pdpt = 0000000036fb0001 *pde = 0000000000000000
      Oops: 0000 [#1] PREEMPT SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:1f.1/host0/uevent
      Modules linked in: puffin(+) serio_raw
      
      Pid: 151, comm: modprobe Not tainted (2.6.31.5-1.0.b1-b1 #1) POULSBO
      EIP: 0060:[<c176720e>] EFLAGS: 00010246 CPU: 0
      EIP is at ulite_console_setup+0x6f/0xa8
      EAX: c16ec824 EBX: c16ec824 ECX: c176719f EDX: 00000000
      ESI: 00000000 EDI: c17b42c4 EBP: f6fd1cf0 ESP: f6fd1cd8
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      Process modprobe (pid: 151, ti=f6fd0000 task=f6fa1020 task.ti=f6fd0000)
      Stack:
       c1031f51 00000000 00000000 00000246 c182237c f7742000 f6fd1d5c c11fd316
      <0> c16ec85c f77420d4 0000001e 00000000 00000000 c1633e78 4f494d4d 63783020
      <0> 30303038 00303031 f6fd1d3c c10e0786 f6fd1d48 00000000 f6fd1d48 00000000
      Call Trace:
       [<c1031f51>] ? register_console+0xf6/0x1fc
       [<c11fd316>] ? uart_add_one_port+0x237/0x2bb
       [<c10e0786>] ? sysfs_add_one+0x13/0xd3
       [<c10e142f>] ? sysfs_do_create_link+0xba/0xfc
       [<c146f200>] ? ulite_probe+0x198/0x1eb
       [<c12064ee>] ? platform_drv_probe+0xc/0xe
       [<c120597b>] ? driver_probe_device+0x79/0x105
       [<c1205a8e>] ? __device_attach+0x28/0x30
       [<c120511f>] ? bus_for_each_drv+0x3d/0x67
       [<c1205af9>] ? device_attach+0x44/0x58
       [<c1205a66>] ? __device_attach+0x0/0x30
       [<c1204fb8>] ? bus_probe_device+0x1f/0x34
       [<c1203e68>] ? device_add+0x385/0x4c0
       [<c148491f>] ? _write_unlock+0x8/0x1f
       [<c1206aac>] ? platform_device_add+0xd9/0x11c
       [<c120c685>] ? mfd_add_devices+0x165/0x1bc
       [<f831b378>] ? puffin_probe+0x2d0/0x390 [puffin]
       [<c11a08ef>] ? pci_match_device+0xa0/0xa7
       [<c11a07bc>] ? local_pci_probe+0xe/0x10
       [<c11a11db>] ? pci_device_probe+0x43/0x66
       [<c120597b>] ? driver_probe_device+0x79/0x105
       [<c1205a4a>] ? __driver_attach+0x43/0x5f
       [<c120535d>] ? bus_for_each_dev+0x3d/0x67
       [<c1205852>] ? driver_attach+0x14/0x16
       [<c1205a07>] ? __driver_attach+0x0/0x5f
       [<c1204dea>] ? bus_add_driver+0xf9/0x220
       [<c1205c8f>] ? driver_register+0x8b/0xeb
       [<c11a1518>] ? __pci_register_driver+0x43/0x9f
       [<c10477ef>] ? __blocking_notifier_call_chain+0x40/0x4c
       [<f831f000>] ? puffin_init+0x0/0x48 [puffin]
       [<f831f017>] ? puffin_init+0x17/0x48 [puffin]
       [<c1001139>] ? do_one_initcall+0x4c/0x131
       [<c105607b>] ? sys_init_module+0xa7/0x1b7
       [<c1002a61>] ? syscall_call+0x7/0xb
       Code: 6e 74 00 00 00 92 33 00 00 18 00 0e 01 73 79 6e 63 65 2d 72 65 67 69 73 74 72 79 0c 00 49 32
      00 00 14 00 09 01 61 6c 73 61 2d 69 <6e> 66 6f 00 00 00 42 37 00 00 10 00 07 01 6b 69 6c 6c 61 6c 6c
      EIP: [<c176720e>] ulite_console_setup+0x6f/0xa8 SS:ESP 0068:f6fd1cd8
      CR2: 0000000000000000
      Signed-off-by: NRichard Röjfors <richard.rojfors@pelagicore.com>
      Acked-by: NPeter Korsgaard <jacmet@sunsite.dk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03eac7bb
    • U
      imxfb: correct location of callbacks in suspend and resume · 1ec56203
      Uwe Kleine-König 提交于
      The probe function passes a pointer to a struct fb_info to
      platform_set_drvdata(), so don't interpret the return value of
      platform_get_drvdata() as a pointer to struct imxfb_info.
      
      The original imxfb_info *fbi backlight_power was NULL but in imxfb_suspend
      it was 4 resulting in an oops as imxfb_suspend calls
      imxfb_disable_controller(fbi) which in turn has
      
      	if (fbi->backlight_power)
      			fbi->backlight_power(0);
      Signed-off-by: NUwe Kleine-König  <u.kleine-koenig@pengutronix.de>
      Acked-by: NSascha Hauer <kernel@pengutronix.de>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ec56203
    • L
      cgroups: fix to return errno in a failure path · 4528fd05
      Li Zefan 提交于
      In cgroup_create(), if alloc_css_id() returns failure, the errno is not
      propagated to userspace, so mkdir will fail silently.
      
      To trigger this bug, we mount blkio (or memory subsystem), and create more
      then 65534 cgroups.  (The number of cgroups is limited to 65535 if a
      subsystem has use_id == 1)
      
       # mount -t cgroup -o blkio xxx /mnt
       # for ((i = 0; i < 65534; i++)); do mkdir /mnt/$i; done
       # mkdir /mnt/65534
       (should return ENOSPC)
       #
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NPaul Menage <menage@google.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4528fd05
    • H
      markup_oops.pl: fix $func_offset error with x86_64 · ef2b9b05
      Hui Zhu 提交于
      When I use markup_oops.pl parse a x8664 oops, I got:
      
      objdump: --start-address: bad number: NaN
      No matching code found
      This is because:
      main::(./m.pl:228):	open(FILE, "objdump -dS --adjust-vma=$vmaoffset --start-address=$decodestart --stop-address=$decodestop $filename |") || die "Cannot start objdump";
        DB<3> p $decodestart
      NaN
      
      This NaN is from:
      main::(./m.pl:176):	my $decodestart = Math::BigInt->from_hex("0x$target") - Math::BigInt->from_hex("0x$func_offset");
        DB<2> p $func_offset
      0x175
      
      There is already a "0x" in $func_offset, another 0x makes it a NaN.
      
      The $func_offset is from line:
      
      	if ($line =~ /RIP: 0010:\[\<[0-9a-f]+\>\]  \[\<[0-9a-f]+\>\] ([a-zA-Z0-9\_]+)\+(0x[0-9a-f]+)\/0x[a-f0-9]/) {
      		$function = $1;
      		$func_offset = $2;
      	}
      
      I make a patch to change "(0x[0-9a-f]+)\/0x[a-f0-9]/)" to "0x([0-9a-f]+)\/0x[a-f0-9]/)".
      Signed-off-by: NHui Zhu <teawater@gmail.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef2b9b05
    • R
      get_maintainer.pl: teach git log to use --no-color · 99cf6116
      Richard Kennedy 提交于
      When git has been set to always use color in .gitconfig then I get the
      warning message
      
              Bad divisor in main::vcs_assign: 0
      
      This is caused by vcs_file_signoffs not matching any commits due to the
      pattern not understand the colour codes.  Fix this by telling git log to
      never use colour.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      Acked-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      99cf6116
    • W
      devmem: fix kmem write bug on memory holes · c85e9a97
      Wu Fengguang 提交于
      write_kmem() used to assume vwrite() always return the full buffer length.
      However now vwrite() could return 0 to indicate memory hole.  This
      creates a bug that "buf" is not advanced accordingly.
      
      Fix it to simply ignore the return value, hence the memory hole.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c85e9a97
    • K
      devmem: check vmalloc address on kmem read/write · 325fda71
      KAMEZAWA Hiroyuki 提交于
      Otherwise vmalloc_to_page() will BUG().
      
      This also makes the kmem read/write implementation aligned with mem(4):
      "References to nonexistent locations cause errors to be returned." Here we
      return -ENXIO (inspired by Hugh) if no bytes have been transfered to/from
      user space, otherwise return partial read/write results.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      325fda71
    • A
      mm: flush dcache before writing into page to avoid alias · 931e80e4
      anfei zhou 提交于
      The cache alias problem will happen if the changes of user shared mapping
      is not flushed before copying, then user and kernel mapping may be mapped
      into two different cache line, it is impossible to guarantee the coherence
      after iov_iter_copy_from_user_atomic.  So the right steps should be:
      
      	flush_dcache_page(page);
      	kmap_atomic(page);
      	write to page;
      	kunmap_atomic(page);
      	flush_dcache_page(page);
      
      More precisely, we might create two new APIs flush_dcache_user_page and
      flush_dcache_kern_page to replace the two flush_dcache_page accordingly.
      
      Here is a snippet tested on omap2430 with VIPT cache, and I think it is
      not ARM-specific:
      
      	int val = 0x11111111;
      	fd = open("abc", O_RDWR);
      	addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	*(addr+0) = 0x44444444;
      	tmp = *(addr+0);
      	*(addr+1) = 0x77777777;
      	write(fd, &val, sizeof(int));
      	close(fd);
      
      The results are not always 0x11111111 0x77777777 at the beginning as expected.  Sometimes we see 0x44444444 0x77777777.
      Signed-off-by: NAnfei <anfei.zhou@gmail.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: <linux-arch@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      931e80e4
    • R
      kfifo: fix kernel-doc notation · bc173f70
      Randy Dunlap 提交于
      Fix kfifo kernel-doc warnings:
      
      Warning(kernel/kfifo.c:361): No description found for parameter 'total'
      Warning(kernel/kfifo.c:402): bad line:  @ @lenout: pointer to output variable with copied data
      Warning(kernel/kfifo.c:412): No description found for parameter 'lenout'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Stefani Seibold <stefani@seibold.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc173f70
    • S
      rtc-fm3130: add missing braces · f4b51628
      Sergey Matyukevich 提交于
      Add missing braces for multiline 'if' statements in fm3130_probe.
      Signed-off-by: NSergey Matyukevich <geomatsi@gmail.com>
      Signed-off-by: NAlessandro Zummo <a.zummo@towertech.it>
      Cc: Sergey Lapin <slapin@ossfans.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4b51628
    • A
      mx3fb: some debug and initialisation fixes · b3cb5372
      Alberto Panizzo 提交于
      Fix the kernel oops when dev_dbg is called with mx3_fbi->txd == NULL
      
      Fix the late initialisation of mx3fb->backlight_level.  If not, in the
      chain of function started by init_fb_chan(), in __blank() call
      sdc_set_brightness(mx3fb, mx3fb->backlight_level) that will shut down the
      CONTRAST PWM output.
      Signed-off-by: NAlberto Panizzo <maramaopercheseimorto@gmail.com>
      Acked-by: Guennadi Liakhovetski <g.liakhovetski <at> gmx.de>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3cb5372
    • T
      idr: fix a critical misallocation bug · 859ddf09
      Tejun Heo 提交于
      Eric Paris located a bug in idr.  With IDR_BITS of 6, it grows to three
      layers when id 4096 is first allocated.  When that happens, idr wraps
      incorrectly and searches the idr array ignoring the high bits.  The
      following test code from Eric demonstrates the bug nicely.
      
      #include <linux/idr.h>
      #include <linux/kernel.h>
      #include <linux/module.h>
      
      static DEFINE_IDR(test_idr);
      
      int init_module(void)
      {
      	int ret, forty95, forty96;
      	void *addr;
      
      	/* add 2 entries both with 4095 as the start address */
      again1:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again1;
      		return ret;
      	}
      	if (forty95 != 4095)
      		printk(KERN_ERR "hmmm, forty95=%d\n", forty95);
      
      again2:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again2;
      		return ret;
      	}
      	if (forty96 != 4096)
      		printk(KERN_ERR "hmmm, forty96=%d\n", forty96);
      
      	/* try to find the 2 entries, noticing that 4096 broke */
      	addr = idr_find(&test_idr, forty95);
      	if ((int)addr != forty95)
      		printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr);
      	addr = idr_find(&test_idr, forty96);
      	if ((int)addr != forty96)
      		printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr);
      	/* really weird, the entry which should be at 4096 is actually at 0!! */
      	addr = idr_find(&test_idr, 0);
      	if ((int)addr)
      		printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr);
      
      	idr_remove(&test_idr, forty95);
      	idr_remove(&test_idr, forty96);
      
      	return 0;
      }
      
      void cleanup_module(void)
      {
      }
      
      MODULE_AUTHOR("Eric Paris <eparis@redhat.com>");
      MODULE_DESCRIPTION("Simple idr test");
      MODULE_LICENSE("GPL");
      
      This happens because when sub_alloc() back tracks it doesn't always do it
      step-by-step while the over-the-limit detection assumes step-by-step
      backtracking.  The logic in sub_alloc() looks like the following.
      
        restart:
          clear pa[top level + 1] for end cond detection
          l = top level
          while (true) {
      	search for empty slot at this level
      	if (not found) {
      	    push id to the next possible value
      	    l++
      A:	    if (pa[l] is clear)
      	        failed, return asking caller to grow the tree
      	    if (going up 1 level gives more slots to search)
      	        continue the while loop above with the incremented l
      	    else
      C:	        goto restart
      	}
      	adjust id accordingly to the found slot
      	if (l == 0)
      	    return found id;
      	create lower level if not there yet
      	record pa[l] and l--
          }
      
      Test A is the fail exit condition but this assumes that failure is
      propagated upwared one level at a time but the B optimization path breaks
      the assumption and restarts the whole thing with a start value which is
      above the possible limit with the current layers.  sub_alloc() assumes the
      start id value is inside the limit when called and test A is the only exit
      condition check, so it ends up searching for empty slot while ignoring
      high set bit.
      
      So, for 4095->4096 test, level0 search fails but pa[1] contains a valid
      pointer.  However, going up 1 level wouldn't give any more empty slot so
      it takes C and when the whole thing restarts nobody notices the high bit
      set beyond the top level.
      
      This patch fixes the bug by changing the fail exit condition check to full
      id limit check.
      
      Based-on-patch-from: Eric Paris <eparis@redhat.com>
      Reported-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      859ddf09
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 1a45dcfe
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
        cfq-iosched: Do not idle on async queues
        blk-cgroup: Fix potential deadlock in blk-cgroup
        block: fix bugs in bio-integrity mempool usage
        block: fix bio_add_page for non trivial merge_bvec_fn case
        drbd: null dereference bug
        drbd: fix max_segment_size initialization
      1a45dcfe
    • N
      mm: purge fragmented percpu vmap blocks · 02b709df
      Nick Piggin 提交于
      Improve handling of fragmented per-CPU vmaps.  We previously don't free
      up per-CPU maps until all its addresses have been used and freed.  So
      fragmented blocks could fill up vmalloc space even if they actually had
      no active vmap regions within them.
      
      Add some logic to allow all CPUs to have these blocks purged in the case
      of failure to allocate a new vm area, and also put some logic to trim
      such blocks of a current CPU if we hit them in the allocation path (so
      as to avoid a large build up of them).
      
      Christoph reported some vmap allocation failures when using the per CPU
      vmap APIs in XFS, which cannot be reproduced after this patch and the
      previous bug fix.
      
      Cc: linux-mm@kvack.org
      Cc: stable@kernel.org
      Tested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      --
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      02b709df
    • N
      mm: percpu-vmap fix RCU list walking · de560423
      Nick Piggin 提交于
      RCU list walking of the per-cpu vmap cache was broken.  It did not use
      RCU primitives, and also the union of free_list and rcu_head is
      obviously wrong (because free_list is indeed the list we are RCU
      walking).
      
      While we are there, remove a couple of unused fields from an earlier
      iteration.
      
      These APIs aren't actually used anywhere, because of problems with the
      XFS conversion.  Christoph has now verified that the problems are solved
      with these patches.  Also it is an exported interface, so I think it
      will be good to be merged now (and Christoph wants to get the XFS
      changes into their local tree).
      
      Cc: stable@kernel.org
      Cc: linux-mm@kvack.org
      Tested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      --
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de560423
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 489b24f2
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        random: Remove unused inode variable
        crypto: padlock-sha - Add import/export support
        random: drop weird m_time/a_time manipulation
      489b24f2
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes · 4dab75ec
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
        GFS2: Use GFP_NOFS for alloc structure
        GFS2: Fix previous patch
        GFS2: Don't withdraw on partial rindex entries
        GFS2: Fix refcnt leak on gfs2_follow_link() error path
      4dab75ec
    • L
      Merge branch 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 · 7fbcca25
      Linus Torvalds 提交于
      * 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
        sh: Fix access to released memory in clk_debugfs_register_one()
        sh: Fix access to released memory in dwarf_unwinder_cleanup()
        usb: r8a66597-hdc disable interrupts fix
        spi: spi_sh_msiof: Fixed data sampling on the correct edge
      7fbcca25
    • L
      Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus · e770a0f1
      Linus Torvalds 提交于
      * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
        MIPS: 64-bit: Detect virtual memory size
        MIPS: AR7: Fix USB slave mem range typo
        MIPS: Alchemy: Fix dbdma ring destruction memory debugcheck.
      e770a0f1
    • L
      Fix 'flush_old_exec()/setup_new_exec()' split · 7ab02af4
      Linus Torvalds 提交于
      Commit 221af7f8 ("Split 'flush_old_exec' into two functions") split
      the function at the point of no return - ie right where there were no
      more error cases to check.  That made sense from a technical standpoint,
      but when we then also combined it with the actual personality setting
      going in between flush_old_exec() and setup_new_exec(), it needs to be a
      bit more careful.
      
      In particular, we need to make sure that we really flush the old
      personality bits in the 'flush' stage, rather than later in the 'setup'
      stage, since otherwise we might be flushing the _new_ personality state
      that we're just setting up.
      
      So this moves the flags and personality flushing (and 'flush_thread()',
      which is the arch-specific function that generally resets lazy FP state
      etc) of the old process into flush_old_exec(), so that it doesn't affect
      any state that execve() is setting up for the new process environment.
      
      This was reported by Michal Simek as breaking his Microblaze qemu
      environment.
      Reported-and-tested-by: NMichal Simek <michal.simek@petalogix.com>
      Cc: Peter Anvin <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ab02af4
    • V
      cfq-iosched: Do not idle on async queues · 1efe8fe1
      Vivek Goyal 提交于
      Few weeks back, Shaohua Li had posted similar patch. I am reposting it
      with more test results.
      
      This patch does two things.
      
      - Do not idle on async queues.
      
      - It also changes the write queue depth CFQ drives (cfq_may_dispatch()).
        Currently, we seem to driving queue depth of 1 always for WRITES. This is
        true even if there is only one write queue in the system and all the logic
        of infinite queue depth in case of single busy queue as well as slowly
        increasing queue depth based on last delayed sync request does not seem to
        be kicking in at all.
      
      This patch will allow deeper WRITE queue depths (subjected to the other
      WRITE queue depth contstraints like cfq_quantum and last delayed sync
      request).
      
      Shaohua Li had reported getting more out of his SSD. For me, I have got
      one Lun exported from an HP EVA and when pure buffered writes are on, I
      can get more out of the system. Following are test results of pure
      buffered writes (with end_fsync=1) with vanilla and patched kernel. These
      results are average of 3 sets of run with increasing number of threads.
      
      AVERAGE[bufwfs][vanilla]
      -------
      job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
      ---       --- --  ------------   -----------    -------------  -----------
      bufwfs    3   1   0              0              95349          474141
      bufwfs    3   2   0              0              100282         806926
      bufwfs    3   4   0              0              109989         2.7301e+06
      bufwfs    3   8   0              0              116642         3762231
      bufwfs    3   16  0              0              118230         6902970
      
      AVERAGE[bufwfs] [patched kernel]
      -------
      bufwfs    3   1   0              0              270722         404352
      bufwfs    3   2   0              0              206770         1.06552e+06
      bufwfs    3   4   0              0              195277         1.62283e+06
      bufwfs    3   8   0              0              260960         2.62979e+06
      bufwfs    3   16  0              0              299260         1.70731e+06
      
      I also ran buffered writes along with some sequential reads and some
      buffered reads going on in the system on a SATA disk because the potential
      risk could be that we should not be driving queue depth higher in presence
      of sync IO going to keep the max clat low.
      
      With some random and sequential reads going on in the system on one SATA
      disk I did not see any significant increase in max clat. So it looks like
      other WRITE queue depth control logic is doing its job. Here are the
      results.
      
      AVERAGE[brr, bsr, bufw together] [vanilla]
      -------
      job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
      ---       --- --  ------------   -----------    -------------  -----------
      brr       3   1   850            546345         0              0
      bsr       3   1   14650          729543         0              0
      bufw      3   1   0              0              23908          8274517
      
      brr       3   2   981.333        579395         0              0
      bsr       3   2   14149.7        1175689        0              0
      bufw      3   2   0              0              21921          1.28108e+07
      
      brr       3   4   898.333        1.75527e+06    0              0
      bsr       3   4   12230.7        1.40072e+06    0              0
      bufw      3   4   0              0              19722.3        2.4901e+07
      
      brr       3   8   900            3160594        0              0
      bsr       3   8   9282.33        1.91314e+06    0              0
      bufw      3   8   0              0              18789.3        23890622
      
      AVERAGE[brr, bsr, bufw mixed] [patched kernel]
      -------
      job       Set NR  ReadBW(KB/s)   MaxClat(us)    WriteBW(KB/s)  MaxClat(us)
      ---       --- --  ------------   -----------    -------------  -----------
      brr       3   1   837            417973         0              0
      bsr       3   1   14357.7        591275         0              0
      bufw      3   1   0              0              24869.7        8910662
      
      brr       3   2   1038.33        543434         0              0
      bsr       3   2   13351.3        1205858        0              0
      bufw      3   2   0              0              18626.3        13280370
      
      brr       3   4   913            1.86861e+06    0              0
      bsr       3   4   12652.3        1430974        0              0
      bufw      3   4   0              0              15343.3        2.81305e+07
      
      brr       3   8   890            2.92695e+06    0              0
      bsr       3   8   9635.33        1.90244e+06    0              0
      bufw      3   8   0              0              17200.3        24424392
      
      So looks like it might make sense to include this patch.
      
      Thanks
      Vivek
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1efe8fe1
    • G
      MIPS: 64-bit: Detect virtual memory size · 91dfc423
      Guenter Roeck 提交于
      Linux kernel 2.6.32 and later allocate address space from the top of the
      kernel virtual memory address space.
      
      This patch implements virtual memory size detection for 64 bit MIPS CPUs
      to avoid resulting crashes.
      Signed-off-by: NGuenter Roeck <guenter.roeck@ericsson.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/935/Reviewed-by: NDavid Daney <ddaney@caviumnetworks.com>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      91dfc423
  2. 02 2月, 2010 18 次提交