1. 22 9月, 2010 3 次提交
    • J
      bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends · 692ebd17
      Jan Kara 提交于
      Inodes of devices such as /dev/zero can get dirty for example via
      utime(2) syscall or due to atime update. Backing device of such inodes
      (zero_bdi, etc.) is however unable to handle dirty inodes and thus
      __mark_inode_dirty complains.  In fact, inode should be rather dirtied
      against backing device of the filesystem holding it. This is generally a
      good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
      in these pseudofilesystems are referenced from ordinary filesystem
      inodes and carry mapping with real data of the device. Thus for these
      inodes we have to use inode->i_mapping->backing_dev_info as we did so
      far. We distinguish these filesystems by checking whether sb->s_bdi
      points to a non-trivial backing device or not.
      
      Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
      There's a device inode A described by a path "/dev/sdb" on this
      filesystem. This inode will be dirtied against backing device "8:0"
      after this patch. bdev filesystem contains block device inode B coupled
      with our inode A. When someone modifies a page of /dev/sdb, it's B that
      gets dirtied and the dirtying happens against the backing device "8:16".
      Thus both inodes get filed to a correct bdi list.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      692ebd17
    • J
      char: Mark /dev/zero and /dev/kmem as not capable of writeback · 371d217e
      Jan Kara 提交于
      These devices don't do any writeback but their device inodes still can get
      dirty so mark bdi appropriately so that bdi code does the right thing and files
      inodes to lists of bdi carrying the device inodes.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      371d217e
    • J
      bdi: Initialize noop_backing_dev_info properly · 976e48f8
      Jan Kara 提交于
      Properly initialize this backing dev info so that writeback code does not
      barf when getting to it e.g. via sb->s_bdi.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      976e48f8
  2. 21 9月, 2010 3 次提交
    • V
      cfq-iosched: fix a kernel OOPs when usb key is inserted · 180be2a0
      Vivek Goyal 提交于
      Mike reported a kernel crash when a usb key hotplug is performed while all
      kernel thrads are not in a root cgroup and are running in one of the child
      cgroups of blkio controller.
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000002c
      	IP: [<c11c7b08>] cfq_get_queue+0x232/0x412
      	*pde = 00000000
      	Oops: 0000 [#1] PREEMPT
      	last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/scsi_host/host3/uevent
      
      	[..]
      	Pid: 30039, comm: scsi_scan_3 Not tainted 2.6.35.2-fg.roam #1 Volvi2                         /Aspire 4315
      	EIP: 0060:[<c11c7b08>] EFLAGS: 00010086 CPU: 0
      	EIP is at cfq_get_queue+0x232/0x412
      	EAX: f705f9c0 EBX: e977abac ECX: 00000000 EDX: 00000000
      	ESI: f00da400 EDI: f00da4ec EBP: e977a800 ESP: dff8fd00
      	 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
      	Process scsi_scan_3 (pid: 30039, ti=dff8e000 task=f6b6c9a0 task.ti=dff8e000)
      	Stack:
      	 00000000 00000000 00000001 01ff0000 f00da508 00000000 f00da524 f00da540
      	<0> e7994940 dd631750 f705f9c0 e977a820 e977ac44 f00da4d0 00000001 f6b6c9a0
      	<0> 00000010 00008010 0000000b 00000000 00000001 e977a800 dd76fac0 00000246
      	Call Trace:
      	 [<c11c7f10>] ? cfq_set_request+0x228/0x34c
      	 [<c11c7ce8>] ? cfq_set_request+0x0/0x34c
      	 [<c11bb3b9>] ? elv_set_request+0xf/0x1c
      	 [<c11bdd51>] ? get_request+0x1ad/0x22f
      	 [<c11bddf2>] ? get_request_wait+0x1f/0x11a
      	 [<c11d013b>] ? kvasprintf+0x33/0x3b
      	 [<c127b537>] ? scsi_execute+0x1d/0x103
      	 [<c127b675>] ? scsi_execute_req+0x58/0x83
      	 [<c127c391>] ? scsi_probe_and_add_lun+0x188/0x7c2
      	 [<c12718c6>] ? attribute_container_add_device+0x15/0xfa
      	 [<c11c95d1>] ? kobject_get+0xf/0x13
      	 [<c126d1db>] ? get_device+0x10/0x14
      	 [<c127be93>] ? scsi_alloc_target+0x217/0x24d
      	 [<c127cbd8>] ? __scsi_scan_target+0x95/0x480
      	 [<c10204eb>] ? dequeue_entity+0x14/0x1fe
      	 [<c1020491>] ? update_curr+0x165/0x1ab
      	 [<c1020491>] ? update_curr+0x165/0x1ab
      	 [<c127d00d>] ? scsi_scan_channel+0x4a/0x76
      	 [<c127d0b0>] ? scsi_scan_host_selected+0x77/0xad
      	 [<c127d13c>] ? do_scan_async+0x0/0x11a
      	 [<c127d137>] ? do_scsi_scan_host+0x51/0x56
      	 [<c127d13c>] ? do_scan_async+0x0/0x11a
      	 [<c127d14a>] ? do_scan_async+0xe/0x11a
      	 [<c127d13c>] ? do_scan_async+0x0/0x11a
      	 [<c10354c5>] ? kthread+0x5e/0x63
      	 [<c1035467>] ? kthread+0x0/0x63
      	 [<c1002af6>] ? kernel_thread_helper+0x6/0x10
      	Code: 44 24 1c 54 83 44 24 18 54 83 fa 03 75 94 8b 06 c7 86 64 02 00 00 01 00 00 00 83 e0 03 09 f0 89 06 8b 44 24 28 8b 90 58 01 00 00 <8b> 42 2c 85 c0 75 03 8b 42 08 8d 54 24 48 52 8d 4c 24 50 51 68
      	EIP: [<c11c7b08>] cfq_get_queue+0x232/0x412 SS:ESP 0068:dff8fd00
      	CR2: 000000000000002c
      	---[ end trace 9a88306573f69b12 ]---
      
      The problem here is that we don't have bdi->dev information available when
      thread does some IO.  Hence when dev_name() tries to access bdi->dev, it
      crashes.
      
      This problem does not happen if kernel threads are in root group as root
      group is statically allocated at device initialization time and we don't
      hit this piece of code.
      
      Fix it by delaying the filling of major and minor number information of
      device in blk_group.  Initially a blk_group is created with 0 as device
      information and this information is filled later once some more IO comes
      in from same group.
      Reported-by: NMike Kazantsev <mk.fraggod@gmail.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      180be2a0
    • B
      block: fix blk_rq_map_kern bio direction flag · a45dc2d2
      Benny Halevy 提交于
      This bug was introduced in 7b6d91da
      "block: unify flags for struct bio and struct request"
      
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      a45dc2d2
    • D
      cciss: freeing uninitialized data on error path · b0722cb1
      Dan Carpenter 提交于
      The "h->scatter_list" is allocated inside a for loop.  If any of those
      allocations fail, then the rest of the list is uninitialized data.  When
      we free it we should start from the top and free backwards so that we
      don't call kfree() on uninitialized pointers.
      
      Also if the allocation for "h->scatter_list" fails then we would get an
      Oops here.  I should have noticed this when I send: 4ee69851 "cciss:
      handle allocation failure."  but I didn't.  Sorry about that.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      b0722cb1
  3. 20 9月, 2010 5 次提交
  4. 19 9月, 2010 10 次提交
    • A
      alpha: deal with multiple simultaneously pending signals · 494486a1
      Al Viro 提交于
      Unlike the other targets, alpha sets _one_ sigframe and
      buggers off until the next syscall/interrupt, even if
      more signals are pending.  It leads to quite a few unpleasant
      inconsistencies, starting with SIGSEGV potentially arriving
      not where it should and including e.g. mess with sigsuspend();
      consider two pending signals blocked until sigsuspend()
      unblocks them.  We pick the first one; then, if we are hit
      by interrupt while in the handler, we process the second one
      as well.  If we are not, and if no syscalls had been made,
      we get out of the first handler and leave the second signal
      pending; normally sigreturn() would've picked it anyway, but
      here it starts with restoring the original mask and voila -
      the second signal is blocked again.  On everything else we
      get both delivered consistently.
      
      It's actually easy to fix; the only thing to watch out for
      is prevention of double syscall restart.  Fortunately, the
      idea I've nicked from arm fix by rmk works just fine...
      
      Testcase demonstrating the behaviour in question; on alpha
      we get one or both flags set (usually one), on everything
      else both are always set.
      	#include <signal.h>
      	#include <stdio.h>
      	int had1, had2;
      	void f1(int sig) { had1 = 1; }
      	void f2(int sig) { had2 = 1; }
      	main()
      	{
      		sigset_t set1, set2;
      		sigemptyset(&set1);
      		sigemptyset(&set2);
      		sigaddset(&set2, 1);
      		sigaddset(&set2, 2);
      		signal(1, f1);
      		signal(2, f2);
      		sigprocmask(SIG_SETMASK, &set2, NULL);
      		raise(1);
      		raise(2);
      		sigsuspend(&set1);
      		printf("had1:%d had2:%d\n", had1, had2);
      	}
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      494486a1
    • A
      alpha: fix a 14 years old bug in sigreturn tracing · 53293638
      Al Viro 提交于
      The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
      all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.
      
      What happens is direct return to ret_from_syscall, in order to bypass
      mangling of a3 (error indicator) and prevent other mutilations of
      registers (e.g. by syscall restart).  That's fine, but... the entire
      TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
      stopping/notifying the tracer is after the syscall.  And the normal
      path we are forcibly switching to doesn't have it.
      
      So we end up with *one* stop in traced sigreturn() vs. two in other
      syscalls.  And yes, strace is visibly broken by that; try to strace
      the following
      	#include <signal.h>
      	#include <stdio.h>
      	void f(int sig) {}
      	main()
      	{
      		signal(SIGHUP, f);
      		raise(SIGHUP);
      		write(1, "eeeek\n", 6);
      	}
      and watch the show.  The
      	close(1)                                = 405
      in the end of strace output is coming from return value of write() (6 ==
      __NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
      405 there).
      
      The fix is fairly simple - the only thing we end up missing is the call
      of syscall_trace() and we can tell whether we'd been called from the
      SYSCALL_TRACE path by checking ra value.  Since we are setting the
      switch_stack up (that's what sys_sigreturn() does), we have the right
      environment for calling syscall_trace() - just before we call
      undo_switch_stack() and return.  Since undo_switch_stack() will overwrite
      s0 anyway, we can use it to store the result of "has it been called from
      SYSCALL_TRACE path?" check.  The same thing applies in rt_sigreturn().
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      53293638
    • A
      alpha: unb0rk sigsuspend() and rt_sigsuspend() · 392fb6e3
      Al Viro 提交于
      Old code used to set regs->r0 and regs->r19 to force the right
      return value.  Leaving that after switch to ERESTARTNOHAND
      was a Bad Idea(tm), since now that screws the restart - if we
      hit the case when get_signal_to_deliver() returns 0, we will
      step back to syscall insn, with v0 set to EINTR and a3 to 1.
      The latter won't matter, since EINTR is 4, aka __NR_write.
      
      Testcase:
      
      	#include <signal.h>
      	#define _GNU_SOURCE
      	#include <unistd.h>
      	#include <sys/syscall.h>
      
      	main()
      	{
      		sigset_t mask;
      		sigemptyset(&mask);
      		sigaddset(&mask, SIGCONT);
      		sigprocmask(SIG_SETMASK, &mask, NULL);
      		kill(0, SIGCONT);
      		syscall(__NR_sigsuspend, 1, "b0rken\n", 7);
      	}
      
      results on alpha in immediate message to stdout...
      
      Fix is obvious; moreover, since we don't need regs anymore, we can
      switch to normal prototypes for these guys and lose the wrappers.
      Even better, rt_sigsuspend() is identical to generic version in
      kernel/signal.c now.
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      392fb6e3
    • A
      alpha: belated ERESTART_RESTARTBLOCK race fix · 2deba1bd
      Al Viro 提交于
      same thing as had been done on other targets back in 2003 -
      move setting ->restart_block.fn into {rt_,}sigreturn().
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      2deba1bd
    • M
      alpha: Shift perf event pending work earlier in timer interrupt · bdc8b891
      Michael Cree 提交于
      Pending work from the performance event subsystem is executed in
      the timer interrupt.  This patch shifts the call to
      perf_event_do_pending() before the call to update_process_times()
      as the latter may call back into the perf event subsystem and it
      is prudent to have the pending work executed first.
      Signed-off-by: NMichael Cree <mcree@orcon.net.nz>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      bdc8b891
    • M
      alpha: wire up fanotify and prlimit64 syscalls · 531f0474
      Mikael Pettersson 提交于
      The 2.6.36-rc kernel added three new system calls:
      fanotify_init, fanotify_mark, and prlimit64.  This
      patch wires them up on Alpha.
      
      Built and booted on an XP900.  Untested beyond that.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      531f0474
    • A
      alpha: kill big kernel lock · 12e750d9
      Arnd Bergmann 提交于
      All uses of the BKL on alpha are totally bogus, nothing
      is really protected by this. Remove the remaining users
      so we don't have to mark alpha as 'depends on BKL'.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: linux-alpha@vger.kernel.org
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      12e750d9
    • T
      alpha: fix build breakage in asm/cacheflush.h · b97f897d
      Tejun Heo 提交于
      Alpha SMP flush_icache_user_range() is implemented as an inline
      function inside include/asm/cacheflush.h.  It dereferences @current
      but doesn't include linux/sched.h and thus causes build failure if
      linux/sched.h wasn't included previously.  Fix it by including the
      needed header file explicitly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      b97f897d
    • M
    • J
      31019075
  5. 18 9月, 2010 12 次提交
  6. 17 9月, 2010 7 次提交