1. 02 3月, 2009 1 次提交
  2. 01 3月, 2009 2 次提交
  3. 28 2月, 2009 1 次提交
  4. 27 2月, 2009 1 次提交
  5. 26 2月, 2009 3 次提交
    • J
      block: reduce stack footprint of blk_recount_segments() · 1e428079
      Jens Axboe 提交于
      blk_recalc_rq_segments() requires a request structure passed in, which
      we don't have from blk_recount_segments(). So the latter allocates one on
      the stack, using > 400 bytes of stack for that. This can cause us to spill
      over one page of stack from ext4 at least:
      
       0)     4560     400   blk_recount_segments+0x43/0x62
       1)     4160      32   bio_phys_segments+0x1c/0x24
       2)     4128      32   blk_rq_bio_prep+0x2a/0xf9
       3)     4096      32   init_request_from_bio+0xf9/0xfe
       4)     4064     112   __make_request+0x33c/0x3f6
       5)     3952     144   generic_make_request+0x2d1/0x321
       6)     3808      64   submit_bio+0xb9/0xc3
       7)     3744      48   submit_bh+0xea/0x10e
       8)     3696     368   ext4_mb_init_cache+0x257/0xa6a [ext4]
       9)     3328     288   ext4_mb_regular_allocator+0x421/0xcd9 [ext4]
      10)     3040     160   ext4_mb_new_blocks+0x211/0x4b4 [ext4]
      11)     2880     336   ext4_ext_get_blocks+0xb61/0xd45 [ext4]
      12)     2544      96   ext4_get_blocks_wrap+0xf2/0x200 [ext4]
      13)     2448      80   ext4_da_get_block_write+0x6e/0x16b [ext4]
      14)     2368     352   mpage_da_map_blocks+0x7e/0x4b3 [ext4]
      15)     2016     352   ext4_da_writepages+0x2ce/0x43c [ext4]
      16)     1664      32   do_writepages+0x2d/0x3c
      17)     1632     144   __writeback_single_inode+0x162/0x2cd
      18)     1488      96   generic_sync_sb_inodes+0x1e3/0x32b
      19)     1392      16   sync_sb_inodes+0xe/0x10
      20)     1376      48   writeback_inodes+0x69/0xb3
      21)     1328     208   balance_dirty_pages_ratelimited_nr+0x187/0x2f9
      22)     1120     224   generic_file_buffered_write+0x1d4/0x2c4
      23)      896     176   __generic_file_aio_write_nolock+0x35f/0x393
      24)      720      80   generic_file_aio_write+0x6c/0xc8
      25)      640      80   ext4_file_write+0xa9/0x137 [ext4]
      26)      560     320   do_sync_write+0xf0/0x137
      27)      240      48   vfs_write+0xb3/0x13c
      28)      192      64   sys_write+0x4c/0x74
      29)      128     128   system_call_fastpath+0x16/0x1b
      
      Split the segment counting out into a __blk_recalc_rq_segments() helper
      to avoid allocating an onstack request just for checking the physical
      segment count.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1e428079
    • P
      rcu: Teach RCU that idle task is not quiscent state at boot · a6826048
      Paul E. McKenney 提交于
      This patch fixes a bug located by Vegard Nossum with the aid of
      kmemcheck, updated based on review comments from Nick Piggin,
      Ingo Molnar, and Andrew Morton.  And cleans up the variable-name
      and function-name language.  ;-)
      
      The boot CPU runs in the context of its idle thread during boot-up.
      During this time, idle_cpu(0) will always return nonzero, which will
      fool Classic and Hierarchical RCU into deciding that a large chunk of
      the boot-up sequence is a big long quiescent state.  This in turn causes
      RCU to prematurely end grace periods during this time.
      
      This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
      function to ignore the idle task as a quiescent state until the
      system has started up the scheduler in rest_init(), introducing a
      new non-API function rcu_idle_now_means_idle() to inform RCU of this
      transition.  RCU maintains an internal rcu_idle_cpu_truthful variable
      to track this state, which is then used by rcu_check_callback() to
      determine if it should believe idle_cpu().
      
      Because this patch has the effect of disallowing RCU grace periods
      during long stretches of the boot-up sequence, this patch also introduces
      Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
      no-op if num_online_cpus() returns 1.  This allows boot-time code that
      calls synchronize_rcu() to proceed normally.  Note, however, that RCU
      callbacks registered by call_rcu() will likely queue up until later in
      the boot sequence.  Although rcuclassic and rcutree can also use this
      same optimization after boot completes, rcupreempt must restrict its
      use of this optimization to the portion of the boot sequence before the
      scheduler starts up, given that an rcupreempt RCU read-side critical
      section may be preeempted.
      
      In addition, this patch takes Nick Piggin's suggestion to make the
      system_state global variable be __read_mostly.
      
      Changes since v4:
      
      o	Changes the name of the introduced function and variable to
      	be less emotional.  ;-)
      
      Changes since v3:
      
      o	WARN_ON(nr_context_switches() > 0) to verify that RCU
      	switches out of boot-time mode before the first context
      	switch, as suggested by Nick Piggin.
      
      Changes since v2:
      
      o	Created rcu_blocking_is_gp() internal-to-RCU API that
      	determines whether a call to synchronize_rcu() is itself
      	a grace period.
      
      o	The definition of rcu_blocking_is_gp() for rcuclassic and
      	rcutree checks to see if but a single CPU is online.
      
      o	The definition of rcu_blocking_is_gp() for rcupreempt
      	checks to see both if but a single CPU is online and if
      	the system is still in early boot.
      
      	This allows rcupreempt to again work correctly if running
      	on a single CPU after booting is complete.
      
      o	Added check to rcupreempt's synchronize_sched() for there
      	being but one online CPU.
      
      Tested all three variants both SMP and !SMP, booted fine, passed a short
      rcutorture test on both x86 and Power.
      Located-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6826048
    • B
      ide: fix refcounting in device drivers · 8fed4368
      Bartlomiej Zolnierkiewicz 提交于
      During host driver module removal del_gendisk() results in a final
      put on drive->gendev and freeing the drive by drive_release_dev().
      
      Convert device drivers from using struct kref to use struct device
      so device driver's object holds reference on ->gendev and prevents
      drive from prematurely going away.
      
      Also fix ->remove methods to not erroneously drop reference on a
      host driver by using only put_device() instead of ide*_put().
      Reported-by: NStanislaw Gruszka <stf_xl@wp.pl>
      Tested-by: NStanislaw Gruszka <stf_xl@wp.pl>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      8fed4368
  6. 25 2月, 2009 4 次提交
  7. 23 2月, 2009 1 次提交
  8. 22 2月, 2009 1 次提交
  9. 21 2月, 2009 3 次提交
    • M
      8250: fix boot hang with serial console when using with Serial Over Lan port · b6adea33
      Mauro Carvalho Chehab 提交于
      Intel 8257x Ethernet boards have a feature called Serial Over Lan.
      
      This feature works by emulating a serial port, and it is detected by
      kernel as a normal 8250 port.  However, this emulation is not perfect, as
      also noticed on changeset 7500b1f6.
      
      Before this patch, the kernel were trying to check if the serial TX is
      capable of work using IRQ's.
      
      This were done with a code similar this:
      
              serial_outp(up, UART_IER, UART_IER_THRI);
              lsr = serial_in(up, UART_LSR);
              iir = serial_in(up, UART_IIR);
              serial_outp(up, UART_IER, 0);
      
              if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
      		up->bugs |= UART_BUG_TXEN;
      
      This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
      chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.
      
      Due to that, UART_BUG_TXEN is sometimes enabled.  However, as TX IRQ keeps
      working, and the TX polling is now enabled, the driver miss-interprets the
      IRQ received later, hanging up the machine until a key is pressed at the
      serial console.
      
      This is the 6 version of this patch.  Previous versions were trying to
      introduce a large enough delay between serial_outp and serial_in(up,
      UART_IIR), but not taking forever.  However, the needed delay couldn't be
      safely determined.
      
      At the experimental tests, a delay of 1us solves most of the cases, but
      still hangs sometimes.  Increasing the delay to 5us was better, but still
      doesn't solve.  A very high delay of 50 ms seemed to work every time.
      
      However, poking around with delays and pray for it to be enough doesn't
      seem to be a good approach, even for a quirk.
      
      So, instead of playing with random large arbitrary delays, let's just
      disable UART_BUG_TXEN for all SoL ports.
      
      [akpm@linux-foundation.org: fix warnings]
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6adea33
    • M
      spi_bitbang: add more lowlevel function documentation · 01b24fee
      Michael Buesch 提交于
      This adds more documentation of the lowlevel API to avoid future bugs.
      Signed-off-by: NMichael Buesch <mb@bu3sch.de>
      Acked-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01b24fee
    • J
      slab: introduce kzfree() · 3ef0e5ba
      Johannes Weiner 提交于
      kzfree() is a wrapper for kfree() that additionally zeroes the underlying
      memory before releasing it to the slab allocator.
      
      Currently there is code which memset()s the memory region of an object
      before releasing it back to the slab allocator to make sure
      security-sensitive data are really zeroed out after use.
      
      These callsites can then just use kzfree() which saves some code, makes
      users greppable and allows for a stupid destructor that isn't necessarily
      aware of the actual object size.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Matt Mackall <mpm@selenic.com>
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ef0e5ba
  10. 20 2月, 2009 3 次提交
  11. 19 2月, 2009 11 次提交
  12. 18 2月, 2009 6 次提交
  13. 15 2月, 2009 2 次提交
  14. 13 2月, 2009 1 次提交
    • A
      net: don't use in_atomic() in gfp_any() · 99709372
      Andrew Morton 提交于
      The problem is that in_atomic() will return false inside spinlocks if
      CONFIG_PREEMPT=n.  This will lead to deadlockable GFP_KERNEL allocations
      from spinlocked regions.
      
      Secondly, if CONFIG_PREEMPT=y, this bug solves itself because networking
      will instead use GFP_ATOMIC from this callsite.  Hence we won't get the
      might_sleep() debugging warnings which would have informed us of the buggy
      callsites.
      
      Solve both these problems by switching to in_interrupt().  Now, if someone
      runs a gfp_any() allocation from inside spinlock we will get the warning
      if CONFIG_PREEMPT=y.
      
      I reviewed all callsites and most of them were too complex for my little
      brain and none of them documented their interface requirements.  I have no
      idea what this patch will do.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99709372