1. 30 10月, 2012 5 次提交
  2. 26 10月, 2012 1 次提交
    • J
      block: Add blk_rq_pos(rq) to sort rq when plushing · 975927b9
      Jianpeng Ma 提交于
      My workload is a raid5 which had 16 disks. And used our filesystem to
      write using direct-io mode.
      
      I used the blktrace to find those message:
      8,16   0     6647     2.453665504  2579  M   W 7493152 + 8 [md0_raid5]
      8,16   0     6648     2.453672411  2579  Q   W 7493160 + 8 [md0_raid5]
      8,16   0     6649     2.453672606  2579  M   W 7493160 + 8 [md0_raid5]
      8,16   0     6650     2.453679255  2579  Q   W 7493168 + 8 [md0_raid5]
      8,16   0     6651     2.453679441  2579  M   W 7493168 + 8 [md0_raid5]
      8,16   0     6652     2.453685948  2579  Q   W 7493176 + 8 [md0_raid5]
      8,16   0     6653     2.453686149  2579  M   W 7493176 + 8 [md0_raid5]
      8,16   0     6654     2.453693074  2579  Q   W 7493184 + 8 [md0_raid5]
      8,16   0     6655     2.453693254  2579  M   W 7493184 + 8 [md0_raid5]
      8,16   0     6656     2.453704290  2579  Q   W 7493192 + 8 [md0_raid5]
      8,16   0     6657     2.453704482  2579  M   W 7493192 + 8 [md0_raid5]
      8,16   0     6658     2.453715016  2579  Q   W 7493200 + 8 [md0_raid5]
      8,16   0     6659     2.453715247  2579  M   W 7493200 + 8 [md0_raid5]
      8,16   0     6660     2.453721730  2579  Q   W 7493208 + 8 [md0_raid5]
      8,16   0     6661     2.453721974  2579  M   W 7493208 + 8 [md0_raid5]
      8,16   0     6662     2.453728202  2579  Q   W 7493216 + 8 [md0_raid5]
      8,16   0     6663     2.453728436  2579  M   W 7493216 + 8 [md0_raid5]
      8,16   0     6664     2.453734782  2579  Q   W 7493224 + 8 [md0_raid5]
      8,16   0     6665     2.453735019  2579  M   W 7493224 + 8 [md0_raid5]
      8,16   0     6666     2.453741401  2579  Q   W 7493232 + 8 [md0_raid5]
      8,16   0     6667     2.453741632  2579  M   W 7493232 + 8 [md0_raid5]
      8,16   0     6668     2.453748148  2579  Q   W 7493240 + 8 [md0_raid5]
      8,16   0     6669     2.453748386  2579  M   W 7493240 + 8 [md0_raid5]
      8,16   0     6670     2.453851843  2579  I   W 7493144 + 104 [md0_raid5]
      8,16   0        0     2.453853661     0  m   N cfq2579 insert_request
      8,16   0     6671     2.453854064  2579  I   W 7493120 + 24 [md0_raid5]
      8,16   0        0     2.453854439     0  m   N cfq2579 insert_request
      8,16   0     6672     2.453854793  2579  U   N [md0_raid5] 2
      8,16   0        0     2.453855513     0  m   N cfq2579 Not idling.st->count:1
      8,16   0        0     2.453855927     0  m   N cfq2579 dispatch_insert
      8,16   0        0     2.453861771     0  m   N cfq2579 dispatched a request
      8,16   0        0     2.453862248     0  m   N cfq2579 activate rq,drv=1
      8,16   0     6673     2.453862332  2579  D   W 7493120 + 24 [md0_raid5]
      8,16   0        0     2.453865957     0  m   N cfq2579 Not idling.st->count:1
      8,16   0        0     2.453866269     0  m   N cfq2579 dispatch_insert
      8,16   0        0     2.453866707     0  m   N cfq2579 dispatched a request
      8,16   0        0     2.453867061     0  m   N cfq2579 activate rq,drv=2
      8,16   0     6674     2.453867145  2579  D   W 7493144 + 104 [md0_raid5]
      8,16   0     6675     2.454147608     0  C   W 7493120 + 24 [0]
      8,16   0        0     2.454149357     0  m   N cfq2579 complete rqnoidle 0
      8,16   0     6676     2.454791505     0  C   W 7493144 + 104 [0]
      8,16   0        0     2.454794803     0  m   N cfq2579 complete rqnoidle 0
      8,16   0        0     2.454795160     0  m   N cfq schedule dispatch
      
      From above messages,we can find rq[W 7493144 + 104] and rq[W
      7493120 + 24] do not merge.
      Because the bio order is:
        8,16   0     6638     2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
        8,16   0     6639     2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
        8,16   0     6640     2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
        8,16   0     6641     2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
      The bio(7493144) first and bio(7493120) later.So the subsequent
      bios will be divided into two parts.
      When flushing plug-list,because elv_attempt_insert_merge only support
      backmerge,not supporting frontmerge.
      So rq[7493120 + 24] can't merge with rq[7493144 + 104].
      
      From my test,i found those situation can count 25% in our system.
      Using this patch, there is no this situation.
      Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
      CC:Shaohua Li <shli@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      975927b9
  3. 24 10月, 2012 2 次提交
  4. 23 10月, 2012 3 次提交
    • A
      vfs: fix: don't increase bio_slab_max if krealloc() fails · 386bc35a
      Anna Leuschner 提交于
      Without the patch, bio_slab_max, representing bio_slabs capacity, is increased before krealloc() of bio_slabs. If krealloc() fails, bio_slab_max is too high. Fix that by only updating bio_slab_max if krealloc() is successful.
      Signed-off-by: NAnna Leuschner <anna.m.leuschner@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      386bc35a
    • J
      blkcg: stop iteration early if root_rl is the only request list · 65c77fd9
      Jun'ichi Nomura 提交于
      __blk_queue_next_rl() finds next request list based on blkg_list
      while skipping root_blkg in the list.
      OTOH, root_rl is special as it may exist even without root_blkg.
      
      Though the later part of the function handles such a case correctly,
      exiting early is good for readability of the code.
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: Tejun Heo <tj@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      65c77fd9
    • J
      blkcg: Fix use-after-free of q->root_blkg and q->root_rl.blkg · 65635cbc
      Jun'ichi Nomura 提交于
      blk_put_rl() does not call blkg_put() for q->root_rl because we
      don't take request list reference on q->root_blkg.
      However, if root_blkg is once attached then detached (freed),
      blk_put_rl() is confused by the bogus pointer in q->root_blkg.
      
      For example, with !CONFIG_BLK_DEV_THROTTLING &&
      CONFIG_CFQ_GROUP_IOSCHED,
      switching IO scheduler from cfq to deadline will cause system stall
      after the following warning with 3.6:
      
      > WARNING: at /work/build/linux/block/blk-cgroup.h:250
      > blk_put_rl+0x4d/0x95()
      > Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf
      > ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
      > Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
      > Call Trace:
      >  <IRQ>  [<ffffffff810453bd>] warn_slowpath_common+0x85/0x9d
      >  [<ffffffff810453ef>] warn_slowpath_null+0x1a/0x1c
      >  [<ffffffff811d5f8d>] blk_put_rl+0x4d/0x95
      >  [<ffffffff811d614a>] __blk_put_request+0xc3/0xcb
      >  [<ffffffff811d71a3>] blk_finish_request+0x232/0x23f
      >  [<ffffffff811d76c3>] ? blk_end_bidi_request+0x34/0x5d
      >  [<ffffffff811d76d1>] blk_end_bidi_request+0x42/0x5d
      >  [<ffffffff811d7728>] blk_end_request+0x10/0x12
      >  [<ffffffff812cdf16>] scsi_io_completion+0x207/0x4d5
      >  [<ffffffff812c6fcf>] scsi_finish_command+0xfa/0x103
      >  [<ffffffff812ce2f8>] scsi_softirq_done+0xff/0x108
      >  [<ffffffff811dcea5>] blk_done_softirq+0x8d/0xa1
      >  [<ffffffff810915d5>] ?
      >  generic_smp_call_function_single_interrupt+0x9f/0xd7
      >  [<ffffffff8104cf5b>] __do_softirq+0x102/0x213
      >  [<ffffffff8108a5ec>] ? lock_release_holdtime+0xb6/0xbb
      >  [<ffffffff8104d2b4>] ? raise_softirq_irqoff+0x9/0x3d
      >  [<ffffffff81424dfc>] call_softirq+0x1c/0x30
      >  [<ffffffff81011beb>] do_softirq+0x4b/0xa3
      >  [<ffffffff8104cdb0>] irq_exit+0x53/0xd5
      >  [<ffffffff8102d865>] smp_call_function_single_interrupt+0x34/0x36
      >  [<ffffffff8142486f>] call_function_single_interrupt+0x6f/0x80
      >  <EOI>  [<ffffffff8101800b>] ? mwait_idle+0x94/0xcd
      >  [<ffffffff81018002>] ? mwait_idle+0x8b/0xcd
      >  [<ffffffff81017811>] cpu_idle+0xbb/0x114
      >  [<ffffffff81401fbd>] rest_init+0xc1/0xc8
      >  [<ffffffff81401efc>] ? csum_partial_copy_generic+0x16c/0x16c
      >  [<ffffffff81cdbd3d>] start_kernel+0x3d4/0x3e1
      >  [<ffffffff81cdb79e>] ? kernel_init+0x1f7/0x1f7
      >  [<ffffffff81cdb2dd>] x86_64_start_reservations+0xb8/0xbd
      >  [<ffffffff81cdb3e3>] x86_64_start_kernel+0x101/0x110
      
      This patch clears q->root_blkg and q->root_rl.blkg when root blkg
      is destroyed.
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      65635cbc
  5. 22 10月, 2012 4 次提交
  6. 21 10月, 2012 2 次提交
  7. 20 10月, 2012 23 次提交