1. 03 7月, 2008 19 次提交
  2. 02 7月, 2008 14 次提交
  3. 01 7月, 2008 7 次提交
    • B
      I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device. · d150a4bb
      Ben Dooks 提交于
      Add a MODULE_ALIAS() statement for the i2c-s3c2410 controller
      to ensure that it can be autoloaded on the S3C2440 systems that
      we support.
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      d150a4bb
    • B
      I2C: S3C2410: Fixup error codes returned rom a transfer. · 63f5c289
      Ben Dooks 提交于
      The driver should be returning -ENXIO for transfers that do not
      pass the initial address byte stage.
      
      Note, also small tidyups to the driver comments in the area.
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      63f5c289
    • B
      I2C: S3C2410: Check ACK on byte transmission · 2709781b
      Ben Dooks 提交于
      We should check for the reception of an ACK after transmitting each
      data byte. The address send has been correctly checking this, but the
      data write byte state should have also been checking for these failures.
      
      As part of the same fix, we remove the ACK checking from the receive
      path where it should not have been checking for an ACK which our hardware
      was sending.
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      2709781b
    • G
      rcu: fix hotplug vs rcu race · 8558f8f8
      Gautham R Shenoy 提交于
      Dhaval Giani reported this warning during cpu hotplug stress-tests:
      
      | On running kernel compiles in parallel with cpu hotplug:
      |
      | WARNING: at arch/x86/kernel/smp.c:118
      | native_smp_send_reschedule+0x21/0x36()
      | Modules linked in:
      | Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1
      | [...]
      |  [<c0110355>] native_smp_send_reschedule+0x21/0x36
      |  [<c014fe8f>] force_quiescent_state+0x47/0x57
      |  [<c014fef0>] call_rcu+0x51/0x6d
      |  [<c01713b3>] __fput+0x130/0x158
      |  [<c0171231>] fput+0x17/0x19
      |  [<c016fd99>] filp_close+0x4d/0x57
      |  [<c016fdff>] sys_close+0x5c/0x97
      
      IMHO the warning is a spurious one.
      
      cpu_online_map is updated by the _cpu_down() using stop_machine_run().
      Since force_quiescent_state is invoked from irqs disabled section,
      stop_machine_run() won't be executing while a cpu is executing
      force_quiescent_state(). Hence the cpu_online_map is stable while we're
      in the irq disabled section.
      
      However, a cpu might have been offlined _just_ before we disabled irqs
      while entering force_quiescent_state(). And rcu subsystem might not yet
      have handled the CPU_DEAD notification, leading to the offlined cpu's
      bit being set in the rcp->cpumask.
      
      Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending
      smp_reschedule() to an offlined CPU.
      
      Here's the timeline:
      
      CPU_A						 CPU_B
      --------------------------------------------------------------
      cpu_down():					.
      .					   	.
      .						.
      stop_machine(): /* disables preemption,		.
      		 * and irqs */			.
      .						.
      .						.
      take_cpu_down();				.
      .						.
      .						.
      .						.
      cpu_disable(); /*this removes cpu 		.
      		*from cpu_online_map 		.
      		*/				.
      .						.
      .						.
      restart_machine(); /* enables irqs */		.
      ------WINDOW DURING WHICH rcp->cpumask is stale ---------------
      .						call_rcu();
      .						/* disables irqs here */
      .						.force_quiescent_state();
      .CPU_DEAD:					.for_each_cpu(rcp->cpumask)
      .						.   smp_send_reschedule();
      .						.
      .						.   WARN_ON() for offlined CPU!
      .
      .
      .
      rcu_cpu_notify:
      .
      -------- WINDOW ENDS ------------------------------------------
      rcu_offline_cpu() /* Which calls cpu_quiet()
      		   * which removes
      		   * cpu from rcp->cpumask.
      		   */
      
      If a new batch was started just before calling stop_machine_run(), the
      "tobe-offlined" cpu is still present in rcp-cpumask.
      
      During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle
      task as the next task to be picked by the scheduler. We also call
      cpu_disable() which will disable any further interrupts and remove the
      cpu's bit from the cpu_online_map.
      
      Once the stop_machine_run() successfully calls take_cpu_down(), it calls
      schedule(). That's the last time a schedule is called on the offlined
      cpu, and hence the last time when rdp->passed_quiesc will be set to 1
      through rcu_qsctr_inc().
      
      But the cpu_quiet() will be on this cpu will be called only when the
      next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU
      is still set in rcp->cpumask.
      
      Now coming back to the idle_task which truely offlines the CPU, it does
      check for a pending RCU and raises the softirq, since it will find
      rdp->passed_quiesc to be 0 in this case. However, since the cpu is
      offline I am not sure if the softirq will trigger on the CPU.
      
      Even if it doesn't the rcu_offline_cpu() will find that rcp->completed
      is not the same as rcp->cur, which means that our cpu could be holding
      up the grace period progression. Hence we call cpu_quiet() and move
      ahead.
      
      But because of the window explained in the timeline, we could still have
      a call_rcu() before the RCU subsystem executes it's CPU_DEAD
      notification, and we send smp_send_reschedule() to offlined cpu while
      trying to force the quiescent states. The appended patch adds comments
      and prevents checking for offlined cpu everytime.
      
      cpu_online_map is updated by the _cpu_down() using stop_machine_run().
      Since force_quiescent_state is invoked from irqs disabled section,
      stop_machine_run() won't be executing while a cpu is executing
      force_quiescent_state(). Hence the cpu_online_map is stable while we're
      in the irq disabled section.
      Reported-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Acked-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Rusty Russel <rusty@rustcorp.com.au>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8558f8f8
    • J
      Properly notify block layer of sync writes · 18ce3751
      Jens Axboe 提交于
      fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
      then immediately wait on them. Conceptually, that makes them sync writes
      and we should treat them as such so that the IO schedulers can handle
      them appropriately.
      
      This patch fixes a write starvation issue that Lin Ming reported, where
      xx is stuck for more than 2 minutes because of a large number of
      synchronous IO in the system:
      
      INFO: task kjournald:20558 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
      message.
      kjournald     D ffff810010820978  6712 20558      2
      ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
      ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
      0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
      Call Trace:
      [<ffffffff803ba6f2>] kobject_get+0x12/0x17
      [<ffffffff80247537>] getnstimeofday+0x2f/0x83
      [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
      [<ffffffff8066d195>] io_schedule+0x5d/0x9f
      [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f
      [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f
      [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
      [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78
      [<ffffffff80243909>] wake_bit_function+0x0/0x23
      [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb
      [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6
      [<ffffffff8023a676>] lock_timer_base+0x26/0x4b
      [<ffffffff8030300a>] kjournald+0xc1/0x1fb
      [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e
      [<ffffffff80302f49>] kjournald+0x0/0x1fb
      [<ffffffff802437bb>] kthread+0x47/0x74
      [<ffffffff8022de51>] schedule_tail+0x28/0x5d
      [<ffffffff8020cac8>] child_rip+0xa/0x12
      [<ffffffff80243774>] kthread+0x0/0x74
      [<ffffffff8020cabe>] child_rip+0x0/0x12
      
      Lin Ming confirms that this patch fixes the issue. I've run tests with
      it for the past week and no ill effects have been observed, so I'm
      proposing it for inclusion into 2.6.26.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      18ce3751
    • D
      block: Fix the starving writes bug in the anticipatory IO scheduler · d585d0b9
      Divyesh Shah 提交于
      AS scheduler alternates between issuing read and write batches. It does
      the batch switch only after all requests from the previous batch are
      completed.
      
      When switching to a write batch, if there is an on-going read request,
      it waits for its completion and indicates its intention of switching by
      setting ad->changed_batch and the new direction but does not update the
      batch_expire_time for the new write batch which it does in the case of
      no previous pending requests.
      On completion of the read request, it sees that we were waiting for the
      switch and schedules work for kblockd right away and resets the
      ad->changed_data flag.
      Now when kblockd enters dispatch_request where it is expected to pick
      up a write request, it in turn ends the write batch because the
      batch_expire_timer was not updated and shows the expire timestamp for
      the previous batch.
      
      This results in the write starvation for all the cases where there is
      the intention for switching to a write batch, but there is a previous
      in-flight read request and the batch gets reverted to a read_batch
      right away.
      
      This also holds true in the reverse case (switching from a write batch
      to a read batch with an in-flight write request).
      
      I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and
      linux-2.6-block git HEAD. I've tested the fix on x86 platforms with
      SCSI drives where the driver asks for the next request while a current
      request is in-flight.
      
      This patch is based off linux-2.6-block git HEAD.
      
      Bug reproduction:
      A simple scenario which reproduces this bug is:
      - dd if=/dev/hda3 of=/dev/null &
      - lilo
         The lilo takes forever to complete.
      
      This can also be reproduced fairly easily with the earlier dd and
      another test
      program doing msync().
      
      The example test program below should print out a message after every
      iteration
      but it simply hangs forever. With this bugfix it makes forward progress.
      
      ====
      Example test program using msync() (thanks to suleiman AT google DOT
      com)
      
      inline uint64_t
      rdtsc(void)
      {
               int64_t tsc;
      
               __asm __volatile("rdtsc" : "=A" (tsc));
               return (tsc);
      }
      
      int
      main(int argc, char **argv)
      {
               struct stat st;
               uint64_t e, s, t;
               char *p, q;
               long i;
               int fd;
      
               if (argc < 2) {
                       printf("Usage: %s <file>\n", argv[0]);
                       return (1);
               }
      
               if ((fd = open(argv[1], O_RDWR | O_NOATIME)) < 0)
                       err(1, "open");
      
               if (fstat(fd, &st) < 0)
                       err(1, "fstat");
      
               p = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE,
      MAP_SHARED, fd, 0);
      
               t = 0;
               for (i = 0; i < 1000; i++) {
                       *p = 0;
                       msync(p, 4096, MS_SYNC);
                       s = rdtsc();
                      *p = 0;
                       __asm __volatile(""::: "memory");
                       e = rdtsc();
                       if (argc > 2)
                               printf("%d: %lld cycles %jd %jd\n",
                                      i, e - s, (intmax_t)s, (intmax_t)e);
                       t += e - s;
               }
               printf("average time: %lld cycles\n", t / 1000);
               return (0);
      }
      
      Cc: <stable@kernel.org>
      Acked-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d585d0b9
    • T
      x86: fix NODES_SHIFT Kconfig range · efac4189
      Thomas Gleixner 提交于
      commit 43238382
             x86: change size of node ids from u8 to s16
      
      set the range for NODES_SHIFT to 1..15.
      
      The possible range is 1..9
      
      Fixes Bugzilla #10726
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      efac4189