1. 09 4月, 2014 1 次提交
  2. 31 3月, 2014 1 次提交
  3. 29 3月, 2014 4 次提交
  4. 28 3月, 2014 2 次提交
  5. 26 3月, 2014 4 次提交
    • V
      cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static · 236a9800
      Viresh Kumar 提交于
      cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be
      called directly by cpufreq drivers anymore and so these should be marked static.
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      236a9800
    • S
      cpufreq: Make sure frequency transitions are serialized · 12478cf0
      Srivatsa S. Bhat 提交于
      Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
      notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
      should strictly alternate, thereby preventing two different sets of PRECHANGE or
      POSTCHANGE notifiers from interleaving arbitrarily.
      
      The following examples illustrate why this is important:
      
      Scenario 1:
      -----------
      A thread reading the value of cpuinfo_cur_freq, will call
      __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
      
      The ondemand governor can decide to change the frequency of the CPU at the same
      time and hence it can end up sending the notifications via ->target().
      
      If the notifiers are not serialized, the following sequence can occur:
      - PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
      - PRECHANGE Notification for freq B (from target())
      - Freq changed by target() to B
      - POSTCHANGE Notification for freq B
      - POSTCHANGE Notification for freq A
      
      We can see from the above that the last POSTCHANGE Notification happens for freq
      A but the hardware is set to run at freq B.
      
      Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
      in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
      loops_per_jiffy calculations will get messed up.
      
      Scenario 2:
      -----------
      The governor calls __cpufreq_driver_target() to change the frequency. At the
      same time, if we change scaling_{min|max}_freq from sysfs, it will end up
      calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
      __cpufreq_driver_target(). And hence we end up issuing concurrent calls to
      ->target().
      
      Typically, platforms have the following logic in their ->target() routines:
      (Eg: cpufreq-cpu0, omap, exynos, etc)
      
      A. If new freq is more than old: Increase voltage
      B. Change freq
      C. If new freq is less than old: decrease voltage
      
      Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
      increase the freq and Y is trying to decrease it, we get the following race
      condition:
      
      X.A: voltage gets increased for larger freq
      Y.A: nothing happens
      Y.B: freq gets decreased
      Y.C: voltage gets decreased
      X.B: freq gets increased
      X.C: nothing happens
      
      Thus we can end up setting a freq which is not supported by the voltage we have
      set. That will probably make the clock to the CPU unstable and the system might
      not work properly anymore.
      
      This patch introduces a set of synchronization primitives to serialize frequency
      transitions, which are to be used as shown below:
      
      cpufreq_freq_transition_begin();
      
      //Perform the frequency change
      
      cpufreq_freq_transition_end();
      
      The _begin() call sends the PRECHANGE notification whereas the _end() call sends
      the POSTCHANGE notification. Also, all the necessary synchronization is handled
      within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
      flag can also use these APIs for performing frequency transitions (ie., you can
      call _begin() from one task, and call the corresponding _end() from a different
      task).
      
      The actual synchronization underneath is not that complicated:
      
      The key challenge is to allow drivers to begin the transition from one thread
      and end it in a completely different thread (this is to enable drivers that do
      asynchronous POSTCHANGE notification from bottom-halves, to also use the same
      interface).
      
      To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
      wait-queue are added per-policy. The flag and the wait-queue are used in
      conjunction to create an "uninterrupted flow" from _begin() to _end(). The
      spinlock is used to ensure that only one such "flow" is in flight at any given
      time. Put together, this provides us all the necessary synchronization.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      12478cf0
    • G
      Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" · 72099304
      Greg Kroah-Hartman 提交于
      This reverts commit d1ba277e.
      
      As reported by Stephen, this patch breaks linux-next as a ppc patch
      suddenly (after 2 years) started using this old api call.  So revert it
      for now, it will go away in 3.15-rc2 when we can change the PPC call to
      the new api.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72099304
    • T
      workqueue: Provide destroy_delayed_work_on_stack() · ea2e64f2
      Thomas Gleixner 提交于
      If a delayed or deferrable work is on stack we need to tell debug
      objects that we are destroying the timer and the work. Otherwise we
      leak the tracking object.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Acked-by: NTejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/20140323141939.911487677@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ea2e64f2
  6. 25 3月, 2014 1 次提交
  7. 21 3月, 2014 7 次提交
    • C
      blk-mq: merge blk_mq_insert_request and blk_mq_run_request · eeabc850
      Christoph Hellwig 提交于
      It's almost identical to blk_mq_insert_request, so fold the two into one
      slightly more generic function by making the flush special case a bit
      smarted.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      eeabc850
    • C
      KVM: Bump KVM_MAX_IRQ_ROUTES for s390 · f3f710bc
      Cornelia Huck 提交于
      The maximum number for irq routes is currently 1024, which is a bit on
      the small size for s390: We support up to 4 x 64k virtual devices with
      up to 64 queues, and we need one route for each of the queues if we want
      to operate it via irqfd.
      
      Let's bump this to 4k on s390 for now, as this at least covers the saner
      setups.
      
      We need to find a more general solution, though, as we can't just grow
      the routing table indefinitly.
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      f3f710bc
    • C
      KVM: s390: irq routing for adapter interrupts. · 84223598
      Cornelia Huck 提交于
      Introduce a new interrupt class for s390 adapter interrupts and enable
      irqfds for s390.
      
      This is depending on a new s390 specific vm capability, KVM_CAP_S390_IRQCHIP,
      that needs to be enabled by userspace.
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      84223598
    • H
      mm: fix swapops.h:131 bug if remap_file_pages raced migration · 7e09e738
      Hugh Dickins 提交于
      Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting
      little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity:
      indicating that remove_migration_ptes() failed to find one of the
      migration entries that was temporarily inserted.
      
      The problem comes from remap_file_pages()'s switch from vma_interval_tree
      (good for inserting the migration entry) to i_mmap_nonlinear list (no good
      for locating it again); but can only be a problem if the remap_file_pages()
      range does not cover the whole of the vma (zap_pte() clears the range).
      
      remove_migration_ptes() needs a file_nonlinear method to go down the
      i_mmap_nonlinear list, applying linear location to look for migration
      entries in those vmas too, just in case there was this race.
      
      The file_nonlinear method does need rmap_walk_control.arg to do this;
      but it never needed vma passed in - vma comes from its own iteration.
      Reported-and-tested-by: NDave Jones <davej@redhat.com>
      Reported-and-tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e09e738
    • P
      rcu: Provide grace-period piggybacking API · 765a3f4f
      Paul E. McKenney 提交于
      The following pattern is currently not well supported by RCU:
      
      1.	Make data element inaccessible to RCU readers.
      
      2.	Do work that probably lasts for more than one grace period.
      
      3.	Do something to make sure RCU readers in flight before #1 above
      	have completed.
      
      Here are some things that could currently be done:
      
      a.	Do a synchronize_rcu() unconditionally at either #1 or #3 above.
      	This works, but imposes needless work and latency.
      
      b.	Post an RCU callback at #1 above that does a wakeup, then
      	wait for the wakeup at #3.  This works well, but likely results
      	in an extra unneeded grace period.  Open-coding this is also
      	a bit more semi-tricky code than would be good.
      
      This commit therefore adds get_state_synchronize_rcu() and
      cond_synchronize_rcu() APIs.  Call get_state_synchronize_rcu() at #1
      above and pass its return value to cond_synchronize_rcu() at #3 above.
      This results in a call to synchronize_rcu() if no grace period has
      elapsed between #1 and #3, but requires only a load, comparison, and
      memory barrier if a full grace period did elapse.
      Requested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      765a3f4f
    • D
      Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC · 8c90487c
      Dave Jones 提交于
      Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC, so we can repurpose
      the flag to encompass a wider range of pushing the CPU beyond its
      warrany.
      Signed-off-by: NDave Jones <davej@fedoraproject.org>
      Link: http://lkml.kernel.org/r/20140226154949.GA770@redhat.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      8c90487c
    • V
      tracing: Fix array size mismatch in format string · 87291347
      Vaibhav Nagarnaik 提交于
      In event format strings, the array size is reported in two locations.
      One in array subscript and then via the "size:" attribute. The values
      reported there have a mismatch.
      
      For e.g., in sched:sched_switch the prev_comm and next_comm character
      arrays have subscript values as [32] where as the actual field size is
      16.
      
      name: sched_switch
      ID: 301
      format:
              field:unsigned short common_type;       offset:0;       size:2; signed:0;
              field:unsigned char common_flags;       offset:2;       size:1; signed:0;
              field:unsigned char common_preempt_count;       offset:3;       size:1;signed:0;
              field:int common_pid;   offset:4;       size:4; signed:1;
      
              field:char prev_comm[32];       offset:8;       size:16;        signed:1;
              field:pid_t prev_pid;   offset:24;      size:4; signed:1;
              field:int prev_prio;    offset:28;      size:4; signed:1;
              field:long prev_state;  offset:32;      size:8; signed:1;
              field:char next_comm[32];       offset:40;      size:16;        signed:1;
              field:pid_t next_pid;   offset:56;      size:4; signed:1;
              field:int next_prio;    offset:60;      size:4; signed:1;
      
      After bisection, the following commit was blamed:
      92edca07 tracing: Use direct field, type and system names
      
      This commit removes the duplication of strings for field->name and
      field->type assuming that all the strings passed in
      __trace_define_field() are immutable. This is not true for arrays, where
      the type string is created in event_storage variable and field->type for
      all array fields points to event_storage.
      
      Use __stringify() to create a string constant for the type string.
      
      Also, get rid of event_storage and event_storage_mutex that are not
      needed anymore.
      
      also, an added benefit is that this reduces the overhead of events a bit more:
      
         text    data     bss     dec     hex filename
      8424787 2036472 1302528 11763787         b3804b vmlinux
      8420814 2036408 1302528 11759750         b37086 vmlinux.patched
      
      Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com
      
      Cc: Laurent Chavey <chavey@google.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      87291347
  8. 20 3月, 2014 11 次提交
  9. 19 3月, 2014 7 次提交
  10. 18 3月, 2014 2 次提交