1. 28 7月, 2009 1 次提交
    • B
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt 提交于
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: NNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  2. 23 7月, 2009 1 次提交
    • P
      perf_counter: PERF_SAMPLE_ID and inherited counters · 7f453c24
      Peter Zijlstra 提交于
      Anton noted that for inherited counters the counter-id as provided by
      PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID
      because each inherited counter gets its own id.
      
      His suggestion was to always return the parent counter id, since that
      is the primary counter id as exposed. However, these inherited
      counters have a unique identifier so that events like
      PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which
      counter gets modified, which is important when trying to normalize the
      sample streams.
      
      This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD,
      which is more useful anyway, since changing periods became a lot more
      common than initially thought -- rendering PERF_EVENT_PERIOD the less
      useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate
      value, since it reports the value used to trigger the overflow,
      whereas PERF_EVENT_PERIOD simply reports the requested period changed,
      which might only take effect on the next cycle).
      
      This still leaves us PERF_EVENT_THROTTLE to consider, but since that
      _should_ be a rare occurrence, and linking it to a primary id is the
      most useful bit to diagnose the problem, we introduce a
      PERF_SAMPLE_STREAM_ID, for those few cases where the full
      reconstruction is important.
      
      [Does change the ABI a little, but I see no other way out]
      Suggested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1248095846.15751.8781.camel@twins>
      7f453c24
  3. 22 7月, 2009 1 次提交
    • P
      softirq: introduce tasklet_hrtimer infrastructure · 9ba5f005
      Peter Zijlstra 提交于
      commit ca109491 (hrtimer: removing all ur callback modes) moved all
      hrtimer callbacks into hard interrupt context when high resolution
      timers are active. That breaks code which relied on the assumption
      that the callback happens in softirq context.
      
      Provide a generic infrastructure which combines tasklets and hrtimers
      together to provide an in-softirq hrtimer experience.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: torvalds@linux-foundation.org
      Cc: kaber@trash.net
      Cc: David Miller <davem@davemloft.net>
      LKML-Reference: <1248265724.27058.1366.camel@twins>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      9ba5f005
  4. 21 7月, 2009 1 次提交
    • T
      genirq: Delegate irq affinity setting to the irq thread · 591d2fb0
      Thomas Gleixner 提交于
      irq_set_thread_affinity() calls set_cpus_allowed_ptr() which might
      sleep, but irq_set_thread_affinity() is called with desc->lock held
      and can be called from hard interrupt context as well. The code has
      another bug as it does not hold a ref on the task struct as required
      by set_cpus_allowed_ptr().
      
      Just set the IRQTF_AFFINITY bit in action->thread_flags. The next time
      the thread runs it migrates itself. Solves all of the above problems
      nicely.
      
      Add kerneldoc to irq_set_thread_affinity() while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      591d2fb0
  5. 20 7月, 2009 1 次提交
  6. 18 7月, 2009 2 次提交
    • T
      sched: fix nr_uninterruptible accounting of frozen tasks really · 6301cb95
      Thomas Gleixner 提交于
      commit e3c8ca83 (sched: do not count frozen tasks toward load) broke
      the nr_uninterruptible accounting on freeze/thaw. On freeze the task
      is excluded from accounting with a check for (task->flags &
      PF_FROZEN), but that flag is cleared before the task is thawed. So
      while we prevent that the task with state TASK_UNINTERRUPTIBLE
      is accounted to nr_uninterruptible on freeze we decrement
      nr_uninterruptible on thaw.
      
      Use a separate flag which is handled by the freezing task itself. Set
      it before calling the scheduler with TASK_UNINTERRUPTIBLE state and
      clear it after we return from frozen state.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6301cb95
    • T
      vmlinux.lds.h: restructure BSS linker script macros · 04e448d9
      Tim Abbott 提交于
      The BSS section macros in vmlinux.lds.h currently place the .sbss
      input section outside the bounds of [__bss_start, __bss_end].  On all
      architectures except for microblaze that handle both .sbss and
      __bss_start/__bss_end, this is wrong: the .sbss input section is
      within the range [__bss_start, __bss_end].  Relatedly, the example
      code at the top of the file actually has __bss_start/__bss_end defined
      twice; I believe the right fix here is to define them in the
      BSS_SECTION macro but not in the BSS macro.
      
      Another problem with the current macros is that several
      architectures have an ALIGN(4) or some other small number just before
      __bss_stop in their linker scripts.  The BSS_SECTION macro currently
      hardcodes this to 4; while it should really be an argument.  It also
      ignores its sbss_align argument; fix that.
      
      mn10300 is the only user at present of any of the macros touched by
      this patch.  It looks like mn10300 actually was incorrectly converted
      to use the new BSS() macro (the alignment of 4 prior to conversion was
      a __bss_stop alignment, but the argument to the BSS macro is a start
      alignment).  So fix this as well.
      
      I'd like acks from Sam and David on this one.  Also CCing Paul, since
      he has a patch from me which will need to be updated to use
      BSS_SECTION(0, PAGE_SIZE, 4) once this gets merged.
      Signed-off-by: NTim Abbott <tabbott@ksplice.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      04e448d9
  7. 17 7月, 2009 4 次提交
  8. 15 7月, 2009 3 次提交
  9. 13 7月, 2009 6 次提交
    • L
      tracing/events: Move TRACE_SYSTEM outside of include guard · d0b6e04a
      Li Zefan 提交于
      If TRACE_INCLDUE_FILE is defined, <trace/events/TRACE_INCLUDE_FILE.h>
      will be included and compiled, otherwise it will be
      <trace/events/TRACE_SYSTEM.h>
      
      So TRACE_SYSTEM should be defined outside of #if proctection,
      just like TRACE_INCLUDE_FILE.
      
      Imaging this scenario:
      
       #include <trace/events/foo.h>
          -> TRACE_SYSTEM == foo
       ...
       #include <trace/events/bar.h>
          -> TRACE_SYSTEM == bar
       ...
       #define CREATE_TRACE_POINTS
       #include <trace/events/foo.h>
          -> TRACE_SYSTEM == bar !!!
      
      and then bar.h will be included and compiled.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A5A9CF1.2010007@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d0b6e04a
    • R
      USB: usb.h: fix kernel-doc notation · e376bbbb
      Randy Dunlap 提交于
      Fix usb.h kernel-doc warnings:
      
      Warning(include/linux/usb.h:918): Excess struct/union/enum/typedef member 'nodename' description in 'usb_device_driver'
      Warning(include/linux/usb.h:939): No description found for parameter 'nodename'
      Warning(include/linux/usb.h:1219): No description found for parameter 'sg'
      Warning(include/linux/usb.h:1219): No description found for parameter 'num_sgs'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      e376bbbb
    • G
      Revert "USB: Add Intel Langwell USB OTG Transceiver Drive" · 1a74826f
      Greg Kroah-Hartman 提交于
      This reverts commit 453f7755.
      
      The driver should not have been accepted as the MSRT code is not
      in the main kernel yet, which this depends on.
      
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Hao Wu <hao.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      1a74826f
    • K
      Driver Core: remove BUS_ID_SIZE · 4ead0a2b
      Kay Sievers 提交于
      The name size limit is gone from the driver-core, this is
      the removal of the last left-over.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      4ead0a2b
    • A
      headers: smp_lock.h redux · 405f5571
      Alexey Dobriyan 提交于
      * Remove smp_lock.h from files which don't need it (including some headers!)
      * Add smp_lock.h to files which do need it
      * Make smp_lock.h include conditional in hardirq.h
        It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
      
        This will make hardirq.h inclusion cheaper for every PREEMPT=n config
        (which includes allmodconfig/allyesconfig, BTW)
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      405f5571
    • J
      personality: fix PER_CLEAR_ON_SETID · f9fabcb5
      Julien Tinnes 提交于
      We have found that the current PER_CLEAR_ON_SETID mask on Linux doesn't
      include neither ADDR_COMPAT_LAYOUT, nor MMAP_PAGE_ZERO.
      
      The current mask is READ_IMPLIES_EXEC|ADDR_NO_RANDOMIZE.
      
      We believe it is important to add MMAP_PAGE_ZERO, because by using this
      personality it is possible to have the first page mapped inside a
      process running as setuid root.  This could be used in those scenarios:
      
       - Exploiting a NULL pointer dereference issue in a setuid root binary
       - Bypassing the mmap_min_addr restrictions of the Linux kernel: by
         running a setuid binary that would drop privileges before giving us
         control back (for instance by loading a user-supplied library), we
         could get the first page mapped in a process we control.  By further
         using mremap and mprotect on this mapping, we can then completely
         bypass the mmap_min_addr restrictions.
      
      Less importantly, we believe ADDR_COMPAT_LAYOUT should also be added
      since on x86 32bits it will in practice disable most of the address
      space layout randomization (only the stack will remain randomized).
      Signed-off-by: NJulien Tinnes <jt@cr0.org>
      Signed-off-by: NTavis Ormandy <taviso@sdf.lonestar.org>
      Cc: stable@kernel.org
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Acked-by: NKees Cook <kees@ubuntu.com>
      Acked-by: NEugene Teo <eugene@redhat.com>
      [ Shortened lines and fixed whitespace as per Christophs' suggestion ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f9fabcb5
  10. 12 7月, 2009 1 次提交
  11. 11 7月, 2009 6 次提交
  12. 10 7月, 2009 3 次提交
    • T
      hrtimer: Fix migration expiry check · 6ff7041d
      Thomas Gleixner 提交于
      The timer migration expiry check should prevent the migration of a
      timer to another CPU when the timer expires before the next event is
      scheduled on the other CPU. Migrating the timer might delay it because
      we can not reprogram the clock event device on the other CPU. But the
      code implementing that check has two flaws:
      
      - for !HIGHRES the check compares the expiry value with the clock
        events device expiry value which is wrong for CLOCK_REALTIME based
        timers.
      
      - the check is racy. It holds the hrtimer base lock of the target CPU,
        but the clock event device expiry value can be modified
        nevertheless, e.g. by an timer interrupt firing.
      
      The !HIGHRES case is easy to fix as we can enqueue the timer on the
      cpu which was selected by the load balancer. It runs the idle
      balancing code once per jiffy anyway. So the maximum delay for the
      timer is the same as when we keep the tick on the current cpu going.
      
      In the HIGHRES case we can get the next expiry value from the hrtimer
      cpu_base of the target CPU and serialize the update with the cpu_base
      lock. This moves the lock section in hrtimer_interrupt() so we can set
      next_event to KTIME_MAX while we are handling the expired timers and
      set it to the next expiry value after we handled the timers under the
      base lock. While the expired timers are processed timer migration is
      blocked because the expiry time of the timer is always <= KTIME_MAX.
      
      Also remove the now useless clockevents_get_next_event() function.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6ff7041d
    • J
      memory barrier: adding smp_mb__after_lock · ad462769
      Jiri Olsa 提交于
      Adding smp_mb__after_lock define to be used as a smp_mb call after
      a lock.
      
      Making it nop for x86, since {read|write|spin}_lock() on x86 are
      full memory barriers.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad462769
    • J
      net: adding memory barrier to the poll and receive callbacks · a57de0b4
      Jiri Olsa 提交于
      Adding memory barrier after the poll_wait function, paired with
      receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
      to wrap the memory barrier.
      
      Without the memory barrier, following race can happen.
      The race fires, when following code paths meet, and the tp->rcv_nxt
      and __add_wait_queue updates stay in CPU caches.
      
      CPU1                         CPU2
      
      sys_select                   receive packet
        ...                        ...
        __add_wait_queue           update tp->rcv_nxt
        ...                        ...
        tp->rcv_nxt check          sock_def_readable
        ...                        {
        schedule                      ...
                                      if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                              wake_up_interruptible(sk->sk_sleep)
                                      ...
                                   }
      
      If there was no cache the code would work ok, since the wait_queue and
      rcv_nxt are opposit to each other.
      
      Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
      passed the tp->rcv_nxt check and sleeps, or will get the new value for
      tp->rcv_nxt and will return with new data mask.
      In both cases the process (CPU1) is being added to the wait queue, so the
      waitqueue_active (CPU2) call cannot miss and will wake up CPU1.
      
      The bad case is when the __add_wait_queue changes done by CPU1 stay in its
      cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
      endup calling schedule and sleep forever if there are no more data on the
      socket.
      
      Calls to poll_wait in following modules were ommited:
      	net/bluetooth/af_bluetooth.c
      	net/irda/af_irda.c
      	net/irda/irnet/irnet_ppp.c
      	net/mac80211/rc80211_pid_debugfs.c
      	net/phonet/socket.c
      	net/rds/af_rds.c
      	net/rfkill/core.c
      	net/sunrpc/cache.c
      	net/sunrpc/rpc_pipe.c
      	net/tipc/socket.c
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a57de0b4
  13. 09 7月, 2009 3 次提交
  14. 08 7月, 2009 2 次提交
  15. 07 7月, 2009 3 次提交
    • A
      signals: declare sys_rt_tgsigqueueinfo in syscalls.h · 7afdbf23
      Arnd Bergmann 提交于
      sys_rt_tgsigqueueinfo needs to be declared in linux/syscalls.h so that
      architectures defining the system call table in C can reference it.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      LKML-Reference: <200907071023.44008.arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      7afdbf23
    • T
      linux/sysrq.h needs linux/errno.h · 82e3310a
      Tobias Doerffel 提交于
      In include/linux/sysrq.h the constant EINVAL is being used but is undefined
      if include/linux/errno.h is not included before.
      
      Fix this by adding #include <linux/errno.h> at the beginning.
      Signed-off-by: NTobias Doerffel <tobias.doerffel@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      82e3310a
    • H
      elf: fix multithreaded program core dumping on arm · a65e7bfc
      Hui Zhu 提交于
      Fix the multithread program core thread message error.
      
      This issue affects arches with neither has CORE_DUMP_USE_REGSET nor
      ELF_CORE_COPY_TASK_REGS, ARM is one of them.
      
      The thread message of core file is generated in elf_dump_thread_status.
      The register values is set by elf_core_copy_task_regs in this function.
      
      If an arch doesn't define ELF_CORE_COPY_TASK_REGS,
      elf_core_copy_task_regs() will do nothing.  Then the core file will not
      have the register message of thread.
      
      So add elf_core_copy_regs to set regiser values if ELF_CORE_COPY_TASK_REGS
      doesn't define.
      
      The following is how to reproduce this issue:
      
      cat 1.c
      #include <stdio.h>
      #include <pthread.h>
      #include <assert.h>
      
      void td1(void * i)
      {
             while (1)
             {
                     printf ("1\n");
                     sleep (1);
             }
      
             return;
      }
      
      void td2(void * i)
      {
             while (1)
             {
                     printf ("2\n");
                     sleep (1);
             }
      
             return;
      }
      
      int
      main(int argc,char *argv[],char *envp[])
      {
             pthread_t       t1,t2;
      
             pthread_create(&t1, NULL, (void*)td1, NULL);
             pthread_create(&t2, NULL, (void*)td2, NULL);
      
             sleep (10);
      
             assert(0);
      
             return (0);
      }
      arm-xxx-gcc -g -lpthread 1.c -o 1
      copy 1.c and 1 to a arm board.
      Goto this board.
      ulimit -c 1800000
      ./1
      # ./1
      1
      2
      1
      ...
      ...
      1
      1: 1.c:37: main: Assertion `0' failed.
      Aborted (core dumped)
      Then you can get a core file.
      gdb 1 core.xxx
      Without the patch:
      (gdb) info threads
       3 process 909  0x00000000 in ?? ()
       2 process 908  0x00000000 in ?? ()
      * 1 process 907  0x4a6e2238 in raise () from /lib/libc.so.6
      You can found that the pc of 909 and 908 is 0x00000000.
      With the patch:
      (gdb) info threads
       3 process 885  0x4a749974 in nanosleep () from /lib/libc.so.6
       2 process 884  0x4a749974 in nanosleep () from /lib/libc.so.6
      * 1 process 883  0x4a6e2238 in raise () from /lib/libc.so.6
      The pc of 885 and 884 is right.
      Signed-off-by: NHui Zhu <teawater@gmail.com>
      Cc: Amerigo Wang <xiyou.wangcong@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a65e7bfc
  16. 06 7月, 2009 2 次提交