1. 17 5月, 2012 3 次提交
  2. 09 5月, 2012 1 次提交
    • P
      sched/numa: Rewrite the CONFIG_NUMA sched domain support · cb83b629
      Peter Zijlstra 提交于
      The current code groups up to 16 nodes in a level and then puts an
      ALLNODES domain spanning the entire tree on top of that. This doesn't
      reflect the numa topology and esp for the smaller not-fully-connected
      machines out there today this might make a difference.
      
      Therefore, build a proper numa topology based on node_distance().
      
      Since there's no fixed numa layers anymore, the static SD_NODE_INIT
      and SD_ALLNODES_INIT aren't usable anymore, the new code tries to
      construct something similar and scales some values either on the
      number of cpus in the domain and/or the node_distance() ratio.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-sh@vger.kernel.org
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: sparclinux@vger.kernel.org
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Greg Pearson <greg.pearson@hp.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: bob.picco@oracle.com
      Cc: chris.mason@oracle.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-r74n3n8hhuc2ynbrnp3vt954@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb83b629
  3. 08 5月, 2012 2 次提交
  4. 07 5月, 2012 3 次提交
  5. 17 4月, 2012 1 次提交
  6. 14 4月, 2012 1 次提交
  7. 31 3月, 2012 1 次提交
  8. 30 3月, 2012 1 次提交
    • L
      Fix ia64 build errors (fallout from system.h disintegration) · 93f37888
      Luck, Tony 提交于
      Fix this build error on ia64:
      
        In file included from include/linux/sched.h:92,
                        from arch/ia64/kernel/asm-offsets.c:9:
        include/linux/llist.h:59:25: error: asm/cmpxchg.h: No such file or directory
        make[1]: *** [arch/ia64/kernel/asm-offsets.s] Error 1
      
      Right now we don't seem to need any actual contents for the
      asm/cmpxchg.h to make the build work ...  so leave the migration of
      xchg() and cmpxchg() to this new header file for a future patch.
      
      Also process.c needs <asm/switch_to.h> (for definition of pfm_syst_info).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93f37888
  9. 29 3月, 2012 2 次提交
  10. 28 3月, 2012 1 次提交
  11. 11 3月, 2012 1 次提交
    • K
      xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it. · 73c154c6
      Konrad Rzeszutek Wilk 提交于
      For the hypervisor to take advantage of the MWAIT support it needs
      to extract from the ACPI _CST the register address. But the
      hypervisor does not have the support to parse DSDT so it relies on
      the initial domain (dom0) to parse the ACPI Power Management information
      and push it up to the hypervisor. The pushing of the data is done
      by the processor_harveset_xen module which parses the information that
      the ACPI parser has graciously exposed in 'struct acpi_processor'.
      
      For the ACPI parser to also expose the Cx states for MWAIT, we need
      to expose the MWAIT capability (leaf 1). Furthermore we also need to
      expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly
      function.
      
      The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX
      operations, but it can't do it since it needs to be backwards compatible.
      Instead we choose to use the native CPUID to figure out if the MWAIT
      capability exists and use the XEN_SET_PDC query hypercall to figure out
      if the hypervisor wants us to expose the MWAIT_LEAF capability or not.
      
      Note: The XEN_SET_PDC query was implemented in c/s 23783:
      "ACPI: add _PDC input override mechanism".
      
      With this in place, instead of
       C3 ACPI IOPORT 415
      we get now
       C3:ACPI FFH INTEL MWAIT 0x20
      
      Note: The cpu_idle which would be calling the mwait variants for idling
      never gets set b/c we set the default pm_idle to be the hypercall variant.
      Acked-by: NJan Beulich <JBeulich@suse.com>
      [v2: Fix missing header file include and #ifdef]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      73c154c6
  12. 09 3月, 2012 1 次提交
    • J
      hpsim, initialize chip for assigned irqs · 6efb6b77
      Jiri Slaby 提交于
      Currently, when assign_irq_vector is called and the irq connected in
      the simulator, the irq is not ready. request_irq will return ENOSYS
      immediately. It is because the irq chip is unset.
      
      Hence set the chip properly to irq_type_hp_sim. And make sure this is
      done from both users of simulated interrupts.
      
      Also we have to set handler here, otherwise we end up in
      handle_bad_int resulting in spam in logs and no irqs handled. We use
      handle_simple_irq as these are SW interrupts that need no ACK or
      anything.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6efb6b77
  13. 08 3月, 2012 1 次提交
  14. 05 3月, 2012 1 次提交
    • C
      KVM: provide synchronous registers in kvm_run · b9e5dc8d
      Christian Borntraeger 提交于
      On some cpus the overhead for virtualization instructions is in the same
      range as a system call. Having to call multiple ioctls to get set registers
      will make certain userspace handled exits more expensive than necessary.
      Lets provide a section in kvm_run that works as a shared save area
      for guest registers.
      We also provide two 64bit flags fields (architecture specific), that will
      specify
      1. which parts of these fields are valid.
      2. which registers were modified by userspace
      
      Each bit for these flag fields will define a group of registers (like
      general purpose) or a single register.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b9e5dc8d
  15. 25 2月, 2012 1 次提交
    • J
      [IA64] hpsim, initialize chip for assigned irqs · cb534855
      Jiri Slaby 提交于
      Currently, when assign_irq_vector is called and the irq connected in
      the simulator, the irq is not ready. request_irq will return ENOSYS
      immediately. It is because the irq chip is unset.
      
      Hence set the chip properly to irq_type_hp_sim. And make sure this is
      done from both users of simulated interrupts.
      
      Also we have to set handler here, otherwise we end up in
      handle_bad_int resulting in spam in logs and no irqs handled. We use
      handle_simple_irq as these are SW interrupts that need no ACK or
      anything.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cb534855
  16. 24 2月, 2012 4 次提交
  17. 22 2月, 2012 1 次提交
    • P
      sock: Introduce the SO_PEEK_OFF sock option · ef64a54f
      Pavel Emelyanov 提交于
      This one specifies where to start MSG_PEEK-ing queue data from. When
      set to negative value means that MSG_PEEK works as ususally -- peeks
      from the head of the queue always.
      
      When some bytes are peeked from queue and the peeking offset is non
      negative it is moved forward so that the next peek will return next
      portion of data.
      
      When non-peeking recvmsg occurs and the peeking offset is non negative
      is is moved backward so that the next peek will still peek the proper
      data (i.e. the one that would have been picked if there were no non
      peeking recv in between).
      
      The offset is set using per-proto opteration to let the protocol handle
      the locking issues and to check whether the peeking offset feature is
      supported by the protocol the socket belongs to.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef64a54f
  18. 15 2月, 2012 1 次提交
  19. 18 1月, 2012 1 次提交
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
  20. 13 1月, 2012 1 次提交
  21. 10 1月, 2012 1 次提交
  22. 07 1月, 2012 1 次提交
    • M
      PCI: IA64: convert pcibios_set_master() to a non-inlined function · 91e86df1
      Myron Stowe 提交于
      This patch converts IA64's architecture-specific 'pcibios_set_master()'
      routine to a non-inlined function.  This will allow follow on
      patches to create a generic 'pcibios_set_master()' function using the
      '__weak' attribute which can be used by all architectures as a default
      which, if necessary, can then be over-ridden by architecture-
      specific code.
      
      Converting 'pci_bios_set_master()' to a non-inlined function will allow
      IA64's 'pcibios_set_master()' implementation to remain architecture-
      specific after the generic version is introduced and thus, not change
      current behavior.
      
      No functional change.
      Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      91e86df1
  23. 04 1月, 2012 2 次提交
  24. 30 12月, 2011 1 次提交
    • A
      procfs: do not confuse jiffies with cputime64_t · 34845636
      Andreas Schwab 提交于
      Commit 2a95ea6c ("procfs: do not overflow get_{idle,iowait}_time
      for nohz") did not take into account that one some architectures jiffies
      and cputime use different units.
      
      This causes get_idle_time() to return numbers in the wrong units, making
      the idle time fields in /proc/stat wrong.
      
      Instead of converting the usec value returned by
      get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
      usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34845636
  25. 15 12月, 2011 1 次提交
  26. 13 12月, 2011 1 次提交
  27. 01 12月, 2011 1 次提交
  28. 22 11月, 2011 1 次提交
  29. 15 11月, 2011 1 次提交
  30. 10 11月, 2011 1 次提交
    • J
      net: add wireless TX status socket option · 6e3e939f
      Johannes Berg 提交于
      The 802.1X EAPOL handshake hostapd does requires
      knowing whether the frame was ack'ed by the peer.
      Currently, we fudge this pretty badly by not even
      transmitting the frame as a normal data frame but
      injecting it with radiotap and getting the status
      out of radiotap monitor as well. This is rather
      complex, confuses users (mon.wlan0 presence) and
      doesn't work with all hardware.
      
      To get rid of that hack, introduce a real wifi TX
      status option for data frame transmissions.
      
      This works similar to the existing TX timestamping
      in that it reflects the SKB back to the socket's
      error queue with a SCM_WIFI_STATUS cmsg that has
      an int indicating ACK status (0/1).
      
      Since it is possible that at some point we will
      want to have TX timestamping and wifi status in a
      single errqueue SKB (there's little point in not
      doing that), redefine SO_EE_ORIGIN_TIMESTAMPING
      to SO_EE_ORIGIN_TXSTATUS which can collect more
      than just the timestamp; keep the old constant
      as an alias of course. Currently the internal APIs
      don't make that possible, but it wouldn't be hard
      to split them up in a way that makes it possible.
      
      Thanks to Neil Horman for helping me figure out
      the functions that add the control messages.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      6e3e939f