1. 03 10月, 2012 1 次提交
  2. 11 9月, 2012 1 次提交
    • J
      [IA64] Fix a node distance bug · 7cd10a60
      Jianguo Wu 提交于
      In arch ia64, has following definition:
      extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
      #define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
      
      num_online_nodes() is a variable value, it can be changed after hot-remove/add
      a node.
      
      In my practice, I found node distance is wrong after offline
      a node in IA64 platform. For example system has 4 nodes:
      node distances:
      node   0   1   2   3
        0:  10  21  21  32
        1:  21  10  32  21
        2:  21  32  10  21
        3:  32  21  21  10
      
      linux-drf:/sys/devices/system/node/node0 # cat distance
      10  21  21  32
      linux-drf:/sys/devices/system/node/node1 # cat distance
      21  10  32  21
      
      After offline node2:
      linux-drf:/sys/devices/system/node/node0 # cat distance
      10 21 32
      linux-drf:/sys/devices/system/node/node1 # cat distance
      32 21 32	--------->expected value is: 21  10  21
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7cd10a60
  3. 20 8月, 2012 1 次提交
    • F
      cputime: Consolidate vtime handling on context switch · baa36046
      Frederic Weisbecker 提交于
      The archs that implement virtual cputime accounting all
      flush the cputime of a task when it gets descheduled
      and sometimes set up some ground initialization for the
      next task to account its cputime.
      
      These archs all put their own hooks in their context
      switch callbacks and handle the off-case themselves.
      
      Consolidate this by creating a new account_switch_vtime()
      callback called in generic code right after a context switch
      and that these archs must implement to flush the prev task
      cputime and initialize the next task cputime related state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      baa36046
  4. 27 7月, 2012 1 次提交
    • T
      [IA64] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts · a1193655
      Tony Luck 提交于
      The following build error occured during a ia64 build with
      swap-over-NFS patches applied.
      
      net/core/sock.c:274:36: error: initializer element is not constant
      net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks')
      net/core/sock.c:274:36: error: initializer element is not constant
      
      This is identical to a parisc build error. Fengguang Wu, Mel Gorman
      and James Bottomley did all the legwork to track the root cause of
      the problem. This fix and entire commit log is shamelessly copied
      from them with one extra detail to change a dubious runtime use of
      ATOMIC_INIT() to atomic_set() in drivers/char/mspec.c
      
      Dave Anglin says:
      > Here is the line in sock.i:
      >
      > struct static_key memalloc_socks = ((struct static_key) { .enabled =
      > ((atomic_t) { (0) }) });
      
      The above line contains two compound literals.  It also uses a designated
      initializer to initialize the field enabled.  A compound literal is not a
      constant expression.
      
      The location of the above statement isn't fully clear, but if a compound
      literal occurs outside the body of a function, the initializer list must
      consist of constant expressions.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a1193655
  5. 26 7月, 2012 1 次提交
  6. 25 6月, 2012 1 次提交
  7. 18 6月, 2012 1 次提交
  8. 02 6月, 2012 2 次提交
  9. 31 5月, 2012 1 次提交
  10. 17 5月, 2012 3 次提交
  11. 12 5月, 2012 1 次提交
  12. 09 5月, 2012 1 次提交
    • P
      sched/numa: Rewrite the CONFIG_NUMA sched domain support · cb83b629
      Peter Zijlstra 提交于
      The current code groups up to 16 nodes in a level and then puts an
      ALLNODES domain spanning the entire tree on top of that. This doesn't
      reflect the numa topology and esp for the smaller not-fully-connected
      machines out there today this might make a difference.
      
      Therefore, build a proper numa topology based on node_distance().
      
      Since there's no fixed numa layers anymore, the static SD_NODE_INIT
      and SD_ALLNODES_INIT aren't usable anymore, the new code tries to
      construct something similar and scales some values either on the
      number of cpus in the domain and/or the node_distance() ratio.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-sh@vger.kernel.org
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: sparclinux@vger.kernel.org
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Greg Pearson <greg.pearson@hp.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: bob.picco@oracle.com
      Cc: chris.mason@oracle.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-r74n3n8hhuc2ynbrnp3vt954@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb83b629
  13. 08 5月, 2012 2 次提交
  14. 07 5月, 2012 3 次提交
  15. 20 4月, 2012 1 次提交
    • A
      KVM: Fix page-crossing MMIO · f78146b0
      Avi Kivity 提交于
      MMIO that are split across a page boundary are currently broken - the
      code does not expect to be aborted by the exit to userspace for the
      first MMIO fragment.
      
      This patch fixes the problem by generalizing the current code for handling
      16-byte MMIOs to handle a number of "fragments", and changes the MMIO
      code to create those fragments.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f78146b0
  16. 17 4月, 2012 1 次提交
  17. 14 4月, 2012 1 次提交
  18. 08 4月, 2012 2 次提交
  19. 31 3月, 2012 1 次提交
  20. 30 3月, 2012 1 次提交
    • L
      Fix ia64 build errors (fallout from system.h disintegration) · 93f37888
      Luck, Tony 提交于
      Fix this build error on ia64:
      
        In file included from include/linux/sched.h:92,
                        from arch/ia64/kernel/asm-offsets.c:9:
        include/linux/llist.h:59:25: error: asm/cmpxchg.h: No such file or directory
        make[1]: *** [arch/ia64/kernel/asm-offsets.s] Error 1
      
      Right now we don't seem to need any actual contents for the
      asm/cmpxchg.h to make the build work ...  so leave the migration of
      xchg() and cmpxchg() to this new header file for a future patch.
      
      Also process.c needs <asm/switch_to.h> (for definition of pfm_syst_info).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93f37888
  21. 29 3月, 2012 2 次提交
  22. 28 3月, 2012 1 次提交
  23. 11 3月, 2012 1 次提交
    • K
      xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it. · 73c154c6
      Konrad Rzeszutek Wilk 提交于
      For the hypervisor to take advantage of the MWAIT support it needs
      to extract from the ACPI _CST the register address. But the
      hypervisor does not have the support to parse DSDT so it relies on
      the initial domain (dom0) to parse the ACPI Power Management information
      and push it up to the hypervisor. The pushing of the data is done
      by the processor_harveset_xen module which parses the information that
      the ACPI parser has graciously exposed in 'struct acpi_processor'.
      
      For the ACPI parser to also expose the Cx states for MWAIT, we need
      to expose the MWAIT capability (leaf 1). Furthermore we also need to
      expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly
      function.
      
      The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX
      operations, but it can't do it since it needs to be backwards compatible.
      Instead we choose to use the native CPUID to figure out if the MWAIT
      capability exists and use the XEN_SET_PDC query hypercall to figure out
      if the hypervisor wants us to expose the MWAIT_LEAF capability or not.
      
      Note: The XEN_SET_PDC query was implemented in c/s 23783:
      "ACPI: add _PDC input override mechanism".
      
      With this in place, instead of
       C3 ACPI IOPORT 415
      we get now
       C3:ACPI FFH INTEL MWAIT 0x20
      
      Note: The cpu_idle which would be calling the mwait variants for idling
      never gets set b/c we set the default pm_idle to be the hypercall variant.
      Acked-by: NJan Beulich <JBeulich@suse.com>
      [v2: Fix missing header file include and #ifdef]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      73c154c6
  24. 09 3月, 2012 1 次提交
    • J
      hpsim, initialize chip for assigned irqs · 6efb6b77
      Jiri Slaby 提交于
      Currently, when assign_irq_vector is called and the irq connected in
      the simulator, the irq is not ready. request_irq will return ENOSYS
      immediately. It is because the irq chip is unset.
      
      Hence set the chip properly to irq_type_hp_sim. And make sure this is
      done from both users of simulated interrupts.
      
      Also we have to set handler here, otherwise we end up in
      handle_bad_int resulting in spam in logs and no irqs handled. We use
      handle_simple_irq as these are SW interrupts that need no ACK or
      anything.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6efb6b77
  25. 08 3月, 2012 1 次提交
  26. 05 3月, 2012 1 次提交
    • C
      KVM: provide synchronous registers in kvm_run · b9e5dc8d
      Christian Borntraeger 提交于
      On some cpus the overhead for virtualization instructions is in the same
      range as a system call. Having to call multiple ioctls to get set registers
      will make certain userspace handled exits more expensive than necessary.
      Lets provide a section in kvm_run that works as a shared save area
      for guest registers.
      We also provide two 64bit flags fields (architecture specific), that will
      specify
      1. which parts of these fields are valid.
      2. which registers were modified by userspace
      
      Each bit for these flag fields will define a group of registers (like
      general purpose) or a single register.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b9e5dc8d
  27. 25 2月, 2012 1 次提交
    • J
      [IA64] hpsim, initialize chip for assigned irqs · cb534855
      Jiri Slaby 提交于
      Currently, when assign_irq_vector is called and the irq connected in
      the simulator, the irq is not ready. request_irq will return ENOSYS
      immediately. It is because the irq chip is unset.
      
      Hence set the chip properly to irq_type_hp_sim. And make sure this is
      done from both users of simulated interrupts.
      
      Also we have to set handler here, otherwise we end up in
      handle_bad_int resulting in spam in logs and no irqs handled. We use
      handle_simple_irq as these are SW interrupts that need no ACK or
      anything.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cb534855
  28. 24 2月, 2012 4 次提交
  29. 22 2月, 2012 1 次提交
    • P
      sock: Introduce the SO_PEEK_OFF sock option · ef64a54f
      Pavel Emelyanov 提交于
      This one specifies where to start MSG_PEEK-ing queue data from. When
      set to negative value means that MSG_PEEK works as ususally -- peeks
      from the head of the queue always.
      
      When some bytes are peeked from queue and the peeking offset is non
      negative it is moved forward so that the next peek will return next
      portion of data.
      
      When non-peeking recvmsg occurs and the peeking offset is non negative
      is is moved backward so that the next peek will still peek the proper
      data (i.e. the one that would have been picked if there were no non
      peeking recv in between).
      
      The offset is set using per-proto opteration to let the protocol handle
      the locking issues and to check whether the peeking offset feature is
      supported by the protocol the socket belongs to.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef64a54f