1. 09 6月, 2010 8 次提交
    • C
      x86, UV: Remove BAU check for stay-busy · 90cc7d94
      Cliff Wickman 提交于
      Remove a faulty assumption that a long running BAU request has
      encountered a hardware problem and will never finish.
      
      Numalink congestion can make a request appear to have
      encountered such a problem, but it is not safe to cancel the
      request.  If such a cancel is done but a reply is later received
      we can miss a TLB shootdown.
      
      We depend upon the max_bau_concurrent 'throttle' to prevent the
      stay-busy case from happening.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004ad-BV@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      90cc7d94
    • C
      x86, UV: Correct BAU discovery of hubs and sockets · a8328ee5
      Cliff Wickman 提交于
      Correct the initialization-time assumption of contigous blade
      numbers and of sockets numbered from zero.
      
      There may be hubs present with no cpu's enabled.
      There may be disabled sockets such that the active socket is not
      number zero.
      
      And assign a 'socket master' by assuming that a socket is a
      node. (it is not safe to extract socket number from an apicid)
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004aW-9S@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a8328ee5
    • C
      x86, UV: Correct BAU software acknowledge · 39847e7f
      Cliff Wickman 提交于
      Correct the acknowledgment and the reset of a BAU
      software-acknowledged message.
      
      A retry message should be testing only for timed-out resources
      (mask << 8). (And we delete a log message that might cause
      unnecessary concern) The acknowledge MMR is
      |--timed-out--|---pending--|,  each is 8 bits.
      
      The IPI-driven reset of software acknowledge resources frees
      both timed out and pending resources.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004aP-7O@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39847e7f
    • C
      x86, UV: BAU structure rearranging · 4faca155
      Cliff Wickman 提交于
      Move some structure definitions from the C code to the BAU
      header file, and change the organization of that header file a
      little.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004aI-54@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4faca155
    • C
      x86, UV: Shorten access to BAU statistics structure · 712157aa
      Cliff Wickman 提交于
      Use a pointer from the per-cpu BAU control structure to the
      per-cpu BAU statistics structure.
      We nearly always know the first before needing the second.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004aB-2k@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      712157aa
    • C
      x86, UV: Disable BAU on network congestion · 50fb55ac
      Cliff Wickman 提交于
      The numalink network can become so congested that TLB shootdown
      using the Broadcast Assist Unit becomes slower than using IPI's.
      
      In that case, disable the use of the BAU for a period of time.
      The period is tunable.  When the period expires the use of the
      BAU is re-enabled. A count of these actions is added to the
      statistics file.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNy-0004a4-0a@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      50fb55ac
    • C
      x86, UV: BAU tunables into a debugfs file · e8e5e8a8
      Cliff Wickman 提交于
      Make the Broadcast Assist Unit driver's nine tuning values variable by
      making them accessible through a read/write debugfs file.
      
      The file will normally be mounted as
      /sys/kernel/debug/sgi_uv/bau_tunables. The tunables are kept in each
      cpu's per-cpu BAU structure.
      
      The patch also does a little name improvement, and corrects the reset of
      two destination timeout counters.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNx-0004Zx-Uo@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e5e8a8
    • C
      x86, UV: Calculate BAU destination timeout · 12a6611f
      Cliff Wickman 提交于
      Calculate the Broadcast Assist Unit's destination timeout period from the
      values in the relevant MMR's.
      
      Store it in each cpu's per-cpu BAU structure so that a destination
      timeout can be differentiated from a 'plugged' situation in which all
      software ack resources are already allocated and a timeout is pending.
      That case returns an immediate destination error.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: gregkh@suse.de
      LKML-Reference: <E1OJvNx-0004Zq-RK@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12a6611f
  2. 15 4月, 2010 1 次提交
    • C
      x86, UV: Improve BAU performance and error recovery · b8f7fb13
      Cliff Wickman 提交于
      - increase performance of the interrupt handler
      
      - release timed-out software acknowledge resources
      
      - recover from continuous-busy status due to a hardware issue
      
      - add a 'throttle' to keep a uvhub from sending more than a
        specified number of broadcasts concurrently (work around the hardware issue)
      
      - provide a 'nobau' boot command line option
      
      - rename 'pnode' and 'node' to 'uvhub' (the 'node' terminology
        is ambiguous)
      
      - add some new statistics about the scope of broadcasts, retries, the
        hardware issue and the 'throttle'
      
      - split off new function uv_bau_retry_msg() from
        uv_bau_process_message() per community coding style feedback.
      
      - simplify the argument list to uv_bau_process_message(), per
        community coding style feedback.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: linux-mm@kvack.org
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <E1O25Z4-0004Ur-PB@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8f7fb13
  3. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  4. 11 3月, 2010 1 次提交
  5. 24 11月, 2009 1 次提交
    • C
      x86: SGI UV: Fix BAU initialization · e38e2af1
      Cliff Wickman 提交于
      A memory mapped register that affects the SGI UV Broadcast
      Assist Unit's interrupt handling may sometimes be unintialized.
      
      Remove the condition on its initialization, as that condition
      can be randomly satisfied by a hardware reset.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <E1NBGB9-0005nU-Dp@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e38e2af1
  6. 16 10月, 2009 2 次提交
    • R
      x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode() · 1d21e6e3
      Robin Holt 提交于
      Create an inline function to extract the pnode from a global
      physical address and then convert the broadcast assist unit to
      use the newly created uv_gpa_to_pnode function.
      
      The open-coded code was wrong as well - it might explain a
      few of our unexplained bau hangs.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Acked-by: NCliff Whickman <cpw@sgi.com>
      Cc: linux-mm@kvack.org
      Cc: Jack Steiner <steiner@sgi.com>
      LKML-Reference: <20091016112920.GZ8903@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1d21e6e3
    • R
      x86, UV: Fix information in __uv_hub_info structure · 036ed8ba
      Robin Holt 提交于
      A few parts of the uv_hub_info structure are initialized
      incorrectly.
      
       - n_val is being loaded with m_val.
       - gpa_mask is initialized with a bytes instead of an unsigned long.
       - Handle the case where none of the alias registers are used.
      
      Lastly I converted the bau over to using the uv_hub_info->m_val
      which is the correct value.
      
      Without this patch, booting a large configuration hits a
      problem where the upper bits of the gnode affect the pnode
      and the bau will not operate.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Acked-by: NJack Steiner <steiner@sgi.com>
      Cc: Cliff Whickman <cpw@sgi.com>
      Cc: stable@kernel.org
      LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      036ed8ba
  7. 24 8月, 2009 1 次提交
  8. 15 8月, 2009 1 次提交
  9. 24 6月, 2009 1 次提交
    • C
      x86: Fix uv bau sending buffer initialization · 9c26f52b
      Cliff Wickman 提交于
      The initialization of the UV Broadcast Assist Unit's sending
      buffers was making an invalid assumption about the
      initialization of an MMR that defines its address.
      
      The BIOS will not be providing that MMR.  So
      uv_activation_descriptor_init() should unconditionally set it.
      
      Tested on UV simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org> # for v2.6.30.x
      LKML-Reference: <E1MJTfj-0005i1-W8@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9c26f52b
  10. 09 6月, 2009 1 次提交
  11. 03 6月, 2009 1 次提交
    • C
      x86: Fix UV BAU activation descriptor init · 0e2595cd
      Cliff Wickman 提交于
      The UV tlb shootdown code has a serious initialization error.
      
      An array of structures [32*8] is initialized as if it were [32].
      The array is indexed by (cpu number on the blade)*8, so the short
      initialization works for up to 4 cpus on a blade.
      But above that, we provide an invalid opcode to the hub's
      broadcast assist unit.
      
      This patch changes the allocation of the array to use its symbolic
      dimensions for better clarity. And initializes all 32*8 entries.
      
      Shortened 'UV_ACTIVATION_DESCRIPTOR_SIZE' to 'UV_ADP_SIZE' per Ingo's
      recommendation.
      
      Tested on the UV simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <E1M6lZR-0007kV-Aq@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0e2595cd
  12. 17 4月, 2009 1 次提交
    • C
      x86: UV BAU distribution and payload MMRs · 4ea3c51d
      Cliff Wickman 提交于
      This patch correctly sets BAU memory mapped registers to point
      to the sending activation descriptor table and target payload table.
      
      The "Broadcast Assist Unit" is used for TLB shootdown in UV.
      
      The memory mapped registers that point to sending and receiving
      memory structures contain node numbers.
      
      In one case the __pa() function did not provide the node id of
      memory on blade zero in configurations where that id is nonzero.
      In another case, it was assumed that memory was allocated on
      the local node.  That assumption is not true in a configuration
      in which the node has no memory.
      
      Tested on the UV hardware simulator.
      
      [ Impact: fix possible runtime crash due to incorrect TLB logic ]
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4ea3c51d
  13. 15 4月, 2009 1 次提交
    • C
      x86: UV: BAU partition-relative distribution map · 94ca8e48
      Cliff Wickman 提交于
      This patch enables each partition's BAU distribution bit map
      to be partition-relative.
      
      The distribution bitmap had been constructed assuming 0 as the base
      node number.  That construct would not have allowed a total system of
      greater than 256 nodes.
      It also corrects an error that occurred when the first blade's nasid
      was not zero.  That nasid was stored as the base node.
      The base node number gets added by hardware to the node numbers implied
      in the distribution bitmap, resulting in invalid target nasids.
      
      Tested on the UV hardware simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1Ltl0C-0004Ob-37@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94ca8e48
  14. 04 4月, 2009 2 次提交
    • C
      x86: UV BAU messaging timeouts · c4c4688f
      Cliff Wickman 提交于
      This patch replaces a 'nop' uv_enable_timeouts() in the
      UV TLB shootdown code. (somehow, long ago that function got
      eviscerated)
      
      If any cpu in the destination node does not get interrupted by the
      message and post completion in a reasonable time the hardware
      should respond to the sender with an error.  This function
      enables such timeouts.
      
      Tested on the UV hardware simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1LpjXU-00007e-Qh@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c4c4688f
    • C
      x86: UV BAU and nodes with no memory · 9674f35b
      Cliff Wickman 提交于
      This patch fixes BAU initialization for systems containing
      nodes with no memory and for systems with non-consecutive
      node numbers.
      
      Fixes and clarifies situations where pnode should be used instead
      of node id.
      
      Tested on the UV hardware simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1LpjX3-00007N-12@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9674f35b
  15. 18 3月, 2009 1 次提交
  16. 13 3月, 2009 1 次提交
  17. 08 3月, 2009 1 次提交
    • C
      x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1
      Cliff Wickman 提交于
      In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
      the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
      
      And CONFIG_PREEMPT is not enabled by default in the distribution that
      most UV owners will use.
      
      We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
      And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
      is not on.
      
      As Ingo commented:
      
        > and we have no proper primitive to test for atomicity. (mainly
        > because we dont know about atomicity on a non-preempt kernel)
      
      So we drop the WARN_ON.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a450de1
  18. 18 2月, 2009 2 次提交
  19. 31 1月, 2009 1 次提交
  20. 29 1月, 2009 1 次提交
  21. 21 1月, 2009 1 次提交
    • T
      x86: uv cleanup · bdbcdd48
      Tejun Heo 提交于
      Impact: cleanup
      
      Make the following uv related cleanups.
      
      * collect visible uv related definitions and interfaces into uv/uv.h
        and use it.  this cleans up the messy situation where on 64bit, uv
        is defined properly, on 32bit generic it's dummy and on the rest
        undefined.  after this clean up, uv is defined on 64 and dummy on
        32.
      
      * update uv_flush_tlb_others() such that it takes cpumask of
        to-be-flushed cpus as argument, instead of that minus self, and
        returns yet-to-be-flushed cpumask, instead of modifying the passed
        in parameter.  this interface change will ease dummy implementation
        of uv_flush_tlb_others() and makes uv tlb flush related stuff
        defined in tlb_uv proper.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bdbcdd48
  22. 16 1月, 2009 1 次提交
  23. 12 1月, 2009 1 次提交
    • R
      x86: change flush_tlb_others to take a const struct cpumask · 4595f962
      Rusty Russell 提交于
      Impact: reduce stack usage, use new cpumask API.
      
      This is made a little more tricky by uv_flush_tlb_others which
      actually alters its argument, for an IPI to be sent to the remaining
      cpus in the mask.
      
      I solve this by allocating a cpumask_var_t for this case and falling back
      to IPI should this fail.
      
      To eliminate temporaries in the caller, all flush_tlb_others implementations
      now do the this-cpu-elimination step themselves.
      
      Note also the curious "cpus_or(f->flush_cpumask, cpumask, f->flush_cpumask)"
      which has been there since pre-git and yet f->flush_cpumask is always zero
      at this point.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      4595f962
  24. 03 1月, 2009 1 次提交
    • C
      x86, UV: remove erroneous BAU initialization · 46814dde
      Cliff Wickman 提交于
      Impact: fix crash on x86/UV
      
      UV is the SGI "UltraViolet" machine, which is x86_64 based.
      BAU is the "Broadcast Assist Unit", used for TLB shootdown in UV.
      
      This patch removes the allocation and initialization of an unused table.
      
      This table is left over from a development test mode.  It is unused in
      the present code.
      
      And it was incorrectly initialized: 8 entries allocated but 17 initialized,
      causing slab corruption.
      
      This patch should go into 2.6.27 and 2.6.28 as well as the current tree.
      
      Diffed against 2.6.28 (linux-next, 12/30/08)
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      46814dde
  25. 11 11月, 2008 1 次提交
  26. 22 10月, 2008 1 次提交
  27. 20 8月, 2008 1 次提交
  28. 08 7月, 2008 3 次提交
    • C
      x86, SGI UV: uv_ptc_proc_write fix · e7eb8726
      Cliff Wickman 提交于
      Someone could write 0 bytes to /proc/sgi_uv/ptc_statistics,
      causing
        optstr[count - 1] = '\0';
      to write to who-knows-where.
      
      (Andi Kleen noticed this need from a patch I sent for
       similar code in the ia64 world (sn2_ptc_proc_write()).)
      
      (count less than zero is not possible here, as count is unsigned)
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7eb8726
    • C
      x86, SGI UV: TLB shootdown using broadcast assist unit, v6 · cef53278
      Cliff Wickman 提交于
      v6: 6/19 close the security hole in uv_ptc_proc_write())
      
        > Found a potential security hole while doing that:
        > static ssize_t uv_ptc_proc_write(struct file *file, const char __user *user,
        >                              size_t count, loff_t *data)
        >     if (copy_from_user(optstr, user, count))
        >             return -EFAULT;
        >
        > is count guaranteed to never be larger than 64?
      
      is fixed below.
      
      It adds tlb_uv.o to the Makefile.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: mingo@elte.hu
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cef53278
    • I
      SGI UV: TLB shootdown using broadcast assist unit, fix · d400524a
      Ingo Molnar 提交于
      fix:
      
      arch/x86/kernel/tlb_uv.c: In function ‘uv_table_bases_init':
      arch/x86/kernel/tlb_uv.c:612: error: ‘bau_tabsp' undeclared (first use in this function)
      arch/x86/kernel/tlb_uv.c:612: error: (Each undeclared identifier is reported only once
      arch/x86/kernel/tlb_uv.c:612: error: for each function it appears in.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d400524a