1. 01 11月, 2007 1 次提交
    • V
      [POWERPC] 4xx: Workaround for the 440EP(x)/GR(x) processors identical PVR issue. · d1dfc35d
      Valentine Barshak 提交于
      PowerPC 440EP(x) 440GR(x) processors have the same PVR values, since
      they have identical cores. However, FPU is not supported on GR(x) and
      enabling APU instruction broadcast in the CCR0 register (to enable FPU)
      may cause unpredictable results. There's no safe way to detect FPU
      support at runtime. This patch provides a workarund for the issue.
      
      We use a POWER6 "logical PVR approach". First, we identify all EP(x)
      and GR(x) processors as GR(x) ones (which is safe). Then we check
      the device tree cpu path. If we have a EP(x) processor entry,
      we call identify_cpu again with PVR | 0x8. This bit is always 0
      in the real PVR. This way we enable FPU only for 440EP(x).
      Signed-off-by: NValentine Barshak <vbarshak@ru.mvista.com>
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      d1dfc35d
  2. 23 10月, 2007 1 次提交
  3. 20 10月, 2007 5 次提交
  4. 19 10月, 2007 1 次提交
    • M
      powerpc: add scaled time accounting · 4603ac18
      Michael Neuling 提交于
      This adds POWERPC specific hooks for scaled time accounting.
      
      POWER6 includes a SPURR register.  The SPURR is based off the PURR register
      but is scaled based on CPU frequency and issue rates.  This gives a more
      accurate account of the instructions used per task.  The PURR and timebase
      will be constant relative to the wall clock, irrespective of the CPU
      frequency.
      
      This implementation reads the SPURR register in account_system_vtime which
      is only call called on context witch and hard and soft irq entry and exit.
      The percentage of user and system time is then estimated using the ratio of
      these accounted by the PURR.  If the SPURR is not present, the PURR read.
      
      An earlier implementation of this patch read the SPURR whenever the PURR
      was read, which included the system call entry and exit path.
      Unfortunately this showed a performance regression on lmbench runs, so was
      re-implemented.
      
      I've included the lmbench results here when run bare metal on POWER6.  1st
      column is the unpatch results.  2nd column is the results using the below
      patch and the 3rd is the % diff of these results from the base.  4th and
      5th columns are the results and % differnce from the base using the older
      patch (SPURR read in syscall entry/exit path).
      
                                    Base        Scaled-Acct     SPURR-in-syscall
                                   Result      Result  % diff    Result % diff
      Simple syscall:              0.3086      0.3086  0.0000    0.3452 11.8600
      Simple read:                 0.4591      0.4671  1.7425    0.5044 9.86713
      Simple write:                0.4364      0.4366  0.0458    0.4731 8.40971
      Simple stat:                 2.0055      2.0295  1.1967    2.0669 3.06158
      Simple fstat:                0.5962      0.5876  -1.442    0.6368 6.80979
      Simple open/close:           3.1283      3.1009  -0.875    3.2088 2.57328
      Select on 10 fd's:           0.8554      0.8457  -1.133    0.8667 1.32101
      Select on 100 fd's:          3.5292      3.6329  2.9383    3.6664 3.88756
      Select on 250 fd's:          7.9097      8.1881  3.5197    8.2242 3.97613
      Select on 500 fd's:          15.2659     15.836  3.7357    15.873 3.97814
      Select on 10 tcp fd's:       0.9576      0.9416  -1.670    0.9752 1.83792
      Select on 100 tcp fd's:      7.248       7.2254  -0.311    7.2685 0.28283
      Select on 250 tcp fd's:      17.7742     17.707  -0.375    17.749 -0.1406
      Select on 500 tcp fd's:      35.4258     35.25   -0.496    35.286 -0.3929
      Signal handler installation: 0.6131      0.6075  -0.913    0.647  5.52927
      Signal handler overhead:     2.0919      2.1078  0.7600    2.1831 4.35967
      Protection fault:            0.7345      0.7478  1.8107    0.8031 9.33968
      Pipe latency:                33.006      16.398  -50.31    33.475 1.42368
      AF_UNIX sock stream latency: 14.5093     30.910  113.03    30.715 111.692
      Process fork+exit:           219.8       222.8   1.3648    229.37 4.35623
      Process fork+execve:         876.14      873.28  -0.32     868.66 -0.8533
      Process fork+/bin/sh -c:     2830        2876.5  1.6431    2958   4.52296
      File /var/tmp/XXX write bw:  1193497     1195536 0.1708    118657 -0.5799
      Pagefaults on /var/tmp/XXX:  3.1272      3.2117  2.7020    3.2521 3.99398
      
      Also, kernel compile times show no difference with this patch applied.
      
      [pbadari@us.ibm.com: Avoid unnecessary PURR reading]
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4603ac18
  5. 17 10月, 2007 17 次提交
  6. 16 10月, 2007 1 次提交
  7. 13 10月, 2007 1 次提交
    • K
      Driver core: change add_uevent_var to use a struct · 7eff2e7a
      Kay Sievers 提交于
      This changes the uevent buffer functions to use a struct instead of a
      long list of parameters. It does no longer require the caller to do the
      proper buffer termination and size accounting, which is currently wrong
      in some places. It fixes a known bug where parts of the uevent
      environment are overwritten because of wrong index calculations.
      
      Many thanks to Mathieu Desnoyers for finding bugs and improving the
      error handling.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      7eff2e7a
  8. 12 10月, 2007 5 次提交
  9. 11 10月, 2007 8 次提交
    • P
      [POWERPC] Make clockevents work on PPC601 processors · cdec12ae
      Paul Mackerras 提交于
      In testing the new clocksource and clockevent code on a PPC601
      processor, I discovered that the clockevent multiplier value for the
      decrementer clockevent was overflowing.  Because the RTCL register in
      the 601 effectively counts at 1GHz (it doesn't actually, but it
      increases by 128 every 128ns), and the shift value was 32, that meant
      the multiplier value had to be 2^32, which won't fit in an unsigned
      long on 32-bit.  The same problem would arise on any platform where
      the timebase frequency was 1GHz or more (not that we actually have any
      such machines today).
      
      This fixes it by reducing the shift value to 16.  Doing the
      calculations with a resolution of 2^-16 nanoseconds (15 femtoseconds)
      should be quite adequate.  :)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cdec12ae
    • P
      [POWERPC] Prevent decrementer clockevents from firing early · d968014b
      Paul Mackerras 提交于
      On old powermacs, we sometimes set the decrementer to 1 in order to
      trigger a decrementer interrupt, which we use to handle an interrupt
      that was pending at the time when it was re-enabled.  This was causing
      the decrementer clock event device to call the event function for the
      next event early, which was causing problems when high-res timers were
      not enabled.
      
      This fixes the problem by recording the timebase value at which the
      next event should occur, and checking the current timebase against the
      recorded value in timer_interrupt.  If it isn't time for the next
      event, it just reprograms the decrementer and returns.
      
      This also subtracts 1 from the value stored into the decrementer,
      which is appropriate because the decrementer interrupts on the
      transition from 0 to -1, not when the decrementer reaches 0.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d968014b
    • P
      [POWERPC] Fix performance monitor on machines with logical PVR · 87a72f9e
      Paul Mackerras 提交于
      Some IBM machines supply a "logical" PVR (processor version register)
      value in the device tree in the cpu nodes rather than the real PVR.
      This is used for instance to indicate that the processors in a POWER6
      partition have been configured by the hypervisor to run in POWER5+
      mode rather than POWER6 mode.  To cope with this, we call identify_cpu
      a second time with the logical PVR value (the first call is with the
      real PVR value in the very early setup code).
      
      However, POWER5+ machines can also supply a logical PVR value, and use
      the same value (the value that indicates a v2.04 architecture
      compliant processor).  This causes problems for code that uses the
      performance monitor (such as oprofile), because the PMU registers are
      different in POWER6 (even in POWER5+ mode) from the real POWER5+.
      
      This change works around this problem by taking out the PMU
      information from the cputable entries for the logical PVR values, and
      changing identify_cpu so that the second call to it won't overwrite
      the PMU information that was established by the first call (the one
      with the real PVR), but does update the other fields.  Specifically,
      if the cputable entry for the logical PVR value has num_pmcs == 0,
      none of the PMU-related fields get used.
      
      So that we can create a mixed cputable entry, we now make cur_cpu_spec
      point to a single static struct cpu_spec, and copy stuff from
      cpu_specs[i] into it.  This has the side-effect that we can now make
      cpu_specs[] be initdata.
      
      Ultimately it would be good to move the PMU-related fields out to a
      separate structure, pointed to by the cputable entries, and change
      identify_cpu so that it saves the PMU info pointer, copies the whole
      structure, and restores the PMU info pointer, rather than identify_cpu
      having to list all the fields that are *not* PMU-related.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      87a72f9e
    • S
      [POWERPC] iSeries: Move detection of virtual cdroms · b833b481
      Stephen Rothwell 提交于
      Now we will only have entries in the device tree for the actual existing
      devices (including their OS/400 properties).  This way viocd.c gets all
      the information about the devices from the device tree.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b833b481
    • S
    • S
      [POWERPC] Remove iSeries_vio_dev · 1670b2b2
      Stephen Rothwell 提交于
      It was only being used to carry around dma_iommu_ops and vio_iommu_table
      which we can use directly instead.  This also means that vio_bus_device
      doesn't need to refer to them either.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1670b2b2
    • S
      [POWERPC] Clean up vio.h · b707f517
      Stephen Rothwell 提交于
      Remove vio_dma_ops declaration (since it no longer exists) and some
      unused fields from struct vio_driver.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b707f517
    • G
      [POWERPC] Only call ppc_md.setup_arch() if it is provided · 38db7e74
      Grant Likely 提交于
      This allows platforms which don't have anything to do at setup_arch time
      (like a bunch of the 4xx platforms) to eliminate an empty setup_arch hook.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      38db7e74