1. 16 11月, 2008 2 次提交
    • A
      Fix inotify watch removal/umount races · 8f7b0ba1
      Al Viro 提交于
      Inotify watch removals suck violently.
      
      To kick the watch out we need (in this order) inode->inotify_mutex and
      ih->mutex.  That's fine if we have a hold on inode; however, for all
      other cases we need to make damn sure we don't race with umount.  We can
      *NOT* just grab a reference to a watch - inotify_unmount_inodes() will
      happily sail past it and we'll end with reference to inode potentially
      outliving its superblock.
      
      Ideally we just want to grab an active reference to superblock if we
      can; that will make sure we won't go into inotify_umount_inodes() until
      we are done.  Cleanup is just deactivate_super().
      
      However, that leaves a messy case - what if we *are* racing with
      umount() and active references to superblock can't be acquired anymore?
      We can bump ->s_count, grab ->s_umount, which will almost certainly wait
      until the superblock is shut down and the watch in question is pining
      for fjords.  That's fine, but there is a problem - we might have hit the
      window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
      the moment when superblock is past the point of no return and is heading
      for shutdown) and the moment when deactivate_super() acquires
      ->s_umount.
      
      We could just do drop_super() yield() and retry, but that's rather
      antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
      ->s_umount and having found that we'd got there first (i.e.  that
      ->s_root is non-NULL) we know that we won't race with
      inotify_umount_inodes().
      
      So we could grab a reference to watch and do the rest as above, just
      with drop_super() instead of deactivate_super(), right? Wrong.  We had
      to drop ih->mutex before we could grab ->s_umount.  So the watch
      could've been gone already.
      
      That still can be dealt with - we need to save watch->wd, do idr_find()
      and compare its result with our pointer.  If they match, we either have
      the damn thing still alive or we'd lost not one but two races at once,
      the watch had been killed and a new one got created with the same ->wd
      at the same address.  That couldn't have happened in inotify_destroy(),
      but inotify_rm_wd() could run into that.  Still, "new one got created"
      is not a problem - we have every right to kill it or leave it alone,
      whatever's more convenient.
      
      So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
      "grab it and kill it" check.  If it's been our original watch, we are
      fine, if it's a newcomer - nevermind, just pretend that we'd won the
      race and kill the fscker anyway; we are safe since we know that its
      superblock won't be going away.
      
      And yes, this is far beyond mere "not very pretty"; so's the entire
      concept of inotify to start with.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Acked-by: NGreg KH <greg@kroah.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f7b0ba1
    • M
      Add 'pr_fmt()' format modifier to pr_xyz macros. · d091c2f5
      Martin Schwidefsky 提交于
      A common reason for device drivers to implement their own printk macros
      is the lack of a printk prefix with the standard pr_xyz macros.
      Introduce a pr_fmt() macro that is applied for every pr_xyz macro to the
      format string.
      
      The most common use of the pr_fmt macro would be to add the name of the
      device driver to all pr_xyz messages in a source file.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d091c2f5
  2. 14 11月, 2008 2 次提交
  3. 13 11月, 2008 4 次提交
  4. 12 11月, 2008 3 次提交
  5. 11 11月, 2008 3 次提交
  6. 10 11月, 2008 2 次提交
  7. 09 11月, 2008 1 次提交
  8. 08 11月, 2008 1 次提交
    • T
      ACPI video: if no ACPI backlight support, use vendor drivers · c3d6de69
      Thomas Renninger 提交于
      If an ACPI graphics device supports backlight brightness functions (cmp. with
      latest ACPI spec Appendix B), let the ACPI video driver control backlight and
      switch backlight control off in vendor specific ACPI drivers (asus_acpi,
      thinkpad_acpi, eeepc, fujitsu_laptop, msi_laptop, sony_laptop, acer-wmi).
      
      Currently it is possible to load above drivers and let both poke on the
      brightness HW registers, the video and vendor specific ACPI drivers -> bad.
      
      This patch provides the basic support to check for BIOS capabilities before
      driver loading time. Driver specific modifications are in separate follow up
      patches.
      
      "acpi_backlight=vendor"
      	Prever vendor driver over ACPI driver for backlight.
      "acpi_backlight=video" (default)
      	Prever ACPI driver over vendor driver for backlight.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NZhang Rui <rui.zhang@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      c3d6de69
  9. 07 11月, 2008 9 次提交
  10. 06 11月, 2008 4 次提交
    • R
      cpumask: introduce new API, without changing anything · 2d3854a3
      Rusty Russell 提交于
      Impact: introduce new APIs
      
      We want to deprecate cpumasks on the stack, as we are headed for
      gynormous numbers of CPUs.  Eventually, we want to head towards an
      undefined 'struct cpumask' so they can never be declared on stack.
      
      1) New cpumask functions which take pointers instead of copies.
         (cpus_* -> cpumask_*)
      
      2) Several new helpers to reduce requirements for temporary cpumasks
         (cpumask_first_and, cpumask_next_and, cpumask_any_and)
      
      3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
         (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)
      
      4) 'struct cpumask' for explicitness and to mark new-style code.
      
      5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
         not NR_CPUS for time efficiency and for smaller dynamic allocations
         in future.
      
      6) cpumask_copy() so we can allocate less than a full cpumask eventually
         (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
         definition eventually.
      
      7) work_on_cpu() helper for doing task on a CPU, rather than saving old
         cpumask for current thread and manipulating it.
      
      8) smp_call_function_many() which is smp_call_function_mask() except
         taking a cpumask pointer.
      
      Note that this patch simply introduces the new functions and leaves
      the obsolescent ones in place.  This is to simplify the transition
      patches.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d3854a3
    • A
      Add round_jiffies_up and related routines · 9c133c46
      Alan Stern 提交于
      This patch (as1158b) adds round_jiffies_up() and friends.  These
      routines work like the analogous round_jiffies() functions, except
      that they will never round down.
      
      The new routines will be useful for timeouts where we don't care
      exactly when the timer expires, provided it doesn't expire too soon.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9c133c46
    • J
      bio: define __BIOVEC_PHYS_MERGEABLE · f92131c3
      Jeremy Fitzhardinge 提交于
      Define __BIOVEC_PHYS_MERGEABLE as the default implementation of
      BIOVEC_PHYS_MERGEABLE, so that its available for reuse within an
      arch-specific definition of BIOVEC_PHYS_MERGEABLE.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f92131c3
    • I
      sched: re-tune balancing · 9fcd18c9
      Ingo Molnar 提交于
      Impact: improve wakeup affinity on NUMA systems, tweak SMP systems
      
      Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
      balancing defaults on NUMA and SMP systems.
      
      Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
      why we would not want to have wakeup affinity across nodes as well.
      (we already do this in the standard NUMA template.)
      
      lat_ctx on a NUMA box is particularly happy about this change:
      
      before:
      
       |   phoenix:~/l> ./lat_ctx -s 0 2
       |   "size=0k ovr=2.60
       |   2 5.70
      
      after:
      
       |   phoenix:~/l> ./lat_ctx -s 0 2
       |   "size=0k ovr=2.65
       |   2 2.07
      
      a 2.75x speedup.
      
      pipe-test is similarly happy about it too:
      
       |  phoenix:~/sched-tests> ./pipe-test
       |   18.26 usecs/loop.
       |   14.70 usecs/loop.
       |   14.38 usecs/loop.
       |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
       |   8.63 usecs/loop.
       |   8.59 usecs/loop.
       |   9.03 usecs/loop.
       |   8.94 usecs/loop.
       |   8.96 usecs/loop.
       |   8.63 usecs/loop.
      
      Also:
      
       - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
       - enable SD_WAKE_BALANCE on SMP domains
      
      Sysbench+postgresql improves all around the board, quite significantly:
      
                 .28-rc3-11474e2c  .28-rc3-11474e2c-tune
      -------------------------------------------------
          1:             571              688    +17.08%
          2:            1236             1206    -2.55%
          4:            2381             2642    +9.89%
          8:            4958             5164    +3.99%
         16:            9580             9574    -0.07%
         32:            7128             8118    +12.20%
         64:            7342             8266    +11.18%
        128:            7342             8064    +8.95%
        256:            7519             7884    +4.62%
        512:            7350             7731    +4.93%
      -------------------------------------------------
        SUM:           55412            59341    +6.62%
      
      So it's a win both for the runup portion, the peak area and the tail.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9fcd18c9
  11. 05 11月, 2008 2 次提交
    • E
      [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4) · 467622ef
      Eric W. Biederman 提交于
      For "unlock" cycles to 16bit devices in 8bit compatibility mode we need
      to use the byte addresses 0xaaa and 0x555. These effectively match
      the word address 0x555 and 0x2aa, except the latter has its low bit set.
      
      Most chips don't care about the value of the 'A-1' pin in x8 mode,
      but some -- like the ST M29W320D -- do. So we need to be careful to
      set it where appropriate.
      
      cfi_send_gen_cmd is only ever passed addresses where the low byte
      is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are
      affected by this patch, by masking in the extra low bit when the device
      is known to be in compatibility mode.
      
      [dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa]
      v4: Fix  stupid typo in cfi_build_cmd_addr that failed to compile
          I'm writing this patch way to late at night.
      v3: Bring all of the work back into cfi_build_cmd_addr
          including calling of map_bankwidth(map) and cfi_interleave(cfi)
          So every caller doesn't need to.
      v2: Only modified the address if we our device_type is larger than our
          bus width.
      
      Cc: stable@kernel.org
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      467622ef
    • P
      net: fix packet socket delivery in rx irq handler · 9b22ea56
      Patrick McHardy 提交于
      The changes to deliver hardware accelerated VLAN packets to packet
      sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
      The __vlan_hwaccel_rx() function is called directly from the drivers
      RX function, for non-NAPI drivers that means its still in RX IRQ
      context:
      
      [   27.779463] ------------[ cut here ]------------
      [   27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
      ...
      [   27.782520]  [<c0264755>] netif_nit_deliver+0x5b/0x75
      [   27.782590]  [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
      [   27.782664]  [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
      [   27.782738]  [<c0155b17>] handle_IRQ_event+0x23/0x51
      [   27.782808]  [<c015692e>] handle_edge_irq+0xc2/0x102
      [   27.782878]  [<c0105fd5>] do_IRQ+0x4d/0x64
      
      Split hardware accelerated VLAN reception into two parts to fix this:
      
      - __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
        device lookup, then calls netif_receive_skb()/netif_rx()
      
      - vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
        in softirq context, performs the real reception and delivery to
        packet sockets.
      Reported-and-tested-by: NRamon Casellas <ramon.casellas@cttc.es>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b22ea56
  12. 04 11月, 2008 3 次提交
  13. 03 11月, 2008 1 次提交
  14. 31 10月, 2008 3 次提交
    • K
      resources: add io-mapping functions to dynamically map large device apertures · 9663f2e6
      Keith Packard 提交于
      Impact: add new generic io_map_*() APIs
      
      Graphics devices have large PCI apertures which would consume a significant
      fraction of a 32-bit address space if mapped during driver initialization.
      Using ioremap at runtime is impractical as it is too slow.
      
      This new set of interfaces uses atomic mappings on 32-bit processors and a
      large static mapping on 64-bit processors to provide reasonable 32-bit
      performance and optimal 64-bit performance.
      
      The current implementation sits atop the io_map_atomic fixmap-based
      mechanism for 32-bit processors.
      
      This includes some editorial suggestions from Randy Dunlap for
      Documentation/io-mapping.txt
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9663f2e6
    • R
      net: delete excess kernel-doc notation · ad1d967c
      Randy Dunlap 提交于
      Remove excess kernel-doc function parameters from networking header
      & driver files:
      
      Warning(include/net/sock.h:946): Excess function parameter or struct member 'sk' description in 'sk_filter_release'
      Warning(include/linux/netdevice.h:1545): Excess function parameter or struct member 'cpu' description in 'netif_tx_lock'
      Warning(drivers/net/wan/z85230.c:712): Excess function parameter or struct member 'regs' description in 'z8530_interrupt'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad1d967c
    • J
      libata: add whitelist for devices with known good pata-sata bridges · 9ce8e307
      Jens Axboe 提交于
      libata currently imposes a UDMA5 max transfer rate and 200 sector max
      transfer size for SATA devices that sit behind a pata-sata bridge. Lots
      of devices have known good bridges that don't need this limit applied.
      The MTRON SSD disks are such devices. Transfer rates are increased by
      20-30% with the restriction removed.
      
      So add a "blacklist" entry for the MTRON devices, with a flag indicating
      that the bridge is known good.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      9ce8e307