1. 04 10月, 2008 1 次提交
    • T
      svcrdma: Add Fast Reg MR Data Types · 0d3ebb9a
      Tom Tucker 提交于
      Add data types to track Fast Reg Memory Regions. The core data type is
      svc_rdma_fastreg_mr that associates a device MR with a host kva and page
      list. A field is added to the WR context to keep track of the FRMR
      used to map the local memory for an RPC.
      
      An FRMR list and spin lock are added to the transport instance to keep
      track of all FRMR allocated for the transport. Also added are device
      capability flags to indicate what the memory registration
      capabilities are for the underlying device and whether or not fast
      memory registration is supported.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      0d3ebb9a
  2. 14 9月, 2008 4 次提交
    • A
      memstick: fix MSProHG 8-bit interface mode support · 8e82f8c3
      Alex Dubov 提交于
      - 8-bit interface mode never worked properly.  The only adapter I have
        which supports the 8b mode (the Jmicron) had some problems with its
        clock wiring and they discovered it only now.  We also discovered that
        ProHG media is more sensitive to the ordering of initialization
        commands.
      
      - Make the driver fall back to highest supported mode instead of always
        falling back to serial.  The driver will attempt the switch to 8b mode
        for any new MSPro card, but not all of them support it.  Previously,
        these new cards ended up in serial mode, which is not the best idea
        (they work fine with 4b, after all).
      
      - Edit some macros for better conformance to Sony documentation
      Signed-off-by: NAlex Dubov <oakad@yahoo.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e82f8c3
    • M
      mm: mark the correct zone as full when scanning zonelists · 5bead2a0
      Mel Gorman 提交于
      The iterator for_each_zone_zonelist() uses a struct zoneref *z cursor when
      scanning zonelists to keep track of where in the zonelist it is.  The
      zoneref that is returned corresponds to the the next zone that is to be
      scanned, not the current one.  It was intended to be treated as an opaque
      list.
      
      When the page allocator is scanning a zonelist, it marks elements in the
      zonelist corresponding to zones that are temporarily full.  As the
      zonelist is being updated, it uses the cursor here;
      
        if (NUMA_BUILD)
              zlc_mark_zone_full(zonelist, z);
      
      This is intended to prevent rescanning in the near future but the zoneref
      cursor does not correspond to the zone that has been found to be full.
      This is an easy misunderstanding to make so this patch corrects the
      problem by changing zoneref cursor to be the current zone being scanned
      instead of the next one.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.26.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5bead2a0
    • H
      include/linux/ioport.h: add missing macro argument for devm_release_* family · dea420ce
      Hiroshi DOYU 提交于
      akpm: these have no callers at this time, but they shall soon, so let's
      get them right.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NHiroshi DOYU <Hiroshi.DOYU@nokia.com>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dea420ce
    • T
      [libata] LBA28/LBA48 off-by-one bug in ata.h · 97b697a1
      Taisuke Yamada 提交于
      I recently bought 3 HGST P7K500-series 500GB SATA drives and
      had trouble accessing the block right on the LBA28-LBA48 border.
      Here's how it fails (same for all 3 drives):
      
        # dd if=/dev/sdc bs=512 count=1 skip=268435455 > /dev/null
        dd: reading `/dev/sdc': Input/output error
        0+0 records in
        0+0 records out
        0 bytes (0 B) copied, 0.288033 seconds, 0.0 kB/s
        # dmesg
        ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
        ata1.00: BMDMA stat 0x25
        ata1.00: cmd c8/00:08:f8:ff:ff/00:00:00:00:00/ef tag 0 dma 4096 in
        res 51/04:08:f8:ff:ff/00:00:00:00:00/ef Emask 0x1 (device error)
        ata1.00: status: { DRDY ERR }
        ata1.00: error: { ABRT }
        ata1.00: configured for UDMA/33
        ata1: EH complete
        ...
      
      After some investigations, it turned out this seems to be caused
      by misinterpretation of the ATA specification on LBA28 access.
      Following part is the code in question:
      
        === include/linux/ata.h ===
        static inline int lba_28_ok(u64 block, u32 n_block)
        {
          /* check the ending block number */
          return ((block + n_block - 1) < ((u64)1 << 28)) && (n_block <= 256);
        }
      
      HGST drive (sometimes) fails with LBA28 access of {block = 0xfffffff,
      n_block = 1}, and this behavior seems to be comformant. Other drives,
      including other HGST drives are not that strict, through.
      
      >From the ATA specification:
      (http://www.t13.org/Documents/UploadedDocuments/project/d1410r3b-ATA-ATAPI-6.pdf)
      
        8.15.29  Word (61:60): Total number of user addressable sectors
        This field contains a value that is one greater than the total number
        of user addressable sectors (see 6.2). The maximum value that shall
        be placed in this field is 0FFFFFFFh.
      
      So the driver shouldn't use the value of 0xfffffff for LBA28 request
      as this exceeds maximum user addressable sector. The logical maximum
      value for LBA28 is 0xffffffe.
      
      The obvious fix is to cut "- 1" part, and the patch attached just do
      that. I've been using the patched kernel for about a month now, and
      the same fix is also floating on the net for some time. So I believe
      this fix works reliably.
      
      Just FYI, many Windows/Intel platform users also seems to be struck
      by this, and HGST has issued a note pointing to Intel ICH8/9 driver.
      
        "28-bit LBA command is being used to access LBAs 29-bits in length"
      http://www.hitachigst.com/hddt/knowtree.nsf/cffe836ed7c12018862565b000530c74/b531b8bce8745fb78825740f00580e23
      
      Also, *BSDs seems to have similar fix included sometime around ~2004,
      through I have not checked out exact portion of the code.
      Signed-off-by: NTaisuke Yamada <tai@rakugaki.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      97b697a1
  3. 11 9月, 2008 1 次提交
  4. 07 9月, 2008 1 次提交
    • M
      sched: arch_reinit_sched_domains() must destroy domains to force rebuild · dfb512ec
      Max Krasnyansky 提交于
      What I realized recently is that calling rebuild_sched_domains() in
      arch_reinit_sched_domains() by itself is not enough when cpusets are enabled.
      partition_sched_domains() code is trying to avoid unnecessary domain rebuilds
      and will not actually rebuild anything if new domain masks match the old ones.
      
      What this means is that doing
           echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
      on a system with cpusets enabled will not take affect untill something changes
      in the cpuset setup (ie new sets created or deleted).
      
      This patch fixes restore correct behaviour where domains must be rebuilt in
      order to enable MC powersaving flags.
      
      Test on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.
      Also tested on dual-core Core2 laptop. Lockdep is happy and things are working
      as expected.
      Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com>
      Tested-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dfb512ec
  5. 06 9月, 2008 3 次提交
  6. 05 9月, 2008 2 次提交
    • K
      Fix conditional export of kvh.h and a.out.h to userspace. · afbc8d8e
      Khem Raj 提交于
      Some architectures have moved the asm/ into arch/ and some have not.
      This patch checks for a.out.h and kvh.h in both places before exporting
      the corresponding file from linux/
      
      [dwmw2: simplified a little]
      Signed-off-by: NKhem Raj <raj.khem@gmail.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      afbc8d8e
    • V
      clockevents: prevent clockevent event_handler ending up handler_noop · 7c1e7689
      Venkatesh Pallipadi 提交于
      There is a ordering related problem with clockevents code, due to which
      clockevents_register_device() called after tickless/highres switch
      will not work. The new clockevent ends up with clockevents_handle_noop as
      event handler, resulting in no timer activity.
      
      The problematic path seems to be
      
      * old device already has hrtimer_interrupt as the event_handler
      * new clockevent device registers with a higher rating
      * tick_check_new_device() is called
        * clockevents_exchange_device() gets called
          * old->event_handler is set to clockevents_handle_noop
        * tick_setup_device() is called for the new device
          * which sets new->event_handler using the old->event_handler which is noop.
      
      Change the ordering so that new device inherits the proper handler.
      
      This does not have any issue in normal case as most likely all the clockevent
      devices are setup before the highres switch. But, can potentially be affecting
      some corner case where HPET force detect happens after the highres switch.
      This was a problem with HPET in MSI mode code that we have been experimenting
      with.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7c1e7689
  7. 04 9月, 2008 4 次提交
  8. 03 9月, 2008 3 次提交
  9. 01 9月, 2008 1 次提交
    • V
      debugobjects: fix lockdep warning · 673d62cc
      Vegard Nossum 提交于
      Daniel J. Blueman reported:
      > =======================================================
      > [ INFO: possible circular locking dependency detected ]
      > 2.6.27-rc4-224c #1
      > -------------------------------------------------------
      > hald/4680 is trying to acquire lock:
      >  (&n->list_lock){++..}, at: [<ffffffff802bfa26>] add_partial+0x26/0x80
      >
      > but task is already holding lock:
      >  (&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>]
      > debug_object_free+0x5c/0x120
      
      We fix it by moving the actual freeing to outside the lock (the lock
      now only protects the list).
      
      The pool lock is also promoted to irq-safe (suggested by Dan). It's
      necessary because free_pool is now called outside the irq disabled
      region. So we need to protect against an interrupt handler which calls
      debug_object_init().
      
      [tglx@linutronix.de: added hlist_move_list helper to avoid looping
      		     through the list twice]
      Reported-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      673d62cc
  10. 30 8月, 2008 2 次提交
    • L
      Resource handling: add 'insert_resource_expand_to_fit()' function · bef69ea0
      Linus Torvalds 提交于
      Not used anywhere yet, but this complements the existing plain
      'insert_resource()' functionality with a version that can expand the
      resource we are adding in order to fix up any conflicts it has with
      existing resources.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bef69ea0
    • D
      net: Unbreak userspace usage of linux/mroute.h · 7c19a3d2
      David S. Miller 提交于
      Nothing in linux/pim.h should be exported to userspace.
      
      This should fix the XORP build failure reported by
      Jose Calhariz, the debain package maintainer.
      
      Nothing originally in linux/mroute.h was exported to userspace
      ever, but some of this stuff started to be when it was moved into
      this new linux/pim.h, and that was wrong.  If we didn't provide these
      definitions for 10 years we can reasonably expect that applications
      defined this stuff locally or used GLIBC headers providing the
      protocol definitions.  And as such the only result of this can
      be conflict and userland build breakage.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c19a3d2
  11. 28 8月, 2008 1 次提交
  12. 27 8月, 2008 4 次提交
  13. 25 8月, 2008 3 次提交
  14. 24 8月, 2008 1 次提交
  15. 22 8月, 2008 6 次提交
    • A
      libata: Fix a large collection of DMA mode mismatches · b15b3eba
      Alan Cox 提交于
      Dave Müller sent a diff for the pata_oldpiix that highlighted a problem
      where a lot of the ATA drivers assume dma_mode == 0 means "no DMA" while
      the core code uses 0xFF.
      
      This turns out to have other consequences such as code doing >= XFER_UDMA_0
      also catching 0xFF as UDMAlots. Fortunately it doesn't generally affect
      set_dma_mode, although some drivers call back into their own set mode code
      from other points.
      
      Having been through the drivers I've added helpers for using_udma/using_mwdma
      dma_enabled so that people don't open code ranges that may change (eg if UDMA8
      appears somewhere)
      
      Thanks to David for the initial bits
      [and added fix for pata_oldpiix from and signed-off-by Dave Mueller
       <dave.mueller@gmx.ch>  -jg]
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      b15b3eba
    • T
      libata: restore SControl on detach · d127ea7b
      Tejun Heo 提交于
      Save SControl during probing and restore it on detach.  This prevents
      adjustments made by libata drivers to seep into the next driver which
      gets attached (be it a libata one or not).
      
      It's not clear whether SControl also needs to be restored on suspend.
      The next system to have control (ACPI or kexec'd kernel) would
      probably like to see the original SControl value but there's no
      guarantee that a link is gonna keep working after SControl is adjusted
      without a reset and adding a reset and modified recovery cycle soley
      for this is an overkill.  For now, do it only for detach.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      d127ea7b
    • T
      libata: implement no[hs]rst force params · 05944bdf
      Tejun Heo 提交于
      Implement force params nohrst, nosrst and norst.  This is to work
      around reset related problems and ease debugging.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      05944bdf
    • A
      USB: Defer Set-Interface for suspended devices · 55151d7d
      Alan Stern 提交于
      This patch (as1128) fixes one of the problems related to the new PM
      infrastructure.  We are not allowed to register new child devices
      during the middle of a system sleep transition, but unbinding a USB
      driver causes the core to automatically install altsetting 0 and
      thereby create new endpoint pseudo-devices.
      
      The patch fixes this problem (and the related problem that installing
      altsetting 0 will fail if the device is suspended) by deferring the
      Set-Interface call until some later time when it is legal and can
      succeed.  Possible later times are: when a new driver is being probed
      for the interface, and when the interface is being resumed.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      55151d7d
    • G
      driver core: add init_name to struct device · c906a48a
      Greg Kroah-Hartman 提交于
      This gives us a way to handle both the bus_id and init_name values being
      used for a while during the transition period.
      
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      c906a48a
    • J
      dev_printk(): constify the `dev' argument · bf9ca69f
      Jean Delvare 提交于
      Add const markings to dev_name and dev_driver_string to make it clear that
      dev_printk doesn't modify dev.  This is a prerequisite to adding more
      const markings to other functions make it clearer, which functions can
      modify dev and which can't.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      bf9ca69f
  16. 21 8月, 2008 3 次提交
    • M
      nohz: fix wrong event handler after online an offlined cpu · 3c4fbe5e
      Miao Xie 提交于
      On the tickless system(CONFIG_NO_HZ=y and CONFIG_HIGH_RES_TIMERS=n), after
      I made an offlined cpu online, I found this cpu's event handler was
      tick_handle_periodic, not tick_nohz_handler.
      
      After debuging, I found this bug was caused by the wrong tick mode.  the
      tick mode is not changed to NOHZ_MODE_INACTIVE when the cpu is offline.
      
      This patch fixes this bug.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c4fbe5e
    • I
      fbdefio: add set_page_dirty handler to deferred IO FB · d847471d
      Ian Campbell 提交于
      Fixes kernel BUG at lib/radix-tree.c:473.
      
      Previously the handler was incidentally provided by tmpfs but this was
      removed with:
      
        commit 14fcc23f
        Author: Hugh Dickins <hugh@veritas.com>
        Date:   Mon Jul 28 15:46:19 2008 -0700
      
          tmpfs: fix kernel BUG in shmem_delete_inode
      
      relying on this behaviour was incorrect in any case and the BUG also
      appeared when the device node was on an ext3 filesystem.
      
      v2: override a_ops at open() time rather than mmap() time to minimise
      races per AKPM's concerns.
      Signed-off-by: NIan Campbell <ijc@hellion.org.uk>
      Cc: Jaya Kumar <jayakumar.lkml@gmail.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Johannes Weiner <hannes@saeurebad.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Kel Modderman <kel@otaku42.de>
      Cc: Markus Armbruster <armbru@redhat.com>
      Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
      Cc: <stable@kernel.org> [14fcc23f is in 2.6.25.14 and 2.6.26.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d847471d
    • N
      mm: dirty page tracking race fix · 479db0bf
      Nick Piggin 提交于
      There is a race with dirty page accounting where a page may not properly
      be accounted for.
      
      clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.
      
      page_mkclean walks the rmaps for that page, and for each one it cleans and
      write protects the pte if it was dirty.  It uses page_check_address to
      find the pte.  That function has a shortcut to avoid the ptl if the pte is
      not present.  Unfortunately, the pte can be switched to not-present then
      back to present by other code while holding the page table lock -- this
      should not be a signal for page_mkclean to ignore that pte, because it may
      be dirty.
      
      For example, powerpc64's set_pte_at will clear a previously present pte
      before setting it to the desired value.  There may also be other code in
      core mm or in arch which do similar things.
      
      The consequence of the bug is loss of data integrity due to msync, and
      loss of dirty page accounting accuracy.  XIP's __xip_unmap could easily
      also be unreliable (depending on the exact XIP locking scheme), which can
      lead to data corruption.
      
      Fix this by having an option to always take ptl to check the pte in
      page_check_address.
      
      It's possible to retain this optimization for page_referenced and
      try_to_unmap.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Jared Hulbert <jaredeh@gmail.com>
      Cc: Carsten Otte <cotte@freenet.de>
      Cc: Hugh Dickins <hugh@veritas.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      479db0bf