1. 30 4月, 2013 3 次提交
    • Y
      mem hotunplug: fix kfree() of bootmem memory · ebff7d8f
      Yasuaki Ishimatsu 提交于
      When hot removing memory presented at boot time, following messages are shown:
      
        kernel BUG at mm/slub.c:3409!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge stp llc ipmi_devintf ipmi_msghandler sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc vfat fat dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr sg i2c_i801 lpc_ich mfd_core igb i2c_algo_bit i2c_core e1000e ptp pps_core tpm_infineon ioatdma dca sr_mod cdrom sd_mod crc_t10dif usb_storage megaraid_sas lpfc scsi_transport_fc scsi_tgt scsi_mod
        CPU 0
        Pid: 5091, comm: kworker/0:2 Tainted: G        W    3.9.0-rc6+ #15
        RIP: kfree+0x232/0x240
        Process kworker/0:2 (pid: 5091, threadinfo ffff88084678c000, task ffff88083928ca80)
        Call Trace:
          __release_region+0xd4/0xe0
          __remove_pages+0x52/0x110
          arch_remove_memory+0x89/0xd0
          remove_memory+0xc4/0x100
          acpi_memory_device_remove+0x6d/0xb1
          acpi_device_remove+0x89/0xab
          __device_release_driver+0x7c/0xf0
          device_release_driver+0x2f/0x50
          acpi_bus_device_detach+0x6c/0x70
          acpi_ns_walk_namespace+0x11a/0x250
          acpi_walk_namespace+0xee/0x137
          acpi_bus_trim+0x33/0x7a
          acpi_bus_hot_remove_device+0xc4/0x1a1
          acpi_os_execute_deferred+0x27/0x34
          process_one_work+0x1f7/0x590
          worker_thread+0x11a/0x370
          kthread+0xee/0x100
          ret_from_fork+0x7c/0xb0
        RIP  [<ffffffff811c41d2>] kfree+0x232/0x240
         RSP <ffff88084678d968>
      
      The reason why the messages are shown is to release a resource
      structure, allocated by bootmem, by kfree().  So when we release a
      resource structure, we should check whether it is allocated by bootmem
      or not.
      
      But even if we know a resource structure is allocated by bootmem, we
      cannot release it since SLxB cannot treat it.  So for reusing a resource
      structure, this patch remembers it by using bootmem_resource as follows:
      
      When releasing a resource structure by free_resource(), free_resource()
      checks whether the resource structure is allocated by bootmem or not.
      If it is allocated by bootmem, free_resource() adds it to
      bootmem_resource.  If it is not allocated by bootmem, free_resource()
      release it by kfree().
      
      And when getting a new resource structure by get_resource(),
      get_resource() checks whether bootmem_resource has released resource
      structures or not.  If there is a released resource structure,
      get_resource() returns it.  If there is not a releaed resource
      structure, get_resource() returns new resource structure allocated by
      kzalloc().
      
      [akpm@linux-foundation.org: s/get_resource/alloc_resource/]
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebff7d8f
    • T
      resource: add release_mem_region_adjustable() · 825f787b
      Toshi Kani 提交于
      Add release_mem_region_adjustable(), which releases a requested region
      from a currently busy memory resource.  This interface adjusts the
      matched memory resource accordingly even if the requested region does
      not match exactly but still fits into.
      
      This new interface is intended for memory hot-delete.  During bootup,
      memory resources are inserted from the boot descriptor table, such as
      EFI Memory Table and e820.  Each memory resource entry usually covers
      the whole contigous memory range.  Memory hot-delete request, on the
      other hand, may target to a particular range of memory resource, and its
      size can be much smaller than the whole contiguous memory.  Since the
      existing release interfaces like __release_region() require a requested
      region to be exactly matched to a resource entry, they do not allow a
      partial resource to be released.
      
      This new interface is restrictive (i.e.  release under certain
      conditions), which is consistent with other release interfaces,
      __release_region() and __release_resource().  Additional release
      conditions, such as an overlapping region to a resource entry, can be
      supported after they are confirmed as valid cases.
      
      There is no change to the existing interfaces since their restriction is
      valid for I/O resources.
      
      [akpm@linux-foundation.org: use GFP_ATOMIC under write_lock()]
      [akpm@linux-foundation.org: switch back to GFP_KERNEL, less buggily]
      [akpm@linux-foundation.org: remove unneeded and wrong kfree(), per Toshi]
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by : Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Reviewed-by: NRam Pai <linuxram@us.ibm.com>
      Cc: T Makphaibulchoke <tmac@hp.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      825f787b
    • T
      resource: add __adjust_resource() for internal use · ae8e3a91
      Toshi Kani 提交于
      Add __adjust_resource(), which is called by adjust_resource() internally
      after the resource_lock is held.  There is no interface change to
      adjust_resource().  This change allows other functions to call
      __adjust_resource() internally while the resource_lock is held.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Cc: T Makphaibulchoke <tmac@hp.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae8e3a91
  2. 06 10月, 2012 1 次提交
  3. 31 7月, 2012 1 次提交
    • O
      resource: make sure requested range is included in the root range · 65fed8f6
      Octavian Purdila 提交于
      When the requested range is outside of the root range the logic in
      __reserve_region_with_split will cause an infinite recursion which will
      overflow the stack as seen in the warning bellow.
      
      This particular stack overflow was caused by requesting the
      (100000000-107ffffff) range while the root range was (0-ffffffff).  In
      this case __request_resource would return the whole root range as
      conflict range (i.e.  0-ffffffff).  Then, the logic in
      __reserve_region_with_split would continue the recursion requesting the
      new range as (conflict->end+1, end) which incidentally in this case
      equals the originally requested range.
      
      This patch aborts looking for an usable range when the request does not
      intersect with the root range.  When the request partially overlaps with
      the root range, it ajust the request to fall in the root range and then
      continues with the new request.
      
      When the request is modified or aborted errors and a stack trace are
      logged to allow catching the errors in the upper layers.
      
      [    5.968374] WARNING: at kernel/sched.c:4129 sub_preempt_count+0x63/0x89()
      [    5.975150] Modules linked in:
      [    5.978184] Pid: 1, comm: swapper Not tainted 3.0.22-mid27-00004-gb72c817 #46
      [    5.985324] Call Trace:
      [    5.987759]  [<c1039dfc>] ? console_unlock+0x17b/0x18d
      [    5.992891]  [<c1039620>] warn_slowpath_common+0x48/0x5d
      [    5.998194]  [<c1031758>] ? sub_preempt_count+0x63/0x89
      [    6.003412]  [<c1039644>] warn_slowpath_null+0xf/0x13
      [    6.008453]  [<c1031758>] sub_preempt_count+0x63/0x89
      [    6.013499]  [<c14d60c4>] _raw_spin_unlock+0x27/0x3f
      [    6.018453]  [<c10c6349>] add_partial+0x36/0x3b
      [    6.022973]  [<c10c7c0a>] deactivate_slab+0x96/0xb4
      [    6.027842]  [<c14cf9d9>] __slab_alloc.isra.54.constprop.63+0x204/0x241
      [    6.034456]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.039842]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.045232]  [<c10c7dc9>] kmem_cache_alloc_trace+0x51/0xb0
      [    6.050710]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.056100]  [<c103f78f>] kzalloc.constprop.5+0x29/0x38
      [    6.061320]  [<c17b45e9>] __reserve_region_with_split+0x1c/0xd1
      [    6.067230]  [<c17b4693>] __reserve_region_with_split+0xc6/0xd1
      ...
      [    7.179057]  [<c17b4693>] __reserve_region_with_split+0xc6/0xd1
      [    7.184970]  [<c17b4779>] reserve_region_with_split+0x30/0x42
      [    7.190709]  [<c17a8ebf>] e820_reserve_resources_late+0xd1/0xe9
      [    7.196623]  [<c17c9526>] pcibios_resource_survey+0x23/0x2a
      [    7.202184]  [<c17cad8a>] pcibios_init+0x23/0x35
      [    7.206789]  [<c17ca574>] pci_subsys_init+0x3f/0x44
      [    7.211659]  [<c1002088>] do_one_initcall+0x72/0x122
      [    7.216615]  [<c17ca535>] ? pci_legacy_init+0x3d/0x3d
      [    7.221659]  [<c17a27ff>] kernel_init+0xa6/0x118
      [    7.226265]  [<c17a2759>] ? start_kernel+0x334/0x334
      [    7.231223]  [<c14d7482>] kernel_thread_helper+0x6/0x10
      Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65fed8f6
  4. 14 6月, 2012 1 次提交
  5. 01 6月, 2012 1 次提交
  6. 04 2月, 2012 1 次提交
  7. 31 10月, 2011 1 次提交
  8. 30 9月, 2011 1 次提交
    • R
      Resource: fix wrong resource window calculation · 47ea91b4
      Ram Pai 提交于
      __find_resource() incorrectly returns a resource window which overlaps
      an existing allocated window.  This happens when the parent's
      resource-window spans 0x00000000 to 0xffffffff and is entirely allocated
      to all its children resource-windows.
      
      __find_resource() looks for gaps in resource allocation among the
      children resource windows.  When it encounters the last child window it
      blindly tries the range next to one allocated to the last child.  Since
      the last child's window ends at 0xffffffff the calculation overflows,
      leading the algorithm to believe that any window in the range 0x0000000
      to 0xfffffff is available for allocation.  This leads to a conflicting
      window allocation.
      
      Michal Ludvig reported this issue seen on his platform.  The following
      patch fixes the problem and has been verified by Michal.  I believe this
      bug has been there for ages.  It got exposed by git commit 2bbc6942
      ("PCI : ability to relocate assigned pci-resources")
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      Tested-by: NMichal Ludvig <mludvig@logix.net.nz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47ea91b4
  9. 31 7月, 2011 1 次提交
  10. 07 7月, 2011 1 次提交
  11. 18 12月, 2010 2 次提交
  12. 28 10月, 2010 1 次提交
  13. 27 10月, 2010 5 次提交
  14. 12 5月, 2010 1 次提交
    • A
      resource: shared I/O region support · 8b6d043b
      Alan Cox 提交于
      SuperIO devices share regions and use lock/unlock operations to chip
      select.  We therefore need to be able to request a resource and wait for
      it to be freed by whichever other SuperIO device currently hogs it.
      Right now you have to poll which is horrible.
      
      Add a MUXED field to IO port resources. If the MUXED field is set on the
      resource and on the request (via request_muxed_region) then we block
      until the previous owner of the muxed resource releases their region.
      
      This allows us to implement proper resource sharing and locking for
      superio chips using code of the form
      
      enable_my_superio_dev() {
      	request_muxed_region(0x44, 0x02, "superio:watchdog");
      	outb() ..sequence to enable chip
      }
      
      disable_my_superio_dev() {
      	outb() .. sequence of disable chip
      	release_region(0x44, 0x02);
      }
      Signed-off-by: NGiel van Schijndel <me@mortis.eu>
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8b6d043b
  15. 24 3月, 2010 1 次提交
  16. 03 3月, 2010 1 次提交
  17. 02 3月, 2010 1 次提交
    • W
      resource: Fix generic page_is_ram() for partial RAM pages · 37b99dd5
      Wu Fengguang 提交于
      The System RAM walk shall skip partial RAM pages and avoid calling
      func() on them. So that page_is_ram() return 0 for a partial RAM page.
      
      In particular, it shall not call func() with len=0.
      This fixes a boot time bug reported by Sachin and root caused by Thomas:
      
      > >>> WARNING: at arch/x86/mm/ioremap.c:111 __ioremap_caller+0x169/0x2f1()
      > >>> Hardware name: BladeCenter LS21 -[79716AA]-
      > >>> Modules linked in:
      > >>> Pid: 0, comm: swapper Not tainted 2.6.33-git6-autotest #1
      > >>> Call Trace:
      > >>> [<ffffffff81047cff>] ? __ioremap_caller+0x169/0x2f1
      > >>> [<ffffffff81063b7d>] warn_slowpath_common+0x77/0xa4
      > >>> [<ffffffff81063bb9>] warn_slowpath_null+0xf/0x11
      > >>> [<ffffffff81047cff>] __ioremap_caller+0x169/0x2f1
      > >>> [<ffffffff813747a3>] ? acpi_os_map_memory+0x12/0x1b
      > >>> [<ffffffff81047f10>] ioremap_nocache+0x12/0x14
      > >>> [<ffffffff813747a3>] acpi_os_map_memory+0x12/0x1b
      > >>> [<ffffffff81282fa0>] acpi_tb_verify_table+0x29/0x5b
      > >>> [<ffffffff812827f0>] acpi_load_tables+0x39/0x15a
      > >>> [<ffffffff8191c8f8>] acpi_early_init+0x60/0xf5
      > >>> [<ffffffff818f2cad>] start_kernel+0x397/0x3a7
      > >>> [<ffffffff818f2295>] x86_64_start_reservations+0xa5/0xa9
      > >>> [<ffffffff818f237a>] x86_64_start_kernel+0xe1/0xe8
      > >>> ---[ end trace 4eaa2a86a8e2da22 ]---
      > >>> ioremap reserve_memtype failed -22
      
      The return code is -EINVAL, so it failed in the is_ram check, which is
      not too surprising
      
      > BIOS-provided physical RAM map:
      >  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
      >  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
      >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
      >  BIOS-e820: 0000000000100000 - 00000000cffa3900 (usable)
      >  BIOS-e820: 00000000cffa3900 - 00000000cffa7400 (ACPI data)
      
      The ACPI data is not starting on a page boundary and neither does the
      usable RAM area end on a page boundary. Very useful !
      
      > ACPI: DSDT 00000000cffa3900 036CE (v01 IBM    SERLEWIS 00001000 INTL 20060912)
      
      ACPI is trying to map DSDT at cffa3900, which results in a check
      vs. cffa3000 which is the relevant page boundary. The generic is_ram
      check correctly identifies that as RAM because it's in the usable
      resource area. The old e820 based is_ram check does not take
      overlapping resource areas into account. That's why it works.
      
      CC: Sachin Sant <sachinp@in.ibm.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      LKML-Reference: <20100301135551.GA9998@localhost>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      37b99dd5
  18. 23 2月, 2010 3 次提交
  19. 02 2月, 2010 2 次提交
  20. 22 12月, 2009 1 次提交
  21. 05 11月, 2009 1 次提交
  22. 23 9月, 2009 1 次提交
    • K
      walk system ram range · 908eedc6
      KAMEZAWA Hiroyuki 提交于
      Originally, walk_memory_resource() was introduced to traverse all memory
      of "System RAM" for detecting memory hotplug/unplug range.  For doing so,
      flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
      memory hotplug.
      
      But for using other purpose, /proc/kcore, this may includes some firmware
      area marked as IORESOURCE_BUSY | IORESOUCE_MEM.  This patch makes the
      check strict to find out busy "System RAM".
      
      Note: PPC64 keeps their own walk_memory_resouce(), which walk through
      ppc64's lmb informaton.  Because old kclist_add() is called per lmb, this
      patch makes no difference in behavior, finally.
      
      And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
      Because pfn_valid() just show "there is memmap or not* and cannot be used
      for "there is physical memory or not", this function is useful in generic
      to scan physical memory range.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: WANG Cong <xiyou.wangcong@gmail.com>
      Cc: Américo Wang <xiyou.wangcong@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      908eedc6
  23. 01 7月, 2009 1 次提交
  24. 19 4月, 2009 1 次提交
  25. 16 1月, 2009 1 次提交
  26. 08 1月, 2009 1 次提交
    • A
      resource: allow MMIO exclusivity for device drivers · e8de1481
      Arjan van de Ven 提交于
      Device drivers that use pci_request_regions() (and similar APIs) have a
      reasonable expectation that they are the only ones accessing their device.
      As part of the e1000e hunt, we were afraid that some userland (X or some
      bootsplash stuff) was mapping the MMIO region that the driver thought it
      had exclusively via /dev/mem or via various sysfs resource mappings.
      
      This patch adds the option for device drivers to cause their reserved
      regions to the "banned from /dev/mem use" list, so now both kernel memory
      and device-exclusive MMIO regions are banned.
      NOTE: This is only active when CONFIG_STRICT_DEVMEM is set.
      
      In addition to the config option, a kernel parameter iomem=relaxed is
      provided for the cases where developers want to diagnose, in the field,
      drivers issues from userspace.
      Reviewed-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      e8de1481
  27. 17 12月, 2008 1 次提交
    • A
      resources: skip sanity check of busy resources · 3ac52669
      Arjan van de Ven 提交于
      Impact: reduce false positives in iomem_map_sanity_check()
      
      Some drivers (vesafb) only map/reserve a portion of a resource.
      If then some other driver comes in and maps the whole resource,
      the current code WARN_ON's. This is not the intent of the checks
      in iomem_map_sanity_check(); rather these checks want to
      warn when crossing *hardware* resources only.
      
      This patch skips BUSY resources as suggested by Linus.
      
      Note: having two drivers talk to the same hardware at the same
      time is obviously not optimal behavior, but that's a separate story.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ac52669
  28. 02 11月, 2008 1 次提交
  29. 29 10月, 2008 1 次提交
    • S
      resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() WARNINGs · d68612b2
      Suresh Siddha 提交于
      Impact: avoid false-positive WARN_ON()
      
      Andi Kleen reported:
      > When running x86info on a 2.6.27-git8 system I get
      >
      > resource map sanity check conflict: 0x9e000 0x9efff 0x10000 0x9e7ff System RAM
      > ------------[ cut here ]------------
      > WARNING: at /home/lsrc/linux/arch/x86/mm/ioremap.c:226 __ioremap_caller+0xf2/0x2d6()
      > ...
      
      Some of the pages below the 1MB ISA addresses will be shared typically by both
      BIOS and system usable RAM. For example:
      	BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
      	BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
      
      x86info reads the low physical address using /dev/mem, which internally
      uses ioremap() for accessing non RAM pages. ioremap() of such low
      pages conflicts with multiple resource entities leading to the
      above warning.
      
      Change the iomem_map_sanity_check() to allow mapping a page spanning multiple
      resource entities (minimum granularity that one can map is a page anyhow).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d68612b2
  30. 24 10月, 2008 1 次提交