1. 05 3月, 2014 3 次提交
    • J
      iommu/vt-d: Avoid caching stale domain_device_info when hot-removing PCI device · 7e7dfab7
      Jiang Liu 提交于
      Function device_notifier() in intel-iommu.c only remove domain_device_info
      data structure associated with a PCI device when handling PCI device
      driver unbinding events. If a PCI device has never been bound to a PCI
      device driver, there won't be BUS_NOTIFY_UNBOUND_DRIVER event when
      hot-removing the PCI device. So associated domain_device_info data
      structure may get lost.
      
      On the other hand, if iommu_pass_through is enabled, function
      iommu_prepare_static_indentify_mapping() will create domain_device_info
      data structure for each PCIe to PCIe bridge and PCIe endpoint,
      no matter whether there are drivers associated with those PCIe devices
      or not. So those domain_device_info data structures will get lost when
      hot-removing the assocated PCIe devices if they have never bound to
      any PCI device driver.
      
      To be even worse, it's not only an memory leak issue, but also an
      caching of stale information bug because the memory are kept in
      device_domain_list and domain->devices lists.
      
      Fix the bug by trying to remove domain_device_info data structure when
      handling BUS_NOTIFY_DEL_DEVICE event.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      7e7dfab7
    • J
      iommu/vt-d: Avoid caching stale domain_device_info and fix memory leak · 816997d0
      Jiang Liu 提交于
      Function device_notifier() in intel-iommu.c fails to remove
      device_domain_info data structures for PCI devices if they are
      associated with si_domain because iommu_no_mapping() returns true
      for those PCI devices. This will cause memory leak and caching of
      stale information in domain->devices list.
      
      So fix the issue by not calling iommu_no_mapping() and skipping check
      of iommu_pass_through.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      816997d0
    • J
      iommu/vt-d: Avoid double free of g_iommus on error recovery path · 989d51fc
      Jiang Liu 提交于
      Array 'g_iommus' may be freed twice on error recovery path in function
      init_dmars() and free_dmar_iommu(), thus cause random system crash as
      below.
      
      [    6.774301] IOMMU: dmar init failed
      [    6.778310] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
      [    6.785615] software IO TLB [mem 0x76bcf000-0x7abcf000] (64MB) mapped at [ffff880076bcf000-ffff88007abcefff]
      [    6.796887] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [    6.804173] Modules linked in:
      [    6.807731] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #108
      [    6.815122] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.1402050741 02/05/2014
      [    6.836000] task: ffff880455a80000 ti: ffff880455a88000 task.ti: ffff880455a88000
      [    6.844487] RIP: 0010:[<ffffffff8143eea6>]  [<ffffffff8143eea6>] memcpy+0x6/0x110
      [    6.853039] RSP: 0000:ffff880455a89cc8  EFLAGS: 00010293
      [    6.859064] RAX: ffff006568636163 RBX: ffff00656863616a RCX: 0000000000000005
      [    6.867134] RDX: 0000000000000005 RSI: ffffffff81cdc439 RDI: ffff006568636163
      [    6.875205] RBP: ffff880455a89d30 R08: 000000000001bc3b R09: 0000000000000000
      [    6.883275] R10: 0000000000000000 R11: ffffffff81cdc43e R12: ffff880455a89da8
      [    6.891338] R13: ffff006568636163 R14: 0000000000000005 R15: ffffffff81cdc439
      [    6.899408] FS:  0000000000000000(0000) GS:ffff88045b800000(0000) knlGS:0000000000000000
      [    6.908575] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    6.915088] CR2: ffff88047e1ff000 CR3: 0000000001e0e000 CR4: 00000000001407f0
      [    6.923160] Stack:
      [    6.925487]  ffffffff8143c904 ffff88045b407e00 ffff006568636163 ffff006568636163
      [    6.934113]  ffffffff8120a1a9 ffffffff81cdc43e 0000000000000007 0000000000000000
      [    6.942747]  ffff880455a89da8 ffff006568636163 0000000000000007 ffffffff81cdc439
      [    6.951382] Call Trace:
      [    6.954197]  [<ffffffff8143c904>] ? vsnprintf+0x124/0x6f0
      [    6.960323]  [<ffffffff8120a1a9>] ? __kmalloc_track_caller+0x169/0x360
      [    6.967716]  [<ffffffff81440e1b>] kvasprintf+0x6b/0x80
      [    6.973552]  [<ffffffff81432bf1>] kobject_set_name_vargs+0x21/0x70
      [    6.980552]  [<ffffffff8143393d>] kobject_init_and_add+0x4d/0x90
      [    6.987364]  [<ffffffff812067c9>] ? __kmalloc+0x169/0x370
      [    6.993492]  [<ffffffff8102dbbc>] ? cache_add_dev+0x17c/0x4f0
      [    7.000005]  [<ffffffff8102ddfa>] cache_add_dev+0x3ba/0x4f0
      [    7.006327]  [<ffffffff821a87ca>] ? i8237A_init_ops+0x14/0x14
      [    7.012842]  [<ffffffff821a87f8>] cache_sysfs_init+0x2e/0x61
      [    7.019260]  [<ffffffff81002162>] do_one_initcall+0xf2/0x220
      [    7.025679]  [<ffffffff810a4a29>] ? parse_args+0x2c9/0x450
      [    7.031903]  [<ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b
      [    7.038904]  [<ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a
      [    7.045322]  [<ffffffff8184d5e0>] ? rest_init+0x150/0x150
      [    7.051447]  [<ffffffff8184d5ee>] kernel_init+0xe/0x100
      [    7.057380]  [<ffffffff8187b87c>] ret_from_fork+0x7c/0xb0
      [    7.063503]  [<ffffffff8184d5e0>] ? rest_init+0x150/0x150
      [    7.069628] Code: 89 e5 53 48 89 fb 75 16 80 7f 3c 00 75 05 e8 d2 f9 ff ff 48 8b 43 58 48 2b 43 50 88 43 4e 5b 5d c3 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b
      [    7.094960] RIP  [<ffffffff8143eea6>] memcpy+0x6/0x110
      [    7.100856]  RSP <ffff880455a89cc8>
      [    7.104864] ---[ end trace b5d3fdc6c6c28083 ]---
      [    7.110142] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      [    7.110142]
      [    7.120540] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      989d51fc
  2. 03 3月, 2014 8 次提交
  3. 02 3月, 2014 8 次提交
  4. 01 3月, 2014 10 次提交
  5. 28 2月, 2014 11 次提交
    • S
      arm64: mm: Add double logical invert to pte accessors · 84fe6826
      Steve Capper 提交于
      Page table entries on ARM64 are 64 bits, and some pte functions such as
      pte_dirty return a bitwise-and of a flag with the pte value. If the
      flag to be tested resides in the upper 32 bits of the pte, then we run
      into the danger of the result being dropped if downcast.
      
      For example:
      	gather_stats(page, md, pte_dirty(*pte), 1);
      where pte_dirty(*pte) is downcast to an int.
      
      This patch adds a double logical invert to all the pte_ accessors to
      ensure predictable downcasting.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      84fe6826
    • H
      dm cache: fix truncation bug when mapping I/O to >2TB fast device · e0d849fa
      Heinz Mauelshagen 提交于
      When remapping a block to the cache's fast device that is larger than
      2TB we must not truncate the destination sector to 32bits.  The 32bit
      temporary result of from_cblock() was being overflowed in
      remap_to_cache() due to the logical left shift.
      
      Use an intermediate 64bit type to store the 32bit from_cblock() result
      to fix the overflow.
      Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      e0d849fa
    • J
      perf tools: Fix strict alias issue for find_first_bit · b39c2a57
      Jiri Olsa 提交于
      When compiling perf tool code with gcc 4.4.7 I'm getting
      following error:
      
          CC       util/session.o
        cc1: warnings being treated as errors
        util/session.c: In function ‘perf_session_deliver_event’:
        tools/perf/util/include/linux/bitops.h:109: error: dereferencing pointer ‘p’ does break strict-aliasing rules
        tools/perf/util/include/linux/bitops.h:101: error: dereferencing pointer ‘p’ does break strict-aliasing rules
        util/session.c:697: note: initialized from here
        tools/perf/util/include/linux/bitops.h:101: note: initialized from here
        make[1]: *** [util/session.o] Error 1
        make: *** [util/session.o] Error 2
      
      The aliased types here are u64 and unsigned long pointers, which is safe
      for the find_first_bit processing.
      
      This error shows up for me only for gcc 4.4 on 32bit x86, even for
      -Wstrict-aliasing=3, while newer gcc are quiet and scream here for
      -Wstrict-aliasing={2,1}. Looks like newer gcc changed the rules for
      strict alias warnings.
      
      The gcc documentation offers workaround for valid aliasing by using
      __may_alias__ attribute:
      
        http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gcc/Type-Attributes.html
      
      Using this workaround for the find_first_bit function.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1393434867-20271-1-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b39c2a57
    • B
      powerpc/powernv: Fix indirect XSCOM unmangling · e0cf9576
      Benjamin Herrenschmidt 提交于
      We need to unmangle the full address, not just the register
      number, and we also need to support the real indirect bit
      being set for in-kernel uses.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.13]
      e0cf9576
    • B
      powerpc/powernv: Fix opal_xscom_{read,write} prototype · 2f3f38e4
      Benjamin Herrenschmidt 提交于
      The OPAL firmware functions opal_xscom_read and opal_xscom_write
      take a 64-bit argument for the XSCOM (PCB) address in order to
      support the indirect mode on P8.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.13]
      2f3f38e4
    • G
      powerpc/powernv: Refactor PHB diag-data dump · af87d2fe
      Gavin Shan 提交于
      As Ben suggested, the patch prints PHB diag-data with multiple
      fields in one line and omits the line if the fields of that
      line are all zero.
      
      With the patch applied, the PHB3 diag-data dump looks like:
      
      PHB3 PHB#3 Diag-data (Version: 1)
      
        brdgCtl:     00000002
        RootSts:     0000000f 00400000 b0830008 00100147 00002000
        nFir:        0000000000000000 0030006e00000000 0000000000000000
        PhbSts:      0000001c00000000 0000000000000000
        Lem:         0000000000100000 42498e327f502eae 0000000000000000
        InAErr:      8000000000000000 8000000000000000 0402030000000000 0000000000000000
        PE[  8] A/B: 8480002b00000000 8000000000000000
      
      [ The current diag data is so big that it overflows the printk
        buffer pretty quickly in cases when we get a handful of errors
        at once which can happen. --BenH
      ]
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af87d2fe
    • G
      powerpc/powernv: Dump PHB diag-data immediately · 94716604
      Gavin Shan 提交于
      The PHB diag-data is important to help locating the root cause for
      EEH errors such as frozen PE or fenced PHB. However, the EEH core
      enables IO path by clearing part of HW registers before collecting
      this data causing it to be corrupted.
      
      This patch fixes this by dumping the PHB diag-data immediately when
      frozen/fenced state on PE or PHB is detected for the first time in
      eeh_ops::get_state() or next_error() backend.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      94716604
    • P
      powerpc: Increase stack redzone for 64-bit userspace to 512 bytes · 573ebfa6
      Paul Mackerras 提交于
      The new ELFv2 little-endian ABI increases the stack redzone -- the
      area below the stack pointer that can be used for storing data --
      from 288 bytes to 512 bytes.  This means that we need to allow more
      space on the user stack when delivering a signal to a 64-bit process.
      
      To make the code a bit clearer, we define new USER_REDZONE_SIZE and
      KERNEL_REDZONE_SIZE symbols in ptrace.h.  For now, we leave the
      kernel redzone size at 288 bytes, since increasing it to 512 bytes
      would increase the size of interrupt stack frames correspondingly.
      
      Gcc currently only makes use of 288 bytes of redzone even when
      compiling for the new little-endian ABI, and the kernel cannot
      currently be compiled with the new ABI anyway.
      
      In the future, hopefully gcc will provide an option to control the
      amount of redzone used, and then we could reduce it even more.
      
      This also changes the code in arch_compat_alloc_user_space() to
      preserve the expanded redzone.  It is not clear why this function would
      ever be used on a 64-bit process, though.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      CC: <stable@vger.kernel.org> [v3.13]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      573ebfa6
    • L
      powerpc/ftrace: bugfix for test_24bit_addr · a95fc585
      Liu Ping Fan 提交于
      The branch target should be the func addr, not the addr of func_descr_t.
      So using ppc_function_entry() to generate the right target addr.
      Signed-off-by: NLiu Ping Fan <pingfank@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a95fc585
    • L
      powerpc/crashdump : Fix page frame number check in copy_oldmem_page · f5295bd8
      Laurent Dufour 提交于
      In copy_oldmem_page, the current check using max_pfn and min_low_pfn to
      decide if the page is backed or not, is not valid when the memory layout is
      not continuous.
      
      This happens when running as a QEMU/KVM guest, where RTAS is mapped higher
      in the memory. In that case max_pfn points to the end of RTAS, and a hole
      between the end of the kdump kernel and RTAS is not backed by PTEs. As a
      consequence, the kdump kernel is crashing in copy_oldmem_page when accessing
      in a direct way the pages in that hole.
      
      This fix relies on the memblock's service memblock_is_region_memory to
      check if the read page is part or not of the directly accessible memory.
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Tested-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f5295bd8
    • T
      powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly · 41dd03a9
      Tony Breeds 提交于
      Currently we're storing a host endian RTAS token in
      rtas_stop_self_args.token.  We then pass that directly to rtas.  This is
      fine on big endian however on little endian the token is not what we
      expect.
      
      This will typically result in hitting:
      	panic("Alas, I survived.\n");
      
      To fix this we always use the stop-self token in host order and always
      convert it to be32 before passing this to rtas.
      Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      41dd03a9