1. 22 2月, 2018 2 次提交
    • H
      mm, swap, frontswap: fix THP swap if frontswap enabled · 7ba71669
      Huang Ying 提交于
      It was reported by Sergey Senozhatsky that if THP (Transparent Huge
      Page) and frontswap (via zswap) are both enabled, when memory goes low
      so that swap is triggered, segfault and memory corruption will occur in
      random user space applications as follow,
      
      kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
       #0  0x00007fc08889ae0d _int_malloc (libc.so.6)
       #1  0x00007fc08889c2f3 malloc (libc.so.6)
       #2  0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
       #3  0x0000560e6005e75c n/a (urxvt)
       #4  0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
       #5  0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
       #6  0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
       #7  0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
       #8  0x0000560e6005cb55 ev_run (urxvt)
       #9  0x0000560e6003b9b9 main (urxvt)
       #10 0x00007fc08883af4a __libc_start_main (libc.so.6)
       #11 0x0000560e6003f9da _start (urxvt)
      
      After bisection, it was found the first bad commit is bd4c82c2 ("mm,
      THP, swap: delay splitting THP after swapped out").
      
      The root cause is as follows:
      
      When the pages are written to swap device during swapping out in
      swap_writepage(), zswap (fontswap) is tried to compress the pages to
      improve performance.  But zswap (frontswap) will treat THP as a normal
      page, so only the head page is saved.  After swapping in, tail pages
      will not be restored to their original contents, causing memory
      corruption in the applications.
      
      This is fixed by refusing to save page in the frontswap store functions
      if the page is a THP.  So that the THP will be swapped out to swap
      device.
      
      Another choice is to split THP if frontswap is enabled.  But it is found
      that the frontswap enabling isn't flexible.  For example, if
      CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
      zswap itself isn't enabled.
      
      Frontswap has multiple backends, to make it easy for one backend to
      enable THP support, the THP checking is put in backend frontswap store
      functions instead of the general interfaces.
      
      Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com
      Fixes: bd4c82c2 ("mm, THP, swap: delay splitting THP after swapped out")
      Signed-off-by: N"Huang, Ying" <ying.huang@intel.com>
      Reported-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Suggested-by: Minchan Kim <minchan@kernel.org>	[put THP checking in backend]
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: <stable@vger.kernel.org>	[4.14]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ba71669
    • L
      RDMA/uverbs: Fix kernel panic while using XRC_TGT QP type · f4576587
      Leon Romanovsky 提交于
      Attempt to modify XRC_TGT QP type from the user space (ibv_xsrq_pingpong
      invocation) will trigger the following kernel panic. It is caused by the
      fact that such QPs missed uobject initialization.
      
      [   17.408845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
      [   17.412645] IP: rdma_lookup_put_uobject+0x9/0x50
      [   17.416567] PGD 0 P4D 0
      [   17.419262] Oops: 0000 [#1] SMP PTI
      [   17.422915] CPU: 0 PID: 455 Comm: ibv_xsrq_pingpo Not tainted 4.16.0-rc1+ #86
      [   17.424765] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [   17.427399] RIP: 0010:rdma_lookup_put_uobject+0x9/0x50
      [   17.428445] RSP: 0018:ffffb8c7401e7c90 EFLAGS: 00010246
      [   17.429543] RAX: 0000000000000000 RBX: ffffb8c7401e7cf8 RCX: 0000000000000000
      [   17.432426] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
      [   17.437448] RBP: 0000000000000000 R08: 00000000000218f0 R09: ffffffff8ebc4cac
      [   17.440223] R10: fffff6038052cd80 R11: ffff967694b36400 R12: ffff96769391f800
      [   17.442184] R13: ffffb8c7401e7cd8 R14: 0000000000000000 R15: ffff967699f60000
      [   17.443971] FS:  00007fc29207d700(0000) GS:ffff96769fc00000(0000) knlGS:0000000000000000
      [   17.446623] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   17.448059] CR2: 0000000000000048 CR3: 000000001397a000 CR4: 00000000000006b0
      [   17.449677] Call Trace:
      [   17.450247]  modify_qp.isra.20+0x219/0x2f0
      [   17.451151]  ib_uverbs_modify_qp+0x90/0xe0
      [   17.452126]  ib_uverbs_write+0x1d2/0x3c0
      [   17.453897]  ? __handle_mm_fault+0x93c/0xe40
      [   17.454938]  __vfs_write+0x36/0x180
      [   17.455875]  vfs_write+0xad/0x1e0
      [   17.456766]  SyS_write+0x52/0xc0
      [   17.457632]  do_syscall_64+0x75/0x180
      [   17.458631]  entry_SYSCALL_64_after_hwframe+0x21/0x86
      [   17.460004] RIP: 0033:0x7fc29198f5a0
      [   17.460982] RSP: 002b:00007ffccc71f018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [   17.463043] RAX: ffffffffffffffda RBX: 0000000000000078 RCX: 00007fc29198f5a0
      [   17.464581] RDX: 0000000000000078 RSI: 00007ffccc71f050 RDI: 0000000000000003
      [   17.466148] RBP: 0000000000000000 R08: 0000000000000078 R09: 00007ffccc71f050
      [   17.467750] R10: 000055b6cf87c248 R11: 0000000000000246 R12: 00007ffccc71f300
      [   17.469541] R13: 000055b6cf8733a0 R14: 0000000000000000 R15: 0000000000000000
      [   17.471151] Code: 00 00 0f 1f 44 00 00 48 8b 47 48 48 8b 00 48 8b 40 10 e9 0b 8b 68 00 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 53 89 f5 <48> 8b 47 48 48 89 fb 40 0f b6 f6 48 8b 00 48 8b 40 20 e8 e0 8a
      [   17.475185] RIP: rdma_lookup_put_uobject+0x9/0x50 RSP: ffffb8c7401e7c90
      [   17.476841] CR2: 0000000000000048
      [   17.477764] ---[ end trace 1dbcc5354071a712 ]---
      [   17.478880] Kernel panic - not syncing: Fatal exception
      [   17.480277] Kernel Offset: 0xd000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      
      Fixes: 2f08ee36 ("RDMA/restrack: don't use uaccess_kernel()")
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f4576587
  2. 21 2月, 2018 5 次提交
  3. 20 2月, 2018 9 次提交
  4. 18 2月, 2018 1 次提交
  5. 17 2月, 2018 13 次提交
    • L
      iio: adis_lib: Initialize trigger before requesting interrupt · f027e0b3
      Lars-Peter Clausen 提交于
      The adis_probe_trigger() creates a new IIO trigger and requests an
      interrupt associated with the trigger. The interrupt uses the generic
      iio_trigger_generic_data_rdy_poll() function as its interrupt handler.
      
      Currently the driver initializes some fields of the trigger structure after
      the interrupt has been requested. But an interrupt can fire as soon as it
      has been requested. This opens up a race condition.
      
      iio_trigger_generic_data_rdy_poll() will access the trigger data structure
      and dereference the ops field. If the ops field is not yet initialized this
      will result in a NULL pointer deref.
      
      It is not expected that the device generates an interrupt at this point, so
      typically this issue did not surface unless e.g. due to a hardware
      misconfiguration (wrong interrupt number, wrong polarity, etc.).
      
      But some newer devices from the ADIS family start to generate periodic
      interrupts in their power-on reset configuration and unfortunately the
      interrupt can not be masked in the device.  This makes the race condition
      much more visible and the following crash has been observed occasionally
      when booting a system using the ADIS16460.
      
      	Unable to handle kernel NULL pointer dereference at virtual address 00000008
      	pgd = c0004000
      	[00000008] *pgd=00000000
      	Internal error: Oops: 5 [#1] PREEMPT SMP ARM
      	Modules linked in:
      	CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-04126-gf9739f0-dirty #257
      	Hardware name: Xilinx Zynq Platform
      	task: ef04f640 task.stack: ef050000
      	PC is at iio_trigger_notify_done+0x30/0x68
      	LR is at iio_trigger_generic_data_rdy_poll+0x18/0x20
      	pc : [<c042d868>]    lr : [<c042d924>]    psr: 60000193
      	sp : ef051bb8  ip : 00000000  fp : ef106400
      	r10: c081d80a  r9 : ef3bfa00  r8 : 00000087
      	r7 : ef051bec  r6 : 00000000  r5 : ef3bfa00  r4 : ee92ab00
      	r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : ee97e400
      	Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
      	Control: 18c5387d  Table: 0000404a  DAC: 00000051
      	Process swapper/0 (pid: 1, stack limit = 0xef050210)
      	[<c042d868>] (iio_trigger_notify_done) from [<c0065b10>] (__handle_irq_event_percpu+0x88/0x118)
      	[<c0065b10>] (__handle_irq_event_percpu) from [<c0065bbc>] (handle_irq_event_percpu+0x1c/0x58)
      	[<c0065bbc>] (handle_irq_event_percpu) from [<c0065c30>] (handle_irq_event+0x38/0x5c)
      	[<c0065c30>] (handle_irq_event) from [<c0068e28>] (handle_level_irq+0xa4/0x130)
      	[<c0068e28>] (handle_level_irq) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
      	[<c0064e74>] (generic_handle_irq) from [<c021ab7c>] (zynq_gpio_irqhandler+0xb8/0x13c)
      	[<c021ab7c>] (zynq_gpio_irqhandler) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
      	[<c0064e74>] (generic_handle_irq) from [<c0065370>] (__handle_domain_irq+0x5c/0xb4)
      	[<c0065370>] (__handle_domain_irq) from [<c000940c>] (gic_handle_irq+0x48/0x8c)
      	[<c000940c>] (gic_handle_irq) from [<c0013e8c>] (__irq_svc+0x6c/0xa8)
      
      To fix this make sure that the trigger is fully initialized before
      requesting the interrupt.
      
      Fixes: ccd2b52f ("staging:iio: Add common ADIS library")
      Reported-by: NRobin Getz <Robin.Getz@analog.com>
      Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      f027e0b3
    • S
      pvcalls-front: wait for other operations to return when release passive sockets · d1a75e08
      Stefano Stabellini 提交于
      Passive sockets can have ongoing operations on them, specifically, we
      have two wait_event_interruptable calls in pvcalls_front_accept.
      
      Add two wake_up calls in pvcalls_front_release, then wait for the
      potential waiters to return and release the sock_mapping refcount.
      Signed-off-by: NStefano Stabellini <stefano@aporeto.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      d1a75e08
    • S
      pvcalls-front: introduce a per sock_mapping refcount · 64d68718
      Stefano Stabellini 提交于
      Introduce a per sock_mapping refcount, in addition to the existing
      global refcount. Thanks to the sock_mapping refcount, we can safely wait
      for it to be 1 in pvcalls_front_release before freeing an active socket,
      instead of waiting for the global refcount to be 1.
      Signed-off-by: NStefano Stabellini <stefano@aporeto.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      64d68718
    • J
      xenbus: track caller request id · 29fee6ee
      Joao Martins 提交于
      Commit fd8aa909 ("xen: optimize xenbus driver for multiple concurrent
      xenstore accesses") optimized xenbus concurrent accesses but in doing so
      broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
      charge of xenbus message exchange with the correct header and body. Now,
      after the mentioned commit the replies received by application will no
      longer have the header req_id echoed back as it was on request (see
      specification below for reference), because that particular field is being
      overwritten by kernel.
      
      struct xsd_sockmsg
      {
        uint32_t type;  /* XS_??? */
        uint32_t req_id;/* Request identifier, echoed in daemon's response.  */
        uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
        uint32_t len;   /* Length of data following this. */
      
        /* Generally followed by nul-terminated string(s). */
      };
      
      Before there was only one request at a time so req_id could simply be
      forwarded back and forth. To allow simultaneous requests we need a
      different req_id for each message thus kernel keeps a monotonic increasing
      counter for this field and is written on every request irrespective of
      userspace value.
      
      Forwarding again the req_id on userspace requests is not a solution because
      we would open the possibility of userspace-generated req_id colliding with
      kernel ones. So this patch instead takes another route which is to
      artificially keep user req_id while keeping the xenbus logic as is. We do
      that by saving the original req_id before xs_send(), use the private kernel
      counter as req_id and then once reply comes and was validated, we restore
      back the original req_id.
      
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: fd8aa909 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
      Reported-by: NBhavesh Davda <bhavesh.davda@oracle.com>
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      29fee6ee
    • E
      tun: fix tun_napi_alloc_frags() frag allocator · 43a08e0f
      Eric Dumazet 提交于
      <Mark Rutland reported>
          While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
          misaligned atomic in __skb_clone:
      
              atomic_inc(&(skb_shinfo(skb)->dataref));
      
         where dataref doesn't have the required natural alignment, and the
         atomic operation faults. e.g. i often see it aligned to a single
         byte boundary rather than a four byte boundary.
      
         AFAICT, the skb_shared_info is misaligned at the instant it's
         allocated in __napi_alloc_skb()  __napi_alloc_skb()
      </end of report>
      
      Problem is caused by tun_napi_alloc_frags() using
      napi_alloc_frag() with user provided seg sizes,
      leading to other users of this API getting unaligned
      page fragments.
      
      Since we would like to not necessarily add paddings or alignments to
      the frags that tun_napi_alloc_frags() attaches to the skb, switch to
      another page frag allocator.
      
      As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
      meaning that we can not deplete memory reserves as easily.
      
      Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43a08e0f
    • C
      PCI/cxgb4: Extend T3 PCI quirk to T4+ devices · 7dcf688d
      Casey Leedom 提交于
      We've run into a problem where our device is attached
      to a Virtual Machine and the use of the new pci_set_vpd_size()
      API doesn't help.  The VM kernel has been informed that
      the accesses are okay, but all of the actual VPD Capability
      Accesses are trapped down into the KVM Hypervisor where it
      goes ahead and imposes the silent denials.
      
      The right idea is to follow the kernel.org
      commit 1c7de2b4 ("PCI: Enable access to non-standard VPD for
      Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
      to establish a PCI Quirk for our T3-based adapters. This commit
      extends that PCI Quirk to cover Chelsio T4 devices and later.
      
      The advantage of this approach is that the VPD Size gets set early
      in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
      driver even be available in the Base OS/Hypervisor.  Thus PF4 can
      be exported to a Virtual Machine and everything should work.
      
      Fixes: 67e65879 ("cxgb4: Set VPD size so we can read both VPD structures")
      Cc: <stable@vger.kernel.org>  # v4.9+
      Signed-off-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NArjun Vynipadath <arjun@chelsio.com>
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7dcf688d
    • R
      cxgb4: fix trailing zero in CIM LA dump · e6f02a4d
      Rahul Lakkireddy 提交于
      Set correct size of the CIM LA dump for T6.
      
      Fixes: 27887bc7 ("cxgb4: collect hardware LA dumps")
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6f02a4d
    • G
      cxgb4: free up resources of pf 0-3 · c4e43e14
      Ganesh Goudar 提交于
      free pf 0-3 resources, commit baf50868 ("cxgb4:
      restructure VF mgmt code") erroneously removed the
      code which frees the pf 0-3 resources, causing the
      probe of pf 0-3 to fail in case of driver reload.
      
      Fixes: baf50868 ("cxgb4: restructure VF mgmt code")
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4e43e14
    • S
      RDMA/restrack: don't use uaccess_kernel() · 2f08ee36
      Steve Wise 提交于
      uaccess_kernel() isn't sufficient to determine if an rdma resource is
      user-mode or not.  For example, resources allocated in the add_one()
      function of an ib_client get falsely labeled as user mode, when they
      are kernel mode allocations.  EG: mad qps.
      
      The result is that these qps are skipped over during a nldev query
      because of an erroneous namespace mismatch.
      
      So now we determine if the resource is user-mode by looking at the object
      struct's uobject or similar pointer to know if it was allocated for user
      mode applications.
      
      Fixes: 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources")
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      2f08ee36
    • L
      staging: android: ion: Zero CMA allocated memory · 6d79bd5b
      Liam Mark 提交于
      Since commit 204f6722 ("staging: android: ion: Use CMA APIs directly")
      the CMA API is now used directly and therefore the allocated memory is no
      longer automatically zeroed.
      
      Explicitly zero CMA allocated memory to ensure that no data is exposed to
      userspace.
      
      Fixes: 204f6722 ("staging: android: ion: Use CMA APIs directly")
      Signed-off-by: NLiam Mark <lmark@codeaurora.org>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d79bd5b
    • B
      staging: android: ashmem: Fix a race condition in pin ioctls · ce8a3a9e
      Ben Hutchings 提交于
      ashmem_pin_unpin() reads asma->file and asma->size before taking the
      ashmem_mutex, so it can race with other operations that modify them.
      
      Build-tested only.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce8a3a9e
    • A
      staging: fsl-mc: fix build testing on x86 · 02b7b284
      Arnd Bergmann 提交于
      Selecting GENERIC_MSI_IRQ_DOMAIN on x86 causes a compile-time error in
      some configurations:
      
      drivers/base/platform-msi.c:37:19: error: field 'arg' has incomplete type
      
      On the other architectures, we are fine, but here we should have an additional
      dependency on X86_LOCAL_APIC so we can get the PCI_MSI_IRQ_DOMAIN symbol.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02b7b284
    • L
      RDMA/verbs: Check existence of function prior to accessing it · 21885586
      Leon Romanovsky 提交于
      Update all the flows to ensure that function pointer exists prior
      to accessing it.
      
      This is much safer than checking the uverbs_ex_mask variable, especially
      since we know that test isn't working properly and will be removed
      in -next.
      
      This prevents a user triggereable oops.
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      21885586
  6. 16 2月, 2018 10 次提交
    • N
      dm: correctly handle chained bios in dec_pending() · 8dd601fa
      NeilBrown 提交于
      dec_pending() is given an error status (possibly 0) to be recorded
      against a bio.  It can be called several times on the one 'struct
      dm_io', and it is careful to only assign a non-zero error to
      io->status.  However when it then assigned io->status to bio->bi_status,
      it is not careful and could overwrite a genuine error status with 0.
      
      This can happen when chained bios are in use.  If a bio is chained
      beneath the bio that this dm_io is handling, the child bio might
      complete and set bio->bi_status before the dm_io completes.
      
      This has been possible since chained bios were introduced in 3.14, and
      has become a lot easier to trigger with commit 18a25da8 ("dm: ensure
      bio submission follows a depth-first tree walk") as that commit caused
      dm to start using chained bios itself.
      
      A particular failure mode is that if a bio spans an 'error' target and a
      working target, the 'error' fragment will complete instantly and set the
      ->bi_status, and the other fragment will normally complete a little
      later, and will clear ->bi_status.
      
      The fix is simply to only assign io_error to bio->bi_status when
      io_error is not zero.
      Reported-and-tested-by: NMilan Broz <gmazyland@gmail.com>
      Cc: stable@vger.kernel.org (v3.14+)
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      8dd601fa
    • J
      irqchip/bcm: Remove hashed address printing · 2d02424e
      Jaedon Shin 提交于
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers are being hashed when printed. Displaying the virtual memory at
      bootup time is not helpful. so delete the prints.
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJaedon Shin <jaedon.shin@gmail.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      2d02424e
    • M
      irqchip/gic-v2m: Add PCI Multi-MSI support · de337ee3
      Marc Zyngier 提交于
      We'd never implemented Multi-MSI support with GICv2m, because
      it is weird and clunky, and you'd think people would rather use
      MSI-X.
      
      Turns out there is still plenty of devices out there that rely
      on Multi-MSI. Oh well, let's teach that trick to the v2m widget,
      it is not a big deal anyway.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      de337ee3
    • S
      irqchip/gic-v3: Ignore disabled ITS nodes · 95a25625
      Stephen Boyd 提交于
      On some platforms there's an ITS available but it's not enabled
      because reading or writing the registers is denied by the
      firmware. In fact, reading or writing them will cause the system
      to reset. We could remove the node from DT in such a case, but
      it's better to skip nodes that are marked as "disabled" in DT so
      that we can describe the hardware that exists and use the status
      property to indicate how the firmware has configured things.
      
      Cc: Stuart Yoder <stuyoder@gmail.com>
      Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Rajendra Nayak <rnayak@codeaurora.org>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      95a25625
    • S
      irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq() · 21ec30c0
      Shanker Donthineni 提交于
      A DMB instruction can be used to ensure the relative order of only
      memory accesses before and after the barrier. Since writes to system
      registers are not memory operations, barrier DMB is not sufficient
      for observability of memory accesses that occur before ICC_SGI1R_EL1
      writes.
      
      A DSB instruction ensures that no instructions that appear in program
      order after the DSB instruction, can execute until the DSB instruction
      has completed.
      
      Cc: stable@vger.kernel.org
      Acked-by: Will Deacon <will.deacon@arm.com>,
      Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      21ec30c0
    • M
      irqchip/gic-v3: Change pr_debug message to pr_devel · b6dd4d83
      Mark Salter 提交于
      The pr_debug() in gic-v3 gic_send_sgi() can trigger a circular locking
      warning:
      
       GICv3: CPU10: ICC_SGI1R_EL1 5000400
       ======================================================
       WARNING: possible circular locking dependency detected
       4.15.0+ #1 Tainted: G        W
       ------------------------------------------------------
       dynamic_debug01/1873 is trying to acquire lock:
        ((console_sem).lock){-...}, at: [<0000000099c891ec>] down_trylock+0x20/0x4c
      
       but task is already holding lock:
        (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #2 (&rq->lock){-.-.}:
              __lock_acquire+0x3b4/0x6e0
              lock_acquire+0xf4/0x2a8
              _raw_spin_lock+0x4c/0x60
              task_fork_fair+0x3c/0x148
              sched_fork+0x10c/0x214
              copy_process.isra.32.part.33+0x4e8/0x14f0
              _do_fork+0xe8/0x78c
              kernel_thread+0x48/0x54
              rest_init+0x34/0x2a4
              start_kernel+0x45c/0x488
      
       -> #1 (&p->pi_lock){-.-.}:
              __lock_acquire+0x3b4/0x6e0
              lock_acquire+0xf4/0x2a8
              _raw_spin_lock_irqsave+0x58/0x70
              try_to_wake_up+0x48/0x600
              wake_up_process+0x28/0x34
              __up.isra.0+0x60/0x6c
              up+0x60/0x68
              __up_console_sem+0x4c/0x7c
              console_unlock+0x328/0x634
              vprintk_emit+0x25c/0x390
              dev_vprintk_emit+0xc4/0x1fc
              dev_printk_emit+0x88/0xa8
              __dev_printk+0x58/0x9c
              _dev_info+0x84/0xa8
              usb_new_device+0x100/0x474
              hub_port_connect+0x280/0x92c
              hub_event+0x740/0xa84
              process_one_work+0x240/0x70c
              worker_thread+0x60/0x400
              kthread+0x110/0x13c
              ret_from_fork+0x10/0x18
      
       -> #0 ((console_sem).lock){-...}:
              validate_chain.isra.34+0x6e4/0xa20
              __lock_acquire+0x3b4/0x6e0
              lock_acquire+0xf4/0x2a8
              _raw_spin_lock_irqsave+0x58/0x70
              down_trylock+0x20/0x4c
              __down_trylock_console_sem+0x3c/0x9c
              console_trylock+0x20/0xb0
              vprintk_emit+0x254/0x390
              vprintk_default+0x58/0x90
              vprintk_func+0xbc/0x164
              printk+0x80/0xa0
              __dynamic_pr_debug+0x84/0xac
              gic_raise_softirq+0x184/0x18c
              smp_cross_call+0xac/0x218
              smp_send_reschedule+0x3c/0x48
              resched_curr+0x60/0x9c
              check_preempt_curr+0x70/0xdc
              wake_up_new_task+0x310/0x470
              _do_fork+0x188/0x78c
              SyS_clone+0x44/0x50
              __sys_trace_return+0x0/0x4
      
       other info that might help us debug this:
      
       Chain exists of:
         (console_sem).lock --> &p->pi_lock --> &rq->lock
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&rq->lock);
                                      lock(&p->pi_lock);
                                      lock(&rq->lock);
         lock((console_sem).lock);
      
        *** DEADLOCK ***
      
       2 locks held by dynamic_debug01/1873:
        #0:  (&p->pi_lock){-.-.}, at: [<000000001366df53>] wake_up_new_task+0x40/0x470
        #1:  (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc
      
       stack backtrace:
       CPU: 10 PID: 1873 Comm: dynamic_debug01 Tainted: G        W        4.15.0+ #1
       Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS T48 10/02/2017
       Call trace:
        dump_backtrace+0x0/0x188
        show_stack+0x24/0x2c
        dump_stack+0xa4/0xe0
        print_circular_bug.isra.31+0x29c/0x2b8
        check_prev_add.constprop.39+0x6c8/0x6dc
        validate_chain.isra.34+0x6e4/0xa20
        __lock_acquire+0x3b4/0x6e0
        lock_acquire+0xf4/0x2a8
        _raw_spin_lock_irqsave+0x58/0x70
        down_trylock+0x20/0x4c
        __down_trylock_console_sem+0x3c/0x9c
        console_trylock+0x20/0xb0
        vprintk_emit+0x254/0x390
        vprintk_default+0x58/0x90
        vprintk_func+0xbc/0x164
        printk+0x80/0xa0
        __dynamic_pr_debug+0x84/0xac
        gic_raise_softirq+0x184/0x18c
        smp_cross_call+0xac/0x218
        smp_send_reschedule+0x3c/0x48
        resched_curr+0x60/0x9c
        check_preempt_curr+0x70/0xdc
        wake_up_new_task+0x310/0x470
        _do_fork+0x188/0x78c
        SyS_clone+0x44/0x50
        __sys_trace_return+0x0/0x4
       GICv3: CPU0: ICC_SGI1R_EL1 12000
      
      This could be fixed with printk_deferred() but that might lessen its
      usefulness for debugging. So change it to pr_devel to keep it out of
      production kernels. Developers working on gic-v3 can enable it as
      needed in their kernels.
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      b6dd4d83
    • M
      irqchip/mips-gic: Avoid spuriously handling masked interrupts · 285cb4f6
      Matt Redfearn 提交于
      Commit 7778c4b2 ("irqchip: mips-gic: Use pcpu_masks to avoid reading
      GIC_SH_MASK*") removed the read of the hardware mask register when
      handling shared interrupts, instead using the driver's shadow pcpu_masks
      entry as the effective mask. Unfortunately this did not take account of
      the write to pcpu_masks during gic_shared_irq_domain_map, which
      effectively unmasks the interrupt early. If an interrupt is asserted,
      gic_handle_shared_int decodes and processes the interrupt even though it
      has not yet been unmasked via gic_unmask_irq, which also sets the
      appropriate bit in pcpu_masks.
      
      On the MIPS Boston board, when a console command line of
      "console=ttyS0,115200n8r" is passed, the modem status IRQ is enabled in
      the UART, which is immediately raised to the GIC. The interrupt has been
      mapped, but no handler has yet been registered, nor is it expected to be
      unmasked. However, the write to pcpu_masks in gic_shared_irq_domain_map
      has effectively unmasked it, resulting in endless reports of:
      
      [    5.058454] irq 13, desc: ffffffff80a7ad80, depth: 1, count: 0, unhandled: 0
      [    5.062057] ->handle_irq():  ffffffff801b1838,
      [    5.062175] handle_bad_irq+0x0/0x2c0
      
      Where IRQ 13 is the UART interrupt.
      
      To fix this, just remove the write to pcpu_masks in
      gic_shared_irq_domain_map. The existing write in gic_unmask_irq is the
      correct place for what is now the effective unmasking.
      
      Cc: stable@vger.kernel.org
      Fixes: 7778c4b2 ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*")
      Signed-off-by: NMatt Redfearn <matt.redfearn@mips.com>
      Reviewed-by: NPaul Burton <paul.burton@mips.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      285cb4f6
    • T
      mei: set device client to the disconnected state upon suspend. · 7ae079ac
      Tomas Winkler 提交于
      This fixes regression introduced by
      commit 8d52af67 ("mei: speed up the power down flow")
      
      In mei_cldev_disable during device power down flow, such as
      suspend or system power off, it jumps over disconnecting function
      to speed up the power down process, however, because the client is
      unlinked from the file_list (mei_cl_unlink) mei_cl_set_disconnected
      is not called from mei_cl_all_disconnect leaving resource leaking.
      The most visible is reference counter on underlying HW module is
      not decreased preventing to remove modules after suspend/resume cycles.
      Signed-off-by: NTomas Winkler <tomas.winkler@intel.com>
      Fixes: 8d52af67 ("mei: speed up the power down flow")
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ae079ac
    • M
      ANDROID: binder: synchronize_rcu() when using POLLFREE. · 5eeb2ca0
      Martijn Coenen 提交于
      To prevent races with ep_remove_waitqueue() removing the
      waitqueue at the same time.
      
      Reported-by: syzbot+a2a3c4909716e271487e@syzkaller.appspotmail.com
      Signed-off-by: NMartijn Coenen <maco@android.com>
      Cc: stable <stable@vger.kernel.org> # 4.14+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5eeb2ca0
    • T
      binder: replace "%p" with "%pK" · 8ca86f16
      Todd Kjos 提交于
      The format specifier "%p" can leak kernel addresses. Use
      "%pK" instead. There were 4 remaining cases in binder.c.
      Signed-off-by: NTodd Kjos <tkjos@google.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ca86f16