1. 07 4月, 2019 2 次提交
    • K
      fs: stream_open - opener for stream-like files so that read and write can run... · 10dce8af
      Kirill Smelkov 提交于
      fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
      
      Commit 9c225f26 ("vfs: atomic f_pos accesses as per POSIX") added
      locking for file.f_pos access and in particular made concurrent read and
      write not possible - now both those functions take f_pos lock for the
      whole run, and so if e.g. a read is blocked waiting for data, write will
      deadlock waiting for that read to complete.
      
      This caused regression for stream-like files where previously read and
      write could run simultaneously, but after that patch could not do so
      anymore. See e.g. commit 581d21a2 ("xenbus: fix deadlock on writes
      to /proc/xen/xenbus") which fixes such regression for particular case of
      /proc/xen/xenbus.
      
      The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
      safety for read/write/lseek and added the locking to file descriptors of
      all regular files. In 2014 that thread-safety problem was not new as it
      was already discussed earlier in 2006.
      
      However even though 2006'th version of Linus's patch was adding f_pos
      locking "only for files that are marked seekable with FMODE_LSEEK (thus
      avoiding the stream-like objects like pipes and sockets)", the 2014
      version - the one that actually made it into the tree as 9c225f26 -
      is doing so irregardless of whether a file is seekable or not.
      
      See
      
          https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
          https://lwn.net/Articles/180387
          https://lwn.net/Articles/180396
      
      for historic context.
      
      The reason that it did so is, probably, that there are many files that
      are marked non-seekable, but e.g. their read implementation actually
      depends on knowing current position to correctly handle the read. Some
      examples:
      
      	kernel/power/user.c		snapshot_read
      	fs/debugfs/file.c		u32_array_read
      	fs/fuse/control.c		fuse_conn_waiting_read + ...
      	drivers/hwmon/asus_atk0110.c	atk_debugfs_ggrp_read
      	arch/s390/hypfs/inode.c		hypfs_read_iter
      	...
      
      Despite that, many nonseekable_open users implement read and write with
      pure stream semantics - they don't depend on passed ppos at all. And for
      those cases where read could wait for something inside, it creates a
      situation similar to xenbus - the write could be never made to go until
      read is done, and read is waiting for some, potentially external, event,
      for potentially unbounded time -> deadlock.
      
      Besides xenbus, there are 14 such places in the kernel that I've found
      with semantic patch (see below):
      
      	drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
      	drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
      	drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
      	drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
      	net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
      	drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
      	drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
      	drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
      	net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
      	drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
      	drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
      	drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
      	drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
      	drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()
      
      In addition to the cases above another regression caused by f_pos
      locking is that now FUSE filesystems that implement open with
      FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
      stream-like files - for the same reason as above e.g. read can deadlock
      write locking on file.f_pos in the kernel.
      
      FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990 ("fuse:
      implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
      in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
      write routines not depending on current position at all, and with both
      read and write being potentially blocking operations:
      
      See
      
          https://github.com/libfuse/osspd
          https://lwn.net/Articles/308445
      
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510
      
      Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
      "somewhat pipe-like files ..." with read handler not using offset.
      However that test implements only read without write and cannot exercise
      the deadlock scenario:
      
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216
      
      I've actually hit the read vs write deadlock for real while implementing
      my FUSE filesystem where there is /head/watch file, for which open
      creates separate bidirectional socket-like stream in between filesystem
      and its user with both read and write being later performed
      simultaneously. And there it is semantically not easy to split the
      stream into two separate read-only and write-only channels:
      
          https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169
      
      Let's fix this regression. The plan is:
      
      1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
         doing so would break many in-kernel nonseekable_open users which
         actually use ppos in read/write handlers.
      
      2. Add stream_open() to kernel to open stream-like non-seekable file
         descriptors. Read and write on such file descriptors would never use
         nor change ppos. And with that property on stream-like files read and
         write will be running without taking f_pos lock - i.e. read and write
         could be running simultaneously.
      
      3. With semantic patch search and convert to stream_open all in-kernel
         nonseekable_open users for which read and write actually do not
         depend on ppos and where there is no other methods in file_operations
         which assume @offset access.
      
      4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
         steam_open if that bit is present in filesystem open reply.
      
         It was tempting to change fs/fuse/ open handler to use stream_open
         instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
         grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
         and in particular GVFS which actually uses offset in its read and
         write handlers
      
      	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481
      
         so if we would do such a change it will break a real user.
      
      5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
         from v3.14+ (the kernel where 9c225f26 first appeared).
      
         This will allow to patch OSSPD and other FUSE filesystems that
         provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
         in their open handler and this way avoid the deadlock on all kernel
         versions. This should work because fs/fuse/ ignores unknown open
         flags returned from a filesystem and so passing FOPEN_STREAM to a
         kernel that is not aware of this flag cannot hurt. In turn the kernel
         that is not aware of FOPEN_STREAM will be < v3.14 where just
         FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
         write deadlock.
      
      This patch adds stream_open, converts /proc/xen/xenbus to it and adds
      semantic patch to automatically locate in-kernel places that are either
      required to be converted due to read vs write deadlock, or that are just
      safe to be converted because read and write do not use ppos and there
      are no other funky methods in file_operations.
      
      Regarding semantic patch I've verified each generated change manually -
      that it is correct to convert - and each other nonseekable_open instance
      left - that it is either not correct to convert there, or that it is not
      converted due to current stream_open.cocci limitations.
      
      The script also does not convert files that should be valid to convert,
      but that currently have .llseek = noop_llseek or generic_file_llseek for
      unknown reason despite file being opened with nonseekable_open (e.g.
      drivers/input/mousedev.c)
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yongzhi Pan <panyongzhi@gmail.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Nikolaus Rath <Nikolaus@rath.org>
      Cc: Han-Wen Nienhuys <hanwen@google.com>
      Signed-off-by: NKirill Smelkov <kirr@nexedi.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10dce8af
    • L
      Merge tag 'rtc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · be76865d
      Linus Torvalds 提交于
      Pull RTC fixes from Alexandre Belloni:
      
       - Various alarm fixes for da9063, cros-ec and sh
      
       - sd3078 manufacturer name fix as this was introduced this cycle
      
      * tag 'rtc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        rtc: da9063: set uie_unsupported when relevant
        rtc: sd3078: fix manufacturer name
        rtc: sh: Fix invalid alarm warning for non-enabled alarm
        rtc: cros-ec: Fail suspend/resume if wake IRQ can't be configured
      be76865d
  2. 06 4月, 2019 26 次提交
  3. 05 4月, 2019 12 次提交
    • G
      tty: mark Siemens R3964 line discipline as BROKEN · c7084edc
      Greg Kroah-Hartman 提交于
      The n_r3964 line discipline driver was written in a different time, when
      SMP machines were rare, and users were trusted to do the right thing.
      Since then, the world has moved on but not this code, it has stayed
      rooted in the past with its lovely hand-crafted list structures and
      loads of "interesting" race conditions all over the place.
      
      After attempting to clean up most of the issues, I just gave up and am
      now marking the driver as BROKEN so that hopefully someone who has this
      hardware will show up out of the woodwork (I know you are out there!)
      and will help with debugging a raft of changes that I had laying around
      for the code, but was too afraid to commit as odds are they would break
      things.
      
      Many thanks to Jann and Linus for pointing out the initial problems in
      this codebase, as well as many reviews of my attempts to fix the issues.
      It was a case of whack-a-mole, and as you can see, the mole won.
      Reported-by: NJann Horn <jannh@google.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7084edc
    • S
      syscalls: Remove start and number from syscall_set_arguments() args · 32d92586
      Steven Rostedt (VMware) 提交于
      After removing the start and count arguments of syscall_get_arguments() it
      seems reasonable to remove them from syscall_set_arguments(). Note, as of
      today, there are no users of syscall_set_arguments(). But we are told that
      there will be soon. But for now, at least make it consistent with
      syscall_get_arguments().
      
      Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: x86@kernel.org
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Cc: linux-hexagon@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linux-arch@vger.kernel.org
      Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
      Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
      Reviewed-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      32d92586
    • S
      syscalls: Remove start and number from syscall_get_arguments() args · b35f549d
      Steven Rostedt (Red Hat) 提交于
      At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
      function call syscall_get_arguments() implemented in x86 was horribly
      written and not optimized for the standard case of passing in 0 and 6 for
      the starting index and the number of system calls to get. When looking at
      all the users of this function, I discovered that all instances pass in only
      0 and 6 for these arguments. Instead of having this function handle
      different cases that are never used, simply rewrite it to return the first 6
      arguments of a system call.
      
      This should help out the performance of tracing system calls by ptrace,
      ftrace and perf.
      
      Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: x86@kernel.org
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Cc: linux-hexagon@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linux-arch@vger.kernel.org
      Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
      Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
      Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
      Reviewed-by: NDmitry V. Levin <ldv@altlinux.org>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      b35f549d
    • L
      Merge tag 'drm-fixes-2019-04-05' of git://anongit.freedesktop.org/drm/drm · ea2cec24
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Pretty quiet week, just some amdgpu and i915 fixes.
      
        i915:
         - deadlock fix
         - gvt fixes
      
        amdgpu:
         - PCIE dpm feature fix
         - Powerplay fixes"
      
      * tag 'drm-fixes-2019-04-05' of git://anongit.freedesktop.org/drm/drm:
        drm/i915/gvt: Fix kerneldoc typo for intel_vgpu_emulate_hotplug
        drm/i915/gvt: Correct the calculation of plane size
        drm/amdgpu: remove unnecessary rlc reset function on gfx9
        drm/i915: Always backoff after a drm_modeset_lock() deadlock
        drm/i915/gvt: do not let pin count of shadow mm go negative
        drm/i915/gvt: do not deliver a workload if its creation fails
        drm/amd/display: VBIOS can't be light up HDMI when restart system
        drm/amd/powerplay: fix possible hang with 3+ 4K monitors
        drm/amd/powerplay: correct data type to avoid overflow
        drm/amd/powerplay: add ECC feature bit
        drm/amd/amdgpu: fix PCIe dpm feature issue (v3)
      ea2cec24
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 0548740e
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Several hash table refcount fixes in batman-adv, from Sven
          Eckelmann.
      
       2) Use after free in bpf_evict_inode(), from Daniel Borkmann.
      
       3) Fix mdio bus registration in ixgbe, from Ivan Vecera.
      
       4) Unbounded loop in __skb_try_recv_datagram(), from Paolo Abeni.
      
       5) ila rhashtable corruption fix from Herbert Xu.
      
       6) Don't allow upper-devices to be added to vrf devices, from Sabrina
          Dubroca.
      
       7) Add qmi_wwan device ID for Olicard 600, from Bjørn Mork.
      
       8) Don't leave skb->next poisoned in __netif_receive_skb_list_ptype,
          from Alexander Lobakin.
      
       9) Missing IDR checks in mlx5 driver, from Aditya Pakki.
      
      10) Fix false connection termination in ktls, from Jakub Kicinski.
      
      11) Work around some ASPM issues with r8169 by disabling rx interrupt
          coalescing on certain chips. From Heiner Kallweit.
      
      12) Properly use per-cpu qstat values on NOLOCK qdiscs, from Paolo
          Abeni.
      
      13) Fully initialize sockaddr_in structures in SCTP, from Xin Long.
      
      14) Various BPF flow dissector fixes from Stanislav Fomichev.
      
      15) Divide by zero in act_sample, from Davide Caratti.
      
      16) Fix bridging multicast regression introduced by rhashtable
          conversion, from Nikolay Aleksandrov.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
        ibmvnic: Fix completion structure initialization
        ipv6: sit: reset ip header pointer in ipip6_rcv
        net: bridge: always clear mcast matching struct on reports and leaves
        libcxgb: fix incorrect ppmax calculation
        vlan: conditional inclusion of FCoE hooks to match netdevice.h and bnx2x
        sch_cake: Make sure we can write the IP header before changing DSCP bits
        sch_cake: Use tc_skb_protocol() helper for getting packet protocol
        tcp: Ensure DCTCP reacts to losses
        net/sched: act_sample: fix divide by zero in the traffic path
        net: thunderx: fix NULL pointer dereference in nicvf_open/nicvf_stop
        net: hns: Fix sparse: some warnings in HNS drivers
        net: hns: Fix WARNING when remove HNS driver with SMMU enabled
        net: hns: fix ICMP6 neighbor solicitation messages discard problem
        net: hns: Fix probabilistic memory overwrite when HNS driver initialized
        net: hns: Use NAPI_POLL_WEIGHT for hns driver
        net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw()
        flow_dissector: rst'ify documentation
        ipv6: Fix dangling pointer when ipv6 fragment
        net-gro: Fix GRO flush when receiving a GSO packet.
        flow_dissector: document BPF flow dissector environment
        ...
      0548740e
    • T
      ibmvnic: Fix completion structure initialization · bbd669a8
      Thomas Falcon 提交于
      Fix device initialization completion handling for vNIC adapters.
      Initialize the completion structure on probe and reinitialize when needed.
      This also fixes a race condition during kdump where the driver can attempt
      to access the completion struct before it is initialized:
      
      Unable to handle kernel paging request for data at address 0x00000000
      Faulting instruction address: 0xc0000000081acbe0
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: ibmvnic(+) ibmveth sunrpc overlay squashfs loop
      CPU: 19 PID: 301 Comm: systemd-udevd Not tainted 4.18.0-64.el8.ppc64le #1
      NIP:  c0000000081acbe0 LR: c0000000081ad964 CTR: c0000000081ad900
      REGS: c000000027f3f990 TRAP: 0300   Not tainted  (4.18.0-64.el8.ppc64le)
      MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228288  XER: 00000006
      CFAR: c000000008008934 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1
      GPR00: c0000000081ad964 c000000027f3fc10 c0000000095b5800 c0000000221b4e58
      GPR04: 0000000000000003 0000000000000001 000049a086918581 00000000000000d4
      GPR08: 0000000000000007 0000000000000000 ffffffffffffffe8 d0000000014dde28
      GPR12: c0000000081ad900 c000000009a00c00 0000000000000001 0000000000000100
      GPR16: 0000000000000038 0000000000000007 c0000000095e2230 0000000000000006
      GPR20: 0000000000400140 0000000000000001 c00000000910c880 0000000000000000
      GPR24: 0000000000000000 0000000000000006 0000000000000000 0000000000000003
      GPR28: 0000000000000001 0000000000000001 c0000000221b4e60 c0000000221b4e58
      NIP [c0000000081acbe0] __wake_up_locked+0x50/0x100
      LR [c0000000081ad964] complete+0x64/0xa0
      Call Trace:
      [c000000027f3fc10] [c000000027f3fc60] 0xc000000027f3fc60 (unreliable)
      [c000000027f3fc60] [c0000000081ad964] complete+0x64/0xa0
      [c000000027f3fca0] [d0000000014dad58] ibmvnic_handle_crq+0xce0/0x1160 [ibmvnic]
      [c000000027f3fd50] [d0000000014db270] ibmvnic_tasklet+0x98/0x130 [ibmvnic]
      [c000000027f3fda0] [c00000000813f334] tasklet_action_common.isra.3+0xc4/0x1a0
      [c000000027f3fe00] [c000000008cd13f4] __do_softirq+0x164/0x400
      [c000000027f3fef0] [c00000000813ed64] irq_exit+0x184/0x1c0
      [c000000027f3ff20] [c0000000080188e8] __do_irq+0xb8/0x210
      [c000000027f3ff90] [c00000000802d0a4] call_do_irq+0x14/0x24
      [c000000026a5b010] [c000000008018adc] do_IRQ+0x9c/0x130
      [c000000026a5b060] [c000000008008ce4] hardware_interrupt_common+0x114/0x120
      Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbd669a8
    • L
      ipv6: sit: reset ip header pointer in ipip6_rcv · bb9bd814
      Lorenzo Bianconi 提交于
      ipip6 tunnels run iptunnel_pull_header on received skbs. This can
      determine the following use-after-free accessing iph pointer since
      the packet will be 'uncloned' running pskb_expand_head if it is a
      cloned gso skb (e.g if the packet has been sent though a veth device)
      
      [  706.369655] BUG: KASAN: use-after-free in ipip6_rcv+0x1678/0x16e0 [sit]
      [  706.449056] Read of size 1 at addr ffffe01b6bd855f5 by task ksoftirqd/1/=
      [  706.669494] Hardware name: HPE ProLiant m400 Server/ProLiant m400 Server, BIOS U02 08/19/2016
      [  706.771839] Call trace:
      [  706.801159]  dump_backtrace+0x0/0x2f8
      [  706.845079]  show_stack+0x24/0x30
      [  706.884833]  dump_stack+0xe0/0x11c
      [  706.925629]  print_address_description+0x68/0x260
      [  706.982070]  kasan_report+0x178/0x340
      [  707.025995]  __asan_report_load1_noabort+0x30/0x40
      [  707.083481]  ipip6_rcv+0x1678/0x16e0 [sit]
      [  707.132623]  tunnel64_rcv+0xd4/0x200 [tunnel4]
      [  707.185940]  ip_local_deliver_finish+0x3b8/0x988
      [  707.241338]  ip_local_deliver+0x144/0x470
      [  707.289436]  ip_rcv_finish+0x43c/0x14b0
      [  707.335447]  ip_rcv+0x628/0x1138
      [  707.374151]  __netif_receive_skb_core+0x1670/0x2600
      [  707.432680]  __netif_receive_skb+0x28/0x190
      [  707.482859]  process_backlog+0x1d0/0x610
      [  707.529913]  net_rx_action+0x37c/0xf68
      [  707.574882]  __do_softirq+0x288/0x1018
      [  707.619852]  run_ksoftirqd+0x70/0xa8
      [  707.662734]  smpboot_thread_fn+0x3a4/0x9e8
      [  707.711875]  kthread+0x2c8/0x350
      [  707.750583]  ret_from_fork+0x10/0x18
      
      [  707.811302] Allocated by task 16982:
      [  707.854182]  kasan_kmalloc.part.1+0x40/0x108
      [  707.905405]  kasan_kmalloc+0xb4/0xc8
      [  707.948291]  kasan_slab_alloc+0x14/0x20
      [  707.994309]  __kmalloc_node_track_caller+0x158/0x5e0
      [  708.053902]  __kmalloc_reserve.isra.8+0x54/0xe0
      [  708.108280]  __alloc_skb+0xd8/0x400
      [  708.150139]  sk_stream_alloc_skb+0xa4/0x638
      [  708.200346]  tcp_sendmsg_locked+0x818/0x2b90
      [  708.251581]  tcp_sendmsg+0x40/0x60
      [  708.292376]  inet_sendmsg+0xf0/0x520
      [  708.335259]  sock_sendmsg+0xac/0xf8
      [  708.377096]  sock_write_iter+0x1c0/0x2c0
      [  708.424154]  new_sync_write+0x358/0x4a8
      [  708.470162]  __vfs_write+0xc4/0xf8
      [  708.510950]  vfs_write+0x12c/0x3d0
      [  708.551739]  ksys_write+0xcc/0x178
      [  708.592533]  __arm64_sys_write+0x70/0xa0
      [  708.639593]  el0_svc_handler+0x13c/0x298
      [  708.686646]  el0_svc+0x8/0xc
      
      [  708.739019] Freed by task 17:
      [  708.774597]  __kasan_slab_free+0x114/0x228
      [  708.823736]  kasan_slab_free+0x10/0x18
      [  708.868703]  kfree+0x100/0x3d8
      [  708.905320]  skb_free_head+0x7c/0x98
      [  708.948204]  skb_release_data+0x320/0x490
      [  708.996301]  pskb_expand_head+0x60c/0x970
      [  709.044399]  __iptunnel_pull_header+0x3b8/0x5d0
      [  709.098770]  ipip6_rcv+0x41c/0x16e0 [sit]
      [  709.146873]  tunnel64_rcv+0xd4/0x200 [tunnel4]
      [  709.200195]  ip_local_deliver_finish+0x3b8/0x988
      [  709.255596]  ip_local_deliver+0x144/0x470
      [  709.303692]  ip_rcv_finish+0x43c/0x14b0
      [  709.349705]  ip_rcv+0x628/0x1138
      [  709.388413]  __netif_receive_skb_core+0x1670/0x2600
      [  709.446943]  __netif_receive_skb+0x28/0x190
      [  709.497120]  process_backlog+0x1d0/0x610
      [  709.544169]  net_rx_action+0x37c/0xf68
      [  709.589131]  __do_softirq+0x288/0x1018
      
      [  709.651938] The buggy address belongs to the object at ffffe01b6bd85580
                      which belongs to the cache kmalloc-1024 of size 1024
      [  709.804356] The buggy address is located 117 bytes inside of
                      1024-byte region [ffffe01b6bd85580, ffffe01b6bd85980)
      [  709.946340] The buggy address belongs to the page:
      [  710.003824] page:ffff7ff806daf600 count:1 mapcount:0 mapping:ffffe01c4001f600 index:0x0
      [  710.099914] flags: 0xfffff8000000100(slab)
      [  710.149059] raw: 0fffff8000000100 dead000000000100 dead000000000200 ffffe01c4001f600
      [  710.242011] raw: 0000000000000000 0000000000380038 00000001ffffffff 0000000000000000
      [  710.334966] page dumped because: kasan: bad access detected
      
      Fix it resetting iph pointer after iptunnel_pull_header
      
      Fixes: a09a4c8d ("tunnels: Remove encapsulation offloads on decap")
      Tested-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb9bd814
    • L
      Merge tag 'riscv-for-linus-5.1-rc4' of... · 8e22ba96
      Linus Torvalds 提交于
      Merge tag 'riscv-for-linus-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V fixes from Palmer Dabbelt:
       "I dropped the ball a bit here: these patches should all probably have
        been part of rc2, but I wanted to get around to properly testing them
        in the various configurations (qemu32, qeum64, unleashed) first.
      
        Unfortunately I've been traveling and didn't have time to actually do
        that, but since these fix concrete bugs and pass my old set of tests I
        don't want to delay the fixes any longer.
      
        There are four independent fixes here:
      
         - A fix for the rv32 port that corrects the 64-bit user accesor's
           fixup label address.
      
         - A fix for a regression introduced during the merge window that
           broke medlow configurations at run time. This patch also includes a
           fix that disables ftrace for the same set of functions, which was
           found by inspection at the same time.
      
         - A modification of the memory map to avoid overlapping the FIXMAP
           and VMALLOC regions on systems with small memory maps.
      
         - A fix to the module handling code to use the correct syntax for
           probing Kconfig entries.
      
        These have passed my standard test flow, but I didn't have time to
        expand that testing like I said I would"
      
      * tag 'riscv-for-linus-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        RISC-V: Use IS_ENABLED(CONFIG_CMODEL_MEDLOW)
        RISC-V: Fix FIXMAP_TOP to avoid overlap with VMALLOC area
        RISC-V: Always compile mm/init.c with cmodel=medany and notrace
        riscv: fix accessing 8-byte variable from RV32
      8e22ba96
    • N
      net: bridge: always clear mcast matching struct on reports and leaves · 1515a63f
      Nikolay Aleksandrov 提交于
      We need to be careful and always zero the whole br_ip struct when it is
      used for matching since the rhashtable change. This patch fixes all the
      places which didn't properly clear it which in turn might've caused
      mismatches.
      
      Thanks for the great bug report with reproducing steps and bisection.
      
      Steps to reproduce (from the bug report):
      ip link add br0 type bridge mcast_querier 1
      ip link set br0 up
      
      ip link add v2 type veth peer name v3
      ip link set v2 master br0
      ip link set v2 up
      ip link set v3 up
      ip addr add 3.0.0.2/24 dev v3
      
      ip netns add test
      ip link add v1 type veth peer name v1 netns test
      ip link set v1 master br0
      ip link set v1 up
      ip -n test link set v1 up
      ip -n test addr add 3.0.0.1/24 dev v1
      
      # Multicast receiver
      ip netns exec test socat
      UDP4-RECVFROM:5588,ip-add-membership=224.224.224.224:3.0.0.1,fork -
      
      # Multicast sender
      echo hello | nc -u -s 3.0.0.2 224.224.224.224 5588
      
      Reported-by: liam.mcbirnie@boeing.com
      Fixes: 19e3a9c9 ("net: bridge: convert multicast to generic rhashtable")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1515a63f
    • L
      Merge tag 'pm-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 20ad5494
      Linus Torvalds 提交于
      Pull power management fixes from Rafael Wysocki:
       "These fix up the intel_pstate driver after recent changes to prevent
        it from printing pointless messages and update the turbostat utility
        (mostly fixes and new hardware support).
      
        Specifics:
      
         - Make intel_pstate only load on Intel processors and prevent it from
           printing pointless failure messages (Borislav Petkov).
      
         - Update the turbostat utility:
            * Assorted fixes (Ben Hutchings, Len Brown, Prarit Bhargava).
            * Support for AMD Fam 17h (Zen) RAPL and package power (Calvin
              Walton).
            * Support for Intel Icelake and for systems with more than one die
              per package (Len Brown).
            * Cleanups (Len Brown)"
      
      * tag 'pm-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq/intel_pstate: Load only on Intel hardware
        tools/power turbostat: update version number
        tools/power turbostat: Warn on bad ACPI LPIT data
        tools/power turbostat: Add checks for failure of fgets() and fscanf()
        tools/power turbostat: Also read package power on AMD F17h (Zen)
        tools/power turbostat: Add support for AMD Fam 17h (Zen) RAPL
        tools/power turbostat: Do not display an error on systems without a cpufreq driver
        tools/power turbostat: Add Die column
        tools/power turbostat: Add Icelake support
        tools/power turbostat: Cleanup CNL-specific code
        tools/power turbostat: Cleanup CC3-skip code
        tools/power turbostat: Restore ability to execute in topology-order
      20ad5494
    • L
      Merge tag 'acpi-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b512f712
      Linus Torvalds 提交于
      Pull ACPI fix from Rafael Wysocki:
       "Prevent stale GPE events from triggering spurious system wakeups from
        suspend-to-idle (Furquan Shaikh)"
      
      * tag 'acpi-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPICA: Clear status of GPEs before enabling them
      b512f712
    • D
      Merge tag 'drm-intel-fixes-2019-04-04' of... · 23b5f422
      Dave Airlie 提交于
      Merge tag 'drm-intel-fixes-2019-04-04' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      Only one fix for DSC (backoff after drm_modeset_lock deadlock)
      and GVT's fixes including vGPU display plane size calculation,
      shadow mm pin count, error recovery path for workload create
      and one kerneldoc fix.
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      
      From: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190404161116.GA14522@intel.com
      23b5f422