1. 17 5月, 2019 40 次提交
    • D
      PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary · 9fa23ea1
      Dexuan Cui 提交于
      commit 340d455699400f2c2c0f9b3f703ade3085cdb501 upstream.
      
      When we hot-remove a device, usually the host sends us a PCI_EJECT message,
      and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      When we execute the quick hot-add/hot-remove test, the host may not send
      us the PCI_EJECT message if the guest has not fully finished the
      initialization by sending the PCI_RESOURCES_ASSIGNED* message to the
      host, so it's potentially unsafe to only depend on the
      pci_destroy_slot() in hv_eject_device_work() because the code path
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      is not called in this case. Note: in this case, the host still sends the
      guest a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      In the quick hot-add/hot-remove test, we can have such a race before
      the code path
      
      pci_devices_present_work()
       -> new_pcichild_device()
      
      adds the new device into the hbus->children list, we may have already
      received the PCI_EJECT message, and since the tasklet handler
      
      hv_pci_onchannelcallback()
      
      may fail to find the "hpdev" by calling
      
      get_pcichild_wslot(hbus, dev_message->wslot.slot)
      
      hv_pci_eject_device() is not called; Later, by continuing execution
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      creates the slot and the PCI_BUS_RELATIONS message with
      bus_rel->device_count == 0 removes the device from hbus->children, and
      we end up being unable to remove the slot in
      
      hv_pci_remove()
       -> hv_pci_remove_slots()
      
      Remove the slot in pci_devices_present_work() when the device
      is removed to address this race.
      
      pci_devices_present_work() and hv_eject_device_work() run in the
      singled-threaded hbus->wq, so there is not a double-remove issue for the
      slot.
      
      We cannot offload hv_pci_eject_device() from hv_pci_onchannelcallback()
      to the workqueue, because we need the hv_pci_onchannelcallback()
      synchronously call hv_pci_eject_device() to poll the channel
      ringbuffer to work around the "hangs in hv_compose_msi_msg()" issue
      fixed in commit de0aa7b2 ("PCI: hv: Fix 2 hang issues in
      hv_compose_msi_msg()")
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: rewritten commit log]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fa23ea1
    • D
      PCI: hv: Add hv_pci_remove_slots() when we unload the driver · 76888d13
      Dexuan Cui 提交于
      commit 15becc2b56c6eda3d9bf5ae993bafd5661c1fad1 upstream.
      
      When we unload the pci-hyperv host controller driver, the host does not
      send us a PCI_EJECT message.
      
      In this case we also need to make sure the sysfs PCI slot directory is
      removed, otherwise a command on a slot file eg:
      
      "cat /sys/bus/pci/slots/2/address"
      
      will trigger a
      
      "BUG: unable to handle kernel paging request"
      
      and, if we unload/reload the driver several times we would end up with
      stale slot entries in PCI slot directories in /sys/bus/pci/slots/
      
      root@localhost:~# ls -rtl  /sys/bus/pci/slots/
      total 0
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2-1
      drwxr-xr-x 2 root root 0 Feb  7 10:51 2-2
      
      Add the missing code to remove the PCI slot and fix the current
      behaviour.
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: reformatted the log]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <sthemmin@microsoft.com>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76888d13
    • D
      PCI: hv: Fix a memory leak in hv_eject_device_work() · a47e0054
      Dexuan Cui 提交于
      commit 05f151a73ec2b23ffbff706e5203e729a995cdc2 upstream.
      
      When a device is created in new_pcichild_device(), hpdev->refs is set
      to 2 (i.e. the initial value of 1 plus the get_pcichild()).
      
      When we hot remove the device from the host, in a Linux VM we first call
      hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and
      then schedules a work of hv_eject_device_work(), so hpdev->refs becomes
      3 (let's ignore the paired get/put_pcichild() in other places). But in
      hv_eject_device_work(), currently we only call put_pcichild() twice,
      meaning the 'hpdev' struct can't be freed in put_pcichild().
      
      Add one put_pcichild() to fix the memory leak.
      
      The device can also be removed when we run "rmmod pci-hyperv". On this
      path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()),
      hpdev->refs is 2, and we do correctly call put_pcichild() twice in
      pci_devices_present_work().
      
      Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: commit log rework]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a47e0054
    • L
      powerpc/booke64: set RI in default MSR · 4179b858
      Laurentiu Tudor 提交于
      commit 5266e58d6cd90ac85c187d673093ad9cb649e16d upstream.
      
      Set RI in the default kernel's MSR so that the architected way of
      detecting unrecoverable machine check interrupts has a chance to work.
      This is inline with the MSR setup of the rest of booke powerpc
      architectures configured here.
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4179b858
    • R
      powerpc/powernv/idle: Restore IAMR after idle · 71b20cdb
      Russell Currey 提交于
      commit a3f3072db6cad40895c585dce65e36aab997f042 upstream.
      
      Without restoring the IAMR after idle, execution prevention on POWER9
      with Radix MMU is overwritten and the kernel can freely execute
      userspace without faulting.
      
      This is necessary when returning from any stop state that modifies
      user state, as well as hypervisor state.
      
      To test how this fails without this patch, load the lkdtm driver and
      do the following:
      
        $ echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
      
      which won't fault, then boot the kernel with powersave=off, where it
      will fault. Applying this patch will fix this.
      
      Fixes: 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71b20cdb
    • R
      powerpc/book3s/64: check for NULL pointer in pgd_alloc() · 69c2b71c
      Rick Lindsley 提交于
      commit f39356261c265a0689d7ee568132d516e8b6cecc upstream.
      
      When the memset code was added to pgd_alloc(), it failed to consider
      that kmem_cache_alloc() can return NULL. It's uncommon, but not
      impossible under heavy memory contention. Example oops:
      
        Unable to handle kernel paging request for data at address 0x00000000
        Faulting instruction address: 0xc0000000000a4000
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1
        task: c000000334a00000 task.stack: c000000331c00000
        NIP:  c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020
        REGS: c000000331c039c0 TRAP: 0300   Not tainted  (4.14.0-115.6.1.el7a.ppc64le)
        MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022840  XER: 20040000
        CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
        ...
        NIP [c0000000000a4000] memset+0x68/0x104
        LR [c00000000012f43c] mm_init+0x27c/0x2f0
        Call Trace:
          mm_init+0x260/0x2f0 (unreliable)
          copy_mm+0x11c/0x638
          copy_process.isra.28.part.29+0x6fc/0x1080
          _do_fork+0xdc/0x4c0
          ppc_clone+0x8/0xc
        Instruction dump:
        409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0
        7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018
      
      Fixes: fc5c2f4a ("powerpc/mm/hash64: Zero PGD pages on allocation")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: NRick Lindsley <ricklind@vnet.linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69c2b71c
    • D
      drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl · e9ec5073
      Dan Carpenter 提交于
      commit 6a024330650e24556b8a18cc654ad00cfecf6c6c upstream.
      
      The "param.count" value is a u64 thatcomes from the user.  The code
      later in the function assumes that param.count is at least one and if
      it's not then it leads to an Oops when we dereference the ZERO_SIZE_PTR.
      
      Also the addition can have an integer overflow which would lead us to
      allocate a smaller "pages" array than required.  I can't immediately
      tell what the possible run times implications are, but it's safest to
      prevent the overflow.
      
      Link: http://lkml.kernel.org/r/20181218082129.GE32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9ec5073
    • D
      drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl · ee3b53d8
      Dan Carpenter 提交于
      commit c8ea3663f7a8e6996d44500ee818c9330ac4fd88 upstream.
      
      strndup_user() returns error pointers on error, and then in the error
      handling we pass the error pointers to kfree().  It will cause an Oops.
      
      Link: http://lkml.kernel.org/r/20181218082003.GD32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee3b53d8
    • P
      tipc: fix hanging clients using poll with EPOLLOUT flag · afa485dc
      Parthasarathy Bhuvaragan 提交于
      [ Upstream commit ff946833b70e0c7f93de9a3f5b329b5ae2287b38 ]
      
      commit 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      introduced a regression for clients using non-blocking sockets.
      After the commit, we send EPOLLOUT event to the client even in
      TIPC_CONNECTING state. This causes the subsequent send() to fail
      with ENOTCONN, as the socket is still not in TIPC_ESTABLISHED state.
      
      In this commit, we:
      - improve the fix for hanging poll() by replacing sk_data_ready()
        with sk_state_change() to wake up all clients.
      - revert the faulty updates introduced by commit 517d7c79
        ("tipc: fix hanging poll() for stream sockets").
      
      Fixes: 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afa485dc
    • P
      isdn: bas_gigaset: use usb_fill_int_urb() properly · 98652e0b
      Paul Bolle 提交于
      [ Upstream commit 4014dfae3ccaaf3ec19c9ae0691a3f14e7132eae ]
      
      The switch to make bas_gigaset use usb_fill_int_urb() - instead of
      filling that urb "by hand" - missed the subtle ordering of the previous
      code.
      
      See, before the switch urb->dev was set to a member somewhere deep in a
      complicated structure and then supplied to usb_rcvisocpipe() and
      usb_sndisocpipe(). After that switch urb->dev wasn't set to anything
      specific before being supplied to those two macros. This triggers a
      nasty oops:
      
          BUG: unable to handle kernel NULL pointer dereference at 00000000
          #PF error: [normal kernel read fault]
          *pde = 00000000
          Oops: 0000 [#1] SMP
          CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-0.rc4.1.local0.fc28.i686 #1
          Hardware name: IBM 2525FAG/2525FAG, BIOS 74ET64WW (2.09 ) 12/14/2006
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: f30c9f20
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Call Trace:
           <SOFTIRQ>
           ? gigaset_isdn_connD+0xf6/0x140 [gigaset]
           gigaset_handle_event+0x173e/0x1b90 [gigaset]
           tasklet_action_common.isra.16+0x4e/0xf0
           tasklet_action+0x1e/0x20
           __do_softirq+0xb2/0x293
           ? __irqentry_text_end+0x3/0x3
           call_on_stack+0x45/0x50
           </SOFTIRQ>
           ? irq_exit+0xb5/0xc0
           ? do_IRQ+0x78/0xd0
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? common_interrupt+0xd4/0xdc
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? sched_cpu_activate+0x1b/0xf0
           ? acpi_fan_resume.cold.7+0x9/0x18
           ? cpuidle_enter_state+0x152/0x4c0
           ? cpuidle_enter+0x14/0x20
           ? call_cpuidle+0x21/0x40
           ? do_idle+0x1c8/0x200
           ? cpu_startup_entry+0x25/0x30
           ? rest_init+0x88/0x8a
           ? arch_call_rest_init+0xd/0x19
           ? start_kernel+0x42f/0x448
           ? i386_start_kernel+0xac/0xb0
           ? startup_32_smp+0x164/0x168
          Modules linked in: ppp_generic slhc capi bas_gigaset gigaset kernelcapi nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc ipw2200 iTCO_wdt gpio_ich snd_intel8x0 libipw iTCO_vendor_support snd_ac97_codec lib80211 ppdev ac97_bus snd_seq cfg80211 snd_seq_device pcspkr thinkpad_acpi lpc_ich snd_pcm i2c_i801 snd_timer ledtrig_audio snd soundcore rfkill parport_pc parport pcc_cpufreq acpi_cpufreq i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sdhci_pci sysimgblt cqhci fb_sys_fops drm sdhci mmc_core tg3 ata_generic serio_raw yenta_socket pata_acpi video
          CR2: 0000000000000000
          ---[ end trace 1fe07487b9200c73 ]---
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: cddcb3bc
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Kernel panic - not syncing: Fatal exception in interrupt
          Kernel Offset: 0xcc00000 from 0xc0400000 (relocation range: 0xc0000000-0xf6ffdfff)
          ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      No-one noticed because this Oops is apparently only triggered by setting
      up an ISDN data connection on a live ISDN line on a gigaset base (ie,
      the PBX that the gigaset driver support). Very few people do that
      running present day kernels.
      
      Anyhow, a little code reorganization makes this problem go away, while
      avoiding the subtle ordering that was used in the past. So let's do
      that.
      
      Fixes: 78c696c1 ("isdn: gigaset: use usb_fill_int_urb()")
      Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98652e0b
    • J
      tuntap: synchronize through tfiles array instead of tun->numqueues · 17d8a9eb
      Jason Wang 提交于
      [ Upstream commit 9871a9e47a2646fe30ae7fd2e67668a8d30912f6 ]
      
      When a queue(tfile) is detached through __tun_detach(), we move the
      last enabled tfile to the position where detached one sit but don't
      NULL out last position. We expect to synchronize the datapath through
      tun->numqueues. Unfortunately, this won't work since we're lacking
      sufficient mechanism to order or synchronize the access to
      tun->numqueues.
      
      To fix this, NULL out the last position during detaching and check
      RCU protected tfile against NULL instead of checking tun->numqueues in
      datapath.
      
      Cc: YueHaibing <yuehaibing@huawei.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: weiyongjun (A) <weiyongjun1@huawei.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: c8d68e6b ("tuntap: multiqueue support")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17d8a9eb
    • J
      tuntap: fix dividing by zero in ebpf queue selection · 9c79732f
      Jason Wang 提交于
      [ Upstream commit a35d310f03a692bf4798eb309a1950a06a150620 ]
      
      We need check if tun->numqueues is zero (e.g for the persist device)
      before trying to use it for modular arithmetic.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 96f84061("tun: add eBPF based queue selection method")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c79732f
    • S
      vrf: sit mtu should not be updated when vrf netdev is the link · 737713e6
      Stephen Suryaputra 提交于
      [ Upstream commit ff6ab32bd4e073976e4d8797b4d514a172cfe6cb ]
      
      VRF netdev mtu isn't typically set and have an mtu of 65536. When the
      link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
      mtu minus tunnel header. In the case of VRF netdev is the link, then the
      tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
      this case.
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      737713e6
    • H
      vlan: disable SIOCSHWTSTAMP in container · e3840607
      Hangbin Liu 提交于
      [ Upstream commit 873017af778439f2f8e3d87f28ddb1fcaf244a76 ]
      
      With NET_ADMIN enabled in container, a normal user could be mapped to
      root and is able to change the real device's rx filter via ioctl on
      vlan, which would affect the other ptp process on host. Fix it by
      disabling SIOCSHWTSTAMP in container.
      
      Fixes: a6111d3c ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3840607
    • P
      selinux: do not report error on connect(AF_UNSPEC) · dfdfad3d
      Paolo Abeni 提交于
      [ Upstream commit c7e0d6cca86581092cbbf2cd868b3601495554cf ]
      
      calling connect(AF_UNSPEC) on an already connected TCP socket is an
      established way to disconnect() such socket. After commit 68741a8a
      ("selinux: Fix ltp test connect-syscall failure") it no longer works
      and, in the above scenario connect() fails with EAFNOSUPPORT.
      
      Fix the above falling back to the generic/old code when the address family
      is not AF_INET{4,6}, but leave the SCTP code path untouched, as it has
      specific constraints.
      
      Fixes: 68741a8a ("selinux: Fix ltp test connect-syscall failure")
      Reported-by: NTom Deseyn <tdeseyn@redhat.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfdfad3d
    • Y
      packet: Fix error path in packet_init · 9f51d6f7
      YueHaibing 提交于
      [ Upstream commit 36096f2f4fa05f7678bc87397665491700bae757 ]
      
      kernel BUG at lib/list_debug.c:47!
      invalid opcode: 0000 [#1
      CPU: 0 PID: 12914 Comm: rmmod Tainted: G        W         5.1.0+ #47
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:__list_del_entry_valid+0x53/0x90
      Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48
      89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2
      RSP: 0018:ffffc90001c2fe40 EFLAGS: 00010286
      RAX: 000000000000004e RBX: ffffffffa0184000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff888237a17788 RDI: 00000000ffffffff
      RBP: ffffc90001c2fe40 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffc90001c2fe10 R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc90001c2fe50 R14: ffffffffa0184000 R15: 0000000000000000
      FS:  00007f3d83634540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000555c350ea818 CR3: 0000000231677000 CR4: 00000000000006f0
      Call Trace:
       unregister_pernet_operations+0x34/0x120
       unregister_pernet_subsys+0x1c/0x30
       packet_exit+0x1c/0x369 [af_packet
       __x64_sys_delete_module+0x156/0x260
       ? lockdep_hardirqs_on+0x133/0x1b0
       ? do_syscall_64+0x12/0x1f0
       do_syscall_64+0x6e/0x1f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      When modprobe af_packet, register_pernet_subsys
      fails and does a cleanup, ops->list is set to LIST_POISON1,
      but the module init is considered to success, then while rmmod it,
      BUG() is triggered in __list_del_entry_valid which is called from
      unregister_pernet_subsys. This patch fix error handing path in
      packet_init to avoid possilbe issue if some error occur.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f51d6f7
    • C
      net: ucc_geth - fix Oops when changing number of buffers in the ring · 2e95eb9c
      Christophe Leroy 提交于
      [ Upstream commit ee0df19305d9fabd9479b785918966f6e25b733b ]
      
      When changing the number of buffers in the RX ring while the interface
      is running, the following Oops is encountered due to the new number
      of buffers being taken into account immediately while their allocation
      is done when opening the device only.
      
      [   69.882706] Unable to handle kernel paging request for data at address 0xf0000100
      [   69.890172] Faulting instruction address: 0xc033e164
      [   69.895122] Oops: Kernel access of bad area, sig: 11 [#1]
      [   69.900494] BE PREEMPT CMPCPRO
      [   69.907120] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.115-00006-g179ade8ce3-dirty #269
      [   69.915956] task: c0684310 task.stack: c06da000
      [   69.920470] NIP:  c033e164 LR: c02e44d0 CTR: c02e41fc
      [   69.925504] REGS: dfff1e20 TRAP: 0300   Not tainted  (4.14.115-00006-g179ade8ce3-dirty)
      [   69.934161] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22004428  XER: 20000000
      [   69.940869] DAR: f0000100 DSISR: 20000000
      [   69.940869] GPR00: c0352d70 dfff1ed0 c0684310 f00000a4 00000040 dfff1f68 00000000 0000001f
      [   69.940869] GPR08: df53f410 1cc00040 00000021 c0781640 42004424 100c82b6 f00000a4 df53f5b0
      [   69.940869] GPR16: df53f6c0 c05daf84 00000040 00000000 00000040 c0782be4 00000000 00000001
      [   69.940869] GPR24: 00000000 df53f400 000001b0 df53f410 df53f000 0000003f df708220 1cc00044
      [   69.978348] NIP [c033e164] skb_put+0x0/0x5c
      [   69.982528] LR [c02e44d0] ucc_geth_poll+0x2d4/0x3f8
      [   69.987384] Call Trace:
      [   69.989830] [dfff1ed0] [c02e4554] ucc_geth_poll+0x358/0x3f8 (unreliable)
      [   69.996522] [dfff1f20] [c0352d70] net_rx_action+0x248/0x30c
      [   70.002099] [dfff1f80] [c04e93e4] __do_softirq+0xfc/0x310
      [   70.007492] [dfff1fe0] [c0021124] irq_exit+0xd0/0xd4
      [   70.012458] [dfff1ff0] [c000e7e0] call_do_irq+0x24/0x3c
      [   70.017683] [c06dbe80] [c0006bac] do_IRQ+0x64/0xc4
      [   70.022474] [c06dbea0] [c001097c] ret_from_except+0x0/0x14
      [   70.027964] --- interrupt: 501 at rcu_idle_exit+0x84/0x90
      [   70.027964]     LR = rcu_idle_exit+0x74/0x90
      [   70.037585] [c06dbf60] [20000000] 0x20000000 (unreliable)
      [   70.042984] [c06dbf80] [c004bb0c] do_idle+0xb4/0x11c
      [   70.047945] [c06dbfa0] [c004bd14] cpu_startup_entry+0x18/0x1c
      [   70.053682] [c06dbfb0] [c05fb034] start_kernel+0x370/0x384
      [   70.059153] [c06dbff0] [00003438] 0x3438
      [   70.063062] Instruction dump:
      [   70.066023] 38a00000 38800000 90010014 4bfff015 80010014 7c0803a6 3123ffff 7c691910
      [   70.073767] 38210010 4e800020 38600000 4e800020 <80e3005c> 80c30098 3107ffff 7d083910
      [   70.081690] ---[ end trace be7ccd9c1e1a9f12 ]---
      
      This patch forbids the modification of the number of buffers in the
      ring while the interface is running.
      
      Fixes: ac421852 ("ucc_geth: add ethtool support")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e95eb9c
    • T
      net: seeq: fix crash caused by not set dev.parent · 210057b7
      Thomas Bogendoerfer 提交于
      [ Upstream commit 5afcd14cfc7fed1bcc8abcee2cef82732772bfc2 ]
      
      The old MIPS implementation of dma_cache_sync() didn't use the dev argument,
      but commit c9eb6172 ("dma-mapping: turn dma_cache_sync into a
      dma_map_ops method") changed that, so we now need to set dev.parent.
      Signed-off-by: NThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      210057b7
    • H
      net: macb: Change interrupt and napi enable order in open · dfd91928
      Harini Katakam 提交于
      [ Upstream commit 0504453139ef5a593c9587e1e851febee859c7d8 ]
      
      Current order in open:
      -> Enable interrupts (macb_init_hw)
      -> Enable NAPI
      -> Start PHY
      
      Sequence of RX handling:
      -> RX interrupt occurs
      -> Interrupt is cleared and interrupt bits disabled in handler
      -> NAPI is scheduled
      -> In NAPI, RX budget is processed and RX interrupts are re-enabled
      
      With the above, on QEMU or fixed link setups (where PHY state doesn't
      matter), there's a chance macb RX interrupt occurs before NAPI is
      enabled. This will result in NAPI being scheduled before it is enabled.
      Fix this macb open by changing the order.
      
      Fixes: ae1f2a56 ("net: macb: Added support for many RX queues")
      Signed-off-by: NHarini Katakam <harini.katakam@xilinx.com>
      Acked-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfd91928
    • C
      net: ethernet: stmmac: dwmac-sun8i: enable support of unicast filtering · 68df8383
      Corentin Labbe 提交于
      [ Upstream commit d4c26eb6e721683a0f93e346ce55bc8dc3cbb175 ]
      
      When adding more MAC addresses to a dwmac-sun8i interface, the device goes
      directly in promiscuous mode.
      This is due to IFF_UNICAST_FLT missing flag.
      
      So since the hardware support unicast filtering, let's add IFF_UNICAST_FLT.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Signed-off-by: NCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68df8383
    • Y
      net: dsa: Fix error cleanup path in dsa_init_module · 9284895b
      YueHaibing 提交于
      [ Upstream commit 68be930249d051fd54d3d99156b3dcadcb2a1f9b ]
      
      BUG: unable to handle kernel paging request at ffffffffa01c5430
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:raw_notifier_chain_register+0x16/0x40
      Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
      RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
      RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
      RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
      RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
      R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
      FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
      Call Trace:
       register_netdevice_notifier+0x43/0x250
       ? 0xffffffffa01e0000
       dsa_slave_register_notifier+0x13/0x70 [dsa_core
       ? 0xffffffffa01e0000
       dsa_init_module+0x2e/0x1000 [dsa_core
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cleanup allocated resourses if there are errors,
      otherwise it will trgger memleak.
      
      Fixes: c9eb3e0f ("net: dsa: Add support for learning FDB through notification")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9284895b
    • D
      ipv4: Fix raw socket lookup for local traffic · da2e770f
      David Ahern 提交于
      [ Upstream commit 19e4e768064a87b073a4b4c138b55db70e0cfb9f ]
      
      inet_iif should be used for the raw socket lookup. inet_iif considers
      rt_iif which handles the case of local traffic.
      
      As it stands, ping to a local address with the '-I <dev>' option fails
      ever since ping was changed to use SO_BINDTODEVICE instead of
      cmsg + IP_PKTINFO.
      
      IPv6 works fine.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da2e770f
    • H
      fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied · 947fec63
      Hangbin Liu 提交于
      [ Upstream commit e9919a24d3022f72bcadc407e73a6ef17093a849 ]
      
      With commit 153380ec ("fib_rules: Added NLM_F_EXCL support to
      fib_nl_newrule") we now able to check if a rule already exists. But this
      only works with iproute2. For other tools like libnl, NetworkManager,
      it still could add duplicate rules with only NLM_F_CREATE flag, like
      
      [localhost ~ ]# ip rule
      0:      from all lookup local
      32766:  from all lookup main
      32767:  from all lookup default
      100000: from 192.168.7.5 lookup 5
      100000: from 192.168.7.5 lookup 5
      
      As it doesn't make sense to create two duplicate rules, let's just return
      0 if the rule exists.
      
      Fixes: 153380ec ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule")
      Reported-by: NThomas Haller <thaller@redhat.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      947fec63
    • L
      dpaa_eth: fix SG frame cleanup · c7b5e55b
      Laurentiu Tudor 提交于
      [ Upstream commit 17170e6570c082717c142733d9a638bcd20551f8 ]
      
      Fix issue with the entry indexing in the sg frame cleanup code being
      off-by-1. This problem showed up when doing some basic iperf tests and
      manifested in traffic coming to a halt.
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Acked-by: NMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7b5e55b
    • T
      bridge: Fix error path for kobject_init_and_add() · a79feef3
      Tobin C. Harding 提交于
      [ Upstream commit bdfad5aec1392b93495b77b864d58d7f101dc1c1 ]
      
      Currently error return from kobject_init_and_add() is not followed by a
      call to kobject_put().  This means there is a memory leak.  We currently
      set p to NULL so that kfree() may be called on it as a noop, the code is
      arguably clearer if we move the kfree() up closer to where it is
      called (instead of after goto jump).
      
      Remove a goto label 'err1' and jump to call to kobject_put() in error
      return from kobject_init_and_add() fixing the memory leak.  Re-name goto
      label 'put_back' to 'err1' now that we don't use err1, following current
      nomenclature (err1, err2 ...).  Move call to kfree out of the error
      code at bottom of function up to closer to where memory was allocated.
      Add comment to clarify call to kfree().
      Signed-off-by: NTobin C. Harding <tobin@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a79feef3
    • J
      bonding: fix arp_validate toggling in active-backup mode · 9c2cda31
      Jarod Wilson 提交于
      [ Upstream commit a9b8a2b39ce65df45687cf9ef648885c2a99fe75 ]
      
      There's currently a problem with toggling arp_validate on and off with an
      active-backup bond. At the moment, you can start up a bond, like so:
      
      modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
      ip link set bond0 down
      echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
      echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
      ip link set bond0 up
      ip addr add 192.168.1.2/24 dev bond0
      
      Pings to 192.168.1.1 work just fine. Now turn on arp_validate:
      
      echo 1 > /sys/class/net/bond0/bonding/arp_validate
      
      Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
      arp_validate off again, the link falls flat on it's face:
      
      echo 0 > /sys/class/net/bond0/bonding/arp_validate
      dmesg
      ...
      [133191.911987] bond0: Setting arp_validate to none (0)
      [133194.257793] bond0: bond_should_notify_peers: slave ens4f0
      [133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
      [133194.259000] bond0: making interface ens4f1 the new active one
      [133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
      [133197.331191] bond0: now running without any active interface!
      
      The problem lies in bond_options.c, where passing in arp_validate=0
      results in bond->recv_probe getting set to NULL. This flies directly in
      the face of commit 3fe68df9, which says we need to set recv_probe =
      bond_arp_recv, even if we're not using arp_validate. Said commit fixed
      this in bond_option_arp_interval_set, but missed that we can get to that
      same state in bond_option_arp_validate_set as well.
      
      One solution would be to universally set recv_probe = bond_arp_recv here
      as well, but I don't think bond_option_arp_validate_set has any business
      touching recv_probe at all, and that should be left to the arp_interval
      code, so we can just make things much tidier here.
      
      Fixes: 3fe68df9 ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c2cda31
    • B
      powerpc/64s: Include cpu header · 0dc9ad4e
      Breno Leitao 提交于
      commit 42e2acde1237878462b028f5a27d9cc5bea7502c upstream.
      
      Current powerpc security.c file is defining functions, as
      cpu_show_meltdown(), cpu_show_spectre_v{1,2} and others, that are being
      declared at linux/cpu.h header without including the header file that
      contains these declarations.
      
      This is being reported by sparse, which thinks that these functions are
      static, due to the lack of declaration:
      
      	arch/powerpc/kernel/security.c:105:9: warning: symbol 'cpu_show_meltdown' was not declared. Should it be static?
      	arch/powerpc/kernel/security.c:139:9: warning: symbol 'cpu_show_spectre_v1' was not declared. Should it be static?
      	arch/powerpc/kernel/security.c:161:9: warning: symbol 'cpu_show_spectre_v2' was not declared. Should it be static?
      	arch/powerpc/kernel/security.c:209:6: warning: symbol 'stf_barrier' was not declared. Should it be static?
      	arch/powerpc/kernel/security.c:289:9: warning: symbol 'cpu_show_spec_store_bypass' was not declared. Should it be static?
      
      This patch simply includes the proper header (linux/cpu.h) to match
      function definition and declaration.
      Signed-off-by: NBreno Leitao <leitao@debian.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Joel Stanley <joel@jms.id.au>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Major Hayden <major@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dc9ad4e
    • R
      um: Don't hardcode path as it is architecture dependent · db1b4aa6
      Ritesh Raj Sarraf 提交于
      commit 9ca19a3a3e2482916c475b90f3d7fa2a03d8e5ed upstream.
      
      The current code fails to run on amd64 because of hardcoded reference to
      i386
      Signed-off-by: NRitesh Raj Sarraf <rrs@researchut.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db1b4aa6
    • N
      Don't jump to compute_result state from check_result state · 85f34794
      Nigel Croxon 提交于
      commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef upstream.
      
      Changing state from check_state_check_result to
      check_state_compute_result not only is unsafe but also doesn't
      appear to serve a valid purpose.  A raid6 check should only be
      pushing out extra writes if doing repair and a mis-match occurs.
      The stripe dev management will already try and do repair writes
      for failing sectors.
      
      This patch makes the raid6 check_state_check_result handling
      work more like raid5's.  If somehow too many failures for a
      check, just quit the check operation for the stripe.  When any
      checks pass, don't try and use check_state_compute_result for
      a purpose it isn't needed for and is unsafe for.  Just mark the
      stripe as in sync for passing its parity checks and let the
      stripe dev read/write code and the bad blocks list do their
      job handling I/O errors.
      
      Repro steps from Xiao:
      
      These are the steps to reproduce this problem:
      1. redefined OPT_MEDIUM_ERR_ADDR to 12000 in scsi_debug.c
      2. insmod scsi_debug.ko dev_size_mb=11000  max_luns=1 num_tgts=1
      3. mdadm --create /dev/md127 --level=6 --raid-devices=5 /dev/sde1 /dev/sde2 /dev/sde3 /dev/sde5 /dev/sde6
      sde is the disk created by scsi_debug
      4. echo "2" >/sys/module/scsi_debug/parameters/opts
      5. raid-check
      
      It panic:
      [ 4854.730899] md: data-check of RAID array md127
      [ 4854.857455] sd 5:0:0:0: [sdr] tag#80 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.859246] sd 5:0:0:0: [sdr] tag#80 Sense Key : Medium Error [current]
      [ 4854.860694] sd 5:0:0:0: [sdr] tag#80 Add. Sense: Unrecovered read error
      [ 4854.862207] sd 5:0:0:0: [sdr] tag#80 CDB: Read(10) 28 00 00 00 2d 88 00 04 00 00
      [ 4854.864196] print_req_error: critical medium error, dev sdr, sector 11656 flags 0
      [ 4854.867409] sd 5:0:0:0: [sdr] tag#100 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.869469] sd 5:0:0:0: [sdr] tag#100 Sense Key : Medium Error [current]
      [ 4854.871206] sd 5:0:0:0: [sdr] tag#100 Add. Sense: Unrecovered read error
      [ 4854.872858] sd 5:0:0:0: [sdr] tag#100 CDB: Read(10) 28 00 00 00 2e e0 00 00 08 00
      [ 4854.874587] print_req_error: critical medium error, dev sdr, sector 12000 flags 4000
      [ 4854.876456] sd 5:0:0:0: [sdr] tag#101 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.878552] sd 5:0:0:0: [sdr] tag#101 Sense Key : Medium Error [current]
      [ 4854.880278] sd 5:0:0:0: [sdr] tag#101 Add. Sense: Unrecovered read error
      [ 4854.881846] sd 5:0:0:0: [sdr] tag#101 CDB: Read(10) 28 00 00 00 2e e8 00 00 08 00
      [ 4854.883691] print_req_error: critical medium error, dev sdr, sector 12008 flags 4000
      [ 4854.893927] sd 5:0:0:0: [sdr] tag#166 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.896002] sd 5:0:0:0: [sdr] tag#166 Sense Key : Medium Error [current]
      [ 4854.897561] sd 5:0:0:0: [sdr] tag#166 Add. Sense: Unrecovered read error
      [ 4854.899110] sd 5:0:0:0: [sdr] tag#166 CDB: Read(10) 28 00 00 00 2e e0 00 00 10 00
      [ 4854.900989] print_req_error: critical medium error, dev sdr, sector 12000 flags 0
      [ 4854.902757] md/raid:md127: read error NOT corrected!! (sector 9952 on sdr1).
      [ 4854.904375] md/raid:md127: read error NOT corrected!! (sector 9960 on sdr1).
      [ 4854.906201] ------------[ cut here ]------------
      [ 4854.907341] kernel BUG at drivers/md/raid5.c:4190!
      
      raid5.c:4190 above is this BUG_ON:
      
          handle_parity_checks6()
              ...
              BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */
      
      Cc: <stable@vger.kernel.org> # v3.16+
      OriginalAuthor: David Jeffery <djeffery@redhat.com>
      Cc: Xiao Ni <xni@redhat.com>
      Tested-by: NDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: NDavid Jeffy <djeffery@redhat.com>
      Signed-off-by: NNigel Croxon <ncroxon@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85f34794
    • G
      rtlwifi: rtl8723ae: Fix missing break in switch statement · ace28a8e
      Gustavo A. R. Silva 提交于
      commit 84242b82d81c54e009a2aaa74d3d9eff70babf56 upstream.
      
      Add missing break statement in order to prevent the code from falling
      through to case 0x1025, and erroneously setting rtlhal->oem_id to
      RT_CID_819X_ACER when rtlefuse->eeprom_svid is equal to 0x10EC and
      none of the cases in switch (rtlefuse->eeprom_smid) match.
      
      This bug was found thanks to the ongoing efforts to enable
      -Wimplicit-fallthrough.
      
      Fixes: 238ad2dd ("rtlwifi: rtl8723ae: Clean up the hardware info routine")
      Cc: stable@vger.kernel.org
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ace28a8e
    • P
      mwl8k: Fix rate_idx underflow · d756d1de
      Petr Štetiar 提交于
      commit 6b583201fa219b7b1b6aebd8966c8fd9357ef9f4 upstream.
      
      It was reported on OpenWrt bug tracking system[1], that several users
      are affected by the endless reboot of their routers if they configure
      5GHz interface with channel 44 or 48.
      
      The reboot loop is caused by the following excessive number of WARN_ON
      messages:
      
       WARNING: CPU: 0 PID: 0 at backports-4.19.23-1/net/mac80211/rx.c:4516
                                   ieee80211_rx_napi+0x1fc/0xa54 [mac80211]
      
      as the messages are being correctly emitted by the following guard:
      
       case RX_ENC_LEGACY:
            if (WARN_ON(status->rate_idx >= sband->n_bitrates))
      
      as the rate_idx is in this case erroneously set to 251 (0xfb). This fix
      simply converts previously used magic number to proper constant and
      guards against substraction which is leading to the currently observed
      underflow.
      
      1. https://bugs.openwrt.org/index.php?do=details&task_id=2218
      
      Fixes: 85478344 ("mwl8k: properly set receive status rate index on 5 GHz receive")
      Cc: <stable@vger.kernel.org>
      Tested-by: NEubert Bao <bunnier@gmail.com>
      Reported-by: NEubert Bao <bunnier@gmail.com>
      Signed-off-by: NPetr Štetiar <ynezz@true.cz>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d756d1de
    • W
      cw1200: fix missing unlock on error in cw1200_hw_scan() · c300c98a
      Wei Yongjun 提交于
      commit 51c8d24101c79ffce3e79137e2cee5dfeb956dd7 upstream.
      
      Add the missing unlock before return from function cw1200_hw_scan()
      in the error handling case.
      
      Fixes: 4f68ef64cd7f ("cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan()")
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Acked-by: NJia-Ju Bai <baijiaju1990@gmail.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c300c98a
    • M
      x86/kprobes: Avoid kretprobe recursion bug · 57526050
      Masami Hiramatsu 提交于
      [ Upstream commit b191fa96ea6dc00d331dcc28c1f7db5e075693a0 ]
      
      Avoid kretprobe recursion loop bg by setting a dummy
      kprobes to current_kprobe per-CPU variable.
      
      This bug has been introduced with the asm-coded trampoline
      code, since previously it used another kprobe for hooking
      the function return placeholder (which only has a nop) and
      trampoline handler was called from that kprobe.
      
      This revives the old lost kprobe again.
      
      With this fix, we don't see deadlock anymore.
      
      And you can see that all inner-called kretprobe are skipped.
      
        event_1                                  235               0
        event_2                                19375           19612
      
      The 1st column is recorded count and the 2nd is missed count.
      Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed)
      (some difference are here because the counter is racy)
      Reported-by: NAndrea Righi <righi.andrea@gmail.com>
      Tested-by: NAndrea Righi <righi.andrea@gmail.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: c9becf58 ("[PATCH] kretprobe: kretprobe-booster")
      Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      57526050
    • D
      nfc: nci: Potential off by one in ->pipes[] array · 322a5755
      Dan Carpenter 提交于
      [ Upstream commit 6491d698396fd5da4941980a35ca7c162a672016 ]
      
      This is similar to commit e285d5bf ("NFC: Fix the number of pipes")
      where we changed NFC_HCI_MAX_PIPES from 127 to 128.
      
      As the comment next to the define explains, the pipe identifier is 7
      bits long.  The highest possible pipe is 127, but the number of possible
      pipes is 128.  As the code is now, then there is potential for an
      out of bounds array access:
      
          net/nfc/nci/hci.c:297 nci_hci_cmd_received() warn: array off by one?
          'ndev->hci_dev->pipes[pipe]' '0-127 == 127'
      
      Fixes: 11f54f22 ("NFC: nci: Add HCI over NCI protocol support")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      322a5755
    • D
      NFC: nci: Add some bounds checking in nci_hci_cmd_received() · f5e60565
      Dan Carpenter 提交于
      [ Upstream commit d7ee81ad09f072eab1681877fc71ec05f9c1ae92 ]
      
      This is similar to commit 674d9de0 ("NFC: Fix possible memory
      corruption when handling SHDLC I-Frame commands").
      
      I'm not totally sure, but I think that commit description may have
      overstated the danger.  I was under the impression that this data came
      from the firmware?  If you can't trust your networking firmware, then
      you're already in trouble.
      
      Anyway, these days we add bounds checking where ever we can and we call
      it kernel hardening.  Better safe than sorry.
      
      Fixes: 11f54f22 ("NFC: nci: Add HCI over NCI protocol support")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      f5e60565
    • J
      net: strparser: partially revert "strparser: Call skb_unclone conditionally" · 21e9515b
      Jakub Kicinski 提交于
      [ Upstream commit 4a9c2e3746e6151fd5d077259d79ce9ca86d47d7 ]
      
      This reverts the first part of commit 4e485d06 ("strparser: Call
      skb_unclone conditionally").  To build a message with multiple
      fragments we need our own root of frag_list.  We can't simply
      use the frag_list of orig_skb, because it will lead to linking
      all orig_skbs together creating very long frag chains, and causing
      stack overflow on kfree_skb() (which is called recursively on
      the frag_lists).
      
      BUG: stack guard page was hit at 00000000d40fad41 (stack is 0000000029dde9f4..000000008cce03d5)
      kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP
      RIP: 0010:free_one_page+0x2b/0x490
      
      Call Trace:
        __free_pages_ok+0x143/0x2c0
        skb_release_data+0x8e/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
      
        [...]
      
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        __kfree_skb+0xe/0x20
        tcp_disconnect+0xd6/0x4d0
        tcp_close+0xf4/0x430
        ? tcp_check_oom+0xf0/0xf0
        tls_sk_proto_close+0xe4/0x1e0 [tls]
        inet_release+0x36/0x60
        __sock_release+0x37/0xa0
        sock_close+0x11/0x20
        __fput+0xa2/0x1d0
        task_work_run+0x89/0xb0
        exit_to_usermode_loop+0x9a/0xa0
        do_syscall_64+0xc0/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Let's leave the second unclone conditional, as I'm not entirely
      sure what is its purpose :)
      
      Fixes: 4e485d06 ("strparser: Call skb_unclone conditionally")
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      21e9515b
    • J
      net/tls: fix the IV leaks · 85b9e869
      Jakub Kicinski 提交于
      [ Upstream commit 5a03bc73abed6ae196c15e9950afde19d48be12c ]
      
      Commit f66de3ee ("net/tls: Split conf to rx + tx") made
      freeing of IV and record sequence number conditional to SW
      path only, but commit e8f69799 ("net/tls: Add generic NIC
      offload infrastructure") also allocates that state for the
      device offload configuration.  Remember to free it.
      
      Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      85b9e869
    • I
      mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw workqueue · e38c6748
      Ido Schimmel 提交于
      [ Upstream commit b442fed1b724af0de087912a5718ddde1b87acbb ]
      
      The workqueue is used to periodically update the networking stack about
      activity / statistics of various objects such as neighbours and TC
      actions.
      
      It should not be called as part of memory reclaim path, so remove the
      WQ_MEM_RECLAIM flag.
      
      Fixes: 3d5479e9 ("mlxsw: core: Remove deprecated create_workqueue")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      e38c6748
    • I
      mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw ordered workqueue · 835ae6cc
      Ido Schimmel 提交于
      [ Upstream commit 4af0699782e2cc7d0d89db9eb6f8844dd3df82dc ]
      
      The ordered workqueue is used to offload various objects such as routes
      and neighbours in the order they are notified.
      
      It should not be called as part of memory reclaim path, so remove the
      WQ_MEM_RECLAIM flag. This can also result in a warning [1], if a worker
      tries to flush a non-WQ_MEM_RECLAIM workqueue.
      
      [1]
      [97703.542861] workqueue: WQ_MEM_RECLAIM mlxsw_core_ordered:mlxsw_sp_router_fib6_event_work [mlxsw_spectrum] is flushing !WQ_MEM_RECLAIM events:rht_deferred_worker
      [97703.542884] WARNING: CPU: 1 PID: 32492 at kernel/workqueue.c:2605 check_flush_dependency+0xb5/0x130
      ...
      [97703.542988] Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
      [97703.543049] Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib6_event_work [mlxsw_spectrum]
      [97703.543061] RIP: 0010:check_flush_dependency+0xb5/0x130
      ...
      [97703.543071] RSP: 0018:ffffb3f08137bc00 EFLAGS: 00010086
      [97703.543076] RAX: 0000000000000000 RBX: ffff96e07740ae00 RCX: 0000000000000000
      [97703.543080] RDX: 0000000000000094 RSI: ffffffff82dc1934 RDI: 0000000000000046
      [97703.543084] RBP: ffffb3f08137bc20 R08: ffffffff82dc18a0 R09: 00000000000225c0
      [97703.543087] R10: 0000000000000000 R11: 0000000000007eec R12: ffffffff816e4ee0
      [97703.543091] R13: ffff96e06f6a5c00 R14: ffff96e077ba7700 R15: ffffffff812ab0c0
      [97703.543097] FS: 0000000000000000(0000) GS:ffff96e077a80000(0000) knlGS:0000000000000000
      [97703.543101] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [97703.543104] CR2: 00007f8cd135b280 CR3: 00000001e860e003 CR4: 00000000003606e0
      [97703.543109] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [97703.543112] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [97703.543115] Call Trace:
      [97703.543129] __flush_work+0xbd/0x1e0
      [97703.543137] ? __cancel_work_timer+0x136/0x1b0
      [97703.543145] ? pwq_dec_nr_in_flight+0x49/0xa0
      [97703.543154] __cancel_work_timer+0x136/0x1b0
      [97703.543175] ? mlxsw_reg_trans_bulk_wait+0x145/0x400 [mlxsw_core]
      [97703.543184] cancel_work_sync+0x10/0x20
      [97703.543191] rhashtable_free_and_destroy+0x23/0x140
      [97703.543198] rhashtable_destroy+0xd/0x10
      [97703.543254] mlxsw_sp_fib_destroy+0xb1/0xf0 [mlxsw_spectrum]
      [97703.543310] mlxsw_sp_vr_put+0xa8/0xc0 [mlxsw_spectrum]
      [97703.543364] mlxsw_sp_fib_node_put+0xbf/0x140 [mlxsw_spectrum]
      [97703.543418] ? mlxsw_sp_fib6_entry_destroy+0xe8/0x110 [mlxsw_spectrum]
      [97703.543475] mlxsw_sp_router_fib6_event_work+0x6cd/0x7f0 [mlxsw_spectrum]
      [97703.543484] process_one_work+0x1fd/0x400
      [97703.543493] worker_thread+0x34/0x410
      [97703.543500] kthread+0x121/0x140
      [97703.543507] ? process_one_work+0x400/0x400
      [97703.543512] ? kthread_park+0x90/0x90
      [97703.543523] ret_from_fork+0x35/0x40
      
      Fixes: a3832b31 ("mlxsw: core: Create an ordered workqueue for FIB offload")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: NSemion Lisyansky <semionl@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      835ae6cc
    • I
      mlxsw: core: Do not use WQ_MEM_RECLAIM for EMAD workqueue · 880a328e
      Ido Schimmel 提交于
      [ Upstream commit a8c133b06183c529c51cd0d54eb57d6b7078370c ]
      
      The EMAD workqueue is used to handle retransmission of EMAD packets that
      contain configuration data for the device's firmware.
      
      Given the workers need to allocate these packets and that the code is
      not called as part of memory reclaim path, remove the WQ_MEM_RECLAIM
      flag.
      
      Fixes: d965465b ("mlxsw: core: Fix possible deadlock")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <alexander.levin@microsoft.com>
      880a328e