1. 22 5月, 2019 17 次提交
  2. 17 5月, 2019 23 次提交
    • G
      Linux 4.19.44 · dafc674b
      Greg Kroah-Hartman 提交于
      dafc674b
    • D
      PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary · 9fa23ea1
      Dexuan Cui 提交于
      commit 340d455699400f2c2c0f9b3f703ade3085cdb501 upstream.
      
      When we hot-remove a device, usually the host sends us a PCI_EJECT message,
      and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      When we execute the quick hot-add/hot-remove test, the host may not send
      us the PCI_EJECT message if the guest has not fully finished the
      initialization by sending the PCI_RESOURCES_ASSIGNED* message to the
      host, so it's potentially unsafe to only depend on the
      pci_destroy_slot() in hv_eject_device_work() because the code path
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      is not called in this case. Note: in this case, the host still sends the
      guest a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      In the quick hot-add/hot-remove test, we can have such a race before
      the code path
      
      pci_devices_present_work()
       -> new_pcichild_device()
      
      adds the new device into the hbus->children list, we may have already
      received the PCI_EJECT message, and since the tasklet handler
      
      hv_pci_onchannelcallback()
      
      may fail to find the "hpdev" by calling
      
      get_pcichild_wslot(hbus, dev_message->wslot.slot)
      
      hv_pci_eject_device() is not called; Later, by continuing execution
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      creates the slot and the PCI_BUS_RELATIONS message with
      bus_rel->device_count == 0 removes the device from hbus->children, and
      we end up being unable to remove the slot in
      
      hv_pci_remove()
       -> hv_pci_remove_slots()
      
      Remove the slot in pci_devices_present_work() when the device
      is removed to address this race.
      
      pci_devices_present_work() and hv_eject_device_work() run in the
      singled-threaded hbus->wq, so there is not a double-remove issue for the
      slot.
      
      We cannot offload hv_pci_eject_device() from hv_pci_onchannelcallback()
      to the workqueue, because we need the hv_pci_onchannelcallback()
      synchronously call hv_pci_eject_device() to poll the channel
      ringbuffer to work around the "hangs in hv_compose_msi_msg()" issue
      fixed in commit de0aa7b2 ("PCI: hv: Fix 2 hang issues in
      hv_compose_msi_msg()")
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: rewritten commit log]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fa23ea1
    • D
      PCI: hv: Add hv_pci_remove_slots() when we unload the driver · 76888d13
      Dexuan Cui 提交于
      commit 15becc2b56c6eda3d9bf5ae993bafd5661c1fad1 upstream.
      
      When we unload the pci-hyperv host controller driver, the host does not
      send us a PCI_EJECT message.
      
      In this case we also need to make sure the sysfs PCI slot directory is
      removed, otherwise a command on a slot file eg:
      
      "cat /sys/bus/pci/slots/2/address"
      
      will trigger a
      
      "BUG: unable to handle kernel paging request"
      
      and, if we unload/reload the driver several times we would end up with
      stale slot entries in PCI slot directories in /sys/bus/pci/slots/
      
      root@localhost:~# ls -rtl  /sys/bus/pci/slots/
      total 0
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2-1
      drwxr-xr-x 2 root root 0 Feb  7 10:51 2-2
      
      Add the missing code to remove the PCI slot and fix the current
      behaviour.
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: reformatted the log]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <sthemmin@microsoft.com>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76888d13
    • D
      PCI: hv: Fix a memory leak in hv_eject_device_work() · a47e0054
      Dexuan Cui 提交于
      commit 05f151a73ec2b23ffbff706e5203e729a995cdc2 upstream.
      
      When a device is created in new_pcichild_device(), hpdev->refs is set
      to 2 (i.e. the initial value of 1 plus the get_pcichild()).
      
      When we hot remove the device from the host, in a Linux VM we first call
      hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and
      then schedules a work of hv_eject_device_work(), so hpdev->refs becomes
      3 (let's ignore the paired get/put_pcichild() in other places). But in
      hv_eject_device_work(), currently we only call put_pcichild() twice,
      meaning the 'hpdev' struct can't be freed in put_pcichild().
      
      Add one put_pcichild() to fix the memory leak.
      
      The device can also be removed when we run "rmmod pci-hyperv". On this
      path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()),
      hpdev->refs is 2, and we do correctly call put_pcichild() twice in
      pci_devices_present_work().
      
      Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: commit log rework]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a47e0054
    • L
      powerpc/booke64: set RI in default MSR · 4179b858
      Laurentiu Tudor 提交于
      commit 5266e58d6cd90ac85c187d673093ad9cb649e16d upstream.
      
      Set RI in the default kernel's MSR so that the architected way of
      detecting unrecoverable machine check interrupts has a chance to work.
      This is inline with the MSR setup of the rest of booke powerpc
      architectures configured here.
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4179b858
    • R
      powerpc/powernv/idle: Restore IAMR after idle · 71b20cdb
      Russell Currey 提交于
      commit a3f3072db6cad40895c585dce65e36aab997f042 upstream.
      
      Without restoring the IAMR after idle, execution prevention on POWER9
      with Radix MMU is overwritten and the kernel can freely execute
      userspace without faulting.
      
      This is necessary when returning from any stop state that modifies
      user state, as well as hypervisor state.
      
      To test how this fails without this patch, load the lkdtm driver and
      do the following:
      
        $ echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
      
      which won't fault, then boot the kernel with powersave=off, where it
      will fault. Applying this patch will fix this.
      
      Fixes: 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71b20cdb
    • R
      powerpc/book3s/64: check for NULL pointer in pgd_alloc() · 69c2b71c
      Rick Lindsley 提交于
      commit f39356261c265a0689d7ee568132d516e8b6cecc upstream.
      
      When the memset code was added to pgd_alloc(), it failed to consider
      that kmem_cache_alloc() can return NULL. It's uncommon, but not
      impossible under heavy memory contention. Example oops:
      
        Unable to handle kernel paging request for data at address 0x00000000
        Faulting instruction address: 0xc0000000000a4000
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1
        task: c000000334a00000 task.stack: c000000331c00000
        NIP:  c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020
        REGS: c000000331c039c0 TRAP: 0300   Not tainted  (4.14.0-115.6.1.el7a.ppc64le)
        MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022840  XER: 20040000
        CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
        ...
        NIP [c0000000000a4000] memset+0x68/0x104
        LR [c00000000012f43c] mm_init+0x27c/0x2f0
        Call Trace:
          mm_init+0x260/0x2f0 (unreliable)
          copy_mm+0x11c/0x638
          copy_process.isra.28.part.29+0x6fc/0x1080
          _do_fork+0xdc/0x4c0
          ppc_clone+0x8/0xc
        Instruction dump:
        409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0
        7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018
      
      Fixes: fc5c2f4a ("powerpc/mm/hash64: Zero PGD pages on allocation")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: NRick Lindsley <ricklind@vnet.linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69c2b71c
    • D
      drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl · e9ec5073
      Dan Carpenter 提交于
      commit 6a024330650e24556b8a18cc654ad00cfecf6c6c upstream.
      
      The "param.count" value is a u64 thatcomes from the user.  The code
      later in the function assumes that param.count is at least one and if
      it's not then it leads to an Oops when we dereference the ZERO_SIZE_PTR.
      
      Also the addition can have an integer overflow which would lead us to
      allocate a smaller "pages" array than required.  I can't immediately
      tell what the possible run times implications are, but it's safest to
      prevent the overflow.
      
      Link: http://lkml.kernel.org/r/20181218082129.GE32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9ec5073
    • D
      drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl · ee3b53d8
      Dan Carpenter 提交于
      commit c8ea3663f7a8e6996d44500ee818c9330ac4fd88 upstream.
      
      strndup_user() returns error pointers on error, and then in the error
      handling we pass the error pointers to kfree().  It will cause an Oops.
      
      Link: http://lkml.kernel.org/r/20181218082003.GD32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee3b53d8
    • P
      tipc: fix hanging clients using poll with EPOLLOUT flag · afa485dc
      Parthasarathy Bhuvaragan 提交于
      [ Upstream commit ff946833b70e0c7f93de9a3f5b329b5ae2287b38 ]
      
      commit 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      introduced a regression for clients using non-blocking sockets.
      After the commit, we send EPOLLOUT event to the client even in
      TIPC_CONNECTING state. This causes the subsequent send() to fail
      with ENOTCONN, as the socket is still not in TIPC_ESTABLISHED state.
      
      In this commit, we:
      - improve the fix for hanging poll() by replacing sk_data_ready()
        with sk_state_change() to wake up all clients.
      - revert the faulty updates introduced by commit 517d7c79
        ("tipc: fix hanging poll() for stream sockets").
      
      Fixes: 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afa485dc
    • P
      isdn: bas_gigaset: use usb_fill_int_urb() properly · 98652e0b
      Paul Bolle 提交于
      [ Upstream commit 4014dfae3ccaaf3ec19c9ae0691a3f14e7132eae ]
      
      The switch to make bas_gigaset use usb_fill_int_urb() - instead of
      filling that urb "by hand" - missed the subtle ordering of the previous
      code.
      
      See, before the switch urb->dev was set to a member somewhere deep in a
      complicated structure and then supplied to usb_rcvisocpipe() and
      usb_sndisocpipe(). After that switch urb->dev wasn't set to anything
      specific before being supplied to those two macros. This triggers a
      nasty oops:
      
          BUG: unable to handle kernel NULL pointer dereference at 00000000
          #PF error: [normal kernel read fault]
          *pde = 00000000
          Oops: 0000 [#1] SMP
          CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-0.rc4.1.local0.fc28.i686 #1
          Hardware name: IBM 2525FAG/2525FAG, BIOS 74ET64WW (2.09 ) 12/14/2006
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: f30c9f20
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Call Trace:
           <SOFTIRQ>
           ? gigaset_isdn_connD+0xf6/0x140 [gigaset]
           gigaset_handle_event+0x173e/0x1b90 [gigaset]
           tasklet_action_common.isra.16+0x4e/0xf0
           tasklet_action+0x1e/0x20
           __do_softirq+0xb2/0x293
           ? __irqentry_text_end+0x3/0x3
           call_on_stack+0x45/0x50
           </SOFTIRQ>
           ? irq_exit+0xb5/0xc0
           ? do_IRQ+0x78/0xd0
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? common_interrupt+0xd4/0xdc
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? sched_cpu_activate+0x1b/0xf0
           ? acpi_fan_resume.cold.7+0x9/0x18
           ? cpuidle_enter_state+0x152/0x4c0
           ? cpuidle_enter+0x14/0x20
           ? call_cpuidle+0x21/0x40
           ? do_idle+0x1c8/0x200
           ? cpu_startup_entry+0x25/0x30
           ? rest_init+0x88/0x8a
           ? arch_call_rest_init+0xd/0x19
           ? start_kernel+0x42f/0x448
           ? i386_start_kernel+0xac/0xb0
           ? startup_32_smp+0x164/0x168
          Modules linked in: ppp_generic slhc capi bas_gigaset gigaset kernelcapi nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc ipw2200 iTCO_wdt gpio_ich snd_intel8x0 libipw iTCO_vendor_support snd_ac97_codec lib80211 ppdev ac97_bus snd_seq cfg80211 snd_seq_device pcspkr thinkpad_acpi lpc_ich snd_pcm i2c_i801 snd_timer ledtrig_audio snd soundcore rfkill parport_pc parport pcc_cpufreq acpi_cpufreq i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sdhci_pci sysimgblt cqhci fb_sys_fops drm sdhci mmc_core tg3 ata_generic serio_raw yenta_socket pata_acpi video
          CR2: 0000000000000000
          ---[ end trace 1fe07487b9200c73 ]---
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: cddcb3bc
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Kernel panic - not syncing: Fatal exception in interrupt
          Kernel Offset: 0xcc00000 from 0xc0400000 (relocation range: 0xc0000000-0xf6ffdfff)
          ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      No-one noticed because this Oops is apparently only triggered by setting
      up an ISDN data connection on a live ISDN line on a gigaset base (ie,
      the PBX that the gigaset driver support). Very few people do that
      running present day kernels.
      
      Anyhow, a little code reorganization makes this problem go away, while
      avoiding the subtle ordering that was used in the past. So let's do
      that.
      
      Fixes: 78c696c1 ("isdn: gigaset: use usb_fill_int_urb()")
      Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98652e0b
    • J
      tuntap: synchronize through tfiles array instead of tun->numqueues · 17d8a9eb
      Jason Wang 提交于
      [ Upstream commit 9871a9e47a2646fe30ae7fd2e67668a8d30912f6 ]
      
      When a queue(tfile) is detached through __tun_detach(), we move the
      last enabled tfile to the position where detached one sit but don't
      NULL out last position. We expect to synchronize the datapath through
      tun->numqueues. Unfortunately, this won't work since we're lacking
      sufficient mechanism to order or synchronize the access to
      tun->numqueues.
      
      To fix this, NULL out the last position during detaching and check
      RCU protected tfile against NULL instead of checking tun->numqueues in
      datapath.
      
      Cc: YueHaibing <yuehaibing@huawei.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: weiyongjun (A) <weiyongjun1@huawei.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: c8d68e6b ("tuntap: multiqueue support")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17d8a9eb
    • J
      tuntap: fix dividing by zero in ebpf queue selection · 9c79732f
      Jason Wang 提交于
      [ Upstream commit a35d310f03a692bf4798eb309a1950a06a150620 ]
      
      We need check if tun->numqueues is zero (e.g for the persist device)
      before trying to use it for modular arithmetic.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 96f84061("tun: add eBPF based queue selection method")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c79732f
    • S
      vrf: sit mtu should not be updated when vrf netdev is the link · 737713e6
      Stephen Suryaputra 提交于
      [ Upstream commit ff6ab32bd4e073976e4d8797b4d514a172cfe6cb ]
      
      VRF netdev mtu isn't typically set and have an mtu of 65536. When the
      link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
      mtu minus tunnel header. In the case of VRF netdev is the link, then the
      tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
      this case.
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      737713e6
    • H
      vlan: disable SIOCSHWTSTAMP in container · e3840607
      Hangbin Liu 提交于
      [ Upstream commit 873017af778439f2f8e3d87f28ddb1fcaf244a76 ]
      
      With NET_ADMIN enabled in container, a normal user could be mapped to
      root and is able to change the real device's rx filter via ioctl on
      vlan, which would affect the other ptp process on host. Fix it by
      disabling SIOCSHWTSTAMP in container.
      
      Fixes: a6111d3c ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3840607
    • P
      selinux: do not report error on connect(AF_UNSPEC) · dfdfad3d
      Paolo Abeni 提交于
      [ Upstream commit c7e0d6cca86581092cbbf2cd868b3601495554cf ]
      
      calling connect(AF_UNSPEC) on an already connected TCP socket is an
      established way to disconnect() such socket. After commit 68741a8a
      ("selinux: Fix ltp test connect-syscall failure") it no longer works
      and, in the above scenario connect() fails with EAFNOSUPPORT.
      
      Fix the above falling back to the generic/old code when the address family
      is not AF_INET{4,6}, but leave the SCTP code path untouched, as it has
      specific constraints.
      
      Fixes: 68741a8a ("selinux: Fix ltp test connect-syscall failure")
      Reported-by: NTom Deseyn <tdeseyn@redhat.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfdfad3d
    • Y
      packet: Fix error path in packet_init · 9f51d6f7
      YueHaibing 提交于
      [ Upstream commit 36096f2f4fa05f7678bc87397665491700bae757 ]
      
      kernel BUG at lib/list_debug.c:47!
      invalid opcode: 0000 [#1
      CPU: 0 PID: 12914 Comm: rmmod Tainted: G        W         5.1.0+ #47
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:__list_del_entry_valid+0x53/0x90
      Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48
      89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2
      RSP: 0018:ffffc90001c2fe40 EFLAGS: 00010286
      RAX: 000000000000004e RBX: ffffffffa0184000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff888237a17788 RDI: 00000000ffffffff
      RBP: ffffc90001c2fe40 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffc90001c2fe10 R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc90001c2fe50 R14: ffffffffa0184000 R15: 0000000000000000
      FS:  00007f3d83634540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000555c350ea818 CR3: 0000000231677000 CR4: 00000000000006f0
      Call Trace:
       unregister_pernet_operations+0x34/0x120
       unregister_pernet_subsys+0x1c/0x30
       packet_exit+0x1c/0x369 [af_packet
       __x64_sys_delete_module+0x156/0x260
       ? lockdep_hardirqs_on+0x133/0x1b0
       ? do_syscall_64+0x12/0x1f0
       do_syscall_64+0x6e/0x1f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      When modprobe af_packet, register_pernet_subsys
      fails and does a cleanup, ops->list is set to LIST_POISON1,
      but the module init is considered to success, then while rmmod it,
      BUG() is triggered in __list_del_entry_valid which is called from
      unregister_pernet_subsys. This patch fix error handing path in
      packet_init to avoid possilbe issue if some error occur.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f51d6f7
    • C
      net: ucc_geth - fix Oops when changing number of buffers in the ring · 2e95eb9c
      Christophe Leroy 提交于
      [ Upstream commit ee0df19305d9fabd9479b785918966f6e25b733b ]
      
      When changing the number of buffers in the RX ring while the interface
      is running, the following Oops is encountered due to the new number
      of buffers being taken into account immediately while their allocation
      is done when opening the device only.
      
      [   69.882706] Unable to handle kernel paging request for data at address 0xf0000100
      [   69.890172] Faulting instruction address: 0xc033e164
      [   69.895122] Oops: Kernel access of bad area, sig: 11 [#1]
      [   69.900494] BE PREEMPT CMPCPRO
      [   69.907120] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.115-00006-g179ade8ce3-dirty #269
      [   69.915956] task: c0684310 task.stack: c06da000
      [   69.920470] NIP:  c033e164 LR: c02e44d0 CTR: c02e41fc
      [   69.925504] REGS: dfff1e20 TRAP: 0300   Not tainted  (4.14.115-00006-g179ade8ce3-dirty)
      [   69.934161] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22004428  XER: 20000000
      [   69.940869] DAR: f0000100 DSISR: 20000000
      [   69.940869] GPR00: c0352d70 dfff1ed0 c0684310 f00000a4 00000040 dfff1f68 00000000 0000001f
      [   69.940869] GPR08: df53f410 1cc00040 00000021 c0781640 42004424 100c82b6 f00000a4 df53f5b0
      [   69.940869] GPR16: df53f6c0 c05daf84 00000040 00000000 00000040 c0782be4 00000000 00000001
      [   69.940869] GPR24: 00000000 df53f400 000001b0 df53f410 df53f000 0000003f df708220 1cc00044
      [   69.978348] NIP [c033e164] skb_put+0x0/0x5c
      [   69.982528] LR [c02e44d0] ucc_geth_poll+0x2d4/0x3f8
      [   69.987384] Call Trace:
      [   69.989830] [dfff1ed0] [c02e4554] ucc_geth_poll+0x358/0x3f8 (unreliable)
      [   69.996522] [dfff1f20] [c0352d70] net_rx_action+0x248/0x30c
      [   70.002099] [dfff1f80] [c04e93e4] __do_softirq+0xfc/0x310
      [   70.007492] [dfff1fe0] [c0021124] irq_exit+0xd0/0xd4
      [   70.012458] [dfff1ff0] [c000e7e0] call_do_irq+0x24/0x3c
      [   70.017683] [c06dbe80] [c0006bac] do_IRQ+0x64/0xc4
      [   70.022474] [c06dbea0] [c001097c] ret_from_except+0x0/0x14
      [   70.027964] --- interrupt: 501 at rcu_idle_exit+0x84/0x90
      [   70.027964]     LR = rcu_idle_exit+0x74/0x90
      [   70.037585] [c06dbf60] [20000000] 0x20000000 (unreliable)
      [   70.042984] [c06dbf80] [c004bb0c] do_idle+0xb4/0x11c
      [   70.047945] [c06dbfa0] [c004bd14] cpu_startup_entry+0x18/0x1c
      [   70.053682] [c06dbfb0] [c05fb034] start_kernel+0x370/0x384
      [   70.059153] [c06dbff0] [00003438] 0x3438
      [   70.063062] Instruction dump:
      [   70.066023] 38a00000 38800000 90010014 4bfff015 80010014 7c0803a6 3123ffff 7c691910
      [   70.073767] 38210010 4e800020 38600000 4e800020 <80e3005c> 80c30098 3107ffff 7d083910
      [   70.081690] ---[ end trace be7ccd9c1e1a9f12 ]---
      
      This patch forbids the modification of the number of buffers in the
      ring while the interface is running.
      
      Fixes: ac421852 ("ucc_geth: add ethtool support")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e95eb9c
    • T
      net: seeq: fix crash caused by not set dev.parent · 210057b7
      Thomas Bogendoerfer 提交于
      [ Upstream commit 5afcd14cfc7fed1bcc8abcee2cef82732772bfc2 ]
      
      The old MIPS implementation of dma_cache_sync() didn't use the dev argument,
      but commit c9eb6172 ("dma-mapping: turn dma_cache_sync into a
      dma_map_ops method") changed that, so we now need to set dev.parent.
      Signed-off-by: NThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      210057b7
    • H
      net: macb: Change interrupt and napi enable order in open · dfd91928
      Harini Katakam 提交于
      [ Upstream commit 0504453139ef5a593c9587e1e851febee859c7d8 ]
      
      Current order in open:
      -> Enable interrupts (macb_init_hw)
      -> Enable NAPI
      -> Start PHY
      
      Sequence of RX handling:
      -> RX interrupt occurs
      -> Interrupt is cleared and interrupt bits disabled in handler
      -> NAPI is scheduled
      -> In NAPI, RX budget is processed and RX interrupts are re-enabled
      
      With the above, on QEMU or fixed link setups (where PHY state doesn't
      matter), there's a chance macb RX interrupt occurs before NAPI is
      enabled. This will result in NAPI being scheduled before it is enabled.
      Fix this macb open by changing the order.
      
      Fixes: ae1f2a56 ("net: macb: Added support for many RX queues")
      Signed-off-by: NHarini Katakam <harini.katakam@xilinx.com>
      Acked-by: NNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfd91928
    • C
      net: ethernet: stmmac: dwmac-sun8i: enable support of unicast filtering · 68df8383
      Corentin Labbe 提交于
      [ Upstream commit d4c26eb6e721683a0f93e346ce55bc8dc3cbb175 ]
      
      When adding more MAC addresses to a dwmac-sun8i interface, the device goes
      directly in promiscuous mode.
      This is due to IFF_UNICAST_FLT missing flag.
      
      So since the hardware support unicast filtering, let's add IFF_UNICAST_FLT.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Signed-off-by: NCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68df8383
    • Y
      net: dsa: Fix error cleanup path in dsa_init_module · 9284895b
      YueHaibing 提交于
      [ Upstream commit 68be930249d051fd54d3d99156b3dcadcb2a1f9b ]
      
      BUG: unable to handle kernel paging request at ffffffffa01c5430
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:raw_notifier_chain_register+0x16/0x40
      Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
      RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
      RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
      RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
      RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
      R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
      FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
      Call Trace:
       register_netdevice_notifier+0x43/0x250
       ? 0xffffffffa01e0000
       dsa_slave_register_notifier+0x13/0x70 [dsa_core
       ? 0xffffffffa01e0000
       dsa_init_module+0x2e/0x1000 [dsa_core
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cleanup allocated resourses if there are errors,
      otherwise it will trgger memleak.
      
      Fixes: c9eb3e0f ("net: dsa: Add support for learning FDB through notification")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9284895b
    • D
      ipv4: Fix raw socket lookup for local traffic · da2e770f
      David Ahern 提交于
      [ Upstream commit 19e4e768064a87b073a4b4c138b55db70e0cfb9f ]
      
      inet_iif should be used for the raw socket lookup. inet_iif considers
      rt_iif which handles the case of local traffic.
      
      As it stands, ping to a local address with the '-I <dev>' option fails
      ever since ping was changed to use SO_BINDTODEVICE instead of
      cmsg + IP_PKTINFO.
      
      IPv6 works fine.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da2e770f