1. 01 7月, 2019 4 次提交
  2. 30 6月, 2019 9 次提交
    • B
      net: dsa: mv88e6xxx: wait after reset deactivation · 7b75e49d
      Baruch Siach 提交于
      Add a 1ms delay after reset deactivation. Otherwise the chip returns
      bogus ID value. This is observed with 88E6390 (Peridot) chip.
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b75e49d
    • G
      bnx2x: Prevent ptp_task to be rescheduled indefinitely · 3c91f25c
      Guilherme G. Piccoli 提交于
      Currently bnx2x ptp worker tries to read a register with timestamp
      information in case of TX packet timestamping and in case it fails,
      the routine reschedules itself indefinitely. This was reported as a
      kworker always at 100% of CPU usage, which was narrowed down to be
      bnx2x ptp_task.
      
      By following the ioctl handler, we could narrow down the problem to
      an NTP tool (chrony) requesting HW timestamping from bnx2x NIC with
      RX filter zeroed; this isn't reproducible for example with ptp4l
      (from linuxptp) since this tool requests a supported RX filter.
      It seems NIC FW timestamp mechanism cannot work well with
      RX_FILTER_NONE - driver's PTP filter init routine skips a register
      write to the adapter if there's not a supported filter request.
      
      This patch addresses the problem of bnx2x ptp thread's everlasting
      reschedule by retrying the register read 10 times; between the read
      attempts the thread sleeps for an increasing amount of time starting
      in 1ms to give FW some time to perform the timestamping. If it still
      fails after all retries, we bail out in order to prevent an unbound
      resource consumption from bnx2x.
      
      The patch also adds an ethtool statistic for accounting the skipped
      TX timestamp packets and it reduces the priority of timestamping
      error messages to prevent log flooding. The code was tested using
      both linuxptp and chrony.
      Reported-and-tested-by: NPrzemyslaw Hausman <przemyslaw.hausman@canonical.com>
      Suggested-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NGuilherme G. Piccoli <gpiccoli@canonical.com>
      Acked-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c91f25c
    • E
      igmp: fix memory leak in igmpv3_del_delrec() · e5b1c6c6
      Eric Dumazet 提交于
      im->tomb and/or im->sources might not be NULL, but we
      currently overwrite their values blindly.
      
      Using swap() will make sure the following call to kfree_pmc(pmc)
      will properly free the psf structures.
      
      Tested with the C repro provided by syzbot, which basically does :
      
       socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
       setsockopt(3, SOL_IP, IP_ADD_MEMBERSHIP, "\340\0\0\2\177\0\0\1\0\0\0\0", 12) = 0
       ioctl(3, SIOCSIFFLAGS, {ifr_name="lo", ifr_flags=0}) = 0
       setsockopt(3, SOL_IP, IP_MSFILTER, "\340\0\0\2\177\0\0\1\1\0\0\0\1\0\0\0\377\377\377\377", 20) = 0
       ioctl(3, SIOCSIFFLAGS, {ifr_name="lo", ifr_flags=IFF_UP}) = 0
       exit_group(0)                    = ?
      
      BUG: memory leak
      unreferenced object 0xffff88811450f140 (size 64):
        comm "softirq", pid 0, jiffies 4294942448 (age 32.070s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00  ................
          00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000c7bad083>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<00000000c7bad083>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000c7bad083>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000c7bad083>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
          [<000000009acc4151>] kmalloc include/linux/slab.h:547 [inline]
          [<000000009acc4151>] kzalloc include/linux/slab.h:742 [inline]
          [<000000009acc4151>] ip_mc_add1_src net/ipv4/igmp.c:1976 [inline]
          [<000000009acc4151>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2100
          [<000000004ac14566>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2484
          [<0000000052d8f995>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:959
          [<000000004ee1e21f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1248
          [<0000000066cdfe74>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2618
          [<000000009383a786>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3126
          [<00000000d8ac0c94>] __sys_setsockopt+0x98/0x120 net/socket.c:2072
          [<000000001b1e9666>] __do_sys_setsockopt net/socket.c:2083 [inline]
          [<000000001b1e9666>] __se_sys_setsockopt net/socket.c:2080 [inline]
          [<000000001b1e9666>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2080
          [<00000000420d395e>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
          [<000000007fd83a4b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 24803f38 ("igmp: do not remove igmp souce list info when set link down")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hangbin Liu <liuhangbin@gmail.com>
      Reported-by: syzbot+6ca1abd0db68b5173a4f@syzkaller.appspotmail.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5b1c6c6
    • D
      Merge branch 'Sub-ns-increment-fixes-in-Macb-PTP' · c09fedd6
      David S. Miller 提交于
      Harini Katakam says:
      
      ====================
      Sub ns increment fixes in Macb PTP
      
      The subns increment register fields are not captured correctly in the
      driver. Fix the same and also increase the subns incr resolution.
      
      Sub ns resolution was increased to 24 bits in r1p06f2 version. To my
      knowledge, this PTP driver, with its current BD time stamp
      implementation, is only useful to that version or above. So, I have
      increased the resolution unconditionally. Please let me know if there
      is any IP versions incompatible with this - there is no register to
      obtain this information from.
      
      Changes from RFC:
      None
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c09fedd6
    • H
      net: macb: Fix SUBNS increment and increase resolution · 7ad342bc
      Harini Katakam 提交于
      The subns increment register has 24 bits as follows:
      RegBit[15:0] = Subns[23:8]; RegBit[31:24] = Subns[7:0]
      
      Fix the same in the driver and increase sub ns resolution to the
      best capable, 24 bits. This should be the case on all GEM versions
      that this PTP driver supports.
      Signed-off-by: NHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ad342bc
    • H
      net: macb: Add separate definition for PPM fraction · a8ee4dc1
      Harini Katakam 提交于
      The scaled ppm parameter passed to _adjfine() contains a 16 bit
      fraction. This just happens to be the same as SUBNSINCR_SIZE now.
      Hence define this separately.
      Signed-off-by: NHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8ee4dc1
    • J
      packet: Fix undefined behavior in bit shift · 79293f49
      Jiunn Chang 提交于
      Shifting signed 32-bit value by 31 bits is undefined.  Changing most
      significant bit to unsigned.
      
      Changes included in v2:
        - use subsystem specific subject lines
        - CC required mailing lists
      Signed-off-by: NJiunn Chang <c0d1n61at3@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79293f49
    • F
      net: make skb_dst_force return true when dst is refcounted · b60a7738
      Florian Westphal 提交于
      netfilter did not expect that skb_dst_force() can cause skb to lose its
      dst entry.
      
      I got a bug report with a skb->dst NULL dereference in netfilter
      output path.  The backtrace contains nf_reinject(), so the dst might have
      been cleared when skb got queued to userspace.
      
      Other users were fixed via
      if (skb_dst(skb)) {
      	skb_dst_force(skb);
      	if (!skb_dst(skb))
      		goto handle_err;
      }
      
      But I think its preferable to make the 'dst might be cleared' part
      of the function explicit.
      
      In netfilter case, skb with a null dst is expected when queueing in
      prerouting hook, so drop skb for the other hooks.
      
      v2:
       v1 of this patch returned true in case skb had no dst entry.
       Eric said:
         Say if we have two skb_dst_force() calls for some reason
         on the same skb, only the first one will return false.
      
       This now returns false even when skb had no dst, as per Erics
       suggestion, so callers might need to check skb_dst() first before
       skb_dst_force().
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b60a7738
    • X
      sctp: not bind the socket in sctp_connect · 9b6c0887
      Xin Long 提交于
      Now when sctp_connect() is called with a wrong sa_family, it binds
      to a port but doesn't set bp->port, then sctp_get_af_specific will
      return NULL and sctp_connect() returns -EINVAL.
      
      Then if sctp_bind() is called to bind to another port, the last
      port it has bound will leak due to bp->port is NULL by then.
      
      sctp_connect() doesn't need to bind ports, as later __sctp_connect
      will do it if bp->port is NULL. So remove it from sctp_connect().
      While at it, remove the unnecessary sockaddr.sa_family len check
      as it's already done in sctp_inet_connect.
      
      Fixes: 644fbdea ("sctp: fix the issue that flags are ignored when using kernel_connect")
      Reported-by: syzbot+079bf326b38072f849d9@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b6c0887
  3. 29 6月, 2019 9 次提交
  4. 28 6月, 2019 10 次提交
    • J
      nl80211: Fix undefined behavior in bit shift · d2ce8d6b
      Jiunn Chang 提交于
      Shifting signed 32-bit value by 31 bits is undefined.  Changing most
      significant bit to unsigned.
      Signed-off-by: NJiunn Chang <c0d1n61at3@gmail.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      d2ce8d6b
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 556e2f60
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "A handful of clk driver fixes and one core framework fix
      
         - Do a DT/firmware lookup in clk_core_get() even when the DT index is
           a nonsensical value
      
         - Fix some clk data typos in the Amlogic DT headers/code
      
         - Avoid returning junk in the TI clk driver when an invalid clk is
           looked for
      
         - Fix dividers for the emac clks on Stratix10 SoCs
      
         - Fix default HDA rates on Tegra210 to correct distorted audio"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: socfpga: stratix10: fix divider entry for the emac clocks
        clk: Do a DT parent lookup even when index < 0
        clk: tegra210: Fix default rates for HDA clocks
        clk: ti: clkctrl: Fix returning uninitialized data
        clk: meson: meson8b: fix a typo in the VPU parent names array variable
        clk: meson: fix MPLL 50M binding id typo
      556e2f60
    • L
      Merge tag 'for-5.2/dm-fixes-2' of... · 65ee21eb
      Linus Torvalds 提交于
      Merge tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix incorrect uses of kstrndup and DM logging macros in DM's early
         init code.
      
       - Fix DM log-writes target's handling of super block sectors so updates
         are made in order through use of completion.
      
       - Fix DM core's argument splitting code to avoid undefined behaviour
         reported as a side-effect of UBSAN analysis on ppc64le.
      
       - Fix DM verity target to limit the amount of error messages that can
         result from a corrupt block being found.
      
      * tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm verity: use message limit for data block corruption message
        dm table: don't copy from a NULL pointer in realloc_argv()
        dm log writes: make sure super sector log updates are written in order
        dm init: remove trailing newline from calls to DMERR() and DMINFO()
        dm init: fix incorrect uses of kstrndup()
      65ee21eb
    • L
      Merge tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux · 7a702b4e
      Linus Torvalds 提交于
      Pull pidfd fixes from Christian Brauner:
       "Userspace tools and libraries such as strace or glibc need a cheap and
        reliable way to tell whether CLONE_PIDFD is supported. The easiest way
        is to pass an invalid fd value in the return argument, perform the
        syscall and verify the value in the return argument has been changed
        to a valid fd.
      
        However, if CLONE_PIDFD is specified we currently check if pidfd == 0
        and return EINVAL if not.
      
        The check for pidfd == 0 was originally added to enable us to abuse
        the return argument for passing additional flags along with
        CLONE_PIDFD in the future.
      
        However, extending legacy clone this way would be a terrible idea and
        with clone3 on the horizon and the ability to reuse CLONE_DETACHED
        with CLONE_PIDFD there's no real need for this clutch. So remove the
        pidfd == 0 check and help userspace out.
      
        Also, accordig to Al, anon_inode_getfd() should only be used past the
        point of no failure and ksys_close() should not be used at all since
        it is far too easy to get wrong. Al's motto being "basically, once
        it's in descriptor table, it's out of your control". So Al's patch
        switches back to what we already had in v1 of the original patchset
        and uses a anon_inode_getfile() + put_user() + fd_install() sequence
        in the success path and a fput() + put_unused_fd() in the failure
        path.
      
        The other two changes should be trivial"
      
      * tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
        proc: remove useless d_is_dir() check
        copy_process(): don't use ksys_close() on cleanups
        samples: make pidfd-metadata fail gracefully on older kernels
        fork: don't check parent_tidptr with CLONE_PIDFD
      7a702b4e
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 763cf1f2
      Linus Torvalds 提交于
      Pull HID fixes from Jiri Kosina:
      
       - fix for one corner case in HID++ protocol with respect to handling
         very long reports, from Hans de Goede
      
       - power management fix in Intel-ISH driver, from Hyungwoo Yang
      
       - use-after-free fix in Intel-ISH driver, from Dan Carpenter
      
       - a couple of new device IDs/quirks from Kai-Heng Feng, Kyle Godbey and
         Oleksandr Natalenko
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: intel-ish-hid: fix wrong driver_data usage
        HID: multitouch: Add pointstick support for ALPS Touchpad
        HID: logitech-dj: Fix forwarding of very long HID++ reports
        HID: uclogic: Add support for Huion HS64 tablet
        HID: chicony: add another quirk for PixArt mouse
        HID: intel-ish-hid: Fix a use after free in load_fw_from_host()
      763cf1f2
    • L
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · fe2da896
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "A smaller batch of fixes, nothing that stands out as risky or scary.
      
        Mostly DTS tweaks for a few issues:
      
         - GPU fixlets for Meson
      
         - CPU idle fix for LS1028A
      
         - PWM interrupt fixes for i.MX6UL
      
        Also, enable a driver (FSL_EDMA) on arm64 defconfig, and a warning and
        two MAINTAINER tweaks"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        ARM: dts: imx6ul: fix PWM[1-4] interrupts
        ARM: omap2: remove incorrect __init annotation
        ARM: dts: gemini Fix up DNS-313 compatible string
        ARM: dts: Blank D-Link DIR-685 console
        arm64: defconfig: Enable FSL_EDMA driver
        arm64: dts: ls1028a: Fix CPU idle fail.
        MAINTAINERS: BCM53573: Add internal Broadcom mailing list
        MAINTAINERS: BCM2835: Add internal Broadcom mailing list
        ARM: dts: meson8b: fix the operating voltage of the Mali GPU
        ARM: dts: meson8b: drop undocumented property from the Mali GPU node
        ARM: dts: meson8: fix GPU interrupts and drop an undocumented property
      fe2da896
    • L
      Merge tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · cd0f3aae
      Linus Torvalds 提交于
      Pull AFS fixes from David Howells:
       "The in-kernel AFS client has been undergoing testing on opendev.org on
        one of their mirror machines. They are using AFS to hold data that is
        then served via apache, and Ian Wienand had reported seeing oopses,
        spontaneous machine reboots and updates to volumes going missing. This
        patch series appears to have fixed the problem, very probably due to
        patch (2), but it's not 100% certain.
      
        (1) Fix the printing of the "vnode modified" warning to exclude checks
            on files for which we don't have a callback promise from the
            server (and so don't expect the server to tell us when it
            changes).
      
            Without this, for every file or directory for which we still have
            an in-core inode that gets changed on the server, we may get a
            message logged when we next look at it. This can happen in bulk
            if, for instance, someone does "vos release" to update a R/O
            volume from a R/W volume and a whole set of files are all changed
            together.
      
            We only really want to log a message if the file changed and the
            server didn't tell us about it or we failed to track the state
            internally.
      
        (2) Fix accidental corruption of either afs_vlserver struct objects or
            the the following memory locations (which could hold anything).
            The issue is caused by a union that points to two different
            structs in struct afs_call (to save space in the struct). The call
            cleanup code assumes that it can simply call the cleanup for one
            of those structs if not NULL - when it might be actually pointing
            to the other struct.
      
            This means that every Volume Location RPC op is going to corrupt
            something.
      
        (3) Fix an uninitialised spinlock. This isn't too bad, it just causes
            a one-off warning if lockdep is enabled when "vos release" is
            called, but the spinlock still behaves correctly.
      
        (4) Fix the setting of i_block in the inode. This causes du, for
            example, to produce incorrect results, but otherwise should not be
            dangerous to the kernel"
      
      * tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix setting of i_blocks
        afs: Fix uninitialised spinlock afs_volume::cb_break_lock
        afs: Fix vlserver record corruption
        afs: Fix over zealous "vnode modified" warnings
      cd0f3aae
    • L
      Merge tag 'csky-for-linus-5.2-fixup-gcc-unwind' of git://github.com/c-sky/csky-linux · 139ca258
      Linus Torvalds 提交于
      Pull arch/csky fixup from Guo Ren:
       "A fixup patch for rt_sigframe in signal.c"
      
      * tag 'csky-for-linus-5.2-fixup-gcc-unwind' of git://github.com/c-sky/csky-linux:
        csky: Fixup libgcc unwind error
      139ca258
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c84afab0
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix ppp_mppe crypto soft dependencies, from Takashi Iawi.
      
       2) Fix TX completion to be finite, from Sergej Benilov.
      
       3) Use register_pernet_device to avoid a dst leak in tipc, from Xin
          Long.
      
       4) Double free of TX cleanup in Dirk van der Merwe.
      
       5) Memory leak in packet_set_ring(), from Eric Dumazet.
      
       6) Out of bounds read in qmi_wwan, from Bjørn Mork.
      
       7) Fix iif used in mcast/bcast looped back packets, from Stephen
          Suryaputra.
      
       8) Fix neighbour resolution on raw ipv6 sockets, from Nicolas Dichtel.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits)
        af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
        sctp: change to hold sk after auth shkey is created successfully
        ipv6: fix neighbour resolution with raw socket
        ipv6: constify rt6_nexthop()
        net: dsa: microchip: Use gpiod_set_value_cansleep()
        net: aquantia: fix vlans not working over bridged network
        ipv4: reset rt_iif for recirculated mcast/bcast out pkts
        team: Always enable vlan tx offload
        net/smc: Fix error path in smc_init
        net/smc: hold conns_lock before calling smc_lgr_register_conn()
        bonding: Always enable vlan tx offload
        net/ipv6: Fix misuse of proc_dointvec "skip_notify_on_dev_down"
        ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
        qmi_wwan: Fix out-of-bounds read
        tipc: check msg->req data len in tipc_nl_compat_bearer_disable
        net: macb: do not copy the mac address if NULL
        net/packet: fix memory leak in packet_set_ring()
        net/tls: fix page double free on TX cleanup
        net/sched: cbs: Fix error path of cbs_module_init
        tipc: change to use register_pernet_device
        ...
      c84afab0
    • L
      mt76: usb: fix rx A-MSDU support · 2a92b08b
      Lorenzo Bianconi 提交于
      Commit f8f527b1 ("mt76: usb: use EP max packet aligned buffer sizes
      for rx") breaks A-MSDU support. When A-MSDU is enable the device can
      receive frames up to q->buf_size but they will be discarded in
      mt76u_process_rx_entry since there is no enough room for
      skb_shared_info. Fix the issue reallocating the skb and copying in the
      linear area the first 128B of the received frames and in the frag_list
      the remaining part
      
      Fixes: f8f527b1 ("mt76: usb: use EP max packet aligned buffer sizes for rx")
      Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      2a92b08b
  5. 27 6月, 2019 8 次提交
    • C
      proc: remove useless d_is_dir() check · 30d158b1
      Christian Brauner 提交于
      Remove the d_is_dir() check from tgid_pidfd_to_pid().
      
      It is pointless since you should never get &proc_tgid_base_operations
      for f_op on a non-directory.
      Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      30d158b1
    • A
      copy_process(): don't use ksys_close() on cleanups · 6fd2fe49
      Al Viro 提交于
      anon_inode_getfd() should be used *ONLY* in situations when we are
      guaranteed to be past the last failure point (including copying the
      descriptor number to userland, at that).  And ksys_close() should
      not be used for cleanups at all.
      
      anon_inode_getfile() is there for all nontrivial cases like that.
      Just use that...
      
      Fixes: b3e58382 ("clone: add CLONE_PIDFD")
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Reviewed-by: NJann Horn <jannh@google.com>
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      6fd2fe49
    • N
      af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET · 89ed5b51
      Neil Horman 提交于
      When an application is run that:
      a) Sets its scheduler to be SCHED_FIFO
      and
      b) Opens a memory mapped AF_PACKET socket, and sends frames with the
      MSG_DONTWAIT flag cleared, its possible for the application to hang
      forever in the kernel.  This occurs because when waiting, the code in
      tpacket_snd calls schedule, which under normal circumstances allows
      other tasks to run, including ksoftirqd, which in some cases is
      responsible for freeing the transmitted skb (which in AF_PACKET calls a
      destructor that flips the status bit of the transmitted frame back to
      available, allowing the transmitting task to complete).
      
      However, when the calling application is SCHED_FIFO, its priority is
      such that the schedule call immediately places the task back on the cpu,
      preventing ksoftirqd from freeing the skb, which in turn prevents the
      transmitting task from detecting that the transmission is complete.
      
      We can fix this by converting the schedule call to a completion
      mechanism.  By using a completion queue, we force the calling task, when
      it detects there are no more frames to send, to schedule itself off the
      cpu until such time as the last transmitted skb is freed, allowing
      forward progress to be made.
      
      Tested by myself and the reporter, with good results
      
      Change Notes:
      
      V1->V2:
      	Enhance the sleep logic to support being interruptible and
      allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)
      
      V2->V3:
      	Rearrage the point at which we wait for the completion queue, to
      avoid needing to check for ph/skb being null at the end of the loop.
      Also move the complete call to the skb destructor to avoid needing to
      modify __packet_set_status.  Also gate calling complete on
      packet_read_pending returning zero to avoid multiple calls to complete.
      (Willem de Bruijn)
      
      	Move timeo computation within loop, to re-fetch the socket
      timeout since we also use the timeo variable to record the return code
      from the wait_for_complete call (Neil Horman)
      
      V3->V4:
      	Willem has requested that the control flow be restored to the
      previous state.  Doing so lets us eliminate the need for the
      po->wait_on_complete flag variable, and lets us get rid of the
      packet_next_frame function, but introduces another complexity.
      Specifically, but using the packet pending count, we can, if an
      applications calls sendmsg multiple times with MSG_DONTWAIT set, each
      set of transmitted frames, when complete, will cause
      tpacket_destruct_skb to issue a complete call, for which there will
      never be a wait_on_completion call.  This imbalance will lead to any
      future call to wait_for_completion here to return early, when the frames
      they sent may not have completed.  To correct this, we need to re-init
      the completion queue on every call to tpacket_snd before we enter the
      loop so as to ensure we wait properly for the frames we send in this
      iteration.
      
      	Change the timeout and interrupted gotos to out_put rather than
      out_status so that we don't try to free a non-existant skb
      	Clean up some extra newlines (Willem de Bruijn)
      Reviewed-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89ed5b51
    • X
      sctp: change to hold sk after auth shkey is created successfully · 25bff6d5
      Xin Long 提交于
      Now in sctp_endpoint_init(), it holds the sk then creates auth
      shkey. But when the creation fails, it doesn't release the sk,
      which causes a sk defcnf leak,
      
      Here to fix it by only holding the sk when auth shkey is created
      successfully.
      
      Fixes: a29a5bd4 ("[SCTP]: Implement SCTP-AUTH initializations.")
      Reported-by: syzbot+afabda3890cc2f765041@syzkaller.appspotmail.com
      Reported-by: syzbot+276ca1c77a19977c0130@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25bff6d5
    • D
      Merge branch 'ipv6-fix-neighbour-resolution-with-raw-socket' · 13696531
      David S. Miller 提交于
      Nicolas Dichtel says:
      
      ====================
      ipv6: fix neighbour resolution with raw socket
      
      The first patch prepares the fix, it constify rt6_nexthop().
      The detail of the bug is explained in the second patch.
      
      v1 -> v2:
       - fix compilation warnings
       - split the initial patch
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13696531
    • N
      ipv6: fix neighbour resolution with raw socket · 2c6b55f4
      Nicolas Dichtel 提交于
      The scenario is the following: the user uses a raw socket to send an ipv6
      packet, destinated to a not-connected network, and specify a connected nh.
      Here is the corresponding python script to reproduce this scenario:
      
       import socket
       IPPROTO_RAW = 255
       send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
       # scapy
       # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
       # str(p)
       req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
       send_s.sendto(req, ('fd00:175::2', 0, 0, 0))
      
      fd00:175::/64 is a connected route and fd00:200::fa is not a connected
      host.
      
      With this scenario, the kernel starts by sending a NS to resolve
      fd00:175::2. When it receives the NA, it flushes its queue and try to send
      the initial packet. But instead of sending it, it sends another NS to
      resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
      the user sends again the packet, it now uses the right nh (fd00:175::2).
      
      The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
      :: because the associated route is a connected route, thus it uses the dst
      addr of the packet. Let's use rt6_nexthop() to choose the right nh.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c6b55f4
    • N
      ipv6: constify rt6_nexthop() · 9b1c1ef1
      Nicolas Dichtel 提交于
      There is no functional change in this patch, it only prepares the next one.
      
      rt6_nexthop() will be used by ip6_dst_lookup_neigh(), which uses const
      variables.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b1c1ef1
    • M
      net: dsa: microchip: Use gpiod_set_value_cansleep() · 22e72b5e
      Marek Vasut 提交于
      Replace gpiod_set_value() with gpiod_set_value_cansleep(), as the switch
      reset GPIO can be connected to e.g. I2C GPIO expander and it is perfectly
      fine for the kernel to sleep for a bit in ksz_switch_register().
      Signed-off-by: NMarek Vasut <marex@denx.de>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Tristram Ha <Tristram.Ha@microchip.com>
      Cc: Woojung Huh <Woojung.Huh@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22e72b5e