1. 14 9月, 2018 1 次提交
  2. 13 9月, 2018 15 次提交
    • J
      xen/netfront: don't bug in case of too many frags · ad4f15dc
      Juergen Gross 提交于
      Commit 57f230ab ("xen/netfront: raise max number of slots in
      xennet_get_responses()") raised the max number of allowed slots by one.
      This seems to be problematic in some configurations with netback using
      a larger MAX_SKB_FRAGS value (e.g. old Linux kernel with MAX_SKB_FRAGS
      defined as 18 instead of nowadays 17).
      
      Instead of BUG_ON() in this case just fall back to retransmission.
      
      Fixes: 57f230ab ("xen/netfront: raise max number of slots in xennet_get_responses()")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad4f15dc
    • X
      ipv6: use rt6_info members when dst is set in rt6_fill_node · 22d0bd82
      Xin Long 提交于
      In inet6_rtm_getroute, since Commit 93531c67 ("net/ipv6: separate
      handling of FIB entries from dst based routes"), it has used rt->from
      to dump route info instead of rt.
      
      However for some route like cache, some of its information like flags
      or gateway is not the same as that of the 'from' one. It caused 'ip
      route get' to dump the wrong route information.
      
      In Jianlin's testing, the output information even lost the expiration
      time for a pmtu route cache due to the wrong fib6_flags.
      
      So change to use rt6_info members for dst addr, src addr, flags and
      gateway when it tries to dump a route entry without fibmatch set.
      
      v1->v2:
        - not use rt6i_prefsrc.
        - also fix the gw dump issue.
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22d0bd82
    • L
      Merge tag 'drm-fixes-2018-09-12' of git://anongit.freedesktop.org/drm/drm · 7428b2e5
      Linus Torvalds 提交于
      Pull drm nouveau fixes from Dave Airlie:
       "I'm sending this separately as it's a bit larger than I generally like
        for one driver, but it does contain a bunch of make my nvidia laptop
        not die (runpm) and a bunch to make my docking station and monitor
        display stuff (mst) fixes.
      
        Lyude has spent a lot of time on these, and we are putting the fixes
        into distro kernels as well asap, as it helps a bunch of standard
        Lenovo laptops, so I'm fairly happy things are better than they were
        before these patches, but I decided to split them out just for
        clarification"
      
      * tag 'drm-fixes-2018-09-12' of git://anongit.freedesktop.org/drm/drm:
        drm/nouveau/disp/gm200-: enforce identity-mapped SOR assignment for LVDS/eDP panels
        drm/nouveau/disp: fix DP disable race
        drm/nouveau/disp: move eDP panel power handling
        drm/nouveau/disp: remove unused struct member
        drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS
        drm/nouveau/mmu: don't attempt to dereference vmm without valid instance pointer
        drm/nouveau: fix oops in client init failure path
        drm/nouveau: Fix nouveau_connector_ddc_detect()
        drm/nouveau/drm/nouveau: Don't forget to cancel hpd_work on suspend/unload
        drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early
        drm/nouveau: Reset MST branching unit before enabling
        drm/nouveau: Only write DP_MSTM_CTRL when needed
        drm/nouveau: Remove useless poll_enable() call in drm_load()
        drm/nouveau: Remove useless poll_disable() call in switcheroo_set_state()
        drm/nouveau: Remove useless poll_enable() call in switcheroo_set_state()
        drm/nouveau: Fix deadlocks in nouveau_connector_detect()
        drm/nouveau/drm/nouveau: Use pm_runtime_get_noresume() in connector_detect()
        drm/nouveau/drm/nouveau: Fix deadlock with fb_helper with async RPM requests
        drm/nouveau: Remove duplicate poll_enable() in pmops_runtime_suspend()
        drm/nouveau/drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement
      7428b2e5
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 67b07609
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix up several Kconfig dependencies in netfilter, from Martin Willi
          and Florian Westphal.
      
       2) Memory leak in be2net driver, from Petr Oros.
      
       3) Memory leak in E-Switch handling of mlx5 driver, from Raed Salem.
      
       4) mlx5_attach_interface needs to check for errors, from Huy Nguyen.
      
       5) tipc_release() needs to orphan the sock, from Cong Wang.
      
       6) Need to program TxConfig register after TX/RX is enabled in r8169
          driver, not beforehand, from Maciej S. Szmigiero.
      
       7) Handle 64K PAGE_SIZE properly in ena driver, from Netanel Belgazal.
      
       8) Fix crash regression in ip_do_fragment(), from Taehee Yoo.
      
       9) syzbot can create conditions where kernel log is flooded with
          synflood warnings due to creation of many listening sockets, fix
          that. From Willem de Bruijn.
      
      10) Fix RCU issues in rds socket layer, from Cong Wang.
      
      11) Fix vlan matching in nfp driver, from Pieter Jansen van Vuuren.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
        nfp: flower: reject tunnel encap with ipv6 outer headers for offloading
        nfp: flower: fix vlan match by checking both vlan id and vlan pcp
        tipc: check return value of __tipc_dump_start()
        s390/qeth: don't dump past end of unknown HW header
        s390/qeth: use vzalloc for QUERY OAT buffer
        s390/qeth: switch on SG by default for IQD devices
        s390/qeth: indicate error when netdev allocation fails
        rds: fix two RCU related problems
        r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED
        erspan: fix error handling for erspan tunnel
        erspan: return PACKET_REJECT when the appropriate tunnel is not found
        tcp: rate limit synflood warnings further
        MIPS: lantiq: dma: add dev pointer
        netfilter: xt_hashlimit: use s->file instead of s->private
        netfilter: nfnetlink_queue: Solve the NFQUEUE/conntrack clash for NF_REPEAT
        netfilter: cttimeout: ctnl_timeout_find_get() returns incorrect pointer to type
        netfilter: conntrack: timeout interface depend on CONFIG_NF_CONNTRACK_TIMEOUT
        netfilter: conntrack: reset tcp maxwin on re-register
        qmi_wwan: Support dynamic config on Quectel EP06
        ethernet: renesas: convert to SPDX identifiers
        ...
      67b07609
    • D
      Merge branch 'nfp-flower-fixes' · 4851bfd6
      David S. Miller 提交于
      Jakub Kicinski says:
      
      ====================
      nfp: flower: fixes for flower offload
      
      Two fixes for flower matching and tunnel encap.  Pieter fixes
      VLAN matching if the entire VLAN id is masked out and match
      is only performed on the PCP field.  Louis adds validation of
      tunnel flags for encap, most importantly we should not offload
      actions on IPv6 tunnels if it's not supported.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4851bfd6
    • L
      nfp: flower: reject tunnel encap with ipv6 outer headers for offloading · 224de549
      Louis Peens 提交于
      This fixes a bug where ipv6 tunnels would report that it is
      getting offloaded to hardware but would actually be rejected
      by hardware.
      
      Fixes: b27d6a95 ("nfp: compile flower vxlan tunnel set actions")
      Signed-off-by: NLouis Peens <louis.peens@netronome.com>
      Reviewed-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      224de549
    • P
      nfp: flower: fix vlan match by checking both vlan id and vlan pcp · db191db8
      Pieter Jansen van Vuuren 提交于
      Previously we only checked if the vlan id field is present when trying
      to match a vlan tag. The vlan id and vlan pcp field should be treated
      independently.
      
      Fixes: 5571e8c9 ("nfp: extend flower matching capabilities")
      Signed-off-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db191db8
    • C
      tipc: check return value of __tipc_dump_start() · 12a78b02
      Cong Wang 提交于
      When __tipc_dump_start() fails with running out of memory,
      we have no reason to continue, especially we should avoid
      calling tipc_dump_done().
      
      Fixes: 8f5c5fcf ("tipc: call start and done ops directly in __tipc_nl_compat_dumpit()")
      Reported-and-tested-by: syzbot+3f8324abccfbf8c74a9f@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12a78b02
    • D
      Merge branch 'qeth-fixes' · 6b4d24de
      David S. Miller 提交于
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2018-09-12
      
      please apply the following qeth fixes for -net.
      
      Patch 1 resolves a regression in an error path, while patch 2 enables
      the SG support by default that was newly introduced with 4.19.
      Patch 3 takes care of a longstanding problem with large-order
      allocations, and patch 4 fixes a potential out-of-bounds access.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b4d24de
    • J
      s390/qeth: don't dump past end of unknown HW header · 0ac1487c
      Julian Wiedmann 提交于
      For inbound data with an unsupported HW header format, only dump the
      actual HW header. We have no idea how much payload follows it, and what
      it contains. Worst case, we dump past the end of the Inbound Buffer and
      access whatever is located next in memory.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ac1487c
    • W
      s390/qeth: use vzalloc for QUERY OAT buffer · aec45e85
      Wenjia Zhang 提交于
      qeth_query_oat_command() currently allocates the kernel buffer for
      the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
      fragmented memory, large allocations may fail (eg. the qethqoat tool by
      default uses 132KB).
      
      Solve this issue by using vzalloc, backing the allocation with
      non-contiguous memory.
      Signed-off-by: NWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aec45e85
    • J
      s390/qeth: switch on SG by default for IQD devices · 04db741d
      Julian Wiedmann 提交于
      Scatter-gather transmit brings a nice performance boost. Considering the
      rather large MTU sizes at play, it's also totally the Right Thing To Do.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04db741d
    • J
      s390/qeth: indicate error when netdev allocation fails · 778b1ac7
      Julian Wiedmann 提交于
      Bailing out on allocation error is nice, but we also need to tell the
      ccwgroup core that creating the qeth groupdev failed.
      
      Fixes: d3d1b205 ("s390/qeth: allocate netdevice early")
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      778b1ac7
    • L
      Merge tag 'riscv-for-linus-4.19-rc3' of... · 96eddb81
      Linus Torvalds 提交于
      Merge tag 'riscv-for-linus-4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V fix from Palmer Dabbelt:
       "This contains what I hope to be the last RISC-V patch for 4.19.
      
        It fixes a bug in our initramfs support by removing some broken and
        obselete code"
      
      * tag 'riscv-for-linus-4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        riscv: Do not overwrite initrd_start and initrd_end
      96eddb81
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · aeb54272
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "Three fixes, all in drivers (qedi and iscsi target) so no wider impact
        even if the code changes are a bit extensive"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qedi: Add the CRC size within iSCSI NVM image
        scsi: iscsi: target: Fix conn_ops double free
        scsi: iscsi: target: Set conn->sess to NULL when iscsi_login_set_conn_values fails
      aeb54272
  3. 12 9月, 2018 9 次提交
    • C
      rds: fix two RCU related problems · cc4dfb7f
      Cong Wang 提交于
      When a rds sock is bound, it is inserted into the bind_hash_table
      which is protected by RCU. But when releasing rds sock, after it
      is removed from this hash table, it is freed immediately without
      respecting RCU grace period. This could cause some use-after-free
      as reported by syzbot.
      
      Mark the rds sock with SOCK_RCU_FREE before inserting it into the
      bind_hash_table, so that it would be always freed after a RCU grace
      period.
      
      The other problem is in rds_find_bound(), the rds sock could be
      freed in between rhashtable_lookup_fast() and rds_sock_addref(),
      so we need to extend RCU read lock protection in rds_find_bound()
      to close this race condition.
      
      Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com
      Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
      Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: rds-devel@oss.oracle.com
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oarcle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4dfb7f
    • K
      r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED · 6ad56901
      Kai-Heng Feng 提交于
      After system suspend, sometimes the r8169 doesn't work when ethernet
      cable gets pluggued.
      
      This issue happens because rtl_reset_work() doesn't get called from
      rtl8169_runtime_resume(), after system suspend.
      
      In rtl_task(), RTL_FLAG_TASK_* only gets cleared if this condition is
      met:
      if (!netif_running(dev) ||
          !test_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags))
          ...
      
      If RTL_FLAG_TASK_ENABLED was cleared during system suspend while
      RTL_FLAG_TASK_RESET_PENDING was set, the next rtl_schedule_task() won't
      schedule task as the flag is still there.
      
      So in addition to clearing RTL_FLAG_TASK_ENABLED, also clears other
      flags.
      
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ad56901
    • H
      erspan: fix error handling for erspan tunnel · 51dc63e3
      Haishuang Yan 提交于
      When processing icmp unreachable message for erspan tunnel, tunnel id
      should be erspan_net_id instead of ipgre_net_id.
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51dc63e3
    • H
      erspan: return PACKET_REJECT when the appropriate tunnel is not found · 5a64506b
      Haishuang Yan 提交于
      If erspan tunnel hasn't been established, we'd better send icmp port
      unreachable message after receive erspan packets.
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a64506b
    • W
      tcp: rate limit synflood warnings further · 0297c1c2
      Willem de Bruijn 提交于
      Convert pr_info to net_info_ratelimited to limit the total number of
      synflood warnings.
      
      Commit 946cedcc ("tcp: Change possible SYN flooding messages")
      rate limits synflood warnings to one per listener.
      
      Workloads that open many listener sockets can still see a high rate of
      log messages. Syzkaller is one frequent example.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0297c1c2
    • H
      MIPS: lantiq: dma: add dev pointer · 2d946e5b
      Hauke Mehrtens 提交于
      dma_zalloc_coherent() now crashes if no dev pointer is given.
      Add a dev pointer to the ltq_dma_channel structure and fill it in the
      driver using it.
      
      This fixes a bug introduced in kernel 4.19.
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d946e5b
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 4ecdf770
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for you net tree:
      
      1) Remove duplicated include at the end of UDP conntrack, from Yue Haibing.
      
      2) Restore conntrack dependency on xt_cluster, from Martin Willi.
      
      3) Fix splat with GSO skbs from the checksum target, from Florian Westphal.
      
      4) Rework ct timeout support, the template strategy to attach custom timeouts
         is not correct since it will not work in conjunction with conntrack zones
         and we have a possible free after use when removing the rule due to missing
         refcounting. To fix these problems, do not use conntrack template at all
         and set custom timeout on the already valid conntrack object. This
         fix comes with a preparation patch to simplify timeout adjustment by
         initializating the first position of the timeout array for all of the
         existing trackers. Patchset from Florian Westphal.
      
      5) Fix missing dependency on from IPv4 chain NAT type, from Florian.
      
      6) Release chain reference counter from the flush path, from Taehee Yoo.
      
      7) After flushing an iptables ruleset, conntrack hooks are unregistered
         and entries are left stale to be cleaned up by the timeout garbage
         collector. No TCP tracking is done on established flows by this time.
         If ruleset is reloaded, then hooks are registered again and TCP
         tracking is restored, which considers packets to be invalid. Clear
         window tracking to exercise TCP flow pickup from the middle given that
         history is lost for us. Again from Florian.
      
      8) Fix crash from netlink interface with CONFIG_NF_CONNTRACK_TIMEOUT=y
         and CONFIG_NF_CT_NETLINK_TIMEOUT=n.
      
      9) Broken CT target due to returning incorrect type from
         ctnl_timeout_find_get().
      
      10) Solve conntrack clash on NF_REPEAT verdicts too, from Michal Vaner.
      
      11) Missing conversion of hashlimit sysctl interface to new API, from
          Cong Wang.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ecdf770
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 5e335542
      Linus Torvalds 提交于
      Pull HID fixes from Jiri Kosina:
      
       - functional regression fix for sensor-hub driver from Hans de Goede
      
       - stop doing device reset for i2c-hid devices, which unbreaks some of
         them (and is in line with the specification), from Kai-Heng Feng
      
       - error handling fix for hid-core from Gustavo A. R. Silva
      
       - functional regression fix for some Elan panels from Benjamin
         Tissoires
      
       - a few new device ID additions and misc small fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: i2c-hid: Don't reset device upon system resume
        HID: sensor-hub: Restore fixup for Lenovo ThinkPad Helix 2 sensor hub report
        HID: core: fix NULL pointer dereference
        HID: core: fix grouping by application
        HID: multitouch: fix Elan panels with 2 input modes declaration
        HID: hid-saitek: Add device ID for RAT 7 Contagion
        HID: core: fix memory leak on probe
        HID: input: fix leaking custom input node name
        HID: add support for Apple Magic Keyboards
        HID: i2c-hid: Fix flooded incomplete report after S3 on Rayd touchscreen
        HID: intel-ish-hid: Enable Sunrise Point-H ish driver
      5e335542
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 28a0ea77
      Linus Torvalds 提交于
      Pull rdma fixes from Jason Gunthorpe:
       "This fixes one major regression with NFS and mlx4 due to the max_sg
        rework in this merge window, tidies a few minor error_path
        regressions, and various small fixes.
      
        The HFI1 driver is broken this cycle due to a regression caused by a
        PCI change, it is looking like Bjorn will merge a fix for this. Also,
        the lingering ipoib issue I mentioned earlier still remains unfixed.
      
        Summary:
      
         - Fix possible FD type confusion crash
      
         - Fix a user trigger-able crash in cxgb4
      
         - Fix bad handling of IOMMU resources causing user controlled leaking
           in bnxt
      
         - Add missing locking in ipoib to fix a rare 'stuck tx' situation
      
         - Add missing locking in cma
      
         - Add two missing missing uverbs cleanups on failure paths,
           regressions from this merge window
      
         - Fix a regression from this merge window that caused RDMA NFS to not
           work with the mlx4 driver due to the max_sg changes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/mlx4: Ensure that maximal send/receive SGE less than supported by HW
        RDMA/cma: Protect cma dev list with lock
        RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one()
        bnxt_re: Fix couple of memory leaks that could lead to IOMMU call traces
        IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler
        iw_cxgb4: only allow 1 flush on user qps
        IB/core: Release object lock if destroy failed
        RDMA/ucma: check fd type in ucma_migrate_id()
      28a0ea77
  4. 11 9月, 2018 8 次提交
  5. 10 9月, 2018 2 次提交
    • L
      Linux 4.19-rc3 · 11da3a7f
      Linus Torvalds 提交于
      11da3a7f
    • T
      ip: frags: fix crash in ip_do_fragment() · 5d407b07
      Taehee Yoo 提交于
      A kernel crash occurrs when defragmented packet is fragmented
      in ip_do_fragment().
      In defragment routine, skb_orphan() is called and
      skb->ip_defrag_offset is set. but skb->sk and
      skb->ip_defrag_offset are same union member. so that
      frag->sk is not NULL.
      Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
      defragmented packet is fragmented.
      
      test commands:
         %iptables -t nat -I POSTROUTING -j MASQUERADE
         %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
      
      splat looks like:
      [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
      [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
      [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
      [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
      [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
      [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
      [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
      [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
      [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
      [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
      [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
      [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
      [  261.198158] Call Trace:
      [  261.199018]  ? dst_output+0x180/0x180
      [  261.205011]  ? save_trace+0x300/0x300
      [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
      [  261.213034]  ? sched_clock_local+0xd4/0x140
      [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
      [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
      [  261.227014]  ? find_held_lock+0x39/0x1c0
      [  261.233008]  ip_finish_output+0x51d/0xb50
      [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
      [  261.250152]  ? rcu_is_watching+0x77/0x120
      [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
      [  261.261033]  ? nf_hook_slow+0xb1/0x160
      [  261.265007]  ip_output+0x1c7/0x710
      [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
      [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.282996]  ? nf_hook_slow+0xb1/0x160
      [  261.287007]  raw_sendmsg+0x21f9/0x4420
      [  261.291008]  ? dst_output+0x180/0x180
      [  261.297003]  ? sched_clock_cpu+0x126/0x170
      [  261.301003]  ? find_held_lock+0x39/0x1c0
      [  261.306155]  ? stop_critical_timings+0x420/0x420
      [  261.311004]  ? check_flags.part.36+0x450/0x450
      [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.326142]  ? cyc2ns_read_end+0x10/0x10
      [  261.330139]  ? raw_bind+0x280/0x280
      [  261.334138]  ? sched_clock_cpu+0x126/0x170
      [  261.338995]  ? check_flags.part.36+0x450/0x450
      [  261.342991]  ? __lock_acquire+0x4500/0x4500
      [  261.348994]  ? inet_sendmsg+0x11c/0x500
      [  261.352989]  ? dst_output+0x180/0x180
      [  261.357012]  inet_sendmsg+0x11c/0x500
      [ ... ]
      
      v2:
       - clear skb->sk at reassembly routine.(Eric Dumarzet)
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d407b07
  6. 09 9月, 2018 5 次提交