1. 30 7月, 2018 2 次提交
  2. 25 7月, 2018 2 次提交
  3. 24 7月, 2018 1 次提交
  4. 23 7月, 2018 1 次提交
    • R
      rtnetlink: add rtnl_link_state check in rtnl_configure_link · 5025f7f7
      Roopa Prabhu 提交于
      rtnl_configure_link sets dev->rtnl_link_state to
      RTNL_LINK_INITIALIZED and unconditionally calls
      __dev_notify_flags to notify user-space of dev flags.
      
      current call sequence for rtnl_configure_link
      rtnetlink_newlink
          rtnl_link_ops->newlink
          rtnl_configure_link (unconditionally notifies userspace of
                               default and new dev flags)
      
      If a newlink handler wants to call rtnl_configure_link
      early, we will end up with duplicate notifications to
      user-space.
      
      This patch fixes rtnl_configure_link to check rtnl_link_state
      and call __dev_notify_flags with gchanges = 0 if already
      RTNL_LINK_INITIALIZED.
      
      Later in the series, this patch will help the following sequence
      where a driver implementing newlink can call rtnl_configure_link
      to initialize the link early.
      
      makes the following call sequence work:
      rtnetlink_newlink
          rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes
                                                      link and notifies
                                                      user-space of default
                                                      dev flags)
          rtnl_configure_link (updates dev flags if requested by user ifm
                               and notifies user-space of new dev flags)
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5025f7f7
  5. 22 7月, 2018 1 次提交
    • E
      net: skb_segment() should not return NULL · ff907a11
      Eric Dumazet 提交于
      syzbot caught a NULL deref [1], caused by skb_segment()
      
      skb_segment() has many "goto err;" that assume the @err variable
      contains -ENOMEM.
      
      A successful call to __skb_linearize() should not clear @err,
      otherwise a subsequent memory allocation error could return NULL.
      
      While we are at it, we might use -EINVAL instead of -ENOMEM when
      MAX_SKB_FRAGS limit is reached.
      
      [1]
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      CPU: 0 PID: 13285 Comm: syz-executor3 Not tainted 4.18.0-rc4+ #146
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:tcp_gso_segment+0x3dc/0x1780 net/ipv4/tcp_offload.c:106
      Code: f0 ff ff 0f 87 1c fd ff ff e8 00 88 0b fb 48 8b 75 d0 48 b9 00 00 00 00 00 fc ff df 48 8d be 90 00 00 00 48 89 f8 48 c1 e8 03 <0f> b6 14 08 48 8d 86 94 00 00 00 48 89 c6 83 e0 07 48 c1 ee 03 0f
      RSP: 0018:ffff88019b7fd060 EFLAGS: 00010206
      RAX: 0000000000000012 RBX: 0000000000000020 RCX: dffffc0000000000
      RDX: 0000000000040000 RSI: 0000000000000000 RDI: 0000000000000090
      RBP: ffff88019b7fd0f0 R08: ffff88019510e0c0 R09: ffffed003b5c46d6
      R10: ffffed003b5c46d6 R11: ffff8801dae236b3 R12: 0000000000000001
      R13: ffff8801d6c581f4 R14: 0000000000000000 R15: ffff8801d6c58128
      FS:  00007fcae64d6700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004e8664 CR3: 00000001b669b000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tcp4_gso_segment+0x1c3/0x440 net/ipv4/tcp_offload.c:54
       inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342
       inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342
       skb_mac_gso_segment+0x3b5/0x740 net/core/dev.c:2792
       __skb_gso_segment+0x3c3/0x880 net/core/dev.c:2865
       skb_gso_segment include/linux/netdevice.h:4099 [inline]
       validate_xmit_skb+0x640/0xf30 net/core/dev.c:3104
       __dev_queue_xmit+0xc14/0x3910 net/core/dev.c:3561
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3602
       neigh_hh_output include/net/neighbour.h:473 [inline]
       neigh_output include/net/neighbour.h:481 [inline]
       ip_finish_output2+0x1063/0x1860 net/ipv4/ip_output.c:229
       ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:276 [inline]
       ip_output+0x223/0x880 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:444 [inline]
       ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124
       iptunnel_xmit+0x567/0x850 net/ipv4/ip_tunnel_core.c:91
       ip_tunnel_xmit+0x1598/0x3af1 net/ipv4/ip_tunnel.c:778
       ipip_tunnel_xmit+0x264/0x2c0 net/ipv4/ipip.c:308
       __netdev_start_xmit include/linux/netdevice.h:4148 [inline]
       netdev_start_xmit include/linux/netdevice.h:4157 [inline]
       xmit_one net/core/dev.c:3034 [inline]
       dev_hard_start_xmit+0x26c/0xc30 net/core/dev.c:3050
       __dev_queue_xmit+0x29ef/0x3910 net/core/dev.c:3569
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3602
       neigh_direct_output+0x15/0x20 net/core/neighbour.c:1403
       neigh_output include/net/neighbour.h:483 [inline]
       ip_finish_output2+0xa67/0x1860 net/ipv4/ip_output.c:229
       ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:276 [inline]
       ip_output+0x223/0x880 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:444 [inline]
       ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124
       ip_queue_xmit+0x9df/0x1f80 net/ipv4/ip_output.c:504
       tcp_transmit_skb+0x1bf9/0x3f10 net/ipv4/tcp_output.c:1168
       tcp_write_xmit+0x1641/0x5c20 net/ipv4/tcp_output.c:2363
       __tcp_push_pending_frames+0xb2/0x290 net/ipv4/tcp_output.c:2536
       tcp_push+0x638/0x8c0 net/ipv4/tcp.c:735
       tcp_sendmsg_locked+0x2ec5/0x3f00 net/ipv4/tcp.c:1410
       tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:641 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:651
       __sys_sendto+0x3d7/0x670 net/socket.c:1797
       __do_sys_sendto net/socket.c:1809 [inline]
       __se_sys_sendto net/socket.c:1805 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1805
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x455ab9
      Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fcae64d5c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007fcae64d66d4 RCX: 0000000000455ab9
      RDX: 0000000000000001 RSI: 0000000020000200 RDI: 0000000000000013
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000014
      R13: 00000000004c1145 R14: 00000000004d1818 R15: 0000000000000006
      Modules linked in:
      Dumping ftrace buffer:
         (ftrace buffer empty)
      
      Fixes: ddff00d4 ("net: Move skb_has_shared_frag check out of GRE code and into segmentation")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff907a11
  6. 21 7月, 2018 4 次提交
  7. 20 7月, 2018 2 次提交
    • O
      flow_dissector: Dissect tos and ttl from the tunnel info · 5544adb9
      Or Gerlitz 提交于
      Add dissection of the tos and ttl from the ip tunnel headers
      fields in case a match is needed on them.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5544adb9
    • T
      net/page_pool: Fix inconsistent lock state warning · 4905bd9a
      Tariq Toukan 提交于
      Fix the warning below by calling the ptr_ring_consume_bh,
      which uses spin_[un]lock_bh.
      
      [  179.064300] ================================
      [  179.069073] WARNING: inconsistent lock state
      [  179.073846] 4.18.0-rc2+ #18 Not tainted
      [  179.078133] --------------------------------
      [  179.082907] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [  179.089637] swapper/21/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      [  179.095478] 00000000963d1995 (&(&r->consumer_lock)->rlock){+.?.}, at:
      __page_pool_empty_ring+0x61/0x100
      [  179.105988] {SOFTIRQ-ON-W} state was registered at:
      [  179.111443]   _raw_spin_lock+0x35/0x50
      [  179.115634]   __page_pool_empty_ring+0x61/0x100
      [  179.120699]   page_pool_destroy+0x32/0x50
      [  179.125204]   mlx5e_free_rq+0x38/0xc0 [mlx5_core]
      [  179.130471]   mlx5e_close_channel+0x20/0x120 [mlx5_core]
      [  179.136418]   mlx5e_close_channels+0x26/0x40 [mlx5_core]
      [  179.142364]   mlx5e_close_locked+0x44/0x50 [mlx5_core]
      [  179.148509]   mlx5e_close+0x42/0x60 [mlx5_core]
      [  179.153936]   __dev_close_many+0xb1/0x120
      [  179.158749]   dev_close_many+0xa2/0x170
      [  179.163364]   rollback_registered_many+0x148/0x460
      [  179.169047]   rollback_registered+0x56/0x90
      [  179.174043]   unregister_netdevice_queue+0x7e/0x100
      [  179.179816]   unregister_netdev+0x18/0x20
      [  179.184623]   mlx5e_remove+0x2a/0x50 [mlx5_core]
      [  179.190107]   mlx5_remove_device+0xe5/0x110 [mlx5_core]
      [  179.196274]   mlx5_unregister_interface+0x39/0x90 [mlx5_core]
      [  179.203028]   cleanup+0x5/0xbfc [mlx5_core]
      [  179.208031]   __x64_sys_delete_module+0x16b/0x240
      [  179.213640]   do_syscall_64+0x5a/0x210
      [  179.218151]   entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  179.224218] irq event stamp: 334398
      [  179.228438] hardirqs last  enabled at (334398): [<ffffffffa511d8b7>]
      rcu_process_callbacks+0x1c7/0x790
      [  179.239178] hardirqs last disabled at (334397): [<ffffffffa511d872>]
      rcu_process_callbacks+0x182/0x790
      [  179.249931] softirqs last  enabled at (334386): [<ffffffffa509732e>] irq_enter+0x5e/0x70
      [  179.259306] softirqs last disabled at (334387): [<ffffffffa509741c>] irq_exit+0xdc/0xf0
      [  179.268584]
      [  179.268584] other info that might help us debug this:
      [  179.276572]  Possible unsafe locking scenario:
      [  179.276572]
      [  179.283877]        CPU0
      [  179.286954]        ----
      [  179.290033]   lock(&(&r->consumer_lock)->rlock);
      [  179.295546]   <Interrupt>
      [  179.298830]     lock(&(&r->consumer_lock)->rlock);
      [  179.304550]
      [  179.304550]  *** DEADLOCK ***
      
      Fixes: ff7d6b27 ("page_pool: refurbish version of page_pool code")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4905bd9a
  8. 19 7月, 2018 3 次提交
    • J
      pktgen: convert safe uses of strncpy() to strcpy() to avoid string truncation warning · f15f084f
      Jakub Kicinski 提交于
      GCC 8 complains:
      
      net/core/pktgen.c: In function ‘pktgen_if_write’:
      net/core/pktgen.c:1419:4: warning: ‘strncpy’ output may be truncated copying between 0 and 31 bytes from a string of length 127 [-Wstringop-truncation]
          strncpy(pkt_dev->src_max, buf, len);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      net/core/pktgen.c:1399:4: warning: ‘strncpy’ output may be truncated copying between 0 and 31 bytes from a string of length 127 [-Wstringop-truncation]
          strncpy(pkt_dev->src_min, buf, len);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      net/core/pktgen.c:1290:4: warning: ‘strncpy’ output may be truncated copying between 0 and 31 bytes from a string of length 127 [-Wstringop-truncation]
          strncpy(pkt_dev->dst_max, buf, len);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      net/core/pktgen.c:1268:4: warning: ‘strncpy’ output may be truncated copying between 0 and 31 bytes from a string of length 127 [-Wstringop-truncation]
          strncpy(pkt_dev->dst_min, buf, len);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      There is no bug here, but the code is not perfect either.  It copies
      sizeof(pkt_dev->/member/) - 1 from user space into buf, and then does
      a strcmp(pkt_dev->/member/, buf) hence assuming buf will be null-terminated
      and shorter than pkt_dev->/member/ (pkt_dev->/member/ is never
      explicitly null-terminated, and strncpy() doesn't have to null-terminate
      so the assumption must be on buf).  The use of strncpy() without explicit
      null-termination looks suspicious.  Convert to use straight strcpy().
      
      strncpy() would also null-pad the output, but that's clearly unnecessary
      since the author calls memset(pkt_dev->/member/, 0, sizeof(..)); prior
      to strncpy(), anyway.
      
      While at it format the code for "dst_min", "dst_max", "src_min" and
      "src_max" in the same way by removing extra new lines in one case.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NJiong Wang <jiong.wang@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f15f084f
    • S
      net: Move skb decrypted field, avoid explicity copy · a48d189e
      Stefano Brivio 提交于
      Commit 784abe24 ("net: Add decrypted field to skb")
      introduced a 'decrypted' field that is explicitly copied on skb
      copy and clone.
      
      Move it between headers_start[0] and headers_end[0], so that we
      don't need to copy it explicitly as it's copied by the memcpy()
      in __copy_skb_header().
      
      While at it, drop the assignment in __skb_clone(), it was
      already redundant.
      
      This doesn't change the size of sk_buff or cacheline boundaries.
      
      The 15-bits hole before tc_index becomes a 14-bits hole, and
      will be again a 15-bits hole when this change is merged with
      commit 8b700862 ("net: Don't copy pfmemalloc flag in
      __copy_skb_header()").
      
      v2: as reported by kbuild test robot (oops, I forgot to build
          with CONFIG_TLS_DEVICE it seems), we can't use
          CHECK_SKB_FIELD() on a bit-field member. Just drop the
          check for the moment being, perhaps we could think of some
          magic to also check bit-field members one day.
      
      Fixes: 784abe24 ("net: Add decrypted field to skb")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a48d189e
    • J
      xdp: fix uninitialized 'err' variable · 202aabe8
      Jakub Kicinski 提交于
      Smatch caught an uninitialized variable error which GCC seems
      to miss.
      
      Fixes: a25717d2 ("xdp: support simultaneous driver and hw XDP attachment")
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      202aabe8
  9. 17 7月, 2018 2 次提交
  10. 16 7月, 2018 2 次提交
  11. 14 7月, 2018 5 次提交
  12. 13 7月, 2018 10 次提交
    • A
      devlink: Add generic parameters region_snapshot · f6a69885
      Alex Vesker 提交于
      region_snapshot - When set enables capturing region snapshots
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6a69885
    • A
      devlink: Add support for region snapshot read command · 4e54795a
      Alex Vesker 提交于
      Add support for DEVLINK_CMD_REGION_READ_GET used for both reading
      and dumping region data. Read allows reading from a region specific
      address for given length. Dump allows reading the full region.
      If only snapshot ID is provided a snapshot dump will be done.
      If snapshot ID, Address and Length are provided a snapshot read
      will done.
      
      This is used for both snapshot access and will be used in the same
      way to access current data on the region.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e54795a
    • A
      devlink: Add support for region snapshot delete command · 866319bb
      Alex Vesker 提交于
      Add support for DEVLINK_CMD_REGION_DEL used
      for deleting a snapshot from a region. The snapshot ID is required.
      Also added notification support for NEW and DEL of snapshots.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      866319bb
    • A
      devlink: Extend the support querying for region snapshot IDs · a006d467
      Alex Vesker 提交于
      Extend the support for DEVLINK_CMD_REGION_GET command to also
      return the IDs of the snapshot currently present on the region.
      Each reply will include a nested snapshots attribute that
      can contain multiple snapshot attributes each with an ID.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a006d467
    • A
      devlink: Add support for region get command · d8db7ea5
      Alex Vesker 提交于
      Add support for DEVLINK_CMD_REGION_GET command which is used for
      querying for the supported DEV/REGION values of devlink devices.
      The support is both for doit and dumpit.
      
      Reply includes:
        BUS_NAME, DEVICE_NAME, REGION_NAME, REGION_SIZE
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8db7ea5
    • A
      devlink: Add support for creating region snapshots · d7e52722
      Alex Vesker 提交于
      Each device address region can store multiple snapshots,
      each snapshot is identified using a different numerical ID.
      This ID is used when deleting a snapshot or showing an address
      region specific snapshot. This patch exposes a callback to add
      a new snapshot to an address region.
      The snapshot will be deleted using the destructor function
      when destroying a region or when a snapshot delete command
      from devlink user tool.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e52722
    • A
      devlink: Add callback to query for snapshot id before snapshot create · ccadfa44
      Alex Vesker 提交于
      To restrict the driver with the snapshot ID selection a new callback
      is introduced for the driver to get the snapshot ID before creating
      a new snapshot. This will also allow giving the same ID for multiple
      snapshots taken of different regions on the same time.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccadfa44
    • A
      devlink: Add support for creating and destroying regions · b16ebe92
      Alex Vesker 提交于
      This allows a device to register its supported address regions.
      Each address region can be accessed directly for example reading
      the snapshots taken of this address space.
      Drivers are not limited in the name selection for different regions.
      An example of a region-name can be: pci cr-space, register-space.
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b16ebe92
    • P
      net: gro: properly remove skb from list · 68d2f84a
      Prashant Bhole 提交于
      Following crash occurs in validate_xmit_skb_list() when same skb is
      iterated multiple times in the loop and consume_skb() is called.
      
      The root cause is calling list_del_init(&skb->list) and not clearing
      skb->next in d4546c25. list_del_init(&skb->list) sets skb->next
      to point to skb itself. skb->next needs to be cleared because other
      parts of network stack uses another kind of SKB lists.
      validate_xmit_skb_list() uses such list.
      
      A similar type of bugfix was reported by Jesper Dangaard Brouer.
      https://patchwork.ozlabs.org/patch/942541/
      
      This patch clears skb->next and changes list_del_init() to list_del()
      so that list->prev will maintain the list poison.
      
      [  148.185511] ==================================================================
      [  148.187865] BUG: KASAN: use-after-free in validate_xmit_skb_list+0x4b/0xa0
      [  148.190158] Read of size 8 at addr ffff8801e52eefc0 by task swapper/1/0
      [  148.192940]
      [  148.193642] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.18.0-rc3+ #25
      [  148.195423] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
      [  148.199129] Call Trace:
      [  148.200565]  <IRQ>
      [  148.201911]  dump_stack+0xc6/0x14c
      [  148.203572]  ? dump_stack_print_info.cold.1+0x2f/0x2f
      [  148.205083]  ? kmsg_dump_rewind_nolock+0x59/0x59
      [  148.206307]  ? validate_xmit_skb+0x2c6/0x560
      [  148.207432]  ? debug_show_held_locks+0x30/0x30
      [  148.208571]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.211144]  print_address_description+0x6c/0x23c
      [  148.212601]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.213782]  kasan_report.cold.6+0x241/0x2fd
      [  148.214958]  validate_xmit_skb_list+0x4b/0xa0
      [  148.216494]  sch_direct_xmit+0x1b0/0x680
      [  148.217601]  ? dev_watchdog+0x4e0/0x4e0
      [  148.218675]  ? do_raw_spin_trylock+0x10/0x120
      [  148.219818]  ? do_raw_spin_lock+0xe0/0xe0
      [  148.221032]  __dev_queue_xmit+0x1167/0x1810
      [  148.222155]  ? sched_clock+0x5/0x10
      [...]
      
      [  148.474257] Allocated by task 0:
      [  148.475363]  kasan_kmalloc+0xbf/0xe0
      [  148.476503]  kmem_cache_alloc+0xb4/0x1b0
      [  148.477654]  __build_skb+0x91/0x250
      [  148.478677]  build_skb+0x67/0x180
      [  148.479657]  e1000_clean_rx_irq+0x542/0x8a0
      [  148.480757]  e1000_clean+0x652/0xd10
      [  148.481772]  net_rx_action+0x4ea/0xc20
      [  148.482808]  __do_softirq+0x1f9/0x574
      [  148.483831]
      [  148.484575] Freed by task 0:
      [  148.485504]  __kasan_slab_free+0x12e/0x180
      [  148.486589]  kmem_cache_free+0xb4/0x240
      [  148.487634]  kfree_skbmem+0xed/0x150
      [  148.488648]  consume_skb+0x146/0x250
      [  148.489665]  validate_xmit_skb+0x2b7/0x560
      [  148.490754]  validate_xmit_skb_list+0x70/0xa0
      [  148.491897]  sch_direct_xmit+0x1b0/0x680
      [  148.493949]  __dev_queue_xmit+0x1167/0x1810
      [  148.495103]  br_dev_queue_push_xmit+0xce/0x250
      [  148.496196]  br_forward_finish+0x276/0x280
      [  148.497234]  __br_forward+0x44f/0x520
      [  148.498260]  br_forward+0x19f/0x1b0
      [  148.499264]  br_handle_frame_finish+0x65e/0x980
      [  148.500398]  NF_HOOK.constprop.10+0x290/0x2a0
      [  148.501522]  br_handle_frame+0x417/0x640
      [  148.502582]  __netif_receive_skb_core+0xaac/0x18f0
      [  148.503753]  __netif_receive_skb_one_core+0x98/0x120
      [  148.504958]  netif_receive_skb_internal+0xe3/0x330
      [  148.506154]  napi_gro_complete+0x190/0x2a0
      [  148.507243]  dev_gro_receive+0x9f7/0x1100
      [  148.508316]  napi_gro_receive+0xcb/0x260
      [  148.509387]  e1000_clean_rx_irq+0x2fc/0x8a0
      [  148.510501]  e1000_clean+0x652/0xd10
      [  148.511523]  net_rx_action+0x4ea/0xc20
      [  148.512566]  __do_softirq+0x1f9/0x574
      [  148.513598]
      [  148.514346] The buggy address belongs to the object at ffff8801e52eefc0
      [  148.514346]  which belongs to the cache skbuff_head_cache of size 232
      [  148.517047] The buggy address is located 0 bytes inside of
      [  148.517047]  232-byte region [ffff8801e52eefc0, ffff8801e52ef0a8)
      [  148.519549] The buggy address belongs to the page:
      [  148.520726] page:ffffea000794bb00 count:1 mapcount:0 mapping:ffff880106f4dfc0 index:0xffff8801e52ee840 compound_mapcount: 0
      [  148.524325] flags: 0x17ffffc0008100(slab|head)
      [  148.525481] raw: 0017ffffc0008100 ffff880106b938d0 ffff880106b938d0 ffff880106f4dfc0
      [  148.527503] raw: ffff8801e52ee840 0000000000190011 00000001ffffffff 0000000000000000
      [  148.529547] page dumped because: kasan: bad access detected
      
      Fixes: d4546c25 ("net: Convert GRO SKB handling to list_head.")
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Reported-by: NTyler Hicks <tyhicks@canonical.com>
      Tested-by: NTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68d2f84a
    • S
      net: Don't copy pfmemalloc flag in __copy_skb_header() · 8b700862
      Stefano Brivio 提交于
      The pfmemalloc flag indicates that the skb was allocated from
      the PFMEMALLOC reserves, and the flag is currently copied on skb
      copy and clone.
      
      However, an skb copied from an skb flagged with pfmemalloc
      wasn't necessarily allocated from PFMEMALLOC reserves, and on
      the other hand an skb allocated that way might be copied from an
      skb that wasn't.
      
      So we should not copy the flag on skb copy, and rather decide
      whether to allow an skb to be associated with sockets unrelated
      to page reclaim depending only on how it was allocated.
      
      Move the pfmemalloc flag before headers_start[0] using an
      existing 1-bit hole, so that __copy_skb_header() doesn't copy
      it.
      
      When cloning, we'll now take care of this flag explicitly,
      contravening to the warning comment of __skb_clone().
      
      While at it, restore the newline usage introduced by commit
      b1937227 ("net: reorganize sk_buff for faster
      __copy_skb_header()") to visually separate bytes used in
      bitfields after headers_start[0], that was gone after commit
      a9e419dc ("netfilter: merge ctinfo into nfct pointer storage
      area"), and describe the pfmemalloc flag in the kernel-doc
      structure comment.
      
      This doesn't change the size of sk_buff or cacheline boundaries,
      but consolidates the 15 bits hole before tc_index into a 2 bytes
      hole before csum, that could now be filled more easily.
      Reported-by: NPatrick Talbert <ptalbert@redhat.com>
      Fixes: c93bdd0e ("netvm: allow skb allocation to use PFMEMALLOC reserves")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b700862
  13. 12 7月, 2018 1 次提交
  14. 10 7月, 2018 4 次提交