1. 08 12月, 2017 10 次提交
  2. 07 12月, 2017 5 次提交
    • P
      drivers: net: dsa: remove duplicate includes · 30f1e595
      Pravin Shedge 提交于
      These duplicate includes have been found with scripts/checkincludes.pl but
      they have been removed manually to avoid removing false positives.
      Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30f1e595
    • H
      rds: Fix NULL pointer dereference in __rds_rdma_map · f3069c6d
      Håkon Bugge 提交于
      This is a fix for syzkaller719569, where memory registration was
      attempted without any underlying transport being loaded.
      
      Analysis of the case reveals that it is the setsockopt() RDS_GET_MR
      (2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable.
      
      Here is an example stack trace when the bug is hit:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0
      IP: __rds_rdma_map+0x36/0x440 [rds]
      PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4
      dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp
      coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul
      ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd
      iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd
      shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core
      auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2
      mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect
      sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm
      crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror
      dm_region_hash dm_log dm_mod
      CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2
      Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
      task: ffff882f9190db00 task.stack: ffffc9002b994000
      RIP: 0010:__rds_rdma_map+0x36/0x440 [rds]
      RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580
      RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002
      R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580
      R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc
      FS:  00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0
      Call Trace:
       rds_get_mr+0x56/0x80 [rds]
       rds_setsockopt+0x172/0x340 [rds]
       ? __fget_light+0x25/0x60
       ? __fdget+0x13/0x20
       SyS_setsockopt+0x80/0xe0
       do_syscall_64+0x67/0x1b0
       entry_SYSCALL64_slow_path+0x25/0x25
      RIP: 0033:0x7fbff9b117f9
      RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9
      RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b
      RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860
      R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000
      R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021
      
      Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48
      89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48>
      83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08
      
      The fix is to check the existence of an underlying transport in
      __rds_rdma_map().
      Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3069c6d
    • C
      net_sched: use macvlan real dev trans_start in dev_trans_start() · 32d3e51a
      Chris Dion 提交于
      Macvlan devices are similar to vlans and do not update their
      own trans_start. In order for arp monitoring to work for a bond device
      when the slaves are macvlans, obtain its real device.
      Signed-off-by: NChris Dion <christopher.dion@dell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32d3e51a
    • J
      xen-netback: Fix logging message with spurious period after newline · cc10f871
      Joe Perches 提交于
      Using a period after a newline causes bad output.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Reviewed-by: NPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc10f871
    • F
      net: thunderx: Fix TCP/UDP checksum offload for IPv4 pkts · 134059fd
      Florian Westphal 提交于
      Offload IP header checksum to NIC.
      
      This fixes a previous patch which disabled checksum offloading
      for both IPv4 and IPv6 packets.  So L3 checksum offload was
      getting disabled for IPv4 pkts.  And HW is dropping these pkts
      for some reason.
      
      Without this patch, IPv4 TSO appears to be broken:
      
      WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s
      when copying files via scp from test box to my home workstation.
      
      Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs.
      This patch restores performance for me, ipv6 looks good too.
      
      Fixes: fa6d7cb5 ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts")
      Cc: Sunil Goutham <sgoutham@cavium.com>
      Cc: Aleksey Makarov <aleksey.makarov@auriga.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      134059fd
  3. 06 12月, 2017 21 次提交
    • A
      make sock_alloc_file() do sock_release() on failures · 8e1611e2
      Al Viro 提交于
      This changes calling conventions (and simplifies the hell out
      the callers).  New rules: once struct socket had been passed
      to sock_alloc_file(), it's been consumed either by struct file
      or by sock_release() done by sock_alloc_file().  Either way
      the caller should not do sock_release() after that point.
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e1611e2
    • A
      socketpair(): allocate descriptors first · 016a266b
      Al Viro 提交于
      simplifies failure exits considerably...
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      016a266b
    • A
      fix kcm_clone() · a5739435
      Al Viro 提交于
      1) it's fput() or sock_release(), not both
      2) don't do fd_install() until the last failure exit.
      3) not a bug per se, but... don't attach socket to struct file
         until it's set up.
      
      Take reserving descriptor into the caller, move fd_install() to the
      caller, sanitize failure exits and calling conventions.
      
      Cc: stable@vger.kernel.org # v4.6+
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5739435
    • M
      dccp: CVE-2017-8824: use-after-free in DCCP code · 69c64866
      Mohamed Ghannam 提交于
      Whenever the sock object is in DCCP_CLOSED state,
      dccp_disconnect() must free dccps_hc_tx_ccid and
      dccps_hc_rx_ccid and set to NULL.
      Signed-off-by: NMohamed Ghannam <simo.ghannam@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69c64866
    • E
      net: remove hlist_nulls_add_tail_rcu() · d7efc6c1
      Eric Dumazet 提交于
      Alexander Potapenko reported use of uninitialized memory [1]
      
      This happens when inserting a request socket into TCP ehash,
      in __sk_nulls_add_node_rcu(), since sk_reuseport is not initialized.
      
      Bug was added by commit d894ba18 ("soreuseport: fix ordering for
      mixed v4/v6 sockets")
      
      Note that d296ba60 ("soreuseport: Resolve merge conflict for v4/v6
      ordering fix") missed the opportunity to get rid of
      hlist_nulls_add_tail_rcu() :
      
      Both UDP sockets and TCP/DCCP listeners no longer use
      __sk_nulls_add_node_rcu() for their hash insertion.
      
      Since all other sockets have unique 4-tuple, the reuseport status
      has no special meaning, so we can always use hlist_nulls_add_head_rcu()
      for them and save few cycles/instructions.
      
      [1]
      
      ==================================================================
      BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:16
       dump_stack+0x185/0x1d0 lib/dump_stack.c:52
       kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016
       __msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766
       __sk_nulls_add_node_rcu ./include/net/sock.h:684
       inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413
       reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754
       inet_csk_reqsk_queue_hash_add+0x1cc/0x300 net/ipv4/inet_connection_sock.c:765
       tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414
       tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314
       tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917
       tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483
       tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763
       ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216
       NF_HOOK ./include/linux/netfilter.h:248
       ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257
       dst_input ./include/net/dst.h:477
       ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397
       NF_HOOK ./include/linux/netfilter.h:248
       ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488
       __netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298
       __netif_receive_skb net/core/dev.c:4336
       netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497
       napi_skb_finish net/core/dev.c:4858
       napi_gro_receive+0x629/0xa50 net/core/dev.c:4889
       e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018
       e1000_clean_rx_irq+0x1492/0x1d30
      drivers/net/ethernet/intel/e1000/e1000_main.c:4474
       e1000_clean+0x43aa/0x5970 drivers/net/ethernet/intel/e1000/e1000_main.c:3819
       napi_poll net/core/dev.c:5500
       net_rx_action+0x73c/0x1820 net/core/dev.c:5566
       __do_softirq+0x4b4/0x8dd kernel/softirq.c:284
       invoke_softirq kernel/softirq.c:364
       irq_exit+0x203/0x240 kernel/softirq.c:405
       exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638
       do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263
       common_interrupt+0x86/0x86
      
      Fixes: d894ba18 ("soreuseport: fix ordering for mixed v4/v6 sockets")
      Fixes: d296ba60 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7efc6c1
    • D
      Merge branch 'rmnet-Fix-leaks-in-failure-scenarios' · a5266440
      David S. Miller 提交于
      Subash Abhinov Kasiviswanathan says:
      
      ====================
      net: qualcomm: rmnet: Fix leaks in failure scenarios
      
      Patch 1 fixes a leak in transmit path where a skb cannot be
      transmitted due to insufficient headroom to stamp the map header.
      Patch 2 fixes a leak in rmnet_newlink() failure because the
      rmnet endpoint was never freed
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5266440
    • S
      net: qualcomm: rmnet: Fix leak in device creation failure · 6296928f
      Subash Abhinov Kasiviswanathan 提交于
      If the rmnet device creation fails in the newlink either while
      registering with the physical device or after subsequent
      operations, the rmnet endpoint information is never freed.
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6296928f
    • S
      net: qualcomm: rmnet: Fix leak on transmit failure · c20a5487
      Subash Abhinov Kasiviswanathan 提交于
      If a skb in transmit path does not have sufficient headroom to add
      the map header, the skb is not sent out and is never freed.
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c20a5487
    • S
      VSOCK: fix outdated sk_state value in hvs_release() · c9d3fe9d
      Stefan Hajnoczi 提交于
      Since commit 3b4477d2 ("VSOCK: use TCP
      state constants for sk_state") VSOCK has used TCP_* constants for
      sk_state.
      
      Commit b4562ca7 ("hv_sock: add locking
      in the open/close/release code paths") reintroduced the SS_DISCONNECTING
      constant.
      
      This patch replaces the old SS_DISCONNECTING with the new TCP_CLOSING
      constant.
      
      CC: Dexuan Cui <decui@microsoft.com>
      CC: Cathy Avery <cavery@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9d3fe9d
    • J
      tipc: fix memory leak in tipc_accept_from_sock() · a7d5f107
      Jon Maloy 提交于
      When the function tipc_accept_from_sock() fails to create an instance of
      struct tipc_subscriber it omits to free the already created instance of
      struct tipc_conn instance before it returns.
      
      We fix that with this commit.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7d5f107
    • C
      tipc: fix a null pointer deref on error path · 672ecbe1
      Cong Wang 提交于
      In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
      we call tipc_close_conn() to clean up, but in this case
      calling conn_put() is just enough.
      
      This fixes the folllowing crash:
      
       kasan: GPF could be caused by NULL-ptr deref or user memory access
       general protection fault: 0000 [#1] SMP KASAN
       Dumping ftrace buffer:
          (ftrace buffer empty)
       Modules linked in:
       CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ #137
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       task: 00000000c24413a5 task.stack: 000000005e8160b5
       RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
       RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
       RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
       RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
       R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
       R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
       FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
        _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
        spin_lock_bh include/linux/spinlock.h:320 [inline]
        tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
        tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
        tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
        tipc_close_conn+0x171/0x270 net/tipc/server.c:204
        tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
        tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
        tipc_sk_join net/tipc/socket.c:2747 [inline]
        tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
        SYSC_setsockopt net/socket.c:1851 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1830
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Fixes: 14c04493 ("tipc: add ability to order and receive topology events in driver")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      672ecbe1
    • D
      Merge branch 'sh_eth-dma-mapping-fixes' · a6cec1f5
      David S. Miller 提交于
      Thomas Petazzoni says:
      
      ====================
      net: sh_eth: DMA mapping API fixes
      
      Here are two patches that fix how the sh_eth driver is using the DMA
      mapping API: a bogus struct device is used in some places, or a NULL
      struct device is used.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6cec1f5
    • T
      net: sh_eth: don't use NULL as "struct device" for the DMA mapping API · 573500db
      Thomas Petazzoni 提交于
      Using NULL as argument for the DMA mapping API is bogus, as the DMA
      mapping API may use information from the "struct device" to perform
      the DMA mapping operation. Therefore, pass the appropriate "struct
      device".
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      573500db
    • T
      net: sh_eth: use correct "struct device" when calling DMA mapping functions · 22c1aed4
      Thomas Petazzoni 提交于
      There are two types of "struct device": the one representing the
      physical device on its physical bus (platform, SPI, PCI, etc.), and
      the one representing the logical device in its device class (net,
      etc.).
      
      The DMA mapping API expects to receive as argument a "struct device"
      representing the physical device, as the "struct device" contains
      information about the bus that the DMA API needs.
      
      However, the sh_eth driver mistakenly uses the "struct device"
      representing the logical device (embedded in "struct net_device")
      rather than the "struct device" representing the physical device on
      its bus.
      
      This commit fixes that by adjusting all calls to the DMA mapping API.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22c1aed4
    • D
      Merge branch 'RED-qdisc-fixes' · c1d69de9
      David S. Miller 提交于
      Nogah Frankel says:
      
      ====================
      RED qdisc fixes
      
      Add some input validation checks to RED qdisc.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1d69de9
    • N
      net_sched: red: Avoid illegal values · 8afa10cb
      Nogah Frankel 提交于
      Check the qmin & qmax values doesn't overflow for the given Wlog value.
      Check that qmin <= qmax.
      
      Fixes: a7834745 ("[PKT_SCHED]: Generic RED layer")
      Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8afa10cb
    • N
      net_sched: red: Avoid devision by zero · 5c472203
      Nogah Frankel 提交于
      Do not allow delta value to be zero since it is used as a divisor.
      
      Fixes: 8af2a218 ("sch_red: Adaptative RED AQM")
      Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c472203
    • Z
      gianfar: fix a flooded alignment reports because of padding issue. · 58117672
      Zumeng Chen 提交于
      According to LS1021A RM, the value of PAL can be set so that the start of the
      IP header in the receive data buffer is aligned to a 32-bit boundary. Normally,
      setting PAL = 2 provides minimal padding to ensure such alignment of the IP
      header.
      
      However every incoming packet's 8-byte time stamp will be inserted into the
      packet data buffer as padding alignment bytes when hardware time stamping is
      enabled.
      
      So we set the padding 8+2 here to avoid the flooded alignment faults:
      
      root@128:~# cat /proc/cpu/alignment
      User:           0
      System:         17539 (inet_gro_receive+0x114/0x2c0)
      Skipped:        0
      Half:           0
      Word:           0
      DWord:          0
      Multi:          17539
      User faults:    2 (fixup)
      
      Also shown when exception report enablement
      
      CPU: 0 PID: 161 Comm: irq/66-eth1_g0_ Not tainted 4.1.21-rt13-WR8.0.0.0_preempt-rt #16
      Hardware name: Freescale LS1021A
      [<8001b420>] (unwind_backtrace) from [<8001476c>] (show_stack+0x20/0x24)
      [<8001476c>] (show_stack) from [<807cfb48>] (dump_stack+0x94/0xac)
      [<807cfb48>] (dump_stack) from [<80025d70>] (do_alignment+0x720/0x958)
      [<80025d70>] (do_alignment) from [<80009224>] (do_DataAbort+0x40/0xbc)
      [<80009224>] (do_DataAbort) from [<80015398>] (__dabt_svc+0x38/0x60)
      Exception stack(0x86ad1cc0 to 0x86ad1d08)
      1cc0: f9b3e080 86b3d072 2d78d287 00000000 866816c0 86b3d05e 86e785d0 00000000
      1ce0: 00000011 0000000e 80840ab0 86ad1d3c 86ad1d08 86ad1d08 806d7fc0 806d806c
      1d00: 40070013 ffffffff
      [<80015398>] (__dabt_svc) from [<806d806c>] (inet_gro_receive+0x114/0x2c0)
      [<806d806c>] (inet_gro_receive) from [<80660eec>] (dev_gro_receive+0x21c/0x3c0)
      [<80660eec>] (dev_gro_receive) from [<8066133c>] (napi_gro_receive+0x44/0x17c)
      [<8066133c>] (napi_gro_receive) from [<804f0538>] (gfar_clean_rx_ring+0x39c/0x7d4)
      [<804f0538>] (gfar_clean_rx_ring) from [<804f0bf4>] (gfar_poll_rx_sq+0x58/0xe0)
      [<804f0bf4>] (gfar_poll_rx_sq) from [<80660b10>] (net_rx_action+0x27c/0x43c)
      [<80660b10>] (net_rx_action) from [<80033638>] (do_current_softirqs+0x1e0/0x3dc)
      [<80033638>] (do_current_softirqs) from [<800338c4>] (__local_bh_enable+0x90/0xa8)
      [<800338c4>] (__local_bh_enable) from [<8008025c>] (irq_forced_thread_fn+0x70/0x84)
      [<8008025c>] (irq_forced_thread_fn) from [<800805e8>] (irq_thread+0x16c/0x244)
      [<800805e8>] (irq_thread) from [<8004e490>] (kthread+0xe8/0x104)
      [<8004e490>] (kthread) from [<8000fda8>] (ret_from_fork+0x14/0x2c)
      Signed-off-by: NZumeng Chen <zumeng.chen@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58117672
    • J
      Revert "net: core: maybe return -EEXIST in __dev_alloc_name" · 029b6d14
      Johannes Berg 提交于
      This reverts commit d6f295e9; some userspace (in the case
      we noticed it's wpa_supplicant), is relying on the current
      error code to determine that a fixed name interface already
      exists.
      Reported-by: NJouni Malinen <j@w1.fi>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      029b6d14
    • P
      nfp: fix port stats for mac representors · 42d779ff
      Pieter Jansen van Vuuren 提交于
      Previously we swapped the tx_packets, tx_bytes and tx_dropped counters
      with rx_packets, rx_bytes and rx_dropped counters, respectively. This
      behaviour is correct and expected for VF representors but it should not
      be swapped for physical port mac representors.
      
      Fixes: eadfa4c3 ("nfp: add stats and xmit helpers for representors")
      Signed-off-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42d779ff
    • E
      Revert "tcp: must block bh in __inet_twsk_hashdance()" · e599ea14
      Eric Dumazet 提交于
      We had to disable BH _before_ calling __inet_twsk_hashdance() in commit
      cfac7f83 ("tcp/dccp: block bh before arming time_wait timer").
      
      This means we can revert 614bdd4d ("tcp: must block bh in
      __inet_twsk_hashdance()").
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e599ea14
  4. 05 12月, 2017 4 次提交
    • D
      Merge branch 'bpf-fix-broken-uapi-for-pt-regs' · 037776e4
      Daniel Borkmann 提交于
      Hendrik Brueckner says:
      
      ====================
      Perf tool bpf selftests revealed a broken uapi for s390 and arm64.
      With the BPF_PROG_TYPE_PERF_EVENT program type the bpf_perf_event
      structure exports the pt_regs structure for all architectures.
      
      This fails for s390 and arm64 because pt_regs are not part of the
      user api and kept in-kernel only.  To mitigate the broken uapi,
      introduce a wrapper that exports pt_regs in an asm-generic way.
      For arm64, export the exising user_pt_regs structure.  For s390,
      introduce a user_pt_regs structure that exports the beginning of
      pt_regs.
      
      Note that user_pt_regs must export from the beginning of pt_regs
      as BPF_PROG_TYPE_PERF_EVENT program type is not the only type for
      running BPF programs.
      
      Some more background:
      
        For the bpf_perf_event, there is a uapi definition that is
        passed to the BPF program.  For other "probe" points like
        trace points, kprobes, and uprobes, there is no uapi and the
        BPF program is always passed pt_regs (which is OK as the BPF
        program runs in the kernel context).  The perf tool can attach
        BPF programs to all of these "probe" points and, optionally,
        can create a BPF prologue to access particular arguments
        (passed as registers).  For this, it uses DWARF/CFI
        information to obtain the register and calls a perf-arch
        backend function, regs_query_register_offset().  This function
        returns the index into (user_)pt_regs for a particular
        register.  Then, perf creates a BPF prologue that accesses
        this register based on the passed stucture from the "probe"
        point.
      
      Part of this series, are also updates to the testing and bpf selftest
      to deal with asm-specifics.  To complete the bpf support in perf, the
      the regs_query_register_offset function is added for s390 to support
      BPF prologue creation.
      
      Changelog v1 -> v2:
      - Correct kbuild test bot issues by including
        asm-generic/bpf_perf_event.h for archictectures that do not have
        their own asm version.
      - Added patch to clean-up whitespace and coding style issues in s390
        asm/ptrace.h (#4/6) as suggested by Alexei.
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      037776e4
    • H
      perf s390: add regs_query_register_offset() · a81c4213
      Hendrik Brueckner 提交于
      The regs_query_register_offset() helper function converts
      register name like "%r0" to an offset of a register in user_pt_regs
      It is required by the BPF prologue generator.
      
      The user_pt_regs structure was recently added to "asm/ptrace.h".
      Hence, update tools/perf/check-headers.sh to keep the header file
      in sync with kernel changes.
      Suggested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-and-tested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a81c4213
    • H
      selftests/bpf: sync kernel headers and introduce arch support in Makefile · 618e165b
      Hendrik Brueckner 提交于
      Synchronize the uapi kernel header files which solves the broken
      uapi export of pt_regs.  Because of arch-specific uapi headers,
      extended the include path in the Makefile.
      
      With this change, the test_verifier program compiles and runs successfully
      on s390.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-and-tested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      618e165b
    • H
      s390/uapi: correct whitespace & coding style in asm/ptrace.h · 62e1dfa3
      Hendrik Brueckner 提交于
      Correct whitespace and coding style issues in the s390 asm/ptrace.h
      uapi header file.  This is preparatory work to copy it to the tools/
      directory for inclusion by selftests and perf.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      62e1dfa3