1. 27 10月, 2018 13 次提交
  2. 24 10月, 2018 2 次提交
    • E
      tcp: add tcp_reset_xmit_timer() helper · 3f80e08f
      Eric Dumazet 提交于
      With EDT model, SRTT no longer is inflated by pacing delays.
      
      This means that RTO and some other xmit timers might be setup
      incorrectly. This is particularly visible with either :
      
      - Very small enforced pacing rates (SO_MAX_PACING_RATE)
      - Reduced rto (from the default 200 ms)
      
      This can lead to TCP flows aborts in the worst case,
      or spurious retransmits in other cases.
      
      For example, this session gets far more throughput
      than the requested 80kbit :
      
      $ netperf -H 127.0.0.2 -l 100 -- -q 10000
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
      540000 262144 262144    104.00      2.66
      
      With the fix :
      
      $ netperf -H 127.0.0.2 -l 100 -- -q 10000
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
      540000 262144 262144    104.00      0.12
      
      EDT allows for better control of rtx timers, since TCP has
      a better idea of the earliest departure time of each skb
      in the rtx queue. We only have to eventually add to the
      timer the difference of the EDT time with current time.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f80e08f
    • K
      Revert "net: simplify sock_poll_wait" · 89ab066d
      Karsten Graul 提交于
      This reverts commit dd979b4d.
      
      This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
      internal TCP socket for the initial handshake with the remote peer.
      Whenever the SMC connection can not be established this TCP socket is
      used as a fallback. All socket operations on the SMC socket are then
      forwarded to the TCP socket. In case of poll, the file->private_data
      pointer references the SMC socket because the TCP socket has no file
      assigned. This causes tcp_poll to wait on the wrong socket.
      Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89ab066d
  3. 23 10月, 2018 14 次提交
  4. 22 10月, 2018 1 次提交
    • K
      pstore/ram: Clarify resource reservation labels · 1227daa4
      Kees Cook 提交于
      When ramoops reserved a memory region in the kernel, it had an unhelpful
      label of "persistent_memory". When reading /proc/iomem, it would be
      repeated many times, did not hint that it was ramoops in particular,
      and didn't clarify very much about what each was used for:
      
      400000000-407ffffff : Persistent Memory (legacy)
        400000000-400000fff : persistent_memory
        400001000-400001fff : persistent_memory
      ...
        4000ff000-4000fffff : persistent_memory
      
      Instead, this adds meaningful labels for how the various regions are
      being used:
      
      400000000-407ffffff : Persistent Memory (legacy)
        400000000-400000fff : ramoops:dump(0/252)
        400001000-400001fff : ramoops:dump(1/252)
      ...
        4000fc000-4000fcfff : ramoops:dump(252/252)
        4000fd000-4000fdfff : ramoops:console
        4000fe000-4000fe3ff : ramoops:ftrace(0/3)
        4000fe400-4000fe7ff : ramoops:ftrace(1/3)
        4000fe800-4000febff : ramoops:ftrace(2/3)
        4000fec00-4000fefff : ramoops:ftrace(3/3)
        4000ff000-4000fffff : ramoops:pmsg
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Tested-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
      Tested-by: NGuenter Roeck <groeck@chromium.org>
      1227daa4
  5. 21 10月, 2018 4 次提交
  6. 20 10月, 2018 6 次提交
    • Y
      net: phy: micrel: add Microchip KSZ9131 initial driver · bff5b4b3
      Yuiko Oshino 提交于
      Add support for Microchip Technology KSZ9131 10/100/1000 Ethernet PHY
      Signed-off-by: NYuiko Oshino <yuiko.oshino@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bff5b4b3
    • D
      netpoll: allow cleanup to be synchronous · c9fbd71f
      Debabrata Banerjee 提交于
      This fixes a problem introduced by:
      commit 2cde6acd ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")
      
      When using netconsole on a bond, __netpoll_cleanup can asynchronously
      recurse multiple times, each __netpoll_free_async call can result in
      more __netpoll_free_async's. This means there is now a race between
      cleanup_work queues on multiple netpoll_info's on multiple devices and
      the configuration of a new netpoll. For example if a netconsole is set
      to enable 0, reconfigured, and enable 1 immediately, this netconsole
      will likely not work.
      
      Given the reason for __netpoll_free_async is it can be called when rtnl
      is not locked, if it is locked, we should be able to execute
      synchronously. It appears to be locked everywhere it's called from.
      
      Generalize the design pattern from the teaming driver for current
      callers of __netpoll_free_async.
      
      CC: Neil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9fbd71f
    • J
      bpf: skmsg, fix psock create on existing kcm/tls port · 5032d079
      John Fastabend 提交于
      Before using the psock returned by sk_psock_get() when adding it to a
      sockmap we need to ensure it is actually a sockmap based psock.
      Previously we were only checking this after incrementing the reference
      counter which was an error. This resulted in a slab-out-of-bounds
      error when the psock was not actually a sockmap type.
      
      This moves the check up so the reference counter is only used
      if it is a sockmap psock.
      
      Eric reported the following KASAN BUG,
      
      BUG: KASAN: slab-out-of-bounds in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
      BUG: KASAN: slab-out-of-bounds in refcount_inc_not_zero_checked+0x97/0x2f0 lib/refcount.c:120
      Read of size 4 at addr ffff88019548be58 by task syz-executor4/22387
      
      CPU: 1 PID: 22387 Comm: syz-executor4 Not tainted 4.19.0-rc7+ #264
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
       print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
       check_memory_region_inline mm/kasan/kasan.c:260 [inline]
       check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
       kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
       atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
       refcount_inc_not_zero_checked+0x97/0x2f0 lib/refcount.c:120
       sk_psock_get include/linux/skmsg.h:379 [inline]
       sock_map_link.isra.6+0x41f/0xe30 net/core/sock_map.c:178
       sock_hash_update_common+0x19b/0x11e0 net/core/sock_map.c:669
       sock_hash_update_elem+0x306/0x470 net/core/sock_map.c:738
       map_update_elem+0x819/0xdf0 kernel/bpf/syscall.c:818
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      5032d079
    • S
      bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB · b39b5f41
      Song Liu 提交于
      BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the
      skb. This patch enables direct access of skb for these programs.
      
      Two helper functions bpf_compute_and_save_data_end() and
      bpf_restore_data_end() are introduced. There are used in
      __cgroup_bpf_run_filter_skb(), to compute proper data_end for the
      BPF program, and restore original data afterwards.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      b39b5f41
    • M
      bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall · bd513cd0
      Mauricio Vasquez B 提交于
      The previous patch implemented a bpf queue/stack maps that
      provided the peek/pop/push functions.  There is not a direct
      relationship between those functions and the current maps
      syscalls, hence a new MAP_LOOKUP_AND_DELETE_ELEM syscall is added,
      this is mapped to the pop operation in the queue/stack maps
      and it is still to implement in other kind of maps.
      Signed-off-by: NMauricio Vasquez B <mauricio.vasquez@polito.it>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      bd513cd0
    • M
      bpf: add queue and stack maps · f1a2e44a
      Mauricio Vasquez B 提交于
      Queue/stack maps implement a FIFO/LIFO data storage for ebpf programs.
      These maps support peek, pop and push operations that are exposed to eBPF
      programs through the new bpf_map[peek/pop/push] helpers.  Those operations
      are exposed to userspace applications through the already existing
      syscalls in the following way:
      
      BPF_MAP_LOOKUP_ELEM            -> peek
      BPF_MAP_LOOKUP_AND_DELETE_ELEM -> pop
      BPF_MAP_UPDATE_ELEM            -> push
      
      Queue/stack maps are implemented using a buffer, tail and head indexes,
      hence BPF_F_NO_PREALLOC is not supported.
      
      As opposite to other maps, queue and stack do not use RCU for protecting
      maps values, the bpf_map[peek/pop] have a ARG_PTR_TO_UNINIT_MAP_VALUE
      argument that is a pointer to a memory zone where to save the value of a
      map.  Basically the same as ARG_PTR_TO_UNINIT_MEM, but the size has not
      be passed as an extra argument.
      
      Our main motivation for implementing queue/stack maps was to keep track
      of a pool of elements, like network ports in a SNAT, however we forsee
      other use cases, like for exampling saving last N kernel events in a map
      and then analysing from userspace.
      Signed-off-by: NMauricio Vasquez B <mauricio.vasquez@polito.it>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      f1a2e44a