1. 23 4月, 2019 9 次提交
  2. 22 4月, 2019 2 次提交
  3. 20 4月, 2019 4 次提交
    • T
      tipc: introduce new socket option TIPC_SOCK_RECVQ_USED · 42e5425a
      Tung Nguyen 提交于
      When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the
      number of buffers in receive socket buffer which is not so helpful
      for user space applications.
      
      This commit introduces the new option TIPC_SOCK_RECVQ_USED which
      returns the current allocated bytes of the receive socket buffer.
      This helps user space applications dimension its buffer usage to
      avoid buffer overload issue.
      Signed-off-by: NTung Nguyen <tung.q.nguyen@dektech.com.au>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42e5425a
    • A
      net: socket: implement 64-bit timestamps · 0768e170
      Arnd Bergmann 提交于
      The 'timeval' and 'timespec' data structures used for socket timestamps
      are going to be redefined in user space based on 64-bit time_t in future
      versions of the C library to deal with the y2038 overflow problem,
      which breaks the ABI definition.
      
      Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
      use the _IOR() macro to encode the size of the transferred data, so it
      remains ambiguous whether the application uses the old or new layout.
      
      The best workaround I could find is rather ugly: we redefine the command
      code based on the size of the respective data structure with a ternary
      operator. This lets it get evaluated as late as possible, hopefully after
      that structure is visible to the caller. We cannot use an #ifdef here,
      because inux/sockios.h might have been included before any libc header
      that could determine the size of time_t.
      
      The ioctl implementation now interprets the new command codes as always
      referring to the 64-bit structure on all architectures, while the old
      architecture specific command code still refers to the old architecture
      specific layout. The new command number is only used when they are
      actually different.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0768e170
    • A
      net: rework SIOCGSTAMP ioctl handling · c7cbdbf2
      Arnd Bergmann 提交于
      The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
      socket protocol handlers, and all of those end up calling the same
      sock_get_timestamp()/sock_get_timestampns() helper functions, which
      results in a lot of duplicate code.
      
      With the introduction of 64-bit time_t on 32-bit architectures, this
      gets worse, as we then need four different ioctl commands in each
      socket protocol implementation.
      
      To simplify that, let's add a new .gettstamp() operation in
      struct proto_ops, and move ioctl implementation into the common
      sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
      through.
      
      We can reuse the sock_get_timestamp() implementation, but generalize
      it so it can deal with both native and compat mode, as well as
      timeval and timespec structures.
      Acked-by: NStefan Schmidt <stefan@datenfreihafen.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7cbdbf2
    • M
      vlan: support binding link state to vlan member bridge ports · 8c8b3458
      Mike Manning 提交于
      In the case of vlan filtering on bridges, the bridge may also have the
      corresponding vlan devices as upper devices. Currently the link state
      of vlan devices is transferred from the lower device. So this is up if
      the bridge is in admin up state and there is at least one bridge port
      that is up, regardless of the vlan that the port is a member of.
      
      The link state of the vlan device may need to track only the state of
      the subset of ports that are also members of the corresponding vlan,
      rather than that of all ports.
      
      Add a flag to specify a vlan bridge binding mode, by which the link
      state is no longer automatically transferred from the lower device,
      but is instead determined by the bridge ports that are members of the
      vlan.
      Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c8b3458
  4. 19 4月, 2019 4 次提交
  5. 18 4月, 2019 7 次提交
  6. 17 4月, 2019 1 次提交
  7. 16 4月, 2019 3 次提交
  8. 15 4月, 2019 3 次提交
    • M
      fs: prevent page refcount overflow in pipe_buf_get · 15fab63e
      Matthew Wilcox 提交于
      Change pipe_buf_get() to return a bool indicating whether it succeeded
      in raising the refcount of the page (if the thing in the pipe is a page).
      This removes another mechanism for overflowing the page refcount.  All
      callers converted to handle a failure.
      Reported-by: NJann Horn <jannh@google.com>
      Signed-off-by: NMatthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15fab63e
    • L
      mm: add 'try_get_page()' helper function · 88b1a17d
      Linus Torvalds 提交于
      This is the same as the traditional 'get_page()' function, but instead
      of unconditionally incrementing the reference count of the page, it only
      does so if the count was "safe".  It returns whether the reference count
      was incremented (and is marked __must_check, since the caller obviously
      has to be aware of it).
      
      Also like 'get_page()', you can't use this function unless you already
      had a reference to the page.  The intent is that you can use this
      exactly like get_page(), but in situations where you want to limit the
      maximum reference count.
      
      The code currently does an unconditional WARN_ON_ONCE() if we ever hit
      the reference count issues (either zero or negative), as a notification
      that the conditional non-increment actually happened.
      
      NOTE! The count access for the "safety" check is inherently racy, but
      that doesn't matter since the buffer we use is basically half the range
      of the reference count (ie we look at the sign of the count).
      Acked-by: NMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      88b1a17d
    • L
      mm: make page ref count overflow check tighter and more explicit · f958d7b5
      Linus Torvalds 提交于
      We have a VM_BUG_ON() to check that the page reference count doesn't
      underflow (or get close to overflow) by checking the sign of the count.
      
      That's all fine, but we actually want to allow people to use a "get page
      ref unless it's already very high" helper function, and we want that one
      to use the sign of the page ref (without triggering this VM_BUG_ON).
      
      Change the VM_BUG_ON to only check for small underflows (or _very_ close
      to overflowing), and ignore overflows which have strayed into negative
      territory.
      Acked-by: NMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f958d7b5
  9. 13 4月, 2019 7 次提交
    • N
      rhashtable: use BIT(0) for locking. · ca0b709d
      NeilBrown 提交于
      As reported by Guenter Roeck, the new bit-locking using
      BIT(1) doesn't work on the m68k architecture.  m68k only requires
      2-byte alignment for words and longwords, so there is only one
      unused bit in pointers to structs - We current use two, one for the
      NULLS marker at the end of the linked list, and one for the bit-lock
      in the head of the list.
      
      The two uses don't need to conflict as we never need the head of the
      list to be a NULLS marker - the marker is only needed to check if an
      object has moved to a different table, and the bucket head cannot
      move.  The NULLS marker is only needed in a ->next pointer.
      
      As we already have different types for the bucket head pointer (struct
      rhash_lock_head) and the ->next pointers (struct rhash_head), it is
      fairly easy to treat the lsb differently in each.
      
      So: Initialize buckets heads to NULL, and use the lsb for locking.
      When loading the pointer from the bucket head, if it is NULL (ignoring
      the lock big), report as being the expected NULLS marker.
      When storing a value into a bucket head, if it is a NULLS marker,
      store NULL instead.
      
      And convert all places that used bit 1 for locking, to use bit 0.
      
      Fixes: 8f0db018 ("rhashtable: use bit_spin_locks to protect hash bucket.")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca0b709d
    • N
      rhashtable: replace rht_ptr_locked() with rht_assign_locked() · f4712b46
      NeilBrown 提交于
      The only times rht_ptr_locked() is used, it is to store a new
      value in a bucket-head.  This is the only time it makes sense
      to use it too.  So replace it by a function which does the
      whole task:  Sets the lock bit and assigns to a bucket head.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4712b46
    • N
      rhashtable: move dereference inside rht_ptr() · adc6a3ab
      NeilBrown 提交于
      Rather than dereferencing a pointer to a bucket and then passing the
      result to rht_ptr(), we now pass in the pointer and do the dereference
      in rht_ptr().
      
      This requires that we pass in the tbl and hash as well to support RCU
      checks, and means that the various rht_for_each functions can expect a
      pointer that can be dereferenced without further care.
      
      There are two places where we dereference a bucket pointer
      where there is no testable protection - in each case we know
      that we much have exclusive access without having taken a lock.
      The previous code used rht_dereference() to pretend that holding
      the mutex provided protects, but holding the mutex never provides
      protection for accessing buckets.
      
      So instead introduce rht_ptr_exclusive() that can be used when
      there is known to be exclusive access without holding any locks.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      adc6a3ab
    • N
      rhashtable: reorder some inline functions and macros. · c5783311
      NeilBrown 提交于
      This patch only moves some code around, it doesn't
      change the code at all.
      A subsequent patch will benefit from this as it needs
      to add calls to functions which are now defined before the
      call-site, but weren't before.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5783311
    • N
      rhashtable: fix some __rcu annotation errors · e4edbe3c
      NeilBrown 提交于
      With these annotations, the rhashtable now gets no
      warnings when compiled with "C=1" for sparse checking.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4edbe3c
    • M
      rxrpc: Make rxrpc_kernel_check_life() indicate if call completed · 4611da30
      Marc Dionne 提交于
      Make rxrpc_kernel_check_life() pass back the life counter through the
      argument list and return true if the call has not yet completed.
      Suggested-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4611da30
    • A
      bpf: Introduce bpf_strtol and bpf_strtoul helpers · d7a4cb9b
      Andrey Ignatov 提交于
      Add bpf_strtol and bpf_strtoul to convert a string to long and unsigned
      long correspondingly. It's similar to user space strtol(3) and
      strtoul(3) with a few changes to the API:
      
      * instead of NUL-terminated C string the helpers expect buffer and
        buffer length;
      
      * resulting long or unsigned long is returned in a separate
        result-argument;
      
      * return value is used to indicate success or failure, on success number
        of consumed bytes is returned that can be used to identify position to
        read next if the buffer is expected to contain multiple integers;
      
      * instead of *base* argument, *flags* is used that provides base in 5
        LSB, other bits are reserved for future use;
      
      * number of supported bases is limited.
      
      Documentation for the new helpers is provided in bpf.h UAPI.
      
      The helpers are made available to BPF_PROG_TYPE_CGROUP_SYSCTL programs to
      be able to convert string input to e.g. "ulongvec" output.
      
      E.g. "net/ipv4/tcp_mem" consists of three ulong integers. They can be
      parsed by calling to bpf_strtoul three times.
      
      Implementation notes:
      
      Implementation includes "../../lib/kstrtox.h" to reuse integer parsing
      functions. It's done exactly same way as fs/proc/base.c already does.
      
      Unfortunately existing kstrtoX function can't be used directly since
      they fail if any invalid character is present right after integer in the
      string. Existing simple_strtoX functions can't be used either since
      they're obsolete and don't handle overflow properly.
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d7a4cb9b