1. 22 11月, 2014 14 次提交
  2. 20 11月, 2014 6 次提交
    • A
      fold verify_iovec() into copy_msghdr_from_user() · 08adb7da
      Al Viro 提交于
      ... and do the same on the compat side of things.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      08adb7da
    • A
      {compat_,}verify_iovec(): switch to generic copying of iovecs · 08449320
      Al Viro 提交于
      use {compat_,}rw_copy_check_uvector().  As the result, we are
      guaranteed that all iovecs seen in ->msg_iov by ->sendmsg()
      and ->recvmsg() will pass access_ok().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      08449320
    • A
      separate kernel- and userland-side msghdr · 666547ff
      Al Viro 提交于
      Kernel-side struct msghdr is (currently) using the same layout as
      userland one, but it's not a one-to-one copy - even without considering
      32bit compat issues, we have msg_iov, msg_name and msg_control copied
      to kernel[1].  It's fairly localized, so we get away with a few functions
      where that knowledge is needed (and we could shrink that set even
      more).  Pretty much everything deals with the kernel-side variant and
      the few places that want userland one just use a bunch of force-casts
      to paper over the differences.
      
      The thing is, kernel-side definition of struct msghdr is *not* exposed
      in include/uapi - libc doesn't see it, etc.  So we can add struct user_msghdr,
      with proper annotations and let the few places that ever deal with those
      beasts use it for userland pointers.  Saner typechecking aside, that will
      allow to change the layout of kernel-side msghdr - e.g. replace
      msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
      to modify the iovec as we copy data to/from it, etc.
      
      We could introduce kernel_msghdr instead, but that would create much more
      noise - the absolute majority of the instances would need to have the
      type switched to kernel_msghdr and definition of struct msghdr in
      include/linux/socket.h is not going to be seen by userland anyway.
      
      This commit just introduces user_msghdr and switches the few places that
      are dealing with userland-side msghdr to it.
      
      [1] actually, it's even trickier than that - we copy msg_control for
      sendmsg, but keep the userland address on recvmsg.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      666547ff
    • M
      netlink: Deletion of an unnecessary check before the function call "__module_get" · fcd4d35e
      Markus Elfring 提交于
      The __module_get() function tests whether its argument is NULL and then
      returns immediately. Thus the test around the call is not needed.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd4d35e
    • M
      net: pktgen: Deletion of an unnecessary check before the function call "proc_remove" · ef87c5d6
      Markus Elfring 提交于
      The proc_remove() function tests whether its argument is NULL and then
      returns immediately. Thus the test around the call is not needed.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef87c5d6
    • E
      tcp: make connect() mem charging friendly · 355a901e
      Eric Dumazet 提交于
      While working on sk_forward_alloc problems reported by Denys
      Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
      sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
      sk_forward_alloc is negative while connect is in progress.
      
      We can fix this by calling regular sk_stream_alloc_skb() both for the
      SYN packet (in tcp_connect()) and the syn_data packet in
      tcp_send_syn_data()
      
      Then, tcp_send_syn_data() can avoid copying syn_data as we simply
      can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
      
      Instead of open coding memcpy_fromiovecend(), simply use this helper.
      
      This leaves in socket write queue clean fast clone skbs.
      
      This was tested against our fastopen packetdrill tests.
      Reported-by: NDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      355a901e
  3. 19 11月, 2014 8 次提交
  4. 17 11月, 2014 5 次提交
  5. 14 11月, 2014 7 次提交
    • P
      openvswitch: Fix build failure. · 8cd4313a
      Pravin B Shelar 提交于
      Add dependency on INET to fix following build error. I have also
      fixed MPLS dependency.
      
      ERROR: "ip_route_output_flow" [net/openvswitch/openvswitch.ko]
      undefined!
      make[1]: *** [__modpost] Error 1
      Reported-by: NJim Davis <jim.epost@gmail.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cd4313a
    • E
      tcp: limit GSO packets to half cwnd · d649a7a8
      Eric Dumazet 提交于
      In DC world, GSO packets initially cooked by tcp_sendmsg() are usually
      big, as sk_pacing_rate is high.
      
      When network is congested, cwnd can be smaller than the GSO packets
      found in socket write queue. tcp_write_xmit() splits GSO packets
      using the available cwnd, and we end up sending a single GSO packet,
      consuming all available cwnd.
      
      With GRO aggregation on the receiver, we might handle a single GRO
      packet, sending back a single ACK.
      
      1) This single ACK might be lost
         TLP or RTO are forced to attempt a retransmit.
      2) This ACK releases a full cwnd, sender sends another big GSO packet,
         in a ping pong mode.
      
      This behavior does not fill the pipes in the best way, because of
      scheduling artifacts.
      
      Make sure we always have at least two GSO packets in flight.
      
      This allows us to safely increase GRO efficiency without risking
      spurious retransmits.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d649a7a8
    • T
      rhashtable: Drop gfp_flags arg in insert/remove functions · 6eba8224
      Thomas Graf 提交于
      Reallocation is only required for shrinking and expanding and both rely
      on a mutex for synchronization and callers of rhashtable_init() are in
      non atomic context. Therefore, no reason to continue passing allocation
      hints through the API.
      
      Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
      for silent fall back to vzalloc() without the OOM killer jumping in as
      pointed out by Eric Dumazet and Eric W. Biederman.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6eba8224
    • H
      rhashtable: Add parent argument to mutex_is_held · 7b4ce235
      Herbert Xu 提交于
      Currently mutex_is_held can only test locks in the that are global
      since it takes no arguments.  This prevents rhashtable from being
      used in places where locks are lock, e.g., per-namespace locks.
      
      This patch adds a parent field to mutex_is_held and rhashtable_params
      so that local locks can be used (and tested).
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b4ce235
    • H
      netfilter: Move mutex_is_held under PROVE_LOCKING · 1f501d62
      Herbert Xu 提交于
      The rhashtable function mutex_is_held is only used when PROVE_LOCKING
      is enabled.  This patch modifies netfilter so that we can rhashtable.h
      itself can later make mutex_is_held optional depending on PROVE_LOCKING.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f501d62
    • H
      netlink: Move mutex_is_held under PROVE_LOCKING · 97127566
      Herbert Xu 提交于
      The rhashtable function mutex_is_held is only used when PROVE_LOCKING
      is enabled.  This patch modifies netlink so that we can rhashtable.h
      itself can later make mutex_is_held optional depending on PROVE_LOCKING.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97127566
    • M
      net: generic dev_disable_lro() stacked device handling · fbe168ba
      Michal Kubeček 提交于
      Large receive offloading is known to cause problems if received packets
      are passed to other host. Therefore the kernel disables it by calling
      dev_disable_lro() whenever a network device is enslaved in a bridge or
      forwarding is enabled for it (or globally). For virtual devices we need
      to disable LRO on the underlying physical device (which is actually
      receiving the packets).
      
      Current dev_disable_lro() code handles this  propagation for a vlan
      (including 802.1ad nested vlan), macvlan or a vlan on top of a macvlan.
      It doesn't handle other stacked devices and their combinations, in
      particular propagation from a bond to its slaves which often causes
      problems in virtualization setups.
      
      As we now have generic data structures describing the upper-lower device
      relationship, dev_disable_lro() can be generalized to disable LRO also
      for all lower devices (if any) once it is disabled for the device
      itself.
      
      For bonding and teaming devices, it is necessary to disable LRO not only
      on current slaves at the moment when dev_disable_lro() is called but
      also on any slave (port) added later.
      
      v2: use lower device links for all devices (including vlan and macvlan)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NVeaceslav Falico <vfalico@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbe168ba