1. 25 6月, 2013 5 次提交
    • E
      tcp: remove invalid __rcu annotation · 7ae8639c
      Eric Dumazet 提交于
      struct tcp_fastopen_context has a field named tfm, which is a pointer
      to a crypto_cipher structure.
      
      It currently has a __rcu annotation, which is not needed at all.
      
      tcp_fastopen_ctx is the pointer fetched by rcu_dereference(), but once
      we have a pointer to current tcp_fastopen_context, we do not use/need
      rcu_dereference() to access tfm.
      
      This fixes a lot of sparse errors like the following :
      
      net/ipv4/tcp_fastopen.c:21:31: warning: incorrect type in argument 1 (different address spaces)
      net/ipv4/tcp_fastopen.c:21:31:    expected struct crypto_cipher *tfm
      net/ipv4/tcp_fastopen.c:21:31:    got struct crypto_cipher [noderef] <asn:4>*tfm
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ae8639c
    • D
      packet: nlmon: virtual netlink monitoring device for packet sockets · e4fc408e
      Daniel Borkmann 提交于
      Currently, there is no good possibility to debug netlink traffic that
      is being exchanged between kernel and user space. Therefore, this patch
      implements a netlink virtual device, so that netlink messages will be
      made visible to PF_PACKET sockets. Once there was an approach with a
      similar idea [1], but it got forgotten somehow.
      
      I think it makes most sense to accept the "overhead" of an extra netlink
      net device over implementing the same functionality from PF_PACKET
      sockets once again into netlink sockets. We have BPF filters that can
      already be easily applied which even have netlink extensions, we have
      RX_RING zero-copy between kernel- and user space that can be reused,
      and much more features. So instead of re-implementing all of this, we
      simply pass the skb to a given PF_PACKET socket for further analysis.
      
      Another nice benefit that comes from that is that no code needs to be
      changed in user space packet analyzers (maybe adding a dissector, but
      not more), thus out of the box, we can already capture pcap files of
      netlink traffic to debug/troubleshoot netlink problems.
      
      Also thanks goes to Thomas Graf, Flavio Leitner, Jesper Dangaard Brouer.
      
       [1] http://marc.info/?l=linux-netdev&m=113813401516110Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4fc408e
    • D
      net: netlink: virtual tap device management · bcbde0d4
      Daniel Borkmann 提交于
      Similarly to the networking receive path with ptype_all taps, we add
      the possibility to register netdevices that are for ARPHRD_NETLINK to
      the netlink subsystem, so that those can be used for netlink analyzers
      resp. debuggers. We do not offer a direct callback function as out-of-tree
      modules could do crap with it. Instead, a netdevice must be registered
      properly and only receives a clone, managed by the netlink layer. Symbols
      are exported as GPL-only.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcbde0d4
    • D
      net: if_arp: add ARPHRD_NETLINK type · 77e2af03
      Daniel Borkmann 提交于
      This small patch adds the definition of ARPHRD_NETLINK which can for
      example be used by netlink monitoring devices as device type. So that
      sockaddr_ll can pick it up and based on that choose the correct packet
      dissector.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77e2af03
    • D
      net: Restore unintentional reverts. · d3c5f47e
      David S. Miller 提交于
      This restores commits:
      
      c573972c
      1a590434
      da2e2c21
      
      which initially accidently went into 'net', were
      reverted there, and then properly placed into 'net-next'.
      But the next net --> net-next merge accidently wiped them
      out again.
      Reported-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3c5f47e
  2. 24 6月, 2013 26 次提交
  3. 22 6月, 2013 1 次提交
  4. 20 6月, 2013 8 次提交
    • J
      ndisc: Convert use of typedef ctl_table to struct ctl_table · fedaf4ff
      Joe Perches 提交于
      This typedef is unnecessary and should just be removed.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fedaf4ff
    • J
      ipv6: Convert use of typedef ctl_table to struct ctl_table · 9e8cda3b
      Joe Perches 提交于
      This typedef is unnecessary and should just be removed.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e8cda3b
    • R
      inet: frag , remove an empty ifdef. · af92e542
      Rami Rosen 提交于
      This patch removes an empty ifdef from inet_frag_intern()
      in net/ipv4/inet_fragment.c.
      
      commit b67bfe0d
      (hlist: drop the node parameter from iterators) removed hlist from
      net/ipv4/inet_fragment.c, but did not remove the enclosing ifdef command,
      which is now empty.
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af92e542
    • E
      htb: refactor struct htb_sched fields for performance · c9364636
      Eric Dumazet 提交于
      htb_sched structures are big, and source of false sharing on SMP.
      
      Every time a packet is queued or dequeue, many cache lines must be
      touched because structures are not lay out properly.
      
      By carefully splitting htb_sched in two parts, and define sub structures
      to increase data locality, we can improve performance dramatically on
      SMP.
      
      New htb_prio structure can also be used in htb_class to increase data
      locality.
      
      I got 26 % performance increase on a 24 threads machine, with 200
      concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9364636
    • C
      tcp: introduce a per-route knob for quick ack · bcefe17c
      Cong Wang 提交于
      In previous discussions, I tried to find some reasonable heuristics
      for delayed ACK, however this seems not possible, according to Eric:
      
      	"ACKS might also be delayed because of bidirectional
      	traffic, and is more controlled by the application
      	response time. TCP stack can not easily estimate it."
      
      	"ACK can be incredibly useful to recover from losses in
      	a short time.
      
      	The vast majority of TCP sessions are small lived, and we
      	send one ACK per received segment anyway at beginning or
      	retransmits to let the sender smoothly increase its cwnd,
      	so an auto-tuning facility wont help them that much."
      
      and according to David:
      
      	"ACKs are the only information we have to detect loss.
      
      	And, for the same reasons that TCP VEGAS is fundamentally
      	broken, we cannot measure the pipe or some other
      	receiver-side-visible piece of information to determine
      	when it's "safe" to stretch ACK.
      
      	And even if it's "safe", we should not do it so that losses are
      	accurately detected and we don't spuriously retransmit.
      
      	The only way to know when the bandwidth increases is to
      	"test" it, by sending more and more packets until drops happen.
      	That's why all successful congestion control algorithms must
      	operate on explicited tested pieces of information.
      
      	Similarly, it's not really possible to universally know if
      	it's safe to stretch ACK or not."
      
      It still makes sense to enable or disable quick ack mode like
      what TCP_QUICK_ACK does.
      
      Similar to TCP_QUICK_ACK option, but for people who can't
      modify the source code and still wants to control
      TCP delayed ACK behavior. As David suggested, this should belong
      to per-path scope, since different pathes may want different
      behaviors.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Rick Jones <rick.jones2@hp.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Graf <tgraf@suug.ch>
      CC: David Laight <David.Laight@ACULAB.COM>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcefe17c
    • D
      2c0740e4
    • Y
      bnx2: use pdev->pm_cap instead of pci_find_capability(.., PCI_CAP_ID_PM) · 85768271
      Yijing Wang 提交于
      Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
      in init path. So we can use pdev->pm_cap instead of using
      pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.
      Signed-off-by: NYijing Wang <wangyijing@huawei.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85768271
    • Y
      amd8111e: use pdev->pm_cap instead of pci_find_capability(.., PCI_CAP_ID_PM) · f9c7da5e
      Yijing Wang 提交于
      Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
      in init path. So we can use pdev->pm_cap instead of using
      pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.
      Signed-off-by: NYijing Wang <wangyijing@huawei.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9c7da5e