1. 11 7月, 2012 1 次提交
  2. 05 7月, 2012 5 次提交
  3. 30 6月, 2012 1 次提交
    • P
      netlink: add netlink_kernel_cfg parameter to netlink_kernel_create · a31f2d17
      Pablo Neira Ayuso 提交于
      This patch adds the following structure:
      
      struct netlink_kernel_cfg {
              unsigned int    groups;
              void            (*input)(struct sk_buff *skb);
              struct mutex    *cb_mutex;
      };
      
      That can be passed to netlink_kernel_create to set optional configurations
      for netlink kernel sockets.
      
      I've populated this structure by looking for NULL and zero parameters at the
      existing code. The remaining parameters that always need to be set are still
      left in the original interface.
      
      That includes optional parameters for the netlink socket creation. This allows
      easy extensibility of this interface in the future.
      
      This patch also adapts all callers to use this new interface.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a31f2d17
  4. 29 6月, 2012 2 次提交
  5. 28 6月, 2012 3 次提交
  6. 20 6月, 2012 1 次提交
    • D
      ipv4: Early TCP socket demux. · 41063e9d
      David S. Miller 提交于
      Input packet processing for local sockets involves two major demuxes.
      One for the route and one for the socket.
      
      But we can optimize this down to one demux for certain kinds of local
      sockets.
      
      Currently we only do this for established TCP sockets, but it could
      at least in theory be expanded to other kinds of connections.
      
      If a TCP socket is established then it's identity is fully specified.
      
      This means that whatever input route was used during the three-way
      handshake must work equally well for the rest of the connection since
      the keys will not change.
      
      Once we move to established state, we cache the receive packet's input
      route to use later.
      
      Like the existing cached route in sk->sk_dst_cache used for output
      packets, we have to check for route invalidations using dst->obsolete
      and dst->ops->check().
      
      Early demux occurs outside of a socket locked section, so when a route
      invalidation occurs we defer the fixup of sk->sk_rx_dst until we are
      actually inside of established state packet processing and thus have
      the socket locked.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41063e9d
  7. 16 6月, 2012 1 次提交
  8. 14 6月, 2012 2 次提交
    • E
      netpoll: fix netpoll_send_udp() bugs · 954fba02
      Eric Dumazet 提交于
      Bogdan Hamciuc diagnosed and fixed following bug in netpoll_send_udp() :
      
      "skb->len += len;" instead of "skb_put(skb, len);"
      
      Meaning that _if_ a network driver needs to call skb_realloc_headroom(),
      only packet headers would be copied, leaving garbage in the payload.
      
      However the skb_realloc_headroom() must be avoided as much as possible
      since it requires memory and netpoll tries hard to work even if memory
      is exhausted (using a pool of preallocated skbs)
      
      It appears netpoll_send_udp() reserved 16 bytes for the ethernet header,
      which happens to work for typicall drivers but not all.
      
      Right thing is to use LL_RESERVED_SPACE(dev)
      (And also add dev->needed_tailroom of tailroom)
      
      This patch combines both fixes.
      
      Many thanks to Bogdan for raising this issue.
      Reported-by: NBogdan Hamciuc <bogdan.hamciuc@freescale.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NBogdan Hamciuc <bogdan.hamciuc@freescale.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      954fba02
    • E
      splice: fix racy pipe->buffers uses · 047fe360
      Eric Dumazet 提交于
      Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
      by splice_shrink_spd() called from vmsplice_to_pipe()
      
      commit 35f3d14d (pipe: add support for shrinking and growing pipes)
      added capability to adjust pipe->buffers.
      
      Problem is some paths don't hold pipe mutex and assume pipe->buffers
      doesn't change for their duration.
      
      Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
      use it in place of pipe->buffers where appropriate.
      
      splice_shrink_spd() loses its struct pipe_inode_info argument.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Tom Herbert <therbert@google.com>
      Cc: stable <stable@vger.kernel.org> # 2.6.35
      Tested-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      047fe360
  9. 13 6月, 2012 2 次提交
    • B
      ethtool: Make more commands available to unprivileged processes · 2da45db2
      Ben Hutchings 提交于
      'Get' commands should generally not require CAP_NET_ADMIN, with
      the exception of those that expose internal state.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2da45db2
    • M
      net-next: add dev_loopback_xmit() to avoid duplicate code · 95603e22
      Michel Machado 提交于
      Add dev_loopback_xmit() in order to deduplicate functions
      ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
      ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
      
      I was about to reinvent the wheel when I noticed that
      ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
      need and are not IP-only functions, but they were not available to reuse
      elsewhere.
      
      ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
      understand that this is harmless, and should be in dev_loopback_xmit().
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95603e22
  10. 09 6月, 2012 1 次提交
  11. 08 6月, 2012 3 次提交
  12. 06 6月, 2012 1 次提交
  13. 04 6月, 2012 2 次提交
    • E
      drop_monitor: dont sleep in atomic context · bec4596b
      Eric Dumazet 提交于
      drop_monitor calls several sleeping functions while in atomic context.
      
       BUG: sleeping function called from invalid context at mm/slub.c:943
       in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2
       Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55
       Call Trace:
        [<ffffffff810697ca>] __might_sleep+0xca/0xf0
        [<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0
        [<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130
        [<ffffffff815343fb>] __alloc_skb+0x4b/0x230
        [<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor]
        [<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor]
        [<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor]
        [<ffffffff810568e0>] process_one_work+0x130/0x4c0
        [<ffffffff81058249>] worker_thread+0x159/0x360
        [<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240
        [<ffffffff8105d403>] kthread+0x93/0xa0
        [<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10
        [<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80
        [<ffffffff816be6d0>] ? gs_change+0xb/0xb
      
      Rework the logic to call the sleeping functions in right context.
      
      Use standard timer/workqueue api to let system chose any cpu to perform
      the allocation and netlink send.
      
      Also avoid a loop if reset_per_cpu_data() cannot allocate memory :
      use mod_timer() to wait 1/10 second before next try.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bec4596b
    • E
      sock_diag: add SK_MEMINFO_BACKLOG · d594e987
      Eric Dumazet 提交于
      Adding socket backlog len in INET_DIAG_SKMEMINFO is really useful to
      diagnose various TCP problems.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d594e987
  14. 01 6月, 2012 1 次提交
  15. 30 5月, 2012 1 次提交
  16. 20 5月, 2012 1 次提交
    • E
      net: introduce skb_try_coalesce() · bad43ca8
      Eric Dumazet 提交于
      Move tcp_try_coalesce() protocol independent part to
      skb_try_coalesce().
      
      skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
      to build optimized skbs (less sk_buff, and possibly less 'headers')
      
      skb_try_coalesce() is zero copy, unless the copy can fit in destination
      header (its a rare case)
      
      kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
      because IPv6 will need it in patch (ipv6: use skb coalescing in
      reassembly).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bad43ca8
  17. 19 5月, 2012 3 次提交
  18. 18 5月, 2012 2 次提交
    • N
      drop_monitor: convert to modular building · cad456d5
      Neil Horman 提交于
      When I first wrote drop monitor I wrote it to just build monolithically.  There
      is no reason it can't be built modularly as well, so lets give it that
      flexibiity.
      
      I've tested this by building it as both a module and monolithically, and it
      seems to work quite well
      
      Change notes:
      
      v2)
      * fixed for_each_present_cpu loops to be more correct as per Eric D.
      * Converted exit path failures to BUG_ON as per Ben H.
      
      v3)
      * Converted del_timer to del_timer_sync to close race noted by Ben H.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Reviewed-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cad456d5
    • E
      net: netdev_alloc_skb() use build_skb() · a1c7fff7
      Eric Dumazet 提交于
      netdev_alloc_skb() is used by networks driver in their RX path to
      allocate an skb to receive an incoming frame.
      
      With recent skb->head_frag infrastructure, it makes sense to change
      netdev_alloc_skb() to use build_skb() and a frag allocator.
      
      This permits a zero copy splice(socket->pipe), and better GRO or TCP
      coalescing.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1c7fff7
  19. 17 5月, 2012 3 次提交
  20. 16 5月, 2012 2 次提交
  21. 11 5月, 2012 2 次提交
    • E
      pktgen: fix crash at module unload · c57b5468
      Eric Dumazet 提交于
      commit 7d3d43da (net: In unregister_netdevice_notifier unregister
      the netdevices.) makes pktgen crashing at module unload.
      
      [  296.820578] BUG: spinlock bad magic on CPU#6, rmmod/3267
      [  296.820719]  lock: ffff880310c38000, .magic: ffff8803, .owner: <none>/-1, .owner_cpu: -1
      [  296.820943] Pid: 3267, comm: rmmod Not tainted 3.4.0-rc5+ #254
      [  296.821079] Call Trace:
      [  296.821211]  [<ffffffff8168a715>] spin_dump+0x8a/0x8f
      [  296.821345]  [<ffffffff8168a73b>] spin_bug+0x21/0x26
      [  296.821507]  [<ffffffff812b4741>] do_raw_spin_lock+0x131/0x140
      [  296.821648]  [<ffffffff8169188e>] _raw_spin_lock+0x1e/0x20
      [  296.821786]  [<ffffffffa00cc0fd>] __pktgen_NN_threads+0x4d/0x140 [pktgen]
      [  296.821928]  [<ffffffffa00ccf8d>] pktgen_device_event+0x10d/0x1e0 [pktgen]
      [  296.822073]  [<ffffffff8154ed4f>] unregister_netdevice_notifier+0x7f/0x100
      [  296.822216]  [<ffffffffa00d2a0b>] pg_cleanup+0x48/0x73 [pktgen]
      [  296.822357]  [<ffffffff8109528e>] sys_delete_module+0x17e/0x2a0
      [  296.822502]  [<ffffffff81699652>] system_call_fastpath+0x16/0x1b
      
      Hold the pktgen_thread_lock while splicing pktgen_threads, and test
      pktgen_exiting in pktgen_device_event() to make unload faster.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c57b5468
    • D
      Revert "net: maintain namespace isolation between vlan and real device" · 59b9997b
      David S. Miller 提交于
      This reverts commit 8a83a00b.
      
      It causes regressions for S390 devices, because it does an
      unconditional DST drop on SKBs for vlans and the QETH device
      needs the neighbour entry hung off the DST for certain things
      on transmit.
      
      Arnd can't remember exactly why he even needed this change.
      
      Conflicts:
      
      	drivers/net/macvlan.c
      	net/8021q/vlan_dev.c
      	net/core/dev.c
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59b9997b