1. 20 4月, 2013 1 次提交
  2. 17 4月, 2013 1 次提交
  3. 16 4月, 2013 1 次提交
    • V
      net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api · 4cd729b0
      Vlad Yasevich 提交于
      The current implementation of dev_uc_sync/unsync() assumes that there is
      a strict 1-to-1 relationship between the source and destination of the sync.
      In other words, once an address has been synced to a destination device, it
      will not be synced to any other device through the sync API.
      However, there are some virtual devices that aggreate a number of lower
      devices and need to sync addresses to all of them.  The current
      API falls short there.
      
      This patch introduces a new dev_uc_sync_multiple() api that can be called
      in the above circumstances and allows sync to work for every invocation.
      
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cd729b0
  4. 12 4月, 2013 1 次提交
  5. 11 4月, 2013 1 次提交
  6. 10 4月, 2013 2 次提交
  7. 08 4月, 2013 1 次提交
  8. 06 4月, 2013 1 次提交
    • P
      netfilter: don't reset nf_trace in nf_reset() · 124dff01
      Patrick McHardy 提交于
      Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
      to reset nf_trace in nf_reset(). This is wrong and unnecessary.
      
      nf_reset() is used in the following cases:
      
      - when passing packets up the the socket layer, at which point we want to
        release all netfilter references that might keep modules pinned while
        the packet is queued. nf_trace doesn't matter anymore at this point.
      
      - when encapsulating or decapsulating IPsec packets. We want to continue
        tracing these packets after IPsec processing.
      
      - when passing packets through virtual network devices. Only devices on
        that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
        used anymore. Its not entirely clear whether those packets should
        be traced after that, however we've always done that.
      
      - when passing packets through virtual network devices that make the
        packet cross network namespace boundaries. This is the only cases
        where we clearly want to reset nf_trace and is also what the
        original patch intended to fix.
      
      Add a new function nf_reset_trace() and use it in dev_forward_skb() to
      fix this properly.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      124dff01
  9. 05 4月, 2013 1 次提交
    • V
      net: count hw_addr syncs so that unsync works properly. · 4543fbef
      Vlad Yasevich 提交于
      A few drivers use dev_uc_sync/unsync to synchronize the
      address lists from master down to slave/lower devices.  In
      some cases (bond/team) a single address list is synched down
      to multiple devices.  At the time of unsync, we have a leak
      in these lower devices, because "synced" is treated as a
      boolean and the address will not be unsynced for anything after
      the first device/call.
      
      Treat "synced" as a count (same as refcount) and allow all
      unsync calls to work.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4543fbef
  10. 03 4月, 2013 1 次提交
  11. 02 4月, 2013 1 次提交
  12. 01 4月, 2013 1 次提交
    • K
      net: add option to enable error queue packets waking select · 7d4c04fc
      Keller, Jacob E 提交于
      Currently, when a socket receives something on the error queue it only wakes up
      the socket on select if it is in the "read" list, that is the socket has
      something to read. It is useful also to wake the socket if it is in the error
      list, which would enable software to wait on error queue packets without waking
      up for regular data on the socket. The main use case is for receiving
      timestamped transmit packets which return the timestamp to the socket via the
      error queue. This enables an application to select on the socket for the error
      queue only instead of for the regular traffic.
      
      -v2-
      * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file
      * Modified every socket poll function that checks error queue
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Matthew Vick <matthew.vick@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d4c04fc
  13. 30 3月, 2013 5 次提交
    • E
      net: add a synchronize_net() in netdev_rx_handler_unregister() · 00cfec37
      Eric Dumazet 提交于
      commit 35d48903 (bonding: fix rx_handler locking) added a race
      in bonding driver, reported by Steven Rostedt who did a very good
      diagnosis :
      
      <quoting Steven>
      
      I'm currently debugging a crash in an old 3.0-rt kernel that one of our
      customers is seeing. The bug happens with a stress test that loads and
      unloads the bonding module in a loop (I don't know all the details as
      I'm not the one that is directly interacting with the customer). But the
      bug looks to be something that may still be present and possibly present
      in mainline too. It will just be much harder to trigger it in mainline.
      
      In -rt, interrupts are threads, and can schedule in and out just like
      any other thread. Note, mainline now supports interrupt threads so this
      may be easily reproducible in mainline as well. I don't have the ability
      to tell the customer to try mainline or other kernels, so my hands are
      somewhat tied to what I can do.
      
      But according to a core dump, I tracked down that the eth irq thread
      crashed in bond_handle_frame() here:
      
              slave = bond_slave_get_rcu(skb->dev);
              bond = slave->bond; <--- BUG
      
      the slave returned was NULL and accessing slave->bond caused a NULL
      pointer dereference.
      
      Looking at the code that unregisters the handler:
      
      void netdev_rx_handler_unregister(struct net_device *dev)
      {
      
              ASSERT_RTNL();
              RCU_INIT_POINTER(dev->rx_handler, NULL);
              RCU_INIT_POINTER(dev->rx_handler_data, NULL);
      }
      
      Which is basically:
              dev->rx_handler = NULL;
              dev->rx_handler_data = NULL;
      
      And looking at __netif_receive_skb() we have:
      
              rx_handler = rcu_dereference(skb->dev->rx_handler);
              if (rx_handler) {
                      if (pt_prev) {
                              ret = deliver_skb(skb, pt_prev, orig_dev);
                              pt_prev = NULL;
                      }
                      switch (rx_handler(&skb)) {
      
      My question to all of you is, what stops this interrupt from happening
      while the bonding module is unloading?  What happens if the interrupt
      triggers and we have this:
      
              CPU0                    CPU1
              ----                    ----
        rx_handler = skb->dev->rx_handler
      
                              netdev_rx_handler_unregister() {
                                 dev->rx_handler = NULL;
                                 dev->rx_handler_data = NULL;
      
        rx_handler()
         bond_handle_frame() {
          slave = skb->dev->rx_handler;
          bond = slave->bond; <-- NULL pointer dereference!!!
      
      What protection am I missing in the bond release handler that would
      prevent the above from happening?
      
      </quoting Steven>
      
      We can fix bug this in two ways. First is adding a test in
      bond_handle_frame() and others to check if rx_handler_data is NULL.
      
      A second way is adding a synchronize_net() in
      netdev_rx_handler_unregister() to make sure that a rcu protected reader
      has the guarantee to see a non NULL rx_handler_data.
      
      The second way is better as it avoids an extra test in fast path.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jiri Pirko <jpirko@redhat.com>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00cfec37
    • J
      net: rtnetlink: fdb dflt dump must set idx used for cb->arg[0] · 91f3e7b1
      John Fastabend 提交于
      In rtnl_fdb_dump() when the fdb_dump ndo op is not populated
      we never set the idx value so that cb->arg[0] is always 0.
      Resulting in a endless loop of messages.
      
      Introduced with this commit,
      
      commit 090096bf
      Author: Vlad Yasevich <vyasevic@redhat.com>
      Date:   Wed Mar 6 15:39:42 2013 +0000
      
          net: generic fdb support for drivers without ndo_fdb_<op>
      
      CC: Vlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91f3e7b1
    • S
      net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' · a561cf7e
      Shmulik Ladkani 提交于
      'nf_reset' is called just prior calling 'netif_rx'.
      No need to call it twice.
      Reported-by: NIgor Michailov <rgohita@gmail.com>
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a561cf7e
    • L
      net: simplify the getting percpu of flow_cache · 27815032
      Li RongQing 提交于
      replace per_cpu with per_cpu_ptr to save conversion between address and pointer
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27815032
    • L
      net: fix the use of this_cpu_ptr · 50eab050
      Li RongQing 提交于
      flush_tasklet is not percpu var, and percpu is percpu var, and
      	this_cpu_ptr(&info->cache->percpu->flush_tasklet)
      is not equal to
      	&this_cpu_ptr(info->cache->percpu)->flush_tasklet
      
      1f743b07(use this_cpu_ptr per-cpu helper) introduced this bug.
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50eab050
  14. 29 3月, 2013 1 次提交
  15. 28 3月, 2013 3 次提交
  16. 27 3月, 2013 1 次提交
  17. 26 3月, 2013 1 次提交
  18. 25 3月, 2013 2 次提交
    • E
      net: remove a WARN_ON() in net_enable_timestamp() · 9979a55a
      Eric Dumazet 提交于
      The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false
      positive, in socket clone path, run from softirq context :
      
      [ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80()
      [ 3641.668811] Call Trace:
      [ 3641.671254]  <IRQ>  [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0
      [ 3641.677871]  [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20
      [ 3641.683683]  [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80
      [ 3641.689668]  [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450
      [ 3641.695222]  [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170
      [ 3641.701213]  [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820
      [ 3641.707663]  [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670
      [ 3641.713354]  [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0
      [ 3641.719425]  [<ffffffff807af63a>] tcp_check_req+0x3da/0x530
      [ 3641.724979]  [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80
      [ 3641.730964]  [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0
      [ 3641.736430]  [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0
      [ 3641.741985]  [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0
      
      Its safe at this point because the parent socket owns a reference
      on the netstamp_needed, so we cant have a 0 -> 1 transition, which
      requires to lock a mutex.
      
      Instead of refining the check, lets remove it, as all known callers
      are safe. If it ever changes in the future, static_key_slow_inc()
      will complain anyway.
      Reported-by: NLaurent Chavey <chavey@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9979a55a
    • N
      ipv4: provide addr and netconf dump consistency info · 0465277f
      Nicolas Dichtel 提交于
      This patch takes benefit of dev_addr_genid and dev_base_seq to check if a change
      occurs during a netlink dump. If a change is detected, the flag NLM_F_DUMP_INTR
      is set in the first message after the dump was interrupted.
      
      Note that seq and prev_seq must be reset between each family in rtnl_dump_all()
      because they are specific to each family.
      Reported-by: NJunwei Zhang <junwei.zhang@6wind.com>
      Reported-by: NHongjun Li <hongjun.li@6wind.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0465277f
  19. 22 3月, 2013 1 次提交
  20. 21 3月, 2013 5 次提交
  21. 19 3月, 2013 1 次提交
  22. 18 3月, 2013 3 次提交
  23. 17 3月, 2013 1 次提交
  24. 12 3月, 2013 2 次提交
  25. 10 3月, 2013 1 次提交