1. 08 4月, 2013 7 次提交
  2. 06 4月, 2013 1 次提交
    • P
      netfilter: don't reset nf_trace in nf_reset() · 124dff01
      Patrick McHardy 提交于
      Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
      to reset nf_trace in nf_reset(). This is wrong and unnecessary.
      
      nf_reset() is used in the following cases:
      
      - when passing packets up the the socket layer, at which point we want to
        release all netfilter references that might keep modules pinned while
        the packet is queued. nf_trace doesn't matter anymore at this point.
      
      - when encapsulating or decapsulating IPsec packets. We want to continue
        tracing these packets after IPsec processing.
      
      - when passing packets through virtual network devices. Only devices on
        that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
        used anymore. Its not entirely clear whether those packets should
        be traced after that, however we've always done that.
      
      - when passing packets through virtual network devices that make the
        packet cross network namespace boundaries. This is the only cases
        where we clearly want to reset nf_trace and is also what the
        original patch intended to fix.
      
      Add a new function nf_reset_trace() and use it in dev_forward_skb() to
      fix this properly.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      124dff01
  3. 05 4月, 2013 4 次提交
  4. 03 4月, 2013 4 次提交
    • M
      netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths · 906b1c39
      Matthias Schiffer 提交于
      The bitmask used for the prefix mangling was being calculated
      incorrectly, leading to the wrong part of the address being replaced
      when the prefix length wasn't a multiple of 32.
      Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      906b1c39
    • R
      VSOCK: Handle changes to the VMCI context ID. · 990454b5
      Reilly Grant 提交于
      The VMCI context ID of a virtual machine may change at any time. There
      is a VMCI event which signals this but datagrams may be processed before
      this is handled. It is therefore necessary to be flexible about the
      destination context ID of any datagrams received. (It can be assumed to
      be correct because it is provided by the hypervisor.) The context ID on
      existing sockets should be updated to reflect how the hypervisor is
      currently referring to the system.
      Signed-off-by: NReilly Grant <grantr@vmware.com>
      Acked-by: NAndy King <acking@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      990454b5
    • B
      net IPv6 : Fix broken IPv6 routing table after loopback down-up · 25fb6ca4
      Balakumaran Kannan 提交于
      IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo)
      interface. After down-up, routes of other interface's IPv6 addresses through
      'lo' are lost.
      
      IPv6 addresses assigned to all interfaces are routed through 'lo' for internal
      communication. Once 'lo' is down, those routing entries are removed from routing
      table. But those removed entries are not being re-created properly when 'lo' is
      brought up. So IPv6 addresses of other interfaces becomes unreachable from the
      same machine. Also this breaks communication with other machines because of
      NDISC packet processing failure.
      
      This patch fixes this issue by reading all interface's IPv6 addresses and adding
      them to IPv6 routing table while bringing up 'lo'.
      
      ==Testing==
      Before applying the patch:
      $ route -A inet6
      Kernel IPv6 routing table
      Destination                    Next Hop                   Flag Met Ref Use If
      2000::20/128                   ::                         U    256 0     0 eth0
      fe80::/64                      ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      ::1/128                        ::                         Un   0   1     0 lo
      2000::20/128                   ::                         Un   0   1     0 lo
      fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
      ff00::/8                       ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      $ sudo ifdown lo
      $ sudo ifup lo
      $ route -A inet6
      Kernel IPv6 routing table
      Destination                    Next Hop                   Flag Met Ref Use If
      2000::20/128                   ::                         U    256 0     0 eth0
      fe80::/64                      ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      ::1/128                        ::                         Un   0   1     0 lo
      ff00::/8                       ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      $
      
      After applying the patch:
      $ route -A inet6
      Kernel IPv6 routing
      table
      Destination                    Next Hop                   Flag Met Ref Use If
      2000::20/128                   ::                         U    256 0     0 eth0
      fe80::/64                      ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      ::1/128                        ::                         Un   0   1     0 lo
      2000::20/128                   ::                         Un   0   1     0 lo
      fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
      ff00::/8                       ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      $ sudo ifdown lo
      $ sudo ifup lo
      $ route -A inet6
      Kernel IPv6 routing table
      Destination                    Next Hop                   Flag Met Ref Use If
      2000::20/128                   ::                         U    256 0     0 eth0
      fe80::/64                      ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      ::1/128                        ::                         Un   0   1     0 lo
      2000::20/128                   ::                         Un   0   1     0 lo
      fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
      ff00::/8                       ::                         U    256 0     0 eth0
      ::/0                           ::                         !n   -1  1     1 lo
      $
      Signed-off-by: NBalakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
      Signed-off-by: NMaruthi Thotad <Maruthi.Thotad@ap.sony.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25fb6ca4
    • V
      cbq: incorrect processing of high limits · f0f6ee1f
      Vasily Averin 提交于
      currently cbq works incorrectly for limits > 10% real link bandwidth,
      and practically does not work for limits > 50% real link bandwidth.
      Below are results of experiments taken on 1 Gbit link
      
       In shaper | Actual Result
      -----------+---------------
        100M     | 108 Mbps
        200M     | 244 Mbps
        300M     | 412 Mbps
        500M     | 893 Mbps
      
      This happen because of q->now changes incorrectly in cbq_dequeue():
      when it is called before real end of packet transmitting,
      L2T is greater than real time delay, q_now gets an extra boost
      but never compensate it.
      
      To fix this problem we prevent change of q->now until its synchronization
      with real time.
      Signed-off-by: NVasily Averin <vvs@openvz.org>
      Reviewed-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0f6ee1f
  5. 30 3月, 2013 5 次提交
    • E
      net: add a synchronize_net() in netdev_rx_handler_unregister() · 00cfec37
      Eric Dumazet 提交于
      commit 35d48903 (bonding: fix rx_handler locking) added a race
      in bonding driver, reported by Steven Rostedt who did a very good
      diagnosis :
      
      <quoting Steven>
      
      I'm currently debugging a crash in an old 3.0-rt kernel that one of our
      customers is seeing. The bug happens with a stress test that loads and
      unloads the bonding module in a loop (I don't know all the details as
      I'm not the one that is directly interacting with the customer). But the
      bug looks to be something that may still be present and possibly present
      in mainline too. It will just be much harder to trigger it in mainline.
      
      In -rt, interrupts are threads, and can schedule in and out just like
      any other thread. Note, mainline now supports interrupt threads so this
      may be easily reproducible in mainline as well. I don't have the ability
      to tell the customer to try mainline or other kernels, so my hands are
      somewhat tied to what I can do.
      
      But according to a core dump, I tracked down that the eth irq thread
      crashed in bond_handle_frame() here:
      
              slave = bond_slave_get_rcu(skb->dev);
              bond = slave->bond; <--- BUG
      
      the slave returned was NULL and accessing slave->bond caused a NULL
      pointer dereference.
      
      Looking at the code that unregisters the handler:
      
      void netdev_rx_handler_unregister(struct net_device *dev)
      {
      
              ASSERT_RTNL();
              RCU_INIT_POINTER(dev->rx_handler, NULL);
              RCU_INIT_POINTER(dev->rx_handler_data, NULL);
      }
      
      Which is basically:
              dev->rx_handler = NULL;
              dev->rx_handler_data = NULL;
      
      And looking at __netif_receive_skb() we have:
      
              rx_handler = rcu_dereference(skb->dev->rx_handler);
              if (rx_handler) {
                      if (pt_prev) {
                              ret = deliver_skb(skb, pt_prev, orig_dev);
                              pt_prev = NULL;
                      }
                      switch (rx_handler(&skb)) {
      
      My question to all of you is, what stops this interrupt from happening
      while the bonding module is unloading?  What happens if the interrupt
      triggers and we have this:
      
              CPU0                    CPU1
              ----                    ----
        rx_handler = skb->dev->rx_handler
      
                              netdev_rx_handler_unregister() {
                                 dev->rx_handler = NULL;
                                 dev->rx_handler_data = NULL;
      
        rx_handler()
         bond_handle_frame() {
          slave = skb->dev->rx_handler;
          bond = slave->bond; <-- NULL pointer dereference!!!
      
      What protection am I missing in the bond release handler that would
      prevent the above from happening?
      
      </quoting Steven>
      
      We can fix bug this in two ways. First is adding a test in
      bond_handle_frame() and others to check if rx_handler_data is NULL.
      
      A second way is adding a synchronize_net() in
      netdev_rx_handler_unregister() to make sure that a rcu protected reader
      has the guarantee to see a non NULL rx_handler_data.
      
      The second way is better as it avoids an extra test in fast path.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jiri Pirko <jpirko@redhat.com>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00cfec37
    • V
      net: fq_codel: Fix off-by-one error · cd68ddd4
      Vijay Subramanian 提交于
      Currently, we hold a max of sch->limit -1 number of packets instead of
      sch->limit packets. Fix this off-by-one error.
      Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd68ddd4
    • S
      net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' · a561cf7e
      Shmulik Ladkani 提交于
      'nf_reset' is called just prior calling 'netif_rx'.
      No need to call it twice.
      Reported-by: NIgor Michailov <rgohita@gmail.com>
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a561cf7e
    • L
      net: fix the use of this_cpu_ptr · 50eab050
      Li RongQing 提交于
      flush_tasklet is not percpu var, and percpu is percpu var, and
      	this_cpu_ptr(&info->cache->percpu->flush_tasklet)
      is not equal to
      	&this_cpu_ptr(info->cache->percpu)->flush_tasklet
      
      1f743b07(use this_cpu_ptr per-cpu helper) introduced this bug.
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50eab050
    • H
      ipv6: don't accept node local multicast traffic from the wire · 1c4a154e
      Hannes Frederic Sowa 提交于
      Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been
      verified: http://www.rfc-editor.org/errata_search.php?eid=3480
      
      We have to check for pkt_type and loopback flag because either the
      packets are allowed to travel over the loopback interface (in which case
      pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel
      over a non-loopback interface back to us (in which case PACKET_TYPE is
      PACKET_LOOPBACK and IFF_LOOPBACK flag is not set).
      
      Cc: Erik Hugne <erik.hugne@ericsson.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c4a154e
  6. 28 3月, 2013 3 次提交
  7. 27 3月, 2013 2 次提交
  8. 26 3月, 2013 3 次提交
  9. 25 3月, 2013 10 次提交
  10. 24 3月, 2013 1 次提交