1. 28 2月, 2015 2 次提交
  2. 17 11月, 2014 1 次提交
    • D
      ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs · feb91a02
      Daniel Borkmann 提交于
      It has been reported that generating an MLD listener report on
      devices with large MTUs (e.g. 9000) and a high number of IPv6
      addresses can trigger a skb_over_panic():
      
      skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
      head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
      dev:port1
       ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:100!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: ixgbe(O)
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff80578226>] ? skb_put+0x3a/0x3b
       [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e
       [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4
       [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d
       [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45
       [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68
       [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182
       [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d
       [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3
       [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46
       [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70
      
      mld_newpack() skb allocations are usually requested with dev->mtu
      in size, since commit 72e09ad1 ("ipv6: avoid high order allocations")
      we have changed the limit in order to be less likely to fail.
      
      However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
      macros, which determine if we may end up doing an skb_put() for
      adding another record. To avoid possible fragmentation, we check
      the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
      assumption as the actual max allocation size can be much smaller.
      
      The IGMP case doesn't have this issue as commit 57e1ab6e
      ("igmp: refine skb allocations") stores the allocation size in
      the cb[].
      
      Set a reserved_tailroom to make it fit into the MTU and use
      skb_availroom() helper instead. This also allows to get rid of
      igmp_skb_size().
      Reported-by: NWei Liu <lw1a2.jing@gmail.com>
      Fixes: 72e09ad1 ("ipv6: avoid high order allocations")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: David L Stevens <david.stevens@oracle.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      feb91a02
  3. 06 11月, 2014 2 次提交
    • D
      ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs · 4c672e4b
      Daniel Borkmann 提交于
      It has been reported that generating an MLD listener report on
      devices with large MTUs (e.g. 9000) and a high number of IPv6
      addresses can trigger a skb_over_panic():
      
      skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
      head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
      dev:port1
       ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:100!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: ixgbe(O)
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff80578226>] ? skb_put+0x3a/0x3b
       [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e
       [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4
       [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d
       [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45
       [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68
       [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182
       [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d
       [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3
       [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46
       [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70
      
      mld_newpack() skb allocations are usually requested with dev->mtu
      in size, since commit 72e09ad1 ("ipv6: avoid high order allocations")
      we have changed the limit in order to be less likely to fail.
      
      However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
      macros, which determine if we may end up doing an skb_put() for
      adding another record. To avoid possible fragmentation, we check
      the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
      assumption as the actual max allocation size can be much smaller.
      
      The IGMP case doesn't have this issue as commit 57e1ab6e
      ("igmp: refine skb allocations") stores the allocation size in
      the cb[].
      
      Set a reserved_tailroom to make it fit into the MTU and use
      skb_availroom() helper instead. This also allows to get rid of
      igmp_skb_size().
      Reported-by: NWei Liu <lw1a2.jing@gmail.com>
      Fixes: 72e09ad1 ("ipv6: avoid high order allocations")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: David L Stevens <david.stevens@oracle.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c672e4b
    • J
      net: Convert SEQ_START_TOKEN/seq_printf to seq_puts · 1744bea1
      Joe Perches 提交于
      Using a single fixed string is smaller code size than using
      a format and many string arguments.
      
      Reduces overall code size a little.
      
      $ size net/ipv4/igmp.o* net/ipv6/mcast.o* net/ipv6/ip6_flowlabel.o*
         text	   data	    bss	    dec	    hex	filename
        34269	   7012	  14824	  56105	   db29	net/ipv4/igmp.o.new
        34315	   7012	  14824	  56151	   db57	net/ipv4/igmp.o.old
        30078	   7869	  13200	  51147	   c7cb	net/ipv6/mcast.o.new
        30105	   7869	  13200	  51174	   c7e6	net/ipv6/mcast.o.old
        11434	   3748	   8580	  23762	   5cd2	net/ipv6/ip6_flowlabel.o.new
        11491	   3748	   8580	  23819	   5d0b	net/ipv6/ip6_flowlabel.o.old
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1744bea1
  4. 23 9月, 2014 1 次提交
    • D
      ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback · 35f7aa53
      Daniel Borkmann 提交于
      RFC2710 (MLDv1), section 3.7. says:
      
        The length of a received MLD message is computed by taking the
        IPv6 Payload Length value and subtracting the length of any IPv6
        extension headers present between the IPv6 header and the MLD
        message. If that length is greater than 24 octets, that indicates
        that there are other fields present *beyond* the fields described
        above, perhaps belonging to a *future backwards-compatible* version
        of MLD. An implementation of the version of MLD specified in this
        document *MUST NOT* send an MLD message longer than 24 octets and
        MUST ignore anything past the first 24 octets of a received MLD
        message.
      
      RFC3810 (MLDv2), section 8.2.1. states for *listeners* regarding
      presence of MLDv1 routers:
      
        In order to be compatible with MLDv1 routers, MLDv2 hosts MUST
        operate in version 1 compatibility mode. [...] When Host
        Compatibility Mode is MLDv2, a host acts using the MLDv2 protocol
        on that interface. When Host Compatibility Mode is MLDv1, a host
        acts in MLDv1 compatibility mode, using *only* the MLDv1 protocol,
        on that interface. [...]
      
      While section 8.3.1. specifies *router* behaviour regarding presence
      of MLDv1 routers:
      
        MLDv2 routers may be placed on a network where there is at least
        one MLDv1 router. The following requirements apply:
      
        If an MLDv1 router is present on the link, the Querier MUST use
        the *lowest* version of MLD present on the network. This must be
        administratively assured. Routers that desire to be compatible
        with MLDv1 MUST have a configuration option to act in MLDv1 mode;
        if an MLDv1 router is present on the link, the system administrator
        must explicitly configure all MLDv2 routers to act in MLDv1 mode.
        When in MLDv1 mode, the Querier MUST send periodic General Queries
        truncated at the Multicast Address field (i.e., 24 bytes long),
        and SHOULD also warn about receiving an MLDv2 Query (such warnings
        must be rate-limited). The Querier MUST also fill in the Maximum
        Response Delay in the Maximum Response Code field, i.e., the
        exponential algorithm described in section 5.1.3. is not used. [...]
      
      That means that we should not get queries from different versions of
      MLD. When there's a MLDv1 router present, MLDv2 enforces truncation
      and MRC == MRD (both fields are overlapping within the 24 octet range).
      
      Section 8.3.2. specifies behaviour in the presence of MLDv1 multicast
      address *listeners*:
      
        MLDv2 routers may be placed on a network where there are hosts
        that have not yet been upgraded to MLDv2. In order to be compatible
        with MLDv1 hosts, MLDv2 routers MUST operate in version 1 compatibility
        mode. MLDv2 routers keep a compatibility mode per multicast address
        record. The compatibility mode of a multicast address is determined
        from the Multicast Address Compatibility Mode variable, which can be
        in one of the two following states: MLDv1 or MLDv2.
      
        The Multicast Address Compatibility Mode of a multicast address
        record is set to MLDv1 whenever an MLDv1 Multicast Listener Report is
        *received* for that multicast address. At the same time, the Older
        Version Host Present timer for the multicast address is set to Older
        Version Host Present Timeout seconds. The timer is re-set whenever a
        new MLDv1 Report is received for that multicast address. If the Older
        Version Host Present timer expires, the router switches back to
        Multicast Address Compatibility Mode of MLDv2 for that multicast
        address. [...]
      
      That means, what can happen is the following scenario, that hosts can
      act in MLDv1 compatibility mode when they previously have received an
      MLDv1 query (or, simply operate in MLDv1 mode-only); and at the same
      time, an MLDv2 router could start up and transmits MLDv2 startup query
      messages while being unaware of the current operational mode.
      
      Given RFC2710, section 3.7 we would need to answer to that with an MLDv1
      listener report, so that the router according to RFC3810, section 8.3.2.
      would receive that and internally switch to MLDv1 compatibility as well.
      
      Right now, I believe since the initial implementation of MLDv2, Linux
      hosts would just silently drop such MLDv2 queries instead of replying
      with an MLDv1 listener report, which would prevent a MLDv2 router going
      into fallback mode (until it receives other MLDv1 queries).
      
      Since the mapping of MRC to MRD in exactly such cases can make use of
      the exponential algorithm from 5.1.3, we cannot [strictly speaking] be
      aware in MLDv1 of the encoding in MRC, it seems also not mentioned by
      the RFC. Since encodings are the same up to 32767, assume in such a
      situation this value as a hard upper limit we would clamp. We have asked
      one of the RFC authors on that regard, and he mentioned that there seem
      not to be any implementations that make use of that exponential algorithm
      on startup messages. In any case, this patch fixes this MLD
      interoperability issue.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f7aa53
  5. 14 9月, 2014 4 次提交
  6. 10 9月, 2014 1 次提交
  7. 06 9月, 2014 1 次提交
  8. 05 9月, 2014 1 次提交
  9. 25 8月, 2014 1 次提交
    • I
      ipv6: White-space cleansing : Line Layouts · 67ba4152
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      A number of items are addressed in this patch:
      * Multiple spaces converted to tabs
      * Spaces before tabs removed.
      * Spaces in pointer typing cleansed (char *)foo etc.
      * Remove space after sizeof
      * Ensure spacing around comparators such as if statements.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67ba4152
  10. 27 6月, 2014 1 次提交
  11. 01 4月, 2014 1 次提交
  12. 18 1月, 2014 1 次提交
    • F
      ipv6: send Change Status Report after DAD is completed · 6a7cc418
      Flavio Leitner 提交于
      The RFC 3810 defines two type of messages for multicast
      listeners. The "Current State Report" message, as the name
      implies, refreshes the *current* state to the querier.
      Since the querier sends Query messages periodically, there
      is no need to retransmit the report.
      
      On the other hand, any change should be reported immediately
      using "State Change Report" messages. Since it's an event
      triggered by a change and that it can be affected by packet
      loss, the rfc states it should be retransmitted [RobVar] times
      to make sure routers will receive timely.
      
      Currently, we are sending "Current State Reports" after
      DAD is completed.  Before that, we send messages using
      unspecified address (::) which should be silently discarded
      by routers.
      
      This patch changes to send "State Change Report" messages
      after DAD is completed fixing the behavior to be RFC compliant
      and also to pass TAHI IPv6 testsuite.
      Signed-off-by: NFlavio Leitner <fbl@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a7cc418
  13. 15 1月, 2014 1 次提交
  14. 01 10月, 2013 1 次提交
  15. 05 9月, 2013 7 次提交
  16. 21 8月, 2013 3 次提交
  17. 14 8月, 2013 1 次提交
    • H
      ipv6: make unsolicited report intervals configurable for mld · fc4eba58
      Hannes Frederic Sowa 提交于
      Commit cab70040 ("net: igmp:
      Reduce Unsolicited report interval to 1s when using IGMPv3") and
      2690048c ("net: igmp: Allow user-space
      configuration of igmp unsolicited report interval") by William Manley made
      igmp unsolicited report intervals configurable per interface and corrected
      the interval of unsolicited igmpv3 report messages resendings to 1s.
      
      Same needs to be done for IPv6:
      
      MLDv1 (RFC2710 7.10.): 10 seconds
      MLDv2 (RFC3810 9.11.): 1 second
      
      Both intervals are configurable via new procfs knobs
      mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.
      
      (also added .force_mld_version to ipv6_devconf_dflt to bring structs in
      line without semantic changes)
      
      v2:
      a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
         unsolicited_report_interval procfs knobs.
      b) incorporate stylistic feedback from William Manley
      
      v3:
      a) add new DEVCONF_* values to the end of the enum (thanks to David
         Miller)
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: William Manley <william.manley@youview.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4eba58
  18. 29 7月, 2013 1 次提交
  19. 02 7月, 2013 1 次提交
    • A
      ipv6,mcast: always hold idev->lock before mca_lock · 8965779d
      Amerigo Wang 提交于
      dingtianhong reported the following deadlock detected by lockdep:
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.4.24.05-0.1-default #1 Not tainted
       -------------------------------------------------------
       ksoftirqd/0/3 is trying to acquire lock:
        (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
      
       but task is already holding lock:
        (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&mc->mca_lock){+.+...}:
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
              [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
              [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
              [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
              [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
              [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
              [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
              [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
              [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
              [<ffffffff813d9360>] dev_change_flags+0x40/0x70
              [<ffffffff813ea627>] do_setlink+0x237/0x8a0
              [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
              [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
              [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
              [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
              [<ffffffff81403e20>] netlink_unicast+0x140/0x180
              [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
              [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
              [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
              [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
              [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&ndev->lock){+.+...}:
              [<ffffffff810a798e>] check_prev_add+0x3de/0x440
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
              [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
              [<ffffffff8149b036>] mld_newpack+0xb6/0x160
              [<ffffffff8149b18b>] add_grhead+0xab/0xc0
              [<ffffffff8149d03b>] add_grec+0x3ab/0x460
              [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
              [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
              [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
              [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
              [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
              [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
              [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
              [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
              [<ffffffff8106c7de>] kthread+0xae/0xc0
              [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
      
      actually we can just hold idev->lock before taking pmc->mca_lock,
      and avoid taking idev->lock again when iterating idev->addr_list,
      since the upper callers of mld_newpack() already take
      read_lock_bh(&idev->lock).
      Reported-by: Ndingtianhong <dingtianhong@huawei.com>
      Cc: dingtianhong <dingtianhong@huawei.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Tested-by: NDing Tianhong <dingtianhong@huawei.com>
      Tested-by: NChen Weilong <chenweilong@huawei.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8965779d
  20. 29 6月, 2013 1 次提交
  21. 29 5月, 2013 1 次提交
  22. 19 2月, 2013 2 次提交
  23. 11 2月, 2013 1 次提交
  24. 05 2月, 2013 1 次提交
  25. 30 1月, 2013 1 次提交
  26. 22 1月, 2013 1 次提交