1. 12 6月, 2014 19 次提交
    • F
      net_sched: drr: warn when qdisc is not work conserving · 6e765a00
      Florian Westphal 提交于
      The DRR scheduler requires that items on the active list are work
      conserving, i.e. do not hold on to skbs for throttling purposes, etc.
      Attaching e.g. tbf renders DRR useless because all other classes on the
      active list are delayed as well.
      
      So, warn users that this configuration won't work as expected; we
      already do this in couple of other qdiscs, see e.g.
      
      commit b00355db
      ('pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving warning handler')
      
      The 'const' change is needed to avoid compiler warning ("discards 'const'
      qualifier from pointer target type").
      
      tested with:
      drr_hier() {
              parent=$1
              classes=$2
              for i in  $(seq 1 $classes); do
                      classid=$parent$(printf %x $i)
                      tc class add dev eth0 parent $parent classid $classid drr
      		tc qdisc add dev eth0 parent $classid tbf rate 64kbit burst 256kbit limit 64kbit
              done
      }
      tc qdisc add dev eth0 root handle 1: drr
      drr_hier 1: 32
      tc filter add dev eth0 protocol all pref 1 parent 1: handle 1 flow hash keys dst perturb 1 divisor 32
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e765a00
    • T
      net: Add skb_gro_postpull_rcsum to udp and vxlan · 6bae1d4c
      Tom Herbert 提交于
      Need to gro_postpull_rcsum for GRO to work with checksum complete.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bae1d4c
    • T
      net: Save software checksum complete · 7e3cead5
      Tom Herbert 提交于
      In skb_checksum complete, if we need to compute the checksum for the
      packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
      Subsequent checksum verification can use this.
      
      Also, added csum_complete_sw flag to distinguish between software and
      hardware generated checksum complete, we should always be able to trust
      the software computation.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e3cead5
    • S
      ceph: remove bogus extern · f6479449
      stephen hemminger 提交于
      Sparse complained about this bogus extern on definition of
      a function.
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6479449
    • E
      ipv4: fix a race in ip4_datagram_release_cb() · 9709674e
      Eric Dumazet 提交于
      Alexey gave a AddressSanitizer[1] report that finally gave a good hint
      at where was the origin of various problems already reported by Dormando
      in the past [2]
      
      Problem comes from the fact that UDP can have a lockless TX path, and
      concurrent threads can manipulate sk_dst_cache, while another thread,
      is holding socket lock and calls __sk_dst_set() in
      ip4_datagram_release_cb() (this was added in linux-3.8)
      
      It seems that all we need to do is to use sk_dst_check() and
      sk_dst_set() so that all the writers hold same spinlock
      (sk->sk_dst_lock) to prevent corruptions.
      
      TCP stack do not need this protection, as all sk_dst_cache writers hold
      the socket lock.
      
      [1]
      https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
      
      AddressSanitizer: heap-use-after-free in ipv4_dst_check
      Read of size 2 by thread T15453:
       [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
       [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
       [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0
       [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413
       [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      Freed by thread T15455:
       [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
       [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280
       [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      Allocated by thread T15453:
       [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
       [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
       [<     inlined    >] __ip_route_output_key+0x3e8/0xf70
      __mkroute_output ./net/ipv4/route.c:1939
       [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
       [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
       [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      [2]
      <4>[196727.311203] general protection fault: 0000 [#1] SMP
      <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
      <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
      <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
      <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
      <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>]  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
      <4>[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
      <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
      <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
      <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
      <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
      <4>[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
      <4>[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
      <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      <4>[196727.311713] Stack:
      <4>[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
      <4>[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
      <4>[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
      <4>[196727.311885] Call Trace:
      <4>[196727.311907]  <IRQ>
      <4>[196727.311912]  [<ffffffff815b7f42>] dst_destroy+0x32/0xe0
      <4>[196727.311959]  [<ffffffff815b86c6>] dst_release+0x56/0x80
      <4>[196727.311986]  [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0
      <4>[196727.312013]  [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820
      <4>[196727.312041]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
      <4>[196727.312070]  [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150
      <4>[196727.312097]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
      <4>[196727.312125]  [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230
      <4>[196727.312154]  [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90
      <4>[196727.312183]  [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360
      <4>[196727.312212]  [<ffffffff815fe00b>] ip_rcv+0x22b/0x340
      <4>[196727.312242]  [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan]
      <4>[196727.312275]  [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640
      <4>[196727.312308]  [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150
      <4>[196727.312338]  [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70
      <4>[196727.312368]  [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0
      <4>[196727.312397]  [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140
      <4>[196727.312433]  [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe]
      <4>[196727.312463]  [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340
      <4>[196727.312491]  [<ffffffff815b1691>] net_rx_action+0x111/0x210
      <4>[196727.312521]  [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70
      <4>[196727.312552]  [<ffffffff810519d0>] __do_softirq+0xd0/0x270
      <4>[196727.312583]  [<ffffffff816cef3c>] call_softirq+0x1c/0x30
      <4>[196727.312613]  [<ffffffff81004205>] do_softirq+0x55/0x90
      <4>[196727.312640]  [<ffffffff81051c85>] irq_exit+0x55/0x60
      <4>[196727.312668]  [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0
      <4>[196727.312696]  [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a
      <4>[196727.312722]  <EOI>
      <1>[196727.313071] RIP  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
      <4>[196727.313100]  RSP <ffff885effd23a70>
      <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
      <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt
      Reported-by: NAlexey Preobrazhensky <preobr@google.com>
      Reported-by: Ndormando <dormando@rydia.ne>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 8141ed9f ("ipv4: Add a socket release callback for datagram sockets")
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9709674e
    • O
      net: add __pskb_copy_fclone and pskb_copy_for_clone · bad93e9d
      Octavian Purdila 提交于
      There are several instances where a pskb_copy or __pskb_copy is
      immediately followed by an skb_clone.
      
      Add a couple of new functions to allow the copy skb to be allocated
      from the fclone cache and thus speed up subsequent skb_clone calls.
      
      Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Marek Lindner <mareklindner@neomailbox.ch>
      Cc: Simon Wunderlich <sw@simonwunderlich.de>
      Cc: Antonio Quartulli <antonio@meshcoding.com>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Johan Hedberg <johan.hedberg@gmail.com>
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
      Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Allan Stephens <allan.stephens@windriver.com>
      Cc: Andrew Hendry <andrew.hendry@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bad93e9d
    • T
      bridge: Support 802.1ad vlan filtering · 204177f3
      Toshiaki Makita 提交于
      This enables us to change the vlan protocol for vlan filtering.
      We come to be able to filter frames on the basis of 802.1ad vlan tags
      through a bridge.
      
      This also changes br->group_addr if it has not been set by user.
      This is needed for an 802.1ad bridge.
      (See IEEE 802.1Q-2011 8.13.5.)
      
      Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad
      bridge can forward the Nearest Customer Bridge group addresses except
      for br->group_addr, which should be passed to higher layer.
      
      To change the vlan protocol, write a protocol in sysfs:
      # echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      204177f3
    • T
      bridge: Prepare for forwarding another bridge group addresses · f2808d22
      Toshiaki Makita 提交于
      If a bridge is an 802.1ad bridge, it must forward another bridge group
      addresses (the Nearest Customer Bridge group addresses).
      (For details, see IEEE 802.1Q-2011 8.6.3.)
      
      As user might not want group_fwd_mask to be modified by enabling 802.1ad,
      introduce a new mask, group_fwd_mask_required, which indicates addresses
      the bridge wants to forward. This will be set by enabling 802.1ad.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2808d22
    • T
      bridge: Prepare for 802.1ad vlan filtering support · 8580e211
      Toshiaki Makita 提交于
      This enables a bridge to have vlan protocol informantion and allows vlan
      tag manipulation (retrieve, insert and remove tags) according to the vlan
      protocol.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8580e211
    • T
      bridge: Add 802.1ad tx vlan acceleration · 1c5abb6c
      Toshiaki Makita 提交于
      Bridge device doesn't need to embed S-tag into skb->data.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c5abb6c
    • A
      net: filter: fix warning on 32-bit arch · 61f83d0d
      Alexei Starovoitov 提交于
      fix compiler warning on 32-bit architectures:
      
      net/core/filter.c: In function '__sk_run_filter':
      net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61f83d0d
    • J
      tipc: fix potential bug in function tipc_backlog_rcv · 02c00c2a
      Jon Paul Maloy 提交于
      In commit 4f4482dc ("tipc: compensate
      for double accounting in socket rcv buffer") we access 'truesize' of
      a received buffer after it might have been released by the function
      filter_rcv().
      
      In this commit we correct this by reading the value of 'truesize' to
      the stack before delivering the buffer to filter_rcv().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02c00c2a
    • D
      net: sctp: fix incorrect type in gfp initializer · 9b87d465
      Daniel Borkmann 提交于
      This fixes the following sparse warning:
      
        net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types)
        net/sctp/associola.c:1556:29:    expected bool [unsigned] [usertype] preload
        net/sctp/associola.c:1556:29:    got restricted gfp_t
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b87d465
    • D
      net: sctp: improve sctp_select_active_and_retran_path selection · a7288c4d
      Daniel Borkmann 提交于
      In function sctp_select_active_and_retran_path(), we walk the
      transport list in order to look for the two most recently used
      ACTIVE transports (trans_pri, trans_sec). In case we didn't find
      anything ACTIVE, we currently just camp on a possibly PF or
      INACTIVE transport that is primary path; this behavior actually
      dates back to linux-history tree of the very early days of
      lksctp, and can yield a behavior that chooses suboptimal
      transport paths.
      
      Instead, be a bit more clever by reusing and extending the
      recently introduced sctp_trans_elect_best() handler. In case
      both transports are evaluated to have the same score resulting
      from their states, break the tie by looking at: 1) transport
      patch error count 2) last_time_heard value from each transport.
      
      This is analogous to Nishida's Quick Failover draft [1],
      section 5.1, 3:
      
        The sender SHOULD avoid data transmission to PF destinations.
        When all destinations are in either PF or Inactive state,
        the sender MAY either move the destination from PF to active
        state (and transmit data to the active destination) or the
        sender MAY transmit data to a PF destination. In the former
        scenario, (i) the sender MUST NOT notify the ULP about the
        state transition, and (ii) MUST NOT clear the destination's
        error counter. It is recommended that the sender picks the
        PF destination with least error count (fewest consecutive
        timeouts) for data transmission. In case of a tie (multiple PF
        destinations with same error count), the sender MAY choose the
        last active destination.
      
      Thus for sctp_select_active_and_retran_path(), we keep track of
      the best, if any, transport that is in PF state and in case no
      ACTIVE transport has been found (hence trans_{pri,sec} is NULL),
      we select the best out of the three: current primary_path and
      retran_path as well as a possible PF transport.
      
      The secondary may still camp on the original primary_path as
      before. The change in sctp_trans_elect_best() with a more fine
      grained tie selection also improves at the same time path selection
      for sctp_assoc_update_retran_path() in case of non-ACTIVE states.
      
        [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7288c4d
    • D
      net: sctp: migrate most recently used transport to ktime · e575235f
      Daniel Borkmann 提交于
      Be more precise in transport path selection and use ktime
      helpers instead of jiffies to compare and pick the better
      primary and secondary recently used transports. This also
      avoids any side-effects during a possible roll-over, and
      could lead to better path decision-making.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e575235f
    • D
      net: sctp: refactor active path selection · b82e8f31
      Daniel Borkmann 提交于
      This patch just refactors and moves the code for the active
      path selection into its own helper function outside of
      sctp_assoc_control_transport() which is already big enough.
      No functional changes here.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b82e8f31
    • D
      ktime: add ktime_after and ktime_before helper · 67cb9366
      Daniel Borkmann 提交于
      Add two minimal helper functions analogous to time_before() and
      time_after() that will later on both be needed by SCTP code.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67cb9366
    • P
      mac802154: don't deliver packets to devices that are down · 2d3b5b0a
      Phoebe Buckheister 提交于
      Only one WPAN devices can be active at any given time, so only deliver
      packets to that one interface that is actually up. Multiple monitors may
      be up at any given time, but we don't have to deliver to monitors that
      are down either.
      Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d3b5b0a
    • P
      mac802154: properly free incoming skbs on decryption failure · a374eeb5
      Phoebe Buckheister 提交于
      mac802154 RX did not free skbs on decryption failure, assuming that the
      caller would when the local rx handler returned _DROP. This was false.
      Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a374eeb5
  2. 11 6月, 2014 7 次提交
  3. 07 6月, 2014 1 次提交
  4. 06 6月, 2014 4 次提交
  5. 05 6月, 2014 9 次提交