1. 03 9月, 2014 14 次提交
    • P
      netfilter: nfnetlink: deliver netlink errors on batch completion · cbb8125e
      Pablo Neira Ayuso 提交于
      We have to wait until the full batch has been processed to deliver the
      netlink error messages to userspace. Otherwise, we may deliver
      duplicated errors to userspace in case that we need to abort and replay
      the transaction if any of the required modules needs to be autoloaded.
      
      A simple way to reproduce this (assumming nft_meta is not loaded) with
      the following test file:
      
       add table filter
       add chain filter test
       add chain bad test                 # intentional wrong unexistent table
       add rule filter test meta mark 0
      
      Then, when trying to load the batch:
      
       # nft -f test
       test:4:1-19: Error: Could not process rule: No such file or directory
       add chain bad test
       ^^^^^^^^^^^^^^^^^^^
       test:4:1-19: Error: Could not process rule: No such file or directory
       add chain bad test
       ^^^^^^^^^^^^^^^^^^^
      
      The error is reported twice, once when the batch is aborted due to
      missing nft_meta and another when it is fully processed.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      cbb8125e
    • P
      rhashtable: fix lockdep splat in rhashtable_destroy() · ae82ddcf
      Pablo Neira Ayuso 提交于
      No need for rht_dereference() from rhashtable_destroy() since the
      existing callers don't hold the mutex when invoking this function
      from:
      
      1) Netlink, this is called in case of memory allocation errors in the
         initialization path, no nl_sk_hash_lock is held.
      2) Netfilter, this is called from the rcu callback, no nfnl_lock is
         held either.
      
      I think it's reasonable to assume that the caller has to make sure
      that no hash resizing may happen before releasing the bucket array.
      Therefore, the caller should be responsible for releasing this in a
      safe way, document this to make people aware of it.
      
      This resolves a rcu lockdep splat in nft_hash:
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      3.16.0+ #178 Not tainted
      -------------------------------
      lib/rhashtable.c:596 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 1
      1 lock held by ksoftirqd/2/18:
       #0:  (rcu_callback){......}, at: [<ffffffff810918fd>] rcu_process_callbacks+0x27e/0x4c7
      
      stack backtrace:
      CPU: 2 PID: 18 Comm: ksoftirqd/2 Not tainted 3.16.0+ #178
      Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
       0000000000000001 ffff88011706bb68 ffffffff8143debc 0000000000000000
       ffff880117062610 ffff88011706bb98 ffffffff81077515 ffff8800ca041a50
       0000000000000004 ffff8800ca386480 ffff8800ca041a00 ffff88011706bbb8
      Call Trace:
       [<ffffffff8143debc>] dump_stack+0x4e/0x68
       [<ffffffff81077515>] lockdep_rcu_suspicious+0xfa/0x103
       [<ffffffff81228b1b>] rhashtable_destroy+0x46/0x52
       [<ffffffffa06f21a7>] nft_hash_destroy+0x73/0x82 [nft_hash]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      ae82ddcf
    • P
      netfilter: nft_rbtree: no need for spinlock from set destroy path · d99407f4
      Pablo Neira Ayuso 提交于
      The sets are released from the rcu callback, after the rule is removed
      from the chain list, which implies that nfnetlink cannot update the
      rbtree and no packets are walking on the set anymore. Thus, we can get
      rid of the spinlock in the set destroy path there.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Reviewied-by: NThomas Graf <tgraf@suug.ch>
      d99407f4
    • P
      netfilter: nft_hash: no need for rcu in the hash set destroy path · 39f39016
      Pablo Neira Ayuso 提交于
      The sets are released from the rcu callback, after the rule is removed
      from the chain list, which implies that nfnetlink cannot update the
      hashes (thus, no resizing may occur) and no packets are walking on the
      set anymore.
      
      This resolves a lockdep splat in the nft_hash_destroy() path since the
      nfnl mutex is not held there.
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      3.16.0-rc2+ #168 Not tainted
      -------------------------------
      net/netfilter/nft_hash.c:362 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 1
      1 lock held by ksoftirqd/0/3:
       #0:  (rcu_callback){......}, at: [<ffffffff81096393>] rcu_process_callbacks+0x27e/0x4c7
      
      stack backtrace:
      CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.16.0-rc2+ #168
      Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
       0000000000000001 ffff88011769bb98 ffffffff8142c922 0000000000000006
       ffff880117694090 ffff88011769bbc8 ffffffff8107c3ff ffff8800cba52400
       ffff8800c476bea8 ffff8800c476bea8 ffff8800cba52400 ffff88011769bc08
      Call Trace:
       [<ffffffff8142c922>] dump_stack+0x4e/0x68
       [<ffffffff8107c3ff>] lockdep_rcu_suspicious+0xfa/0x103
       [<ffffffffa079931e>] nft_hash_destroy+0x50/0x137 [nft_hash]
       [<ffffffffa078cd57>] nft_set_destroy+0x11/0x2a [nf_tables]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      39f39016
    • L
      amd-xgbe: Fix initialization of the wrong spin lock · bec6bfb2
      Lendacky, Thomas 提交于
      During allocation and initialization of the network driver structures,
      the wrong pointer is used to initialize a spin lock. Fix the spin lock
      initialization by using the proper pointer.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bec6bfb2
    • L
      openvswitch: fix a memory leak · 4ee45ea0
      Li RongQing 提交于
      The user_skb maybe be leaked if the operation on it failed and codes
      skipped into the label "out:" without calling genlmsg_unicast.
      
      Cc: Pravin Shelar <pshelar@nicira.com>
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ee45ea0
    • P
      netfilter: fix missing dependencies in NETFILTER_XT_TARGET_LOG · 41ad82f7
      Pablo Neira 提交于
      make defconfig reports:
      
      warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NETFILTER_ADVANCED)
      
      Fixes: d79a61d6 netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_*
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41ad82f7
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · abccc587
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      pull request: Netfilter/IPVS fixes for net
      
      The following patchset contains seven Netfilter fixes for your net
      tree, they are:
      
      1) Make the NAT infrastructure independent of x_tables, some users are
         already starting to test nf_tables with NAT without enabling x_tables.
         Without this patch for Kconfig, there's a superfluous dependency
         between NAT and x_tables.
      2) Allow to use 0 in the cgroup match, the kernel rejects with -EINVAL
         with no good reason. From Daniel Borkmann.
      
      3) Select CONFIG_NF_NAT from the nf_tables NAT expression, this also
         resolves another NAT dependency with x_tables.
      
      4) Use HAVE_JUMP_LABEL instead of CONFIG_JUMP_LABEL in the Netfilter hook
         code as elsewhere in the kernel to resolve toolchain problems, from
         Zhouyi Zhou.
      
      5) Use iptunnel_handle_offloads() to set up tunnel encapsulation
         depending on the offload capabilities, reported by Alex Gartrell
         patch from Julian Anastasov.
      
      6) Fix wrong family when registering the ip_vs_local_reply6() hook,
         also from Julian.
      
      7) Select the NF_LOG_* symbols from NETFILTER_XT_TARGET_LOG. Rafał
         Miłecki reported that when jumping from 3.16 to 3.17-rc, his log
         target is not selected anymore due to changes in the previous
         development cycle to accomodate the full logging support for
         nf_tables.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abccc587
    • M
      bnx2x: Configure device endianity on driver load and reset endianity on removal. · 04860eb7
      Manish Chopra 提交于
      Some hosts can be both little and big endian.
      In certain scenarios a big endian kernel can kexec a little endian kernel.
      
      This patch fixes this case from both ends:
      1) Return endianity to original values on shutdown (in case little endian kernel boots after we shutdown).
      2) Do not rely on HW reset values when loading driver in little endian kernel
         but configure them explicitly (in case previous kernel was big endian and did not reset the HW).
      Signed-off-by: NManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: NAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04860eb7
    • E
      qeth: don't query for info if hardware not ready. · 511c2445
      Eugene Crosser 提交于
      When qeth device is queried for ethtool data, hardware operation
      is performed to extract the necessary information from the card.
      If the card is not online at the moment (e.g. it is undergoing
      recovery), this operation produces undesired effects like
      temporarily freezing the system. This patch prevents execution
      of the hardware query operation when the card is not online.
      In such case, ioctl() operation returns error with errno ENODEV.
      Reviewed-by: NUrsula Braun <ursula.braun@de.ibm.com>
      Signed-off-by: NEugene Crosser <Eugene.Crosser@ru.ibm.com>
      Signed-off-by: NFrank Blaschka <blaschka@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      511c2445
    • B
      net: calxedaxgmac: fix driver dependencies · c24f3379
      Bartlomiej Zolnierkiewicz 提交于
      Calxeda 1G/10G XGMAC Ethernet support should be available only on
      Calxeda ECX-1000/2000 (Highbank/Midway) platforms.
      Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: NKyungmin Park <kyungmin.park@samsung.com>
      Cc: Rob Herring <robh@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c24f3379
    • B
      net: sh_eth: fix driver dependencies · f6ec9c33
      Bartlomiej Zolnierkiewicz 提交于
      Renesas SuperH Ethernet support should be available only on
      Renesas ARM SoCs and SuperH architecture.
      Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: NKyungmin Park <kyungmin.park@samsung.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: NSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6ec9c33
    • R
      net: lpc_eth: Fix crash on ip link up · aff88a06
      Roland Stigge 提交于
      When a link is already up, the following sequence makes the kernel
      block completely:
      
        ip link set dev eth0 down
        ip link set dev eth0 up
      
      This is because on suspended phy, the following lines
      
        __lpc_eth_reset(pldat);
        __lpc_eth_init(pldat);
      
      make the LPC ethernet core block (see LPC32x0 manual). The PHY needs to be
      (re-)activated low-level first.
      Signed-off-by: NRoland Stigge <stigge@antcom.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aff88a06
    • I
      tg3: prevent ifup/ifdown during PCI error recovery · 0486a063
      Ivan Vecera 提交于
      The patch fixes race conditions between PCI error recovery callbacks and
      potential ifup/ifdown.
      
      First, if ifup (tg3_open) is called between tg3_io_error_detected() and
      tg3_io_resume() then tp->timer is armed twice before expiry. Once during
      tg3_open() and again during tg3_io_resume(). This results in BUG
      at kernel/time/timer.c:945.
      
      Second, if ifdown (tg3_close) is called between tg3_io_error_detected()
      and tg3_io_resume() then tg3_napi_disable() is called twice without
      a tg3_napi_enable between. Once during tg3_io_error_detected() and again
      during tg3_close(). The tg3_io_resume() then hangs on rtnl_lock().
      
      v2: Added logging messages per Prashant's request
      
      Cc: Prashant Sreedharan <prashant@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Acked-by: NPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0486a063
  2. 02 9月, 2014 16 次提交
  3. 01 9月, 2014 6 次提交
  4. 30 8月, 2014 4 次提交
    • D
      net: sctp: fix ABI mismatch through sctp_assoc_to_state helper · 38ab1fa9
      Daniel Borkmann 提交于
      Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
      tree, the official <netinet/sctp.h> header carries a copy of enum
      sctp_sstat_state that looks like (compared to the current in-kernel
      enumeration):
      
        User definition:                     Kernel definition:
      
        enum sctp_sstat_state {              typedef enum {
          SCTP_EMPTY             = 0,          <removed>
          SCTP_CLOSED            = 1,          SCTP_STATE_CLOSED            = 0,
          SCTP_COOKIE_WAIT       = 2,          SCTP_STATE_COOKIE_WAIT       = 1,
          SCTP_COOKIE_ECHOED     = 3,          SCTP_STATE_COOKIE_ECHOED     = 2,
          SCTP_ESTABLISHED       = 4,          SCTP_STATE_ESTABLISHED       = 3,
          SCTP_SHUTDOWN_PENDING  = 5,          SCTP_STATE_SHUTDOWN_PENDING  = 4,
          SCTP_SHUTDOWN_SENT     = 6,          SCTP_STATE_SHUTDOWN_SENT     = 5,
          SCTP_SHUTDOWN_RECEIVED = 7,          SCTP_STATE_SHUTDOWN_RECEIVED = 6,
          SCTP_SHUTDOWN_ACK_SENT = 8,          SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
        };                                   } sctp_state_t;
      
      This header was later on also placed into the uapi, so that user space
      programs can compile without having <netinet/sctp.h>, but the shipped
      with <linux/sctp.h> instead.
      
      While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
      sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
      nevertheless have a what it appears to be dummy SCTP_EMPTY state from
      the very early days.
      
      While it seems to do just nothing, commit 0b8f9e25 ("sctp: remove
      completely unsed EMPTY state") did the right thing and removed this dead
      code. That however, causes an off-by-one when the user asks the SCTP
      stack via SCTP_STATUS API and checks for the current socket state thus
      yielding possibly undefined behaviour in applications as they expect
      the kernel to tell the right thing.
      
      The enumeration had to be changed however as based on the current socket
      state, we access a function pointer lookup-table through this. Therefore,
      I think the best way to deal with this is just to add a helper function
      sctp_assoc_to_state() to encapsulate the off-by-one quirk.
      Reported-by: NTristan Su <sooqing@gmail.com>
      Fixes: 0b8f9e25 ("sctp: remove completely unsed EMPTY state")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38ab1fa9
    • E
      net: attempt a single high order allocation · d9b2938a
      Eric Dumazet 提交于
      In commit ed98df33 ("net: use __GFP_NORETRY for high order
      allocations") we tried to address one issue caused by order-3
      allocations.
      
      We still observe high latencies and system overhead in situations where
      compaction is not successful.
      
      Instead of trying order-3, order-2, and order-1, do a single order-3
      best effort and immediately fallback to plain order-0.
      
      This mimics slub strategy to fallback to slab min order if the high
      order allocation used for performance failed.
      
      Order-3 allocations give a performance boost only if they can be done
      without recurring and expensive memory scan.
      
      Quoting David :
      
      The page allocator relies on synchronous (sync light) memory compaction
      after direct reclaim for allocations that don't retry and deferred
      compaction doesn't work with this strategy because the allocation order
      is always decreasing from the previous failed attempt.
      
      This means sync light compaction will always be encountered if memory
      cannot be defragmented or reclaimed several times during the
      skb_page_frag_refill() iteration.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9b2938a
    • D
      Merge branch 'mlx4-net' · bcc73547
      David S. Miller 提交于
      Or Gerlitz says:
      
      ====================
      Setup mlx4 user space Ethernet QPs to properly handle VXLAN
      
      This short series fixes the mlx4 driver setting of user space Ethernet QPs
      (e.g those opened by DPDK applications) such that they will properly handle
      VXLAN traffic/offloads
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcc73547
    • O
      mlx4: Set user-space raw Ethernet QPs to properly handle VXLAN traffic · d2fce8a9
      Or Gerlitz 提交于
      Raw Ethernet QPs opened from user-space lack the proper setup to
      recieve/handle VXLAN traffic when VXLAN offloads are enabled.
      
      Fix that by adding a tunnel steering rule on top of the normal unicast
      steering rule and set the tunnel_type field in the QP context.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2fce8a9