1. 19 4月, 2014 10 次提交
    • J
      ixgbe: clean up Rx time stamping code · eda183c2
      Jakub Kicinski 提交于
      Time stamping resources are per-interface so there is no need
      to keep separate last_rx_timestamp for each Rx ring, move
      last_rx_timestamp to the adapter structure.
      
      With last_rx_timestamp inside adapter, ixgbe_ptp_rx_hwtstamp()
      inline function is reduced to a single if statement so it is
      no longer necessary. If statement is placed directly in
      ixgbe_process_skb_fields() fixing likely/unlikely marking.
      
      Checks for q_vector or adapter to be NULL are superfluous.
      
      Comment about taking I/O hit is a leftover from previous design.
      Signed-off-by: NJakub Kicinski <kubakici@wp.pl>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      eda183c2
    • T
      igb: fix stats for i210 rx_fifo_errors · e66c083a
      Todd Fujinaka 提交于
      RQDPC on i210/i211 is R/W not ReadClear. Clear after reading.
      Signed-off-by: NTodd Fujinaka <todd.fujinaka@intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e66c083a
    • H
      e1000e: Enclose e1000e_pm_thaw() with CONFIG_PM_SLEEP · 3e7986f6
      Hiroaki SHIMODA 提交于
      Fix following compilation warning:
      drivers/net/ethernet/intel/e1000e/netdev.c:6238:12: warning
      ‘e1000e_pm_thaw’ defined but not used [-Wunused-function]
       static int e1000e_pm_thaw(struct device *dev)
                  ^
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3e7986f6
    • V
      e1000e: Correctly include VLAN_HLEN when changing interface MTU · c751a3d5
      Vlad Yasevich 提交于
      When changing the interface mtu, the driver starts with a value
      that doesn't include VLAN_HLEN.  Later tests in the driver
      set the rx_buffer_len based on the mtu.  As a result, when
      the user increases the mtu to 1504 (to support 802.1AD for example),
      the driver rx_buffer_len does not change and frames longer
      the 1522 bytes are rejected as too long.
      
      Include VLAN_HLEN from the start so that an user mtu greater then
      1500 bytes is correctly reflected in the driver rx_buffer_len.
      
      CC: e1000-devel@lists.sourceforge.net
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c751a3d5
    • J
      i40e: fix TCP flag replication for hardware offload · 059dab69
      Jesse Brandeburg 提交于
      As reported by Eric Dumazet, the i40e driver was allowing the hardware
      to replicate the PSH flag on all segments of a TSO operation.
      
      This patch fixes the first/middle/last TCP flags settings which
      makes the TSO operations work correctly.
      
      With this change we are now configuring the CWR bit to only be set
      in the first packet of a TSO, so this patch also enables TSO_ECN,
      in order to advertise to the stack that we do the right thing
      on the wire.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      059dab69
    • F
      i40e: remove open-coded skb_cow_head · dd225bc6
      Francois Romieu 提交于
      Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
      Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      dd225bc6
    • V
      net: sctp: cache auth_enable per endpoint · b14878cc
      Vlad Yasevich 提交于
      Currently, it is possible to create an SCTP socket, then switch
      auth_enable via sysctl setting to 1 and crash the system on connect:
      
      Oops[#1]:
      CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
      task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
      [...]
      Call Trace:
      [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80
      [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4
      [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c
      [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8
      [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214
      [<ffffffff8043af68>] sctp_rcv+0x588/0x630
      [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24
      [<ffffffff803acb50>] ip6_input+0x2c0/0x440
      [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564
      [<ffffffff80310650>] process_backlog+0xb4/0x18c
      [<ffffffff80313cbc>] net_rx_action+0x12c/0x210
      [<ffffffff80034254>] __do_softirq+0x17c/0x2ac
      [<ffffffff800345e0>] irq_exit+0x54/0xb0
      [<ffffffff800075a4>] ret_from_irq+0x0/0x4
      [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48
      [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148
      [<ffffffff805a88b0>] start_kernel+0x37c/0x398
      Code: dd0900b8  000330f8  0126302d <dcc60000> 50c0fff1  0047182a  a48306a0
      03e00008  00000000
      ---[ end trace b530b0551467f2fd ]---
      Kernel panic - not syncing: Fatal exception in interrupt
      
      What happens while auth_enable=0 in that case is, that
      ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
      when endpoint is being created.
      
      After that point, if an admin switches over to auth_enable=1,
      the machine can crash due to NULL pointer dereference during
      reception of an INIT chunk. When we enter sctp_process_init()
      via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
      the INIT verification succeeds and while we walk and process
      all INIT params via sctp_process_param() we find that
      net->sctp.auth_enable is set, therefore do not fall through,
      but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
      dereference what we have set to NULL during endpoint
      initialization phase.
      
      The fix is to make auth_enable immutable by caching its value
      during endpoint initialization, so that its original value is
      being carried along until destruction. The bug seems to originate
      from the very first days.
      
      Fix in joint work with Daniel Borkmann.
      Reported-by: NJoshua Kinard <kumba@gentoo.org>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Tested-by: NJoshua Kinard <kumba@gentoo.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b14878cc
    • D
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 5a292f7b
      David S. Miller 提交于
      John W. Linville says:
      
      ====================
      pull request: wireless 2014-04-17
      
      Please pull this batch of fixes intended for the 3.15 stream...
      
      For the mac80211 bits, Johannes says:
      
      "We have a fix from Chun-Yeow to not look at management frame bitrates
      that are typically really low, two fixes from Felix for AP_VLAN
      interfaces, a fix from Ido to disable SMPS settings when a monitor
      interface is enabled, a radar detection fix from Michał and a fix from
      myself for a very old remain-on-channel bug."
      
      For the iwlwifi bits, Emmanuel says:
      
      "I have new device IDs and a new firmware API. These are the trivial
      ones. The less trivial ones are Johannes's fix that delays the
      enablement of an interrupt coalescing hardware until after association
      - this fixes a few connection problems seen in the field. Eyal has a
      bunch of rate control fixes. I decided to add these for 3.15 because
      they fix some disconnection and packet loss scenarios which were
      reported by the field. I also have a fix for a memory leak that
      happens only with a very new NIC."
      
      Along with those...
      
      Amitkumar Karwar fixes a couple of problems relating to driver/firmware
      interactions in mwifiex.
      
      Christian Engelmayer avoids a couple of potential memory leaks in
      the new rsi driver.
      
      Eliad Peller provides a wl18xx mailbox alignment fix for problems
      when using new firmware.
      
      Frederic Danis adds a couple of missing debugging strings to the
      cw1200 driver.
      
      Geert Uytterhoeven adds a variable initialization inside of the
      rsi driver.
      
      Luciano Coelho patches the wlcore code to ignore dummy packet events
      in PLT mode in order to work around a firmware bug.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a292f7b
    • I
      tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled · ba67b510
      Ivan Vecera 提交于
      The patch fixes a problem with dropped jumbo frames after usage of
      'ethtool -G ... rx'.
      
      Scenario:
      1. ip link set eth0 up
      2. ethtool -G eth0 rx N # <- This zeroes rx-jumbo
      3. ip link set mtu 9000 dev eth0
      
      The ethtool command set rx_jumbo_pending to zero so any received jumbo
      packets are dropped and you need to use 'ethtool -G eth0 rx-jumbo N'
      to workaround the issue.
      The patch changes the logic so rx_jumbo_pending value is changed only if
      jumbo frames are enabled (MTU > 1500).
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Acked-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba67b510
    • D
      vlan: Fix lockdep warning when vlan dev handle notification · dc8eaaa0
      dingtianhong 提交于
      When I open the LOCKDEP config and run these steps:
      
      modprobe 8021q
      vconfig add eth2 20
      vconfig add eth2.20 30
      ifconfig eth2 xx.xx.xx.xx
      
      then the Call Trace happened:
      
      [32524.386288] =============================================
      [32524.386293] [ INFO: possible recursive locking detected ]
      [32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G           O
      [32524.386302] ---------------------------------------------
      [32524.386306] ifconfig/3103 is trying to acquire lock:
      [32524.386310]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0
      [32524.386326]
      [32524.386326] but task is already holding lock:
      [32524.386330]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40
      [32524.386341]
      [32524.386341] other info that might help us debug this:
      [32524.386345]  Possible unsafe locking scenario:
      [32524.386345]
      [32524.386350]        CPU0
      [32524.386352]        ----
      [32524.386354]   lock(&vlan_netdev_addr_lock_key/1);
      [32524.386359]   lock(&vlan_netdev_addr_lock_key/1);
      [32524.386364]
      [32524.386364]  *** DEADLOCK ***
      [32524.386364]
      [32524.386368]  May be due to missing lock nesting notation
      [32524.386368]
      [32524.386373] 2 locks held by ifconfig/3103:
      [32524.386376]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20
      [32524.386387]  #1:  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40
      [32524.386398]
      [32524.386398] stack backtrace:
      [32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G           O 3.14.0-rc2-0.7-default+ #35
      [32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [32524.386414]  ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8
      [32524.386421]  ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0
      [32524.386428]  000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000
      [32524.386435] Call Trace:
      [32524.386441]  [<ffffffff814f68a2>] dump_stack+0x6a/0x78
      [32524.386448]  [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940
      [32524.386454]  [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940
      [32524.386459]  [<ffffffff810a4874>] lock_acquire+0xe4/0x110
      [32524.386464]  [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0
      [32524.386471]  [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40
      [32524.386476]  [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0
      [32524.386481]  [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0
      [32524.386489]  [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q]
      [32524.386495]  [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0
      [32524.386500]  [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40
      [32524.386506]  [<ffffffff8141b3cf>] __dev_open+0xef/0x150
      [32524.386511]  [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190
      [32524.386516]  [<ffffffff8141b292>] dev_change_flags+0x32/0x80
      [32524.386524]  [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830
      [32524.386532]  [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660
      [32524.386540]  [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0
      [32524.386550]  [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60
      [32524.386558]  [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0
      [32524.386568]  [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590
      [32524.386578]  [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50
      [32524.386586]  [<ffffffff811b39e5>] ? __fget_light+0x105/0x110
      [32524.386594]  [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0
      [32524.386604]  [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b
      
      ========================================================================
      
      The reason is that all of the addr_lock_key for vlan dev have the same class,
      so if we change the status for vlan dev, the vlan dev and its real dev will
      hold the same class of addr_lock_key together, so the warning happened.
      
      we should distinguish the lock depth for vlan dev and its real dev.
      
      v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which
      	could support to add 8 vlan id on a same vlan dev, I think it is enough for current
      	scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id,
      	and the vlan dev would not meet the same class key with its real dev.
      
      	The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan
      	dev could get a suitable class key.
      
      v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev
      	and its real dev, but it make no sense, because the difference for subclass in the
      	lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth
      	to distinguish the different depth for every vlan dev, the same depth of the vlan dev
      	could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h,
      	I think it is enough here, the lockdep should never exceed that value.
      
      v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method,
      	we could use _nested() variants to fix the problem, calculate the depth for every vlan dev,
      	and use the depth as the subclass for addr_lock_key.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc8eaaa0
  2. 17 4月, 2014 19 次提交
    • J
      Merge branch 'master' of... · 4a0c3d9f
      John W. Linville 提交于
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      4a0c3d9f
    • K
      seccomp: fix memory leak on filter attach · 0acf07d2
      Kees Cook 提交于
      This sets the correct error code when final filter memory is unavailable,
      and frees the raw filter no matter what.
      
      unreferenced object 0xffff8800d6ea4000 (size 512):
        comm "sshd", pid 278, jiffies 4294898315 (age 46.653s)
        hex dump (first 32 bytes):
          21 00 00 00 04 00 00 00 15 00 01 00 3e 00 00 c0  !...........>...
          06 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00  ........!.......
        backtrace:
          [<ffffffff8151414e>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811a3a40>] __kmalloc+0x280/0x320
          [<ffffffff8110842e>] prctl_set_seccomp+0x11e/0x3b0
          [<ffffffff8107bb6b>] SyS_prctl+0x3bb/0x4a0
          [<ffffffff8152ef2d>] system_call_fastpath+0x1a/0x1f
          [<ffffffffffffffff>] 0xffffffffffffffff
      Reported-by: NMasami Ichikawa <masami256@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Tested-by: NMasami Ichikawa <masami256@gmail.com>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0acf07d2
    • D
      isdn: icn: buffer overflow in icn_command() · b7a31405
      Dan Carpenter 提交于
      This buffer over was detected using static analysis:
      
      	drivers/isdn/icn/icn.c:1325 icn_command()
      	error: format string overflow. buf_size: 60 length: 98
      
      The calculation for the length of the string is off because it assumes
      that the dial[] buffer holds a 50 character string, but actually it is
      at most 31 characters and NUL.  I have removed the dial[] buffer because
      it isn't needed.
      
      The maximum length of the string is actually 79 characters and a NUL.  I
      have made the cbuf[] array large enough to hold it and changed the
      sprintf() to an snprintf() as a further safety enhancement.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7a31405
    • N
      ip6_tunnel: use the right netns in ioctl handler · 74462f0d
      Nicolas Dichtel 提交于
      Because the netdevice may be in another netns than the i/o netns, we should
      use the i/o netns instead of dev_net(dev).
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74462f0d
    • N
      sit: use the right netns in ioctl handler · 9aad77c3
      Nicolas Dichtel 提交于
      Because the netdevice may be in another netns than the i/o netns, we should
      use the i/o netns instead of dev_net(dev).
      
      Note that netdev_priv(dev) cannot bu NULL, hence we can remove these useless
      checks.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9aad77c3
    • N
      ip_tunnel: use the right netns in ioctl handler · 8c923ce2
      Nicolas Dichtel 提交于
      Because the netdevice may be in another netns than the i/o netns, we should
      use the i/o netns instead of dev_net(dev).
      
      The variable 'tunnel' was used only to get 'itn', hence to simplify code I
      remove it and use 't' instead.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c923ce2
    • J
      net: use SYSCALL_DEFINEx for sys_recv · b7c0ddf5
      Jan Glauber 提交于
      Make sys_recv a first class citizen by using the SYSCALL_DEFINEx
      macro. Besides being cleaner this will also generate meta data
      for the system call so tracing tools like ftrace or LTTng can
      resolve this system call.
      Signed-off-by: NJan Glauber <jan.glauber@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7c0ddf5
    • D
      Merge branch 'mdio-gpio' · c3206e6f
      David S. Miller 提交于
      Guenter Roeck says:
      
      ====================
      net: mdio-gpio enhancements
      
      The following series of patches adds support for active-low gpio pins
      as well as for systems with separate MDI and MDO pins to the mdio-gpio
      driver.
      
      A board using those features is based on a COM Express CPU board.
      The COM Express standard supports GPIO pins on its connector,
      with one caveat: The pins on the connector have fixed direction
      and are hard configured either as input or output pins.
      The COM Express Design Guide [1] provides additional details.
      
      The hardware uses three of the GPO/GPI pins from the COM Express board
      to drive an MDIO bus. Connectivity between GPI/GPO pins and the MDIO bus
      is as follows.
      
      GPI2 --------------------+------------ MDIO
                               |
                  +--------+   |
      GPO2 ---+---G        |   |
              |   |        |   |
             4.7k | 2N7002 D---+
      	|   |        |
      	+---S        |
      	|   +--------+
             GND
      
      GPO1 --------------------------------- MDC
      
      To support this hardware, two extensions to the driver were necessary.
      
      - Due to the FET in the MDO path (GPO2), the MDO signal is inverted.
        The driver therefore has to support active-low GPIO pins.
      
      - The MDIO signal must be separated into MDI and MDO.
      
      Those changes are implemented in patch 2/3 and 3/3.
      Patch 1/3 simplifies the error path and thus the subsequent
      patches.
      
      [1] http://www.picmg.org/pdf/picmg_comdg_100.pdf
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3206e6f
    • G
      net: mdio-gpio: Add support for separate MDI and MDO gpio pins · f1d54c47
      Guenter Roeck 提交于
      This is for a system with fixed assignments of input and output pins
      (various variants of Kontron COMe).
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1d54c47
    • G
      net: mdio-gpio: Add support for active low gpio pins · 1d251481
      Guenter Roeck 提交于
      Some systems using mdio-gpio may use active-low gpio pins
      (eg with inverters or FETs connected to all or some of the
      gpio pins).
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d251481
    • G
      net: mdio-gpio: Use devm_ functions where possible · 78cdb079
      Guenter Roeck 提交于
      This simplifies error path and deinit/removal functions.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NChris Healy <cphealy@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78cdb079
    • D
      Merge branch 'fib_validate_loopback' · bc383ea5
      David S. Miller 提交于
      Cong Wang says:
      
      ====================
      ipv4: fix flowi4_iif for input routing
      
      This patchset fixes ->flowi4_iif for input routing and rp filter,
      based on suggestion from Julian. See per patch for details.
      
      v1 -> v2:
      * merge the first two patches into one
      * fix fib_check_nh() too
      * add this cover letter
      ====================
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Reviewed-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc383ea5
    • C
      ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source() · 0d5edc68
      Cong Wang 提交于
      In my special case, when a packet is redirected from veth0 to lo,
      its skb->dev->ifindex would be LOOPBACK_IFINDEX. Meanwhile we
      pass the hard-coded LOOPBACK_IFINDEX to fib_validate_source()
      in ip_route_input_slow(). This would cause the following check
      in fib_validate_source() fail:
      
                  (dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev))
      
      when rp_filter is disabeld on loopback. As suggested by Julian,
      the caller should pass 0 here so that we will not end up by
      calling __fib_validate_source().
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d5edc68
    • C
      ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif · 6a662719
      Cong Wang 提交于
      As suggested by Julian:
      
      	Simply, flowi4_iif must not contain 0, it does not
      	look logical to ignore all ip rules with specified iif.
      
      because in fib_rule_match() we do:
      
              if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
                      goto out;
      
      flowi4_iif should be LOOPBACK_IFINDEX by default.
      
      We need to move LOOPBACK_IFINDEX to include/net/flow.h:
      
      1) It is mostly used by flowi_iif
      
      2) Fix the following compile error if we use it in flow.h
      by the patches latter:
      
      In file included from include/linux/netfilter.h:277:0,
                       from include/net/netns/netfilter.h:5,
                       from include/net/net_namespace.h:21,
                       from include/linux/netdevice.h:43,
                       from include/linux/icmpv6.h:12,
                       from include/linux/ipv6.h:61,
                       from include/net/ipv6.h:16,
                       from include/linux/sunrpc/clnt.h:27,
                       from include/linux/nfs_fs.h:30,
                       from init/do_mounts.c:32:
      include/net/flow.h: In function ‘flowi4_init_output’:
      include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a662719
    • C
      mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll · c98235cb
      Chris Mason 提交于
      The mlx4 driver is triggering schedules while atomic inside
      mlx4_en_netpoll:
      
      	spin_lock_irqsave(&cq->lock, flags);
      	napi_synchronize(&cq->napi);
      		^^^^^ msleep here
      	mlx4_en_process_rx_cq(dev, cq, 0);
      	spin_unlock_irqrestore(&cq->lock, flags);
      
      This was part of a patch by Alexander Guller from Mellanox in 2011,
      but it still isn't upstream.
      Signed-off-by: NChris Mason <clm@fb.com>
      cc: stable@vger.kernel.org
      Acked-By: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c98235cb
    • D
      Merge branch 'mvneta_qsgmii' · b07afe07
      David S. Miller 提交于
      Thomas Petazzoni says:
      
      ====================
      net: mvneta: fix usage as a module, and support QSGMII properly
      
      This set of patches is a new attempt at fixing the operation of the
      mvneta driver when built as a module. For the record, the previous
      attempt, merged in commit e3a8786c
      ('net: mvneta: fix usage as a module on RGMII configurations') caused
      problems for all RGMII configurations.
      
      In fact, it turned out that the MAC to PHY connection on the Armada XP
      GP, which was described as using RGMII-ID according to its Device
      Tree, is in fact a QSGMII connection. And the RGMII and QSGMII
      configurations have to be handled in a different way in the driver,
      because the SERDES configuration is different in those two cases.
      
      So, this patch series fixes that by:
      
       * Adding minimal handling of a "qsgmii" connection type in the PHY
         layer. Mainly to make sure that a "qsgmii" phy-mode in the Device
         Tree is recognized, and handed over to the driver as
         PHY_INTERFACE_QSGMII.
      
       * Changing the mvneta driver to properly configure the RGMIIEn and
         PCSEn bits in the GMAC_CTRL_2 register, and configure the SERDES
         register, in the three possible cases: RGMII, SGMII and QSGMII.
      
       * Updating the Device Tree of the Armada XP GP board to reflect the
         fact that it uses a QSGMII MAC/PHY connection.
      
      PATCH 1 and 2 would be merged by David Miller, through the net tree,
      while PATCH 3 would be merged by the mach-mvebu maintainers, through
      their tree and arm-soc.
      
      This set of patches has been tested on:
      
       * Armada XP GP (four QSGMII interfaces)
       * Armada XP DB (two RGMII interfaces and two SGMII interfaces)
       * Armada 370 Mirabox (two RGMII interfaces)
      
      I've tested both the driver built-in, and compiled as a module.
      
      Since the last attempt at fixing this was quite a fiasco, I'd like
      this new attempt to be tested more widely before being applied. I'll
      try to do some testing on other Armada boards I have, but independent
      testing from other persons would also be appreciated.
      
      Note that these patches apply after reverting the previous attempt,
      obviously.
      ====================
      Tested-by: NArnaud Ebalard <arno@natisbad.org>
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b07afe07
    • T
      net: mvneta: properly configure the MAC <-> PHY connection in all situations · 3f1dd4bc
      Thomas Petazzoni 提交于
      Commit 5445eaf3 ('mvneta: Try to fix mvneta when compiled as
      module') fixed the mvneta driver to make it work properly when loaded
      as a module in SGMII configuration, which was tested successful by the
      author on the Armada XP OpenBlocks AX3, which uses SGMII.
      
      However, some other platforms, namely the Armada XP GP don't use
      SGMII, but a QSGMII connection between the MAC and the PHY, and this
      case was not supported by the mvneta driver, which was relying on
      configuration put in place by the bootloader. While this works when
      the mvneta driver is built-in (because clocks are not gated), it
      breaks when mvneta is built as a module, because the clock is gated
      (all configuration is lost) and then re-enabled when the mvneta driver
      is loaded.
      
      In order to support all of RGMII, SGMII and QSGMII, this commit
      reworks how the PHY interface configuration is done, and simplifies
      it: it removes the mvneta_port_sgmii_config() and
      mvneta_gmac_rgmii_set() functions, which were strange because
      mvneta_gmac_rgmii_set() was called in all cases, even for SGMII
      configurations. Also, the mvneta_gmac_rgmii_set() function was taking
      a boolean as argument, which was always true.
      
      Instead, all the PHY interface configuration logic is moved into the
      mvneta_port_power_up() function, in a much simpler 'switch' construct,
      with four cases:
      
       - QSGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
         the SERDES is configured in QSGMII. Technically speaking,
         configuring the SERDES of the first port would be sufficient, but
         it is simpler to do it on all ports.
      
       - SGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
         the SERDES is configured as SGMII.
      
       - RGMII: the RGMIIEn bit in GMAC_CTRL_2 is set. The PCSEn bit is kept
         cleared, and no SERDES configuration is done, because RGMII is not
         using SERDES lanes.
      
       - other: an error is returned. For this reason, the
         mvneta_port_power_up() now returns an int instead of nothing, and
         the return value is checked by mvneta_probe().
      
      This has been successfully tested on:
      
       * Armada XP DB, which has two RGMII and two SGMII connections
       * Armada XP GP, which uses QSGMII for its four interfaces
       * Armada 370 Mirabox, which has two RGMII connections
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f1dd4bc
    • T
      net: phy: add minimal support for QSGMII PHY · b9d12085
      Thomas Petazzoni 提交于
      This commit adds the necessary definitions for the PHY layer to
      recognize "qsgmii" as a valid PHY interface. A QSMII interface, as
      defined at
      http://en.wikipedia.org/wiki/Media_Independent_Interface#Quad_Serial_Gigabit_Media_Independent_Interface,
      is "is a method of combining four SGMII lines into a 5Gbit/s
      interface. QSGMII, like SGMII, uses LVDS signalling for the TX and RX
      data and a single LVDS clock signal. QSGMII uses significantly fewer
      signal lines than four SGMII busses."
      
      This type of MAC <-> PHY connection might require special handling on
      the MAC driver side, so it should be possible to express this type of
      MAC <-> PHY connection, for example in the Device Tree.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: devicetree@vger.kernel.org
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9d12085
    • E
      sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast) · e283546c
      Edward Cree 提交于
      When an MCDI command times out (whether or not we find it
      completed when we poll), call efx_mcdi_abandon(), which tells
      all subsequent MCDI calls to fail-fast, and queues up an FLR.
      
      Because an FLR doesn't lead to receiving any reboot even from
      the MC (unlike most other types of reset), we have to call
      efx_ef10_reset_mc_allocations.
      In efx_start_all(), if a reset (of any kind) is pending, we
      bail out.
      Without this, attempts to reconfigure (e.g. change mtu) can
      cause driver/mc state inconsistency if the first MCDI call
      triggers an FLR.
      
      For similar reasons, on EF10, in
      efx_reset_down(method=RESET_TYPE_MCDI_TIMEOUT), set the number
      of active queues to zero before calling efx_stop_all().
      And, on farch, in efx_reset_up(method=RESET_TYPE_MCDI_TIMEOUT),
      set active_queues and flushes pending & outstanding to zero.
      
      efx_mcdi_mode_{poll,event}() should not take us out of fail-fast
       mode. Instead, this is done by efx_mcdi_reset() after the FLR
      completes.
      
      The new FLR reset_type RESET_TYPE_MCDI_TIMEOUT doesn't really
      fit into the hierarchy of reset 'scopes' whereby efx_reset()
      decides some resets subsume others.  Thus, it uses separate logic.
      
      Also, fixed up some inconsistency around RESET_TYPE_MC_BIST,
      which was in the wrong place in that hierarchy.
      Signed-off-by: NShradha Shah <sshah@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e283546c
  3. 16 4月, 2014 8 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 10ec34fc
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix BPF filter validation of netlink attribute accesses, from
          Mathias Kruase.
      
       2) Netfilter conntrack generation seqcount not initialized properly,
          from Andrey Vagin.
      
       3) Fix comparison mask computation on big-endian in nft_cmp_fast(),
          from Patrick McHardy.
      
       4) Properly limit MTU over ipv6, from Eric Dumazet.
      
       5) Fix seccomp system call argument population on 32-bit, from Daniel
          Borkmann.
      
       6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead
          skb->mac_len needs to be used.  From Vlad Yasevich.
      
       7) We have several cases of using socket based communications to
          implement a tunnel.  For example, some tunnels are encapsulations
          over UDP so we use an internal kernel UDP socket to do the
          transmits.
      
          These tunnels should behave just like other software devices and
          pass the packets on down to the next layer.
      
          Most importantly we want the top-level socket (eg TCP) that created
          the traffic to be charged for the SKB memory.
      
          However, once you get into the IP output path, we have code that
          assumed that whatever was attached to skb->sk is an IP socket.
      
          To keep the top-level socket being charged for the SKB memory,
          whilst satisfying the needs of the IP output path, we now pass in an
          explicit 'sk' argument.
      
          From Eric Dumazet.
      
       8) ping_init_sock() leaks group info, from Xiaoming Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        cxgb4: use the correct max size for firmware flash
        qlcnic: Fix MSI-X initialization code
        ip6_gre: don't allow to remove the fb_tunnel_dev
        ipv4: add a sock pointer to dst->output() path.
        ipv4: add a sock pointer to ip_queue_xmit()
        driver/net: cosa driver uses udelay incorrectly
        at86rf230: fix __at86rf230_read_subreg function
        at86rf230: remove check if AVDD settled
        net: cadence: Add architecture dependencies
        net: Start with correct mac_len in skb_network_protocol
        Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
        cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
        net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
        seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
        qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
        qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
        qlcnic: Fix PVID configuration on eSwitch port.
        qlcnic: Fix max ring count calculation
        qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
        qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
        ...
      10ec34fc
    • S
      cxgb4: use the correct max size for firmware flash · 6f1d7210
      Steve Wise 提交于
      The wrong max fw size was being used and causing false
      "too big" errors running ethtool -f.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f1d7210
    • A
      qlcnic: Fix MSI-X initialization code · 8564ae09
      Alexander Gordeev 提交于
      Function qlcnic_setup_tss_rss_intr() might enter endless
      loop in case pci_enable_msix() contiguously returns a
      positive number of MSI-Xs that could have been allocated.
      Besides, the function contains 'err = -EIO;' assignment
      that never could be reached. This update fixes the
      aforementioned issues.
      
      Cc: Shahed Shaikh <shahed.shaikh@qlogic.com>
      Cc: Dept-HSGLinuxNICDev@qlogic.com
      Cc: netdev@vger.kernel.org
      Cc: linux-pci@vger.kernel.org
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Acked-by: NShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8564ae09
    • N
      ip6_gre: don't allow to remove the fb_tunnel_dev · 54d63f78
      Nicolas Dichtel 提交于
      It's possible to remove the FB tunnel with the command 'ip link del ip6gre0' but
      this is unsafe, the module always supposes that this device exists. For example,
      ip6gre_tunnel_lookup() may use it unconditionally.
      
      Let's add a rtnl handler for dellink, which will never remove the FB tunnel (we
      let ip6gre_destroy_tunnels() do the job).
      
      Introduced by commit c12b395a ("gre: Support GRE over IPv6").
      
      CC: Dmitry Kozlov <xeb@mail.ru>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54d63f78
    • E
      ipv4: add a sock pointer to dst->output() path. · aad88724
      Eric Dumazet 提交于
      In the dst->output() path for ipv4, the code assumes the skb it has to
      transmit is attached to an inet socket, specifically via
      ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
      provider of the packet is an AF_PACKET socket.
      
      The dst->output() method gets an additional 'struct sock *sk'
      parameter. This needs a cascade of changes so that this parameter can
      be propagated from vxlan to final consumer.
      
      Fixes: 8f646c92 ("vxlan: keep original skb ownership")
      Reported-by: Nlucien xin <lucien.xin@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aad88724
    • A
      mwifiex: fix hung task on command timeout · f8d2b920
      Amitkumar Karwar 提交于
      Sometimes when command timeout occurs due to a firmware or
      hardware bug, there may be some synchronous commands in command
      queue. These commands are never downloaded to firmware causing
      hung task warnings. This patch replaces wait_event_interruptible
      call with wait_event_interruptible_timeout to fix the issue.
      Signed-off-by: NAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: NAvinash Patil <patila@marvell.com>
      Signed-off-by: NBing Zhao <bzhao@marvell.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      f8d2b920
    • A
      mwifiex: process event before command response · 20474129
      Amitkumar Karwar 提交于
      During extended scan, SCAN report event is always followed by
      command response. Sometimes It is observed that command response
      is processed before SCAN report which leads to a crash, because
      current command node is cleared while handling the response.
      This patch makes sure that driver's main thread gives priority
      to events over command responses.
      Signed-off-by: NAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: NMaithili Hinge <maithili@marvell.com>
      Signed-off-by: NBing Zhao <bzhao@marvell.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      20474129
    • E
      ipv4: add a sock pointer to ip_queue_xmit() · b0270e91
      Eric Dumazet 提交于
      ip_queue_xmit() assumes the skb it has to transmit is attached to an
      inet socket. Commit 31c70d59 ("l2tp: keep original skb ownership")
      changed l2tp to not change skb ownership and thus broke this assumption.
      
      One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
      so that we do not assume skb->sk points to the socket used by l2tp
      tunnel.
      
      Fixes: 31c70d59 ("l2tp: keep original skb ownership")
      Reported-by: NZhan Jianyu <nasa4836@gmail.com>
      Tested-by: NZhan Jianyu <nasa4836@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0270e91
  4. 15 4月, 2014 3 次提交