1. 25 7月, 2015 2 次提交
    • J
      ipv4: consider TOS in fib_select_default · 2392debc
      Julian Anastasov 提交于
      fib_select_default considers alternative routes only when
      res->fi is for the first alias in res->fa_head. In the
      common case this can happen only when the initial lookup
      matches the first alias with highest TOS value. This
      prevents the alternative routes to require specific TOS.
      
      This patch solves the problem as follows:
      
      - routes that require specific TOS should be returned by
      fib_select_default only when TOS matches, as already done
      in fib_table_lookup. This rule implies that depending on the
      TOS we can have many different lists of alternative gateways
      and we have to keep the last used gateway (fa_default) in first
      alias for the TOS instead of using single tb_default value.
      
      - as the aliases are ordered by many keys (TOS desc,
      fib_priority asc), we restrict the possible results to
      routes with matching TOS and lowest metric (fib_priority)
      and routes that match any TOS, again with lowest metric.
      
      For example, packet with TOS 8 can not use gw3 (not lowest
      metric), gw4 (different TOS) and gw6 (not lowest metric),
      all other gateways can be used:
      
      tos 8 via gw1 metric 2 <--- res->fa_head and res->fi
      tos 8 via gw2 metric 2
      tos 8 via gw3 metric 3
      tos 4 via gw4
      tos 0 via gw5
      tos 0 via gw6 metric 1
      Reported-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2392debc
    • J
      ipv4: fib_select_default should match the prefix · 18a912e9
      Julian Anastasov 提交于
      fib_trie starting from 4.1 can link fib aliases from
      different prefixes in same list. Make sure the alternative
      gateways are in same table and for same prefix (0) by
      checking tb_id and fa_slen.
      
      Fixes: 79e5ad2c ("fib_trie: Remove leaf_info")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18a912e9
  2. 22 7月, 2015 4 次提交
    • C
      openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes · bac541e4
      Chris J Arges 提交于
      Some architectures like POWER can have a NUMA node_possible_map that
      contains sparse entries. This causes memory corruption with openvswitch
      since it allocates flow_cache with a multiple of num_possible_nodes() and
      assumes the node variable returned by for_each_node will index into
      flow->stats[node].
      
      Use nr_node_ids to allocate a maximal sparse array instead of
      num_possible_nodes().
      
      The crash was noticed after 3af229f2 was applied as it changed the
      node_possible_map to match node_online_map on boot.
      Fixes: 3af229f2Signed-off-by: NChris J Arges <chris.j.arges@canonical.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bac541e4
    • F
      netlink: don't hold mutex in rcu callback when releasing mmapd ring · 0470eb99
      Florian Westphal 提交于
      Kirill A. Shutemov says:
      
      This simple test-case trigers few locking asserts in kernel:
      
      int main(int argc, char **argv)
      {
              unsigned int block_size = 16 * 4096;
              struct nl_mmap_req req = {
                      .nm_block_size          = block_size,
                      .nm_block_nr            = 64,
                      .nm_frame_size          = 16384,
                      .nm_frame_nr            = 64 * block_size / 16384,
              };
              unsigned int ring_size;
      	int fd;
      
      	fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
              if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
                      exit(1);
              if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
                      exit(1);
      
      	ring_size = req.nm_block_nr * req.nm_block_size;
      	mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	return 0;
      }
      
      +++ exited with 0 +++
      BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
      in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
      3 locks held by init/1:
       #0:  (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220
       #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70
       #2:  (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0
      Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20
      
      CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
       ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
       0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
       ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
      Call Trace:
       <IRQ>  [<ffffffff81929ceb>] dump_stack+0x4f/0x7b
       [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270
       [<ffffffff81085bed>] __might_sleep+0x4d/0x90
       [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430
       [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
       [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20
       [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350
       [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70
       [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150
       [<ffffffff817e484d>] __sk_free+0x1d/0x160
       [<ffffffff817e49a9>] sk_free+0x19/0x20
      [..]
      
      Cong Wang says:
      
      We can't hold mutex lock in a rcu callback, [..]
      
      Thomas Graf says:
      
      The socket should be dead at this point. It might be simpler to
      add a netlink_release_ring() function which doesn't require
      locking at all.
      Reported-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Diagnosed-by: NCong Wang <cwang@twopensource.com>
      Suggested-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0470eb99
    • E
      tcp: suppress a division by zero warning · 89e478a2
      Eric Dumazet 提交于
      Andrew Morton reported following warning on one ARM build
      with gcc-4.4 :
      
      net/ipv4/inet_hashtables.c: In function 'inet_ehash_locks_alloc':
      net/ipv4/inet_hashtables.c:617: warning: division by zero
      
      Even guarded with a test on sizeof(spinlock_t), compiler does not
      like current construct on a !CONFIG_SMP build.
      
      Remove the warning by using a temporary variable.
      
      Fixes: 095dc8e0 ("tcp: fix/cleanup inet_ehash_locks_alloc()")
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89e478a2
    • E
      inet: frags: fix defragmented packet's IP header for af_packet · 0848f642
      Edward Hyunkoo Jee 提交于
      When ip_frag_queue() computes positions, it assumes that the passed
      sk_buff does not contain L2 headers.
      
      However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
      functions can be called on outgoing packets that contain L2 headers.
      
      Also, IPv4 checksum is not corrected after reassembly.
      
      Fixes: 7736d33f ("packet: Add pre-defragmentation support for ipv4 fanouts.")
      Signed-off-by: NEdward Hyunkoo Jee <edjee@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0848f642
  3. 21 7月, 2015 6 次提交
  4. 17 7月, 2015 8 次提交
    • A
      cfg80211: use RTNL locked reg_can_beacon for IR-relaxation · 923b352f
      Arik Nemtsov 提交于
      The RTNL is required to check for IR-relaxation conditions that allow
      more channels to beacon. Export an RTNL locked version of reg_can_beacon
      and use it where possible in AP/STA interface type flows, where
      IR-relaxation may be applicable.
      
      Fixes: 06f207fc ("cfg80211: change GO_CONCURRENT to IR_CONCURRENT for STA")
      Signed-off-by: NArik Nemtsov <arikx.nemtsov@intel.com>
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      923b352f
    • B
      mac80211: add missing length check for confirm frames · b3e7de87
      Bob Copeland 提交于
      Although mesh_rx_plink_frame() already checks that frames have enough
      bytes for the action code plus another two bytes for capability/reason
      code, it doesn't take into account that confirm frames also have an
      additional two-byte aid.  As a result, a corrupt frame could cause a
      subsequent subtraction to wrap around to ill effect.  Add another
      check for this case.
      Signed-off-by: NBob Copeland <me@bobcopeland.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      b3e7de87
    • B
      mac80211: correct aid location in peering frames · 2ea752cd
      Bob Copeland 提交于
      According to 802.11-2012 8.5.16.3.2 AID comes directly after the
      capability bytes in mesh peering confirm frames.  The existing
      code, however, was adding a 2 byte offset to this location,
      resulting in garbage data going out over the air.  Remove the
      offset to fix it.
      Signed-off-by: NBob Copeland <me@bobcopeland.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      2ea752cd
    • T
      wireless: regulatory: reduce log level of CRDA related messages · 042ab5fc
      Thomas Petazzoni 提交于
      With a basic Linux userspace, the messages "Calling CRDA to update
      world regulatory domain" appears 10 times after boot every second or
      so, followed by a final "Exceeded CRDA call max attempts. Not calling
      CRDA". For those of us not having the corresponding userspace parts,
      having those messages repeatedly displayed at boot time is a bit
      annoying, so this commit reduces their log level to pr_debug().
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      042ab5fc
    • J
      mac80211: shut down interfaces before destroying interface list · d8d9008c
      Johannes Berg 提交于
      If the hardware is unregistered while interfaces are up, mac80211 will
      unregister all interfaces, which in turns causes mac80211 to be called
      again to remove them all from the driver and eventually shut down the
      hardware.
      
      During this shutdown, however, it's currently already unsafe to iterate
      the list of interfaces atomically, as the list is manipulated in an
      unsafe manner. This puts an undue burden on the driver - it must stop
      all its activities before calling ieee80211_unregister_hw(), while in
      the normal stop path it can do all cleanup in the stop method. If, for
      example, it's using the iteration during RX for some reason, it would
      have to stop RX before unregistering to avoid crashes.
      
      Fix this problem by closing all interfaces before unregistering them.
      This will cause the driver stop to have completed before we manipulate
      the interface list, and after the driver is stopped *and* has called
      ieee80211_unregister_hw() it really musn't be iterating any more as
      the memory will be freed as well.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      d8d9008c
    • C
      mac80211: wowlan: enable powersave if suspend while ps-polling · 541b6ed7
      Chaitanya T K 提交于
      If for any reason we're in the middle of PS-polling or awake after
      TX due to dynamic powersave while going to suspend, go back to save
      power. This might cause a response frame to get lost, but since we
      can't really wait for it while going to suspend that's still better
      than not enabling powersave which would cause higher power usage
      during (and possibly even after) suspend.
      
      Note that this really only affects the very few drivers that use
      the powersave implementation in mac80211.
      Signed-off-by: NChaitanya T K <chaitanya.mgit@gmail.com>
      [rewrite misleading commit log]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      541b6ed7
    • M
      mac80211: don't clear all tx flags when requeing · e9de0190
      Michal Kazior 提交于
      When acting as AP and a PS-Poll frame is received
      associated station is marked as one in a Service
      Period. This state is kept until Tx status for
      released frame is reported. While a station is in
      Service Period PS-Poll frames are ignored.
      
      However if PS-Poll was received during A-MPDU
      teardown it was possible to have the to-be
      released frame re-queued back to pending queue.
      In such case the frame was stripped of 2 important
      flags:
      
       (a) IEEE80211_TX_CTL_NO_PS_BUFFER
       (b) IEEE80211_TX_STATUS_EOSP
      
      Stripping of (a) led to the frame that was to be
      released to be queued back to ps_tx_buf queue. If
      station remained to use only PS-Poll frames the
      re-queued frame (and new ones) was never actually
      transmitted because mac80211 would ignore
      subsequent PS-Poll frames due to station being in
      Service Period. There was nothing left to clear
      the Service Period bit (no xmit -> no tx status ->
      no SP end), i.e. the AP would have the station
      stuck in Service Period. Beacon TIM would
      repeatedly prompt station to poll for frames but
      it would get none.
      
      Once (a) is not stripped (b) becomes important
      because it's the main condition to clear the
      Service Period bit of the station when Tx status
      for the released frame is reported back.
      
      This problem was observed with ath9k acting as P2P
      GO in some testing scenarios but isn't limited to
      it. AP operation with mac80211 based Tx A-MPDU
      control combined with clients using PS-Poll frames
      is subject to this race.
      Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      e9de0190
    • T
      mac80211: clear subdir_stations when removing debugfs · 4479004e
      Tom Hughes 提交于
      If we don't do this, and we then fail to recreate the debugfs
      directory during a mode change, then we will fail later trying
      to add stations to this now bogus directory:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000006c
      IP: [<c0a92202>] mutex_lock+0x12/0x30
      Call Trace:
      [<c0678ab4>] start_creating+0x44/0xc0
      [<c0679203>] debugfs_create_dir+0x13/0xf0
      [<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211]
      
      Cc: stable@kernel.org
      Signed-off-by: NTom Hughes <tom@compton.nu>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      4479004e
  5. 16 7月, 2015 12 次提交
  6. 15 7月, 2015 1 次提交
    • W
      rds: rds_ib_device.refcount overflow · 4fabb594
      Wengang Wang 提交于
      Fixes: 3e0249f9 ("RDS/IB: add refcount tracking to struct rds_ib_device")
      
      There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
      failed(mr pool running out). this lead to the refcount overflow.
      
      A complain in line 117(see following) is seen. From vmcore:
      s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
      That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
      to return ERR_PTR(-EAGAIN).
      
      115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
      116 {
      117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
      118         if (atomic_dec_and_test(&rds_ibdev->refcount))
      119                 queue_work(rds_wq, &rds_ibdev->free_work);
      120 }
      
      fix is to drop refcount when rds_ib_alloc_fmr failed.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4fabb594
  7. 13 7月, 2015 1 次提交
    • O
      can: replace timestamp as unique skb attribute · d3b58c47
      Oliver Hartkopp 提交于
      Commit 514ac99c "can: fix multiple delivery of a single CAN frame for
      overlapping CAN filters" requires the skb->tstamp to be set to check for
      identical CAN skbs.
      
      Without timestamping to be required by user space applications this timestamp
      was not generated which lead to commit 36c01245 "can: fix loss of CAN frames
      in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs
      by introducing several __net_timestamp() calls.
      
      This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb()
      to add __net_timestamp() after skbuff creation to prevent the frame loss fixed
      in mainline Linux.
      
      This patch removes the timestamp dependency and uses an atomic counter to
      create an unique identifier together with the skbuff pointer.
      
      Btw: the new skbcnt element introduced in struct can_skb_priv has to be
      initialized with zero in out-of-tree drivers which are not using
      alloc_can{,fd}_skb() too.
      Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      d3b58c47
  8. 12 7月, 2015 3 次提交
  9. 11 7月, 2015 3 次提交