1. 10 10月, 2008 3 次提交
    • P
      selinux: Fix missing calls to netlbl_skbuff_err() · dfaebe98
      Paul Moore 提交于
      At some point I think I messed up and dropped the calls to netlbl_skbuff_err()
      which are necessary for CIPSO to send error notifications to remote systems.
      This patch re-introduces the error handling calls into the SELinux code.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      dfaebe98
    • P
      netlabel: Remove unneeded in-kernel API functions · 948a7243
      Paul Moore 提交于
      After some discussions with the Smack folks, well just Casey, I now have a
      better idea of what Smack wants out of NetLabel in the future so I think it
      is now safe to do some API "pruning".  If another LSM comes along that
      needs this functionality we can always add it back in, but I don't see any
      LSMs on the horizon which might make use of these functions.
      
      Thanks to Rami Rosen who suggested removing netlbl_cfg_cipsov4_del() back
      in February 2008.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      948a7243
    • P
      netlabel: Fix some sparse warnings · 56196701
      Paul Moore 提交于
      Fix a few sparse warnings.  One dealt with a RCU lock being held on error,
      another dealt with an improper type caused by a signed/unsigned mixup while
      the rest appeared to be caused by using rcu_dereference() in a
      list_for_each_entry_rcu() call.  The latter probably isn't a big deal, but
      I derive a certain pleasure from knowing that the net/netlabel is nice and
      clean.
      
      Thanks to James Morris for pointing out the issues and demonstrating how
      to run sparse.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      56196701
  2. 08 10月, 2008 4 次提交
    • D
      tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd. · 9d2c27e1
      Daniele Lacamera 提交于
      Because of rounding, in certain conditions, i.e. when in congestion
      avoidance state rho is smaller than 1/128 of the current cwnd, TCP
      Hybla congestion control starves and the cwnd is kept constant
      forever.
      
      This patch forces an increment by one segment after #send_cwnd calls
      without increments(newreno behavior).
      Signed-off-by: NDaniele Lacamera <root@danielinux.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d2c27e1
    • H
      net: Fix netdev_run_todo dead-lock · 58ec3b4d
      Herbert Xu 提交于
      Benjamin Thery tracked down a bug that explains many instances
      of the error
      
      unregister_netdevice: waiting for %s to become free. Usage count = %d
      
      It turns out that netdev_run_todo can dead-lock with itself if
      a second instance of it is run in a thread that will then free
      a reference to the device waited on by the first instance.
      
      The problem is really quite silly.  We were trying to create
      parallelism where none was required.  As netdev_run_todo always
      follows a RTNL section, and that todo tasks can only be added
      with the RTNL held, by definition you should only need to wait
      for the very ones that you've added and be done with it.
      
      There is no need for a second mutex or spinlock.
      
      This is exactly what the following patch does.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58ec3b4d
    • A
      tcp: Fix possible double-ack w/ user dma · 53240c20
      Ali Saidi 提交于
      From: Ali Saidi <saidi@engin.umich.edu>
      
      When TCP receive copy offload is enabled it's possible that
      tcp_rcv_established() will cause two acks to be sent for a single
      packet. In the case that a tcp_dma_early_copy() is successful,
      copied_early is set to true which causes tcp_cleanup_rbuf() to be
      called early which can send an ack. Further along in
      tcp_rcv_established(), __tcp_ack_snd_check() is called and will
      schedule a delayed ACK. If no packets are processed before the delayed
      ack timer expires the packet will be acked twice.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53240c20
    • P
      net: only invoke dev->change_rx_flags when device is UP · b6c40d68
      Patrick McHardy 提交于
      Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN
      device down that is in promiscous mode:
      
      When the VLAN device is set down, the promiscous count on the real
      device is decremented by one by vlan_dev_stop(). When removing the
      promiscous flag from the VLAN device afterwards, the promiscous
      count on the real device is decremented a second time by the
      vlan_change_rx_flags() callback.
      
      The root cause for this is that the ->change_rx_flags() callback is
      invoked while the device is down. The synchronization is meant to mirror
      the behaviour of the ->set_rx_mode callbacks, meaning the ->open function
      is responsible for doing a full sync on open, the ->close() function is
      responsible for doing full cleanup on ->stop() and ->change_rx_flags()
      is meant to do incremental changes while the device is UP.
      
      Only invoke ->change_rx_flags() while the device is UP to provide the
      intended behaviour.
      Tested-by: NJesper Dangaard Brouer <jdb@comx.dk>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6c40d68
  3. 07 10月, 2008 3 次提交
  4. 01 10月, 2008 4 次提交
    • T
      af_key: Free dumping state on socket close · 05238204
      Timo Teras 提交于
      Fix a xfrm_{state,policy}_walk leak if pfkey socket is closed while
      dumping is on-going.
      Signed-off-by: NTimo Teras <timo.teras@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05238204
    • A
      XFRM,IPv6: initialize ip6_dst_blackhole_ops.kmem_cachep · 5dc121e9
      Arnaud Ebalard 提交于
      ip6_dst_blackhole_ops.kmem_cachep is not expected to be NULL (i.e. to
      be initialized) when dst_alloc() is called from ip6_dst_blackhole().
      Otherwise, it results in the following (xfrm_larval_drop is now set to
      1 by default):
      
      [   78.697642] Unable to handle kernel paging request for data at address 0x0000004c
      [   78.703449] Faulting instruction address: 0xc0097f54
      [   78.786896] Oops: Kernel access of bad area, sig: 11 [#1]
      [   78.792791] PowerMac
      [   78.798383] Modules linked in: btusb usbhid bluetooth b43 mac80211 cfg80211 ehci_hcd ohci_hcd sungem sungem_phy usbcore ssb
      [   78.804263] NIP: c0097f54 LR: c0334a28 CTR: c002d430
      [   78.809997] REGS: eef19ad0 TRAP: 0300   Not tainted  (2.6.27-rc5)
      [   78.815743] MSR: 00001032 <ME,IR,DR>  CR: 22242482  XER: 20000000
      [   78.821550] DAR: 0000004c, DSISR: 40000000
      [   78.827278] TASK = eef0df40[3035] 'mip6d' THREAD: eef18000
      [   78.827408] GPR00: 00001032 eef19b80 eef0df40 00000000 00008020 eef19c30 00000001 00000000
      [   78.833249] GPR08: eee5101c c05a5c10 ef9ad500 00000000 24242422 1005787c 00000000 1004f960
      [   78.839151] GPR16: 00000000 10024e90 10050040 48030018 0fe44150 00000000 00000000 eef19c30
      [   78.845046] GPR24: eef19e44 00000000 eef19bf8 efb37c14 eef19bf8 00008020 00009032 c0596064
      [   78.856671] NIP [c0097f54] kmem_cache_alloc+0x20/0x94
      [   78.862581] LR [c0334a28] dst_alloc+0x40/0xc4
      [   78.868451] Call Trace:
      [   78.874252] [eef19b80] [c03c1810] ip6_dst_lookup_tail+0x1c8/0x1dc (unreliable)
      [   78.880222] [eef19ba0] [c0334a28] dst_alloc+0x40/0xc4
      [   78.886164] [eef19bb0] [c03cd698] ip6_dst_blackhole+0x28/0x1cc
      [   78.892090] [eef19be0] [c03d9be8] rawv6_sendmsg+0x75c/0xc88
      [   78.897999] [eef19cb0] [c038bca4] inet_sendmsg+0x4c/0x78
      [   78.903907] [eef19cd0] [c03207c8] sock_sendmsg+0xac/0xe4
      [   78.909734] [eef19db0] [c03209e4] sys_sendmsg+0x1e4/0x2a0
      [   78.915540] [eef19f00] [c03220a8] sys_socketcall+0xfc/0x210
      [   78.921406] [eef19f40] [c0014b3c] ret_from_syscall+0x0/0x38
      [   78.927295] --- Exception: c01 at 0xfe2d730
      [   78.927297]     LR = 0xfe2d71c
      [   78.939019] Instruction dump:
      [   78.944835] 91640018 9144001c 900a0000 4bffff44 9421ffe0 7c0802a6 bf810010 7c9d2378
      [   78.950694] 90010024 7fc000a6 57c0045e 7c000124 <83e3004c> 8383005c 2f9f0000 419e0050
      [   78.956464] ---[ end trace 05fa1ed7972487a1 ]---
      
      As commented by Benjamin Thery, the bug was introduced by
      f2fc6a54, while adding network
      namespaces support to ipv6 routes.
      Signed-off-by: NArnaud Ebalard <arno@natisbad.org>
      Acked-by: NBenjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5dc121e9
    • D
      ipv6: NULL pointer dereferrence in tcp_v6_send_ack · 2a5b8275
      Denis V. Lunev 提交于
      The following actions are possible:
      tcp_v6_rcv
        skb->dev = NULL;
        tcp_v6_do_rcv
          tcp_v6_hnd_req
            tcp_check_req
              req->rsk_ops->send_ack == tcp_v6_send_ack
      
      So, skb->dev can be NULL in tcp_v6_send_ack. We must obtain namespace
      from dst entry.
      
      Thanks to Vitaliy Gusev <vgusev@openvz.org> for initial problem finding
      in IPv4 code.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a5b8275
    • V
      tcp: Fix NULL dereference in tcp_4_send_ack() · 4dd7972d
      Vitaliy Gusev 提交于
      Fix NULL dereference in tcp_4_send_ack().
      
      As skb->dev is reset to NULL in tcp_v4_rcv() thus OOPS occurs:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000004d0
      IP: [<ffffffff80498503>] tcp_v4_send_ack+0x203/0x250
      
      Stack:  ffff810005dbb000 ffff810015c8acc0 e77b2c6e5f861600 a01610802e90cb6d
       0a08010100000000 88afffff88afffff 0000000080762be8 0000000115c872e8
       0004122000000000 0000000000000001 ffffffff80762b88 0000000000000020
      Call Trace:
       <IRQ>  [<ffffffff80499c33>] tcp_v4_reqsk_send_ack+0x20/0x22
       [<ffffffff8049bce5>] tcp_check_req+0x108/0x14c
       [<ffffffff8047aaf7>] ? rt_intern_hash+0x322/0x33c
       [<ffffffff80499846>] tcp_v4_do_rcv+0x399/0x4ec
       [<ffffffff8045ce4b>] ? skb_checksum+0x4f/0x272
       [<ffffffff80485b74>] ? __inet_lookup_listener+0x14a/0x15c
       [<ffffffff8049babc>] tcp_v4_rcv+0x6a1/0x701
       [<ffffffff8047e739>] ip_local_deliver_finish+0x157/0x24a
       [<ffffffff8047ec9a>] ip_local_deliver+0x72/0x7c
       [<ffffffff8047e5bd>] ip_rcv_finish+0x38d/0x3b2
       [<ffffffff803d3548>] ? scsi_io_completion+0x19d/0x39e
       [<ffffffff8047ebe5>] ip_rcv+0x2a2/0x2e5
       [<ffffffff80462faa>] netif_receive_skb+0x293/0x303
       [<ffffffff80465a9b>] process_backlog+0x80/0xd0
       [<ffffffff802630b4>] ? __rcu_process_callbacks+0x125/0x1b4
       [<ffffffff8046560e>] net_rx_action+0xb9/0x17f
       [<ffffffff80234cc5>] __do_softirq+0xa3/0x164
       [<ffffffff8020c52c>] call_softirq+0x1c/0x28
       <EOI>  [<ffffffff8020de1c>] do_softirq+0x34/0x72
       [<ffffffff80234b8e>] local_bh_enable_ip+0x3f/0x50
       [<ffffffff804d43ca>] _spin_unlock_bh+0x12/0x14
       [<ffffffff804599cd>] release_sock+0xb8/0xc1
       [<ffffffff804a6f9a>] inet_stream_connect+0x146/0x25c
       [<ffffffff80243078>] ? autoremove_wake_function+0x0/0x38
       [<ffffffff8045751f>] sys_connect+0x68/0x8e
       [<ffffffff80291818>] ? fd_install+0x5f/0x68
       [<ffffffff80457784>] ? sock_map_fd+0x55/0x62
       [<ffffffff8020b39b>] system_call_after_swapgs+0x7b/0x80
      
      Code: 41 10 11 d0 83 d0 00 4d 85 ed 89 45 c0 c7 45 c4 08 00 00 00 74 07 41 8b 45 04 89 45 c8 48 8b 43 20 8b 4d b8 48 8d 55 b0 48 89 de <48> 8b 80 d0 04 00 00 48 8b b8 60 01 00 00 e8 20 ae fe ff 65 48
      RIP  [<ffffffff80498503>] tcp_v4_send_ack+0x203/0x250
       RSP <ffffffff80762b78>
      CR2: 00000000000004d0
      Signed-off-by: NVitaliy Gusev <vgusev@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dd7972d
  5. 30 9月, 2008 3 次提交
    • W
      sctp: Fix kernel panic while process protocol violation parameter · ba016670
      Wei Yongjun 提交于
      Since call to function sctp_sf_abort_violation() need paramter 'arg' with
      'struct sctp_chunk' type, it will read the chunk type and chunk length from
      the chunk_hdr member of chunk. But call to sctp_sf_violation_paramlen()
      always with 'struct sctp_paramhdr' type's parameter, it will be passed to
      sctp_sf_abort_violation(). This may cause kernel panic.
      
         sctp_sf_violation_paramlen()
           |-- sctp_sf_abort_violation()
              |-- sctp_make_abort_violation()
      
      This patch fixed this problem. This patch also fix two place which called
      sctp_sf_violation_paramlen() with wrong paramter type.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba016670
    • H
      iucv: Fix mismerge again. · 8b122efd
      Heiko Carstens 提交于
      fb65a7c0 ("iucv: Fix bad merging.") fixed
      a merge error, but in a wrong way. We now end up with the bug below.
      This patch corrects the mismerge like it was intended.
      
      BUG: scheduling while atomic: swapper/1/0x00000000
      Modules linked in:
      CPU: 1 Not tainted 2.6.27-rc7-00094-gc0f4d6d4 #9
      Process swapper (pid: 1, task: 000000003fe7d988, ksp: 000000003fe838c0)
      0000000000000000 000000003fe839b8 0000000000000002 0000000000000000
             000000003fe83a58 000000003fe839d0 000000003fe839d0 0000000000390de6
             000000000058acd8 00000000000000d0 000000003fe7dcd8 0000000000000000
             000000000000000c 000000000000000d 0000000000000000 000000003fe83a28
             000000000039c5b8 0000000000015e5e 000000003fe839b8 000000003fe83a00
      Call Trace:
      ([<0000000000015d6a>] show_trace+0xe6/0x134)
       [<0000000000039656>] __schedule_bug+0xa2/0xa8
       [<0000000000391744>] schedule+0x49c/0x910
       [<0000000000391f64>] schedule_timeout+0xc4/0x114
       [<00000000003910d4>] wait_for_common+0xe8/0x1b4
       [<00000000000549ae>] call_usermodehelper_exec+0xa6/0xec
       [<00000000001af7b8>] kobject_uevent_env+0x418/0x438
       [<00000000001d08fc>] bus_add_driver+0x1e4/0x298
       [<00000000001d1ee4>] driver_register+0x90/0x18c
       [<0000000000566848>] netiucv_init+0x168/0x2c8
       [<00000000000120be>] do_one_initcall+0x3e/0x17c
       [<000000000054a31a>] kernel_init+0x1ce/0x248
       [<000000000001a97a>] kernel_thread_starter+0x6/0xc
       [<000000000001a974>] kernel_thread_starter+0x0/0xc
       iucv: NETIUCV driver initialized
      initcall netiucv_init+0x0/0x2c8 returned with preemption imbalance
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b122efd
    • H
      ipsec: Fix pskb_expand_head corruption in xfrm_state_check_space · d01dbeb6
      Herbert Xu 提交于
      We're never supposed to shrink the headroom or tailroom.  In fact,
      shrinking the headroom is a fatal action.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d01dbeb6
  6. 25 9月, 2008 8 次提交
  7. 23 9月, 2008 1 次提交
    • M
      sys_paccept: disable paccept() until API design is resolved · 2d4c8266
      Michael Kerrisk 提交于
      The reasons for disabling paccept() are as follows:
      
      * The API is more complex than needed.  There is AFAICS no demonstrated
        use case that the sigset argument of this syscall serves that couldn't
        equally be served by the use of pselect/ppoll/epoll_pwait + traditional
        accept().  Roland seems to concur with this opinion
        (http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).  I
        have (more than once) asked Ulrich to explain otherwise
        (http://thread.gmane.org/gmane.linux.kernel/723952/focus=731018), but he
        does not respond, so one is left to assume that he doesn't know of such
        a case.
      
      * The use of a sigset argument is not consistent with other I/O APIs
        that can block on a single file descriptor (e.g., read(), recv(),
        connect()).
      
      * The behavior of paccept() when interrupted by a signal is IMO strange:
        the kernel restarts the system call if SA_RESTART was set for the
        handler.  I think that it should not do this -- that it should behave
        consistently with paccept()/ppoll()/epoll_pwait(), which never restart,
        regardless of SA_RESTART.  The reasoning here is that the very purpose
        of paccept() is to wait for a connection or a signal, and that
        restarting in the latter case is probably never useful.  (Note: Roland
        disagrees on this point, believing that rather paccept() should be
        consistent with accept() in its behavior wrt EINTR
        (http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).)
      
      I believe that instead, a simpler API, consistent with Ulrich's other
      recent additions, is preferable:
      
      accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);
      
      (This simpler API was originally proposed by Ulrich:
      http://thread.gmane.org/gmane.linux.network/92072)
      
      If this simpler API is added, then if we later decide that the sigset
      argument really is required, then a suitable bit in 'flags' could be added
      to indicate the presence of the sigset argument.
      
      At this point, I am hoping we either will get a counter-argument from
      Ulrich about why we really do need paccept()'s sigset argument, or that he
      will resubmit the original accept4() patch.
      Signed-off-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@redhat.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d4c8266
  8. 21 9月, 2008 1 次提交
  9. 19 9月, 2008 2 次提交
    • V
      sctp: Fix oops when INIT-ACK indicates that peer doesn't support AUTH · add52379
      Vlad Yasevich 提交于
      If INIT-ACK is received with SupportedExtensions parameter which
      indicates that the peer does not support AUTH, the packet will be
      silently ignore, and sctp_process_init() do cleanup all of the
      transports in the association.
      When T1-Init timer is expires, OOPS happen while we try to choose
      a different init transport.
      
      The solution is to only clean up the non-active transports, i.e
      the ones that the peer added.  However, that introduces a problem
      with sctp_connectx(), because we don't mark the proper state for
      the transports provided by the user.  So, we'll simply mark
      user-provided transports as ACTIVE.  That will allow INIT
      retransmissions to work properly in the sctp_connectx() context
      and prevent the crash.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      add52379
    • V
      sctp: do not enable peer features if we can't do them. · 0ef46e28
      Vlad Yasevich 提交于
      Do not enable peer features like addip and auth, if they
      are administratively disabled localy.  If the peer resports
      that he supports something that we don't, neither end can
      use it so enabling it is pointless.  This solves a problem
      when talking to a peer that has auth and addip enabled while
      we do not.  Found by Andrei Pelinescu-Onciul <andrei@iptel.org>.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ef46e28
  10. 18 9月, 2008 1 次提交
  11. 17 9月, 2008 1 次提交
  12. 16 9月, 2008 1 次提交
  13. 12 9月, 2008 1 次提交
    • M
      [Bluetooth] Fix regression from using default link policy · 7c6a329e
      Marcel Holtmann 提交于
      To speed up the Simple Pairing connection setup, the support for the
      default link policy has been enabled. This is in contrast to settings
      the link policy on every connection setup. Using the default link policy
      is the preferred way since there is no need to dynamically change it for
      every connection.
      
      For backward compatibility reason and to support old userspace the
      HCISETLINKPOL ioctl has been switched over to using hci_request() to
      issue the HCI command for setting the default link policy instead of
      just storing it in the HCI device structure.
      
      However the hci_request() can only be issued when the device is
      brought up. If used on a device that is registered, but still down
      it will timeout and fail. This is problematic since the command is
      put on the TX queue and the Bluetooth core tries to submit it to
      hardware that is not ready yet. The timeout for these requests is
      10 seconds and this causes a significant regression when setting up
      a new device.
      
      The userspace can perfectly handle a failure of the HCISETLINKPOL
      ioctl and will re-submit it later, but the 10 seconds delay causes
      a problem. So in case hci_request() is called on a device that is
      still down, just fail it with ENETDOWN to indicate what happens.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      7c6a329e
  14. 10 9月, 2008 1 次提交
    • N
      ipv6: Fix OOPS in ip6_dst_lookup_tail(). · e550dfb0
      Neil Horman 提交于
      This fixes kernel bugzilla 11469: "TUN with 1024 neighbours:
      ip6_dst_lookup_tail NULL crash"
      
      dst->neighbour is not necessarily hooked up at this point
      in the processing path, so blindly dereferencing it is
      the wrong thing to do.  This NULL check exists in other
      similar paths and this case was just an oversight.
      
      Also fix the completely wrong and confusing indentation
      here while we're at it.
      
      Based upon a patch by Evgeniy Polyakov.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e550dfb0
  15. 09 9月, 2008 6 次提交
    • H
      ipsec: Restore larval states and socket policies in dump · 225f4005
      Herbert Xu 提交于
      The commit commit 4c563f76 ("[XFRM]:
      Speed up xfrm_policy and xfrm_state walking") inadvertently removed
      larval states and socket policies from netlink dumps.  This patch
      restores them.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      225f4005
    • M
      [Bluetooth] Reject L2CAP connections on an insecure ACL link · e7c29cb1
      Marcel Holtmann 提交于
      The Security Mode 4 of the Bluetooth 2.1 specification has strict
      authentication and encryption requirements. It is the initiators job
      to create a secure ACL link. However in case of malicious devices, the
      acceptor has to make sure that the ACL is encrypted before allowing
      any kind of L2CAP connection. The only exception here is the PSM 1 for
      the service discovery protocol, because that is allowed to run on an
      insecure ACL link.
      
      Previously it was enough to reject a L2CAP connection during the
      connection setup phase, but with Bluetooth 2.1 it is forbidden to
      do any L2CAP protocol exchange on an insecure link (except SDP).
      
      The new hci_conn_check_link_mode() function can be used to check the
      integrity of an ACL link. This functions also takes care of the cases
      where Security Mode 4 is disabled or one of the devices is based on
      an older specification.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      e7c29cb1
    • M
      [Bluetooth] Enforce correct authentication requirements · 09ab6f4c
      Marcel Holtmann 提交于
      With the introduction of Security Mode 4 and Simple Pairing from the
      Bluetooth 2.1 specification it became mandatory that the initiator
      requires authentication and encryption before any L2CAP channel can
      be established. The only exception here is PSM 1 for the service
      discovery protocol (SDP). It is meant to be used without any encryption
      since it contains only public information. This is how Bluetooth 2.0
      and before handle connections on PSM 1.
      
      For Bluetooth 2.1 devices the pairing procedure differentiates between
      no bonding, general bonding and dedicated bonding. The L2CAP layer
      wrongly uses always general bonding when creating new connections, but it
      should not do this for SDP connections. In this case the authentication
      requirement should be no bonding and the just-works model should be used,
      but in case of non-SDP connection it is required to use general bonding.
      
      If the new connection requires man-in-the-middle (MITM) protection, it
      also first wrongly creates an unauthenticated link key and then later on
      requests an upgrade to an authenticated link key to provide full MITM
      protection. With Simple Pairing the link key generation is an expensive
      operation (compared to Bluetooth 2.0 and before) and doing this twice
      during a connection setup causes a noticeable delay when establishing
      a new connection. This should be avoided to not regress from the expected
      Bluetooth 2.0 connection times. The authentication requirements are known
      up-front and so enforce them.
      
      To fulfill these requirements the hci_connect() function has been extended
      with an authentication requirement parameter that will be stored inside
      the connection information and can be retrieved by userspace at any
      time. This allows the correct IO capabilities exchange and results in
      the expected behavior.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      09ab6f4c
    • M
      [Bluetooth] Fix reference counting during ACL config stage · f1c08ca5
      Marcel Holtmann 提交于
      The ACL config stage keeps holding a reference count on incoming
      connections when requesting the extended features. This results in
      keeping an ACL link up without any users. The problem here is that
      the Bluetooth specification doesn't define an ownership of the ACL
      link and thus it can happen that the implementation on the initiator
      side doesn't care about disconnecting unused links. In this case the
      acceptor needs to take care of this.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      f1c08ca5
    • S
      bridge: don't allow setting hello time to zero · 8d4698f7
      Stephen Hemminger 提交于
      Dushan Tcholich reports that on his system ksoftirqd can consume
      between %6 to %10 of cpu time, and cause ~200 context switches per
      second.
      
      He then correlated this with a report by bdupree@techfinesse.com:
      
      	http://marc.info/?l=linux-kernel&m=119613299024398&w=2
      
      and the culprit cause seems to be starting the bridge interface.
      In particular, when starting the bridge interface, his scripts
      are specifying a hello timer interval of "0".
      
      The bridge hello time can't be safely set to values less than 1
      second, otherwise it is possible to end up with a runaway timer.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d4698f7
    • D
      netns : fix kernel panic in timewait socket destruction · d315492b
      Daniel Lezcano 提交于
      How to reproduce ?
       - create a network namespace
       - use tcp protocol and get timewait socket
       - exit the network namespace
       - after a moment (when the timewait socket is destroyed), the kernel
         panics.
      
      # BUG: unable to handle kernel NULL pointer dereference at
      0000000000000007
      IP: [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
      PGD 119985067 PUD 11c5c0067 PMD 0
      Oops: 0000 [1] SMP
      CPU 1
      Modules linked in: ipv6 button battery ac loop dm_mod tg3 libphy ext3 jbd
      edd fan thermal processor thermal_sys sg sata_svw libata dock serverworks
      sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
      Pid: 0, comm: swapper Not tainted 2.6.27-rc2 #3
      RIP: 0010:[<ffffffff821e394d>] [<ffffffff821e394d>]
      inet_twdr_do_twkill_work+0x6e/0xb8
      RSP: 0018:ffff88011ff7fed0 EFLAGS: 00010246
      RAX: ffffffffffffffff RBX: ffffffff82339420 RCX: ffff88011ff7ff30
      RDX: 0000000000000001 RSI: ffff88011a4d03c0 RDI: ffff88011ac2fc00
      RBP: ffffffff823392e0 R08: 0000000000000000 R09: ffff88002802a200
      R10: ffff8800a5c4b000 R11: ffffffff823e4080 R12: ffff88011ac2fc00
      R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
      FS: 0000000041cbd940(0000) GS:ffff8800bff839c0(0000)
      knlGS:0000000000000000
      CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000007 CR3: 00000000bd87c000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff8800bff9e000, task
      ffff88011ff76690)
      Stack: ffffffff823392e0 0000000000000100 ffffffff821e3a3a
      0000000000000008
      0000000000000000 ffffffff821e3a61 ffff8800bff7c000 ffffffff8203c7e7
      ffff88011ff7ff10 ffff88011ff7ff10 0000000000000021 ffffffff82351108
      Call Trace:
      <IRQ> [<ffffffff821e3a3a>] ? inet_twdr_hangman+0x0/0x9e
      [<ffffffff821e3a61>] ? inet_twdr_hangman+0x27/0x9e
      [<ffffffff8203c7e7>] ? run_timer_softirq+0x12c/0x193
      [<ffffffff820390d1>] ? __do_softirq+0x5e/0xcd
      [<ffffffff8200d08c>] ? call_softirq+0x1c/0x28
      [<ffffffff8200e611>] ? do_softirq+0x2c/0x68
      [<ffffffff8201a055>] ? smp_apic_timer_interrupt+0x8e/0xa9
      [<ffffffff8200cad6>] ? apic_timer_interrupt+0x66/0x70
      <EOI> [<ffffffff82011f4c>] ? default_idle+0x27/0x3b
      [<ffffffff8200abbd>] ? cpu_idle+0x5f/0x7d
      
      
      Code: e8 01 00 00 4c 89 e7 41 ff c5 e8 8d fd ff ff 49 8b 44 24 38 4c 89 e7
      65 8b 14 25 24 00 00 00 89 d2 48 8b 80 e8 00 00 00 48 f7 d0 <48> 8b 04 d0
      48 ff 40 58 e8 fc fc ff ff 48 89 df e8 c0 5f 04 00
      RIP [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
      RSP <ffff88011ff7fed0>
      CR2: 0000000000000007
      
      This patch provides a function to purge all timewait sockets related
      to a network namespace. The timewait sockets life cycle is not tied with
      the network namespace, that means the timewait sockets stay alive while
      the network namespace dies. The timewait sockets are for avoiding to
      receive a duplicate packet from the network, if the network namespace is
      freed, the network stack is removed, so no chance to receive any packets
      from the outside world. Furthermore, having a pending destruction timer
      on these sockets with a network namespace freed is not safe and will lead
      to an oops if the timer callback which try to access data belonging to 
      the namespace like for example in:
      	inet_twdr_do_twkill_work
      		-> NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED);
      
      Purging the timewait sockets at the network namespace destruction will:
       1) speed up memory freeing for the namespace
       2) fix kernel panic on asynchronous timewait destruction
      Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
      Acked-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d315492b