1. 29 6月, 2018 1 次提交
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  2. 26 5月, 2018 1 次提交
  3. 16 5月, 2018 1 次提交
  4. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  5. 13 12月, 2017 1 次提交
  6. 28 11月, 2017 1 次提交
  7. 01 7月, 2017 1 次提交
  8. 16 5月, 2017 1 次提交
  9. 13 4月, 2017 2 次提交
    • D
      Bluetooth: Avoid bt_accept_unlink() double unlinking · 27bfbc21
      Dean Jenkins 提交于
      There is a race condition between a thread calling bt_accept_dequeue()
      and a different thread calling bt_accept_unlink(). Protection against
      concurrency is implemented using sk locking. However, sk locking causes
      serialisation of the bt_accept_dequeue() and bt_accept_unlink() threads.
      This serialisation can cause bt_accept_dequeue() to obtain the sk from the
      parent list but becomes blocked waiting for the sk lock held by the
      bt_accept_unlink() thread. bt_accept_unlink() unlinks sk and this thread
      releases the sk lock unblocking bt_accept_dequeue() which potentially runs
      bt_accept_unlink() again on the same sk causing a crash. The attempt to
      double unlink the same sk from the parent list can cause a NULL pointer
      dereference crash due to bt_sk(sk)->parent becoming NULL on the first
      unlink, followed by the second unlink trying to execute
      bt_sk(sk)->parent->sk_ack_backlog-- in bt_accept_unlink() which crashes.
      
      When sk is in the parent list, bt_sk(sk)->parent will be not be NULL.
      When sk is removed from the parent list, bt_sk(sk)->parent is set to
      NULL. Therefore, add a defensive check for bt_sk(sk)->parent not being
      NULL to ensure that sk is still in the parent list after the sk lock has
      been taken in bt_accept_dequeue(). If bt_sk(sk)->parent is detected as
      being NULL then restart the loop so that the loop variables are refreshed
      to use the latest values. This is necessary as list_for_each_entry_safe()
      is not thread safe so causing a risk of an infinite loop occurring as sk
      could point to itself.
      
      In addition, in bt_accept_dequeue() increase the sk reference count to
      protect against early freeing of sk. Early freeing can be possible if the
      bt_accept_unlink() thread calls l2cap_sock_kill() or rfcomm_sock_kill()
      functions before bt_accept_dequeue() gets the sk lock.
      
      For test purposes, the probability of failure can be increased by putting
      a msleep of 1 second in bt_accept_dequeue() between getting the sk and
      waiting for the sk lock. This exposes the fact that the loop
      list_for_each_entry_safe(p, n, &bt_sk(parent)->accept_q) is not safe from
      threads that unlink sk from the list in parallel with the loop which can
      cause sk to become stale within the loop.
      Signed-off-by: NDean Jenkins <Dean_Jenkins@mentor.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      27bfbc21
    • D
      Bluetooth: Handle bt_accept_enqueue() socket atomically · e1633762
      Dean Jenkins 提交于
      There is a small risk that bt_accept_unlink() runs concurrently with
      bt_accept_enqueue() on the same socket. This scenario could potentially
      lead to a NULL pointer dereference of the socket's parent member because
      the socket can be on the list but the socket's parent member is not yet
      updated by bt_accept_enqueue().
      
      Therefore, add socket locking inside bt_accept_enqueue() so that the
      socket is added to the list AND the parent's socket address is set in the
      socket's parent member. The socket locking ensures that the socket is on
      the list with a valid non-NULL parent member.
      Signed-off-by: NDean Jenkins <Dean_Jenkins@mentor.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      e1633762
  10. 02 3月, 2017 1 次提交
  11. 17 2月, 2017 1 次提交
    • E
      Bluetooth: Fix NULL pointer dereference in bt_sock_recvmsg · 9dcbc313
      Ezequiel Garcia 提交于
      As per the comment in include/linux/net.h, the recvfrom handlers
      should expect msg_name to be NULL. However, bt_sock_recvmsg()
      is currently not checking it, which could lead to a NULL pointer
      dereference.
      
      The following NULL pointer dereference was produced while testing
      L2CAP datagram reception. Note that the kernel is tainted due to
      the r8723bs module being inserted. However, it seems the fix still
      applies.
      
      $ l2test -r -G
      l2test[326]: Receiving ...
      Unable to handle kernel NULL pointer dereference at virtual address 00000000
      pgd = ee008000
      [00000000] *pgd=7f896835
      Internal error: Oops: 817 [#1] PREEMPT SMP ARM
      Modules linked in: r8723bs(O)
      CPU: 0 PID: 326 Comm: l2test Tainted: G           O 4.8.0 #1
      Hardware name: Allwinner sun7i (A20) Family
      task: ef1c6880 task.stack: eea70000
      PC is at __memzero+0x58/0x80
      LR is at l2cap_skb_msg_name+0x1c/0x4c
      pc : [<c02c47d8>]    lr : [<c0506278>]    psr: 00070013
      sp : eea71e60  ip : 00000000  fp : 00034e1c
      r10: 00000000  r9 : 00000000  r8 : eea71ed4
      r7 : 000002a0  r6 : eea71ed8  r5 : 00000000  r4 : ee4a5d80
      r3 : 00000000  r2 : 00000000  r1 : 0000000e  r0 : 00000000
      Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment none
      Control: 10c5387d  Table: 7600806a  DAC: 00000051
      Process l2test (pid: 326, stack limit = 0xeea70210)
      Stack: (0xeea71e60 to 0xeea72000)
      1e60: ee4a5d80 eeac2800 000002a0 c04d7114 173eefa0 00000000 c06ca68e 00000000
      1e80: 00000001 eeac2800 eef23500 00000000 000002a0 eea71ed4 eea70000 c0504d50
      1ea0: 00000000 00000000 eef23500 00000000 00000000 c044e8a0 eea71edc eea9f904
      1ec0: bef89aa0 fffffff7 00000000 00035008 000002a0 00000000 00000000 00000000
      1ee0: 00000000 00000000 eea71ed4 00000000 00000000 00000000 00004000 00000000
      1f00: 0000011b c01078c4 eea70000 c044e5e4 00000000 00000000 642f0001 6c2f7665
      1f20: 0000676f 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      1f40: 00000000 00000000 00000000 00000000 00000000 ffffffff 00000001 bef89ad8
      1f60: 000000a8 c01078c4 eea70000 00000000 00034e1c c01e6c74 00000000 00000000
      1f80: 00034e1c 000341f8 00000000 00000123 c01078c4 c044e90c 00000000 00000000
      1fa0: 000002a0 c0107700 00034e1c 000341f8 00000003 00035008 000002a0 00000000
      1fc0: 00034e1c 000341f8 00000000 00000123 00000000 00000000 00011ffc 00034e1c
      1fe0: 00000000 bef89aa4 0001211c b6eebb60 60070010 00000003 00000000 00000000
      [<c02c47d8>] (__memzero) from [<c0506278>] (l2cap_skb_msg_name+0x1c/0x4c)
      [<c0506278>] (l2cap_skb_msg_name) from [<c04d7114>] (bt_sock_recvmsg+0x128/0x160)
      [<c04d7114>] (bt_sock_recvmsg) from [<c0504d50>] (l2cap_sock_recvmsg+0x98/0x134)
      [<c0504d50>] (l2cap_sock_recvmsg) from [<c044e8a0>] (SyS_recvfrom+0x94/0xec)
      [<c044e8a0>] (SyS_recvfrom) from [<c044e90c>] (SyS_recv+0x14/0x1c)
      [<c044e90c>] (SyS_recv) from [<c0107700>] (ret_fast_syscall+0x0/0x3c)
      Code: e3110010 18a0500c e49de004 e3110008 (18a0000c)
      ---[ end trace 224e35e79fe06b42 ]---
      Signed-off-by: NEzequiel Garcia <ezequiel@vanguardiasur.com.ar>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      9dcbc313
  12. 20 9月, 2016 2 次提交
  13. 26 8月, 2016 1 次提交
  14. 08 7月, 2016 1 次提交
    • D
      Bluetooth: Fix bt_sock_recvmsg return value · b5f34f94
      Denis Kenzior 提交于
      If recvmsg is called with a destination buffer that is too small to
      receive the contents of skb in its entirety, the return value from
      recvmsg was inconsistent with common SOCK_SEQPACKET or SOCK_DGRAM
      semantics.
      
      If destination buffer provided by userspace is too small (e.g. len <
      copied), then MSG_TRUNC flag is set and copied is returned.  Instead, it
      should return the length of the message, which is consistent with how
      other datagram based sockets act.  Quoting 'man recv':
      
      "All  three calls return the length of the message on successful comple‐
      tion.  If a message is too long to fit in the supplied  buffer,  excess
      bytes  may  be discarded depending on the type of socket the message is
      received from."
      
      and
      
      "MSG_TRUNC (since Linux 2.2)
      
          For   raw   (AF_PACKET),   Internet   datagram   (since    Linux
          2.4.27/2.6.8),  netlink  (since Linux 2.6.22), and UNIX datagram
          (since Linux 3.4) sockets: return the real length of the packet
          or datagram, even when it was longer than the passed buffer."
      Signed-off-by: NDenis Kenzior <denkenz@gmail.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      b5f34f94
  15. 14 4月, 2016 1 次提交
  16. 20 12月, 2015 1 次提交
  17. 10 12月, 2015 1 次提交
    • Y
      Bluetooth: Fix locking in bt_accept_dequeue after disconnection · 1a11ec89
      Yichen Zhao 提交于
      Fix a crash that may happen when bt_accept_dequeue is run after a
      Bluetooth connection has been disconnected. bt_accept_unlink was called
      after release_sock, permitting bt_accept_unlink to run twice on the same
      socket and cause a NULL pointer dereference.
      
      [50510.241632] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
      [50510.241694] IP: [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
      [50510.241759] PGD 0
      [50510.241776] Oops: 0002 [#1] SMP
      [50510.241802] Modules linked in: rtl8192cu rtl_usb rtlwifi rtl8192c_common 8021q garp stp mrp llc rfcomm bnep nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 ath9k ath9k_common ath9k_hw ath kvm eeepc_wmi asus_wmi mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek sparse_keymap crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_controller cfg80211 snd_hda_codec i915 snd_hwdep snd_pcm ghash_clmulni_intel snd_timer snd soundcore serio_raw cryptd drm_kms_helper drm i2c_algo_bit shpchp ath3k mei_me lpc_ich btusb bluetooth 6lowpan_iphc mei lp parport wmi video mac_hid psmouse ahci libahci r8169 mii
      [50510.242279] CPU: 0 PID: 934 Comm: krfcommd Not tainted 3.16.0-49-generic #65~14.04.1-Ubuntu
      [50510.242327] Hardware name: ASUSTeK Computer INC. VM40B/VM40B, BIOS 1501 12/09/2014
      [50510.242370] task: ffff8800d9068a30 ti: ffff8800d7a54000 task.ti: ffff8800d7a54000
      [50510.242413] RIP: 0010:[<ffffffffc01243f7>]  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
      [50510.242480] RSP: 0018:ffff8800d7a57d58  EFLAGS: 00010246
      [50510.242511] RAX: 0000000000000000 RBX: ffff880119bb8c00 RCX: ffff880119bb8eb0
      [50510.242552] RDX: ffff880119bb8eb0 RSI: 00000000fffffe01 RDI: ffff880119bb8c00
      [50510.242592] RBP: ffff8800d7a57d60 R08: 0000000000000283 R09: 0000000000000001
      [50510.242633] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800d8da9eb0
      [50510.242673] R13: ffff8800d74fdb80 R14: ffff880119bb8c00 R15: ffff8800d8da9c00
      [50510.242715] FS:  0000000000000000(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
      [50510.242761] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [50510.242794] CR2: 00000000000001a8 CR3: 0000000001c13000 CR4: 00000000001407f0
      [50510.242835] Stack:
      [50510.242849]  ffff880119bb8eb0 ffff8800d7a57da0 ffffffffc0124506 ffff8800d8da9eb0
      [50510.242899]  ffff8800d8da9c00 ffff8800d9068a30 0000000000000000 ffff8800d74fdb80
      [50510.242949]  ffff8800d6f85208 ffff8800d7a57e08 ffffffffc0159985 000000000000001f
      [50510.242999] Call Trace:
      [50510.243027]  [<ffffffffc0124506>] bt_accept_dequeue+0xb6/0x180 [bluetooth]
      [50510.243085]  [<ffffffffc0159985>] l2cap_sock_accept+0x125/0x220 [bluetooth]
      [50510.243128]  [<ffffffff810a1b30>] ? wake_up_state+0x20/0x20
      [50510.243163]  [<ffffffff8164946e>] kernel_accept+0x4e/0xa0
      [50510.243200]  [<ffffffffc05b97cd>] rfcomm_run+0x1ad/0x890 [rfcomm]
      [50510.243238]  [<ffffffffc05b9620>] ? rfcomm_process_rx+0x8a0/0x8a0 [rfcomm]
      [50510.243281]  [<ffffffff81091572>] kthread+0xd2/0xf0
      [50510.243312]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
      [50510.243353]  [<ffffffff8176e9d8>] ret_from_fork+0x58/0x90
      [50510.243387]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
      [50510.243424] Code: 00 48 8b 93 b8 02 00 00 48 8d 83 b0 02 00 00 48 89 51 08 48 89 0a 48 89 83 b0 02 00 00 48 89 83 b8 02 00 00 48 8b 83 c0 02 00 00 <66> 83 a8 a8 01 00 00 01 48 c7 83 c0 02 00 00 00 00 00 00 f0 ff
      [50510.243685] RIP  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
      [50510.243737]  RSP <ffff8800d7a57d58>
      [50510.243758] CR2: 00000000000001a8
      [50510.249457] ---[ end trace bb984f932c4e3ab3 ]---
      Signed-off-by: NYichen Zhao <zhaoyichen@google.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      1a11ec89
  18. 02 12月, 2015 1 次提交
    • E
      net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072
      Eric Dumazet 提交于
      This patch is a cleanup to make following patch easier to
      review.
      
      Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
      from (struct socket)->flags to a (struct socket_wq)->flags
      to benefit from RCU protection in sock_wake_async()
      
      To ease backports, we rename both constants.
      
      Two new helpers, sk_set_bit(int nr, struct sock *sk)
      and sk_clear_bit(int net, struct sock *sk) are added so that
      following patch can change their implementation.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cd3e072
  19. 20 11月, 2015 2 次提交
  20. 26 10月, 2015 1 次提交
  21. 22 10月, 2015 1 次提交
  22. 07 3月, 2015 1 次提交
  23. 03 3月, 2015 1 次提交
  24. 02 3月, 2015 1 次提交
  25. 30 12月, 2014 1 次提交
  26. 04 12月, 2014 1 次提交
  27. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  28. 15 9月, 2014 1 次提交
  29. 03 7月, 2014 1 次提交
  30. 21 2月, 2014 1 次提交
  31. 08 12月, 2013 1 次提交
  32. 21 11月, 2013 1 次提交
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426
  33. 18 10月, 2013 1 次提交
  34. 14 10月, 2013 4 次提交