1. 23 3月, 2021 1 次提交
    • A
      Bluetooth: verify AMP hci_chan before amp_destroy · 5c4c8c95
      Archie Pusaka 提交于
      hci_chan can be created in 2 places: hci_loglink_complete_evt() if
      it is an AMP hci_chan, or l2cap_conn_add() otherwise. In theory,
      Only AMP hci_chan should be removed by a call to
      hci_disconn_loglink_complete_evt(). However, the controller might mess
      up, call that function, and destroy an hci_chan which is not initiated
      by hci_loglink_complete_evt().
      
      This patch adds a verification that the destroyed hci_chan must have
      been init'd by hci_loglink_complete_evt().
      
      Example crash call trace:
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xe3/0x144 lib/dump_stack.c:118
       print_address_description+0x67/0x22a mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report mm/kasan/report.c:412 [inline]
       kasan_report+0x251/0x28f mm/kasan/report.c:396
       hci_send_acl+0x3b/0x56e net/bluetooth/hci_core.c:4072
       l2cap_send_cmd+0x5af/0x5c2 net/bluetooth/l2cap_core.c:877
       l2cap_send_move_chan_cfm_icid+0x8e/0xb1 net/bluetooth/l2cap_core.c:4661
       l2cap_move_fail net/bluetooth/l2cap_core.c:5146 [inline]
       l2cap_move_channel_rsp net/bluetooth/l2cap_core.c:5185 [inline]
       l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:5464 [inline]
       l2cap_sig_channel net/bluetooth/l2cap_core.c:5799 [inline]
       l2cap_recv_frame+0x1d12/0x51aa net/bluetooth/l2cap_core.c:7023
       l2cap_recv_acldata+0x2ea/0x693 net/bluetooth/l2cap_core.c:7596
       hci_acldata_packet net/bluetooth/hci_core.c:4606 [inline]
       hci_rx_work+0x2bd/0x45e net/bluetooth/hci_core.c:4796
       process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
       worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
       kthread+0x2f0/0x304 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
      
      Allocated by task 38:
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0x8d/0x9a mm/kasan/kasan.c:553
       kmem_cache_alloc_trace+0x102/0x129 mm/slub.c:2787
       kmalloc include/linux/slab.h:515 [inline]
       kzalloc include/linux/slab.h:709 [inline]
       hci_chan_create+0x86/0x26d net/bluetooth/hci_conn.c:1674
       l2cap_conn_add.part.0+0x1c/0x814 net/bluetooth/l2cap_core.c:7062
       l2cap_conn_add net/bluetooth/l2cap_core.c:7059 [inline]
       l2cap_connect_cfm+0x134/0x852 net/bluetooth/l2cap_core.c:7381
       hci_connect_cfm+0x9d/0x122 include/net/bluetooth/hci_core.h:1404
       hci_remote_ext_features_evt net/bluetooth/hci_event.c:4161 [inline]
       hci_event_packet+0x463f/0x72fa net/bluetooth/hci_event.c:5981
       hci_rx_work+0x197/0x45e net/bluetooth/hci_core.c:4791
       process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
       worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
       kthread+0x2f0/0x304 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
      
      Freed by task 1732:
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free mm/kasan/kasan.c:521 [inline]
       __kasan_slab_free+0x106/0x128 mm/kasan/kasan.c:493
       slab_free_hook mm/slub.c:1409 [inline]
       slab_free_freelist_hook+0xaa/0xf6 mm/slub.c:1436
       slab_free mm/slub.c:3009 [inline]
       kfree+0x182/0x21e mm/slub.c:3972
       hci_disconn_loglink_complete_evt net/bluetooth/hci_event.c:4891 [inline]
       hci_event_packet+0x6a1c/0x72fa net/bluetooth/hci_event.c:6050
       hci_rx_work+0x197/0x45e net/bluetooth/hci_core.c:4791
       process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
       worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
       kthread+0x2f0/0x304 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
      
      The buggy address belongs to the object at ffff8881d7af9180
       which belongs to the cache kmalloc-128 of size 128
      The buggy address is located 24 bytes inside of
       128-byte region [ffff8881d7af9180, ffff8881d7af9200)
      The buggy address belongs to the page:
      page:ffffea00075ebe40 count:1 mapcount:0 mapping:ffff8881da403200 index:0x0
      flags: 0x8000000000000200(slab)
      raw: 8000000000000200 dead000000000100 dead000000000200 ffff8881da403200
      raw: 0000000000000000 0000000080150015 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8881d7af9080: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
       ffff8881d7af9100: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff8881d7af9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
       ffff8881d7af9200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff8881d7af9280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      Signed-off-by: NArchie Pusaka <apusaka@chromium.org>
      Reported-by: syzbot+98228e7407314d2d4ba2@syzkaller.appspotmail.com
      Reviewed-by: NAlain Michaud <alainm@chromium.org>
      Reviewed-by: NAbhishek Pandit-Subedi <abhishekpandit@chromium.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      5c4c8c95
  2. 22 3月, 2021 1 次提交
    • A
      Bluetooth: Set CONF_NOT_COMPLETE as l2cap_chan default · 3a9d54b1
      Archie Pusaka 提交于
      Currently l2cap_chan_set_defaults() reset chan->conf_state to zero.
      However, there is a flag CONF_NOT_COMPLETE which is set when
      creating the l2cap_chan. It is suggested that the flag should be
      cleared when l2cap_chan is ready, but when l2cap_chan_set_defaults()
      is called, l2cap_chan is not yet ready. Therefore, we must set this
      flag as the default.
      
      Example crash call trace:
      __dump_stack lib/dump_stack.c:15 [inline]
      dump_stack+0xc4/0x118 lib/dump_stack.c:56
      panic+0x1c6/0x38b kernel/panic.c:117
      __warn+0x170/0x1b9 kernel/panic.c:471
      warn_slowpath_fmt+0xc7/0xf8 kernel/panic.c:494
      debug_print_object+0x175/0x193 lib/debugobjects.c:260
      debug_object_assert_init+0x171/0x1bf lib/debugobjects.c:614
      debug_timer_assert_init kernel/time/timer.c:629 [inline]
      debug_assert_init kernel/time/timer.c:677 [inline]
      del_timer+0x7c/0x179 kernel/time/timer.c:1034
      try_to_grab_pending+0x81/0x2e5 kernel/workqueue.c:1230
      cancel_delayed_work+0x7c/0x1c4 kernel/workqueue.c:2929
      l2cap_clear_timer+0x1e/0x41 include/net/bluetooth/l2cap.h:834
      l2cap_chan_del+0x2d8/0x37e net/bluetooth/l2cap_core.c:640
      l2cap_chan_close+0x532/0x5d8 net/bluetooth/l2cap_core.c:756
      l2cap_sock_shutdown+0x806/0x969 net/bluetooth/l2cap_sock.c:1174
      l2cap_sock_release+0x64/0x14d net/bluetooth/l2cap_sock.c:1217
      __sock_release+0xda/0x217 net/socket.c:580
      sock_close+0x1b/0x1f net/socket.c:1039
      __fput+0x322/0x55c fs/file_table.c:208
      ____fput+0x17/0x19 fs/file_table.c:244
      task_work_run+0x19b/0x1d3 kernel/task_work.c:115
      exit_task_work include/linux/task_work.h:21 [inline]
      do_exit+0xe4c/0x204a kernel/exit.c:766
      do_group_exit+0x291/0x291 kernel/exit.c:891
      get_signal+0x749/0x1093 kernel/signal.c:2396
      do_signal+0xa5/0xcdb arch/x86/kernel/signal.c:737
      exit_to_usermode_loop arch/x86/entry/common.c:243 [inline]
      prepare_exit_to_usermode+0xed/0x235 arch/x86/entry/common.c:277
      syscall_return_slowpath+0x3a7/0x3b3 arch/x86/entry/common.c:348
      int_ret_from_sys_call+0x25/0xa3
      Signed-off-by: NArchie Pusaka <apusaka@chromium.org>
      Reported-by: syzbot+338f014a98367a08a114@syzkaller.appspotmail.com
      Reviewed-by: NAlain Michaud <alainm@chromium.org>
      Reviewed-by: NAbhishek Pandit-Subedi <abhishekpandit@chromium.org>
      Reviewed-by: NGuenter Roeck <groeck@chromium.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      3a9d54b1
  3. 18 3月, 2021 2 次提交
  4. 17 3月, 2021 1 次提交
  5. 16 3月, 2021 4 次提交
    • J
      Bluetooth: avoid deadlock between hci_dev->lock and socket lock · 17486960
      Jiri Kosina 提交于
      Commit eab2404b ("Bluetooth: Add BT_PHY socket option") added a
      dependency between socket lock and hci_dev->lock that could lead to
      deadlock.
      
      It turns out that hci_conn_get_phy() is not in any way relying on hdev
      being immutable during the runtime of this function, neither does it even
      look at any of the members of hdev, and as such there is no need to hold
      that lock.
      
      This fixes the lockdep splat below:
      
       ======================================================
       WARNING: possible circular locking dependency detected
       5.12.0-rc1-00026-g73d464503354 #10 Not tainted
       ------------------------------------------------------
       bluetoothd/1118 is trying to acquire lock:
       ffff8f078383c078 (&hdev->lock){+.+.}-{3:3}, at: hci_conn_get_phy+0x1c/0x150 [bluetooth]
      
       but task is already holding lock:
       ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #3 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}:
              lock_sock_nested+0x72/0xa0
              l2cap_sock_ready_cb+0x18/0x70 [bluetooth]
              l2cap_config_rsp+0x27a/0x520 [bluetooth]
              l2cap_sig_channel+0x658/0x1330 [bluetooth]
              l2cap_recv_frame+0x1ba/0x310 [bluetooth]
              hci_rx_work+0x1cc/0x640 [bluetooth]
              process_one_work+0x244/0x5f0
              worker_thread+0x3c/0x380
              kthread+0x13e/0x160
              ret_from_fork+0x22/0x30
      
       -> #2 (&chan->lock#2/1){+.+.}-{3:3}:
              __mutex_lock+0xa3/0xa10
              l2cap_chan_connect+0x33a/0x940 [bluetooth]
              l2cap_sock_connect+0x141/0x2a0 [bluetooth]
              __sys_connect+0x9b/0xc0
              __x64_sys_connect+0x16/0x20
              do_syscall_64+0x33/0x80
              entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       -> #1 (&conn->chan_lock){+.+.}-{3:3}:
              __mutex_lock+0xa3/0xa10
              l2cap_chan_connect+0x322/0x940 [bluetooth]
              l2cap_sock_connect+0x141/0x2a0 [bluetooth]
              __sys_connect+0x9b/0xc0
              __x64_sys_connect+0x16/0x20
              do_syscall_64+0x33/0x80
              entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       -> #0 (&hdev->lock){+.+.}-{3:3}:
              __lock_acquire+0x147a/0x1a50
              lock_acquire+0x277/0x3d0
              __mutex_lock+0xa3/0xa10
              hci_conn_get_phy+0x1c/0x150 [bluetooth]
              l2cap_sock_getsockopt+0x5a9/0x610 [bluetooth]
              __sys_getsockopt+0xcc/0x200
              __x64_sys_getsockopt+0x20/0x30
              do_syscall_64+0x33/0x80
              entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       other info that might help us debug this:
      
       Chain exists of:
         &hdev->lock --> &chan->lock#2/1 --> sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
                                      lock(&chan->lock#2/1);
                                      lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
         lock(&hdev->lock);
      
        *** DEADLOCK ***
      
       1 lock held by bluetoothd/1118:
        #0: ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610 [bluetooth]
      
       stack backtrace:
       CPU: 3 PID: 1118 Comm: bluetoothd Not tainted 5.12.0-rc1-00026-g73d464503354 #10
       Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
       Call Trace:
        dump_stack+0x7f/0xa1
        check_noncircular+0x105/0x120
        ? __lock_acquire+0x147a/0x1a50
        __lock_acquire+0x147a/0x1a50
        lock_acquire+0x277/0x3d0
        ? hci_conn_get_phy+0x1c/0x150 [bluetooth]
        ? __lock_acquire+0x2e1/0x1a50
        ? lock_is_held_type+0xb4/0x120
        ? hci_conn_get_phy+0x1c/0x150 [bluetooth]
        __mutex_lock+0xa3/0xa10
        ? hci_conn_get_phy+0x1c/0x150 [bluetooth]
        ? lock_acquire+0x277/0x3d0
        ? mark_held_locks+0x49/0x70
        ? mark_held_locks+0x49/0x70
        ? hci_conn_get_phy+0x1c/0x150 [bluetooth]
        hci_conn_get_phy+0x1c/0x150 [bluetooth]
        l2cap_sock_getsockopt+0x5a9/0x610 [bluetooth]
        __sys_getsockopt+0xcc/0x200
        __x64_sys_getsockopt+0x20/0x30
        do_syscall_64+0x33/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7fb73df33eee
       Code: 48 8b 0d 85 0f 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 0f 0c 00 f7 d8 64 89 01 48
       RSP: 002b:00007fffcfbbbf08 EFLAGS: 00000203 ORIG_RAX: 0000000000000037
       RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007fb73df33eee
       RDX: 000000000000000e RSI: 0000000000000112 RDI: 0000000000000018
       RBP: 0000000000000000 R08: 00007fffcfbbbf44 R09: 0000000000000000
       R10: 00007fffcfbbbf3c R11: 0000000000000203 R12: 0000000000000000
       R13: 0000000000000018 R14: 0000000000000000 R15: 0000556fcefc70d0
      
      Fixes: eab2404b ("Bluetooth: Add BT_PHY socket option")
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      17486960
    • L
      Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg · 2e1614f7
      Luiz Augusto von Dentz 提交于
      This converts instances of BT_ERR and BT_DBG to bt_dev_err and
      bt_dev_dbg which can be enabled at runtime when BT_FEATURE_DEBUG is
      enabled.
      
      Note: Not all instances could be converted as some are exercised by
      selftest.
      Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      2e1614f7
    • L
      Bluetooth: L2CAP: Fix not checking for maximum number of DCID · 7cf3b1dd
      Luiz Augusto von Dentz 提交于
      When receiving L2CAP_CREDIT_BASED_CONNECTION_REQ the remote may request
      more channels than allowed by the spec (10 octecs = 5 CIDs) so this
      checks if the number of channels is bigger than the maximum allowed and
      respond with an error.
      Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      7cf3b1dd
    • S
      Bluetooth: Cancel le_scan_restart work when stopping discovery · c06632a4
      Sonny Sasaka 提交于
      Not cancelling it has caused a bug where passive background scanning is
      disabled out of the blue, preventing BLE keyboards/mice to reconnect.
      Here is how it happens:
      After hci_req_stop_discovery, there is still le_scan_restart_work
      scheduled. Invocation of le_scan_restart_work causes a harmful
      le_scan_disable_work to be scheduled. This le_scan_disable_work will
      eventually disable passive scanning when the timer fires.
      
      Sample btmon trace:
      
      < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7
              Type: Passive (0x00)
              Interval: 367.500 msec (0x024c)
              Window: 37.500 msec (0x003c)
              Own address type: Public (0x00)
              Filter policy: Accept all advertisement (0x00)
      > HCI Event: Command Complete (0x0e) plen 4
            LE Set Scan Parameters (0x08|0x000b) ncmd 1
              Status: Success (0x00)
      < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
              Scanning: Enabled (0x01)
              Filter duplicates: Disabled (0x00)
      > HCI Event: Command Complete (0x0e) plen 4
            LE Set Scan Enable (0x08|0x000c) ncmd 2
              Status: Success (0x00)
      ...
      < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
              Scanning: Disabled (0x00)
              Filter duplicates: Disabled (0x00)
      > HCI Event: Command Complete (0x0e) plen 4
            LE Set Scan Enable (0x08|0x000c) ncmd 2
              Status: Success (0x00)
      // Background scanning is not working here onwards.
      Reviewed-by: NAbhishek Pandit-Subedi <abhishekpandit@chromium.org>
      Signed-off-by: NSonny Sasaka <sonnysasaka@chromium.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      c06632a4
  6. 11 3月, 2021 2 次提交
  7. 08 3月, 2021 4 次提交
  8. 04 3月, 2021 2 次提交
  9. 02 3月, 2021 1 次提交
  10. 27 2月, 2021 2 次提交
    • M
      Bluetooth: btusb: Fix incorrect type in assignment and uninitialized symbol · 201cf397
      mark-yw.chen 提交于
      Warnings: drivers/bluetooth/btusb.c:3775 btusb_mtk_setup() error:
      uninitialized symbol 'fw_version'.
      -> add initial value for fw_version.
      
      Warnings: sparse: sparse: incorrect type in assignment (different base
      types)
      -> add le32_to_cpu to fix incorrect type in assignment.
      Signed-off-by: Nmark-yw.chen <mark-yw.chen@mediatek.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      201cf397
    • R
      Bluetooth: btusb: support 0cb5:c547 Realtek 8822CE device · 3edc5782
      Rasmus Moorats 提交于
      Some Xiaomi RedmiBook laptop models use the 0cb5:c547 USB identifier
      for their Bluetooth device, so load the appropriate firmware for
      Realtek 8822CE.
      
      -Device(0cb5:c547) from /sys/kernel/debug/usb/devices
      T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=02 Dev#=  3 Spd=12   MxCh= 0
      D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0cb5 ProdID=c547 Rev= 0.00
      S:  Manufacturer=Realtek
      S:  Product=Bluetooth Radio
      S:  SerialNumber=00e04c000001
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      Signed-off-by: NRasmus Moorats <xx@nns.ee>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      3edc5782
  11. 22 2月, 2021 20 次提交
    • L
      Merge tag 'perf-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d310ec03
      Linus Torvalds 提交于
      Pull performance event updates from Ingo Molnar:
      
       - Add CPU-PMU support for Intel Sapphire Rapids CPUs
      
       - Extend the perf ABI with PERF_SAMPLE_WEIGHT_STRUCT, to offer
         two-parameter sampling event feedback. Not used yet, but is intended
         for Golden Cove CPU-PMU, which can provide both the instruction
         latency and the cache latency information for memory profiling
         events.
      
       - Remove experimental, default-disabled perfmon-v4 counter_freezing
         support that could only be enabled via a boot option. The hardware is
         hopelessly broken, we'd like to make sure nobody starts relying on
         this, as it would only end in tears.
      
       - Fix energy/power events on Intel SPR platforms
      
       - Simplify the uprobes resume_execution() logic
      
       - Misc smaller fixes.
      
      * tag 'perf-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/rapl: Fix psys-energy event on Intel SPR platform
        perf/x86/rapl: Only check lower 32bits for RAPL energy counters
        perf/x86/rapl: Add msr mask support
        perf/x86/kvm: Add Cascade Lake Xeon steppings to isolation_ucodes[]
        perf/x86/intel: Support CPUID 10.ECX to disable fixed counters
        perf/x86/intel: Add perf core PMU support for Sapphire Rapids
        perf/x86/intel: Filter unsupported Topdown metrics event
        perf/x86/intel: Factor out intel_update_topdown_event()
        perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT
        perf/intel: Remove Perfmon-v4 counter_freezing support
        x86/perf: Use static_call for x86_pmu.guest_get_msrs
        perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info
        perf/x86/intel/uncore: Store the logical die id instead of the physical die id.
        x86/kprobes: Do not decode opcode in resume_execution()
      d310ec03
    • L
      Merge tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 657bd90c
      Linus Torvalds 提交于
      Pull scheduler updates from Ingo Molnar:
       "Core scheduler updates:
      
         - Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the
           preempt=none/voluntary/full boot options (default: full), to allow
           distros to build a PREEMPT kernel but fall back to close to
           PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via
           a boot time selection.
      
           There's also the /debug/sched_debug switch to do this runtime.
      
           This feature is implemented via runtime patching (a new variant of
           static calls).
      
           The scope of the runtime patching can be best reviewed by looking
           at the sched_dynamic_update() function in kernel/sched/core.c.
      
           ( Note that the dynamic none/voluntary mode isn't 100% identical,
             for example preempt-RCU is available in all cases, plus the
             preempt count is maintained in all models, which has runtime
             overhead even with the code patching. )
      
           The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast
           majority of distributions, are supposed to be unaffected.
      
         - Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that
           was found via rcutorture triggering a hang. The bug is that
           rcu_idle_enter() may wake up a NOCB kthread, but this happens after
           the last generic need_resched() check. Some cpuidle drivers fix it
           by chance but many others don't.
      
           In true 2020 fashion the original bug fix has grown into a 5-patch
           scheduler/RCU fix series plus another 16 RCU patches to address the
           underlying issue of missed preemption events. These are the initial
           fixes that should fix current incarnations of the bug.
      
         - Clean up rbtree usage in the scheduler, by providing & using the
           following consistent set of rbtree APIs:
      
             partial-order; less() based:
               - rb_add(): add a new entry to the rbtree
               - rb_add_cached(): like rb_add(), but for a rb_root_cached
      
             total-order; cmp() based:
               - rb_find(): find an entry in an rbtree
               - rb_find_add(): find an entry, and add if not found
      
               - rb_find_first(): find the first (leftmost) matching entry
               - rb_next_match(): continue from rb_find_first()
               - rb_for_each(): iterate a sub-tree using the previous two
      
         - Improve the SMP/NUMA load-balancer: scan for an idle sibling in a
           single pass. This is a 4-commit series where each commit improves
           one aspect of the idle sibling scan logic.
      
         - Improve the cpufreq cooling driver by getting the effective CPU
           utilization metrics from the scheduler
      
         - Improve the fair scheduler's active load-balancing logic by
           reducing the number of active LB attempts & lengthen the
           load-balancing interval. This improves stress-ng mmapfork
           performance.
      
         - Fix CFS's estimated utilization (util_est) calculation bug that can
           result in too high utilization values
      
        Misc updates & fixes:
      
         - Fix the HRTICK reprogramming & optimization feature
      
         - Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code
      
         - Reduce dl_add_task_root_domain() overhead
      
         - Fix uprobes refcount bug
      
         - Process pending softirqs in flush_smp_call_function_from_idle()
      
         - Clean up task priority related defines, remove *USER_*PRIO and
           USER_PRIO()
      
         - Simplify the sched_init_numa() deduplication sort
      
         - Documentation updates
      
         - Fix EAS bug in update_misfit_status(), which degraded the quality
           of energy-balancing
      
         - Smaller cleanups"
      
      * tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
        sched,x86: Allow !PREEMPT_DYNAMIC
        entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point
        entry: Explicitly flush pending rcuog wakeup before last rescheduling point
        rcu/nocb: Trigger self-IPI on late deferred wake up before user resume
        rcu/nocb: Perform deferred wake up before last idle's need_resched() check
        rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers
        sched/features: Distinguish between NORMAL and DEADLINE hrtick
        sched/features: Fix hrtick reprogramming
        sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()
        uprobes: (Re)add missing get_uprobe() in __find_uprobe()
        smp: Process pending softirqs in flush_smp_call_function_from_idle()
        sched: Harden PREEMPT_DYNAMIC
        static_call: Allow module use without exposing static_call_key
        sched: Add /debug/sched_preempt
        preempt/dynamic: Support dynamic preempt with preempt= boot option
        preempt/dynamic: Provide irqentry_exit_cond_resched() static call
        preempt/dynamic: Provide preempt_schedule[_notrace]() static calls
        preempt/dynamic: Provide cond_resched() and might_resched() static calls
        preempt: Introduce CONFIG_PREEMPT_DYNAMIC
        static_call: Provide DEFINE_STATIC_CALL_RET0()
        ...
      657bd90c
    • L
      Merge tag 'core-mm-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7b15c27e
      Linus Torvalds 提交于
      Pull tlb gather updates from Ingo Molnar:
       "Theses fix MM (soft-)dirty bit management in the procfs code & clean
        up the TLB gather API"
      
      * tag 'core-mm-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ldt: Use tlb_gather_mmu_fullmm() when freeing LDT page-tables
        tlb: arch: Remove empty __tlb_remove_tlb_entry() stubs
        tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu()
        tlb: mmu_gather: Introduce tlb_gather_mmu_fullmm()
        tlb: mmu_gather: Remove unused start/end arguments from tlb_finish_mmu()
        mm: proc: Invalidate TLB after clearing soft-dirty page state
      7b15c27e
    • L
      Merge tag 'locking-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9eef0233
      Linus Torvalds 提交于
      Pull locking updates from Ingo Molnar:
       "Core locking primitives updates:
          - Remove mutex_trylock_recursive() from the API - no users left
          - Simplify + constify the futex code a bit
      
        Lockdep updates:
          - Teach lockdep about local_lock_t
          - Add CONFIG_DEBUG_IRQFLAGS=y debug config option to check for
            potentially unsafe IRQ mask restoration patterns. (I.e.
            calling raw_local_irq_restore() with IRQs enabled.)
          - Add wait context self-tests
          - Fix graph lock corner case corrupting internal data structures
          - Fix noinstr annotations
      
        LKMM updates:
          - Simplify the litmus tests
          - Documentation fixes
      
        KCSAN updates:
          - Re-enable KCSAN instrumentation in lib/random32.c
      
        Misc fixes:
          - Don't branch-trace static label APIs
          - DocBook fix
          - Remove stale leftover empty file"
      
      * tag 'locking-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        checkpatch: Don't check for mutex_trylock_recursive()
        locking/mutex: Kill mutex_trylock_recursive()
        s390: Use arch_local_irq_{save,restore}() in early boot code
        lockdep: Noinstr annotate warn_bogus_irq_restore()
        locking/lockdep: Avoid unmatched unlock
        locking/rwsem: Remove empty rwsem.h
        locking/rtmutex: Add missing kernel-doc markup
        futex: Remove unneeded gotos
        futex: Change utime parameter to be 'const ... *'
        lockdep: report broken irq restoration
        jump_label: Do not profile branch annotations
        locking: Add Reviewers
        locking/selftests: Add local_lock inversion tests
        locking/lockdep: Exclude local_lock_t from IRQ inversions
        locking/lockdep: Clean up check_redundant() a bit
        locking/lockdep: Add a skip() function to __bfs()
        locking/lockdep: Mark local_lock_t
        locking/selftests: More granular debug_locks_verbose
        lockdep/selftest: Add wait context selftests
        tools/memory-model: Fix typo in klitmus7 compatibility table
        ...
      9eef0233
    • L
      Merge tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d089f48f
      Linus Torvalds 提交于
      Pull RCU updates from Ingo Molnar:
       "These are the latest RCU updates for v5.12:
      
         - Documentation updates.
      
         - Miscellaneous fixes.
      
         - kfree_rcu() updates: Addition of mem_dump_obj() to provide
           allocator return addresses to more easily locate bugs. This has a
           couple of RCU-related commits, but is mostly MM. Was pulled in with
           akpm's agreement.
      
         - Per-callback-batch tracking of numbers of callbacks, which enables
           better debugging information and smarter reactions to large numbers
           of callbacks.
      
         - The first round of changes to allow CPUs to be runtime switched
           from and to callback-offloaded state.
      
         - CONFIG_PREEMPT_RT-related changes.
      
         - RCU CPU stall warning updates.
      
         - Addition of polling grace-period APIs for SRCU.
      
         - Torture-test and torture-test scripting updates, including a
           "torture everything" script that runs rcutorture, locktorture,
           scftorture, rcuscale, and refscale. Plus does an allmodconfig
           build.
      
         - nolibc fixes for the torture tests"
      
      * tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits)
        percpu_ref: Dump mem_dump_obj() info upon reference-count underflow
        rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback
        mm: Make mem_obj_dump() vmalloc() dumps include start and length
        mm: Make mem_dump_obj() handle vmalloc() memory
        mm: Make mem_dump_obj() handle NULL and zero-sized pointers
        mm: Add mem_dump_obj() to print source of memory block
        tools/rcutorture: Fix position of -lgcc in mkinitrd.sh
        tools/nolibc: Fix position of -lgcc in the documented example
        tools/nolibc: Emit detailed error for missing alternate syscall number definitions
        tools/nolibc: Remove incorrect definitions of __ARCH_WANT_*
        tools/nolibc: Get timeval, timespec and timezone from linux/time.h
        tools/nolibc: Implement poll() based on ppoll()
        tools/nolibc: Implement fork() based on clone()
        tools/nolibc: Make getpgrp() fall back to getpgid(0)
        tools/nolibc: Make dup2() rely on dup3() when available
        tools/nolibc: Add the definition for dup()
        rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01
        torture: Maintain torture-specific set of CPUs-online books
        torture: Clean up after torture-test CPU hotplugging
        rcutorture: Make object_debug also double call_rcu() heap object
        ...
      d089f48f
    • L
      Merge tag 'timers-core-2021-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f6ec19f
      Linus Torvalds 提交于
      Pull timer updates from Thomas Gleixner:
       "Time and timer updates:
      
         - Instead of new drivers remove tango, sirf, u300 and atlas drivers
      
         - Add suspend/resume support for microchip pit64b
      
         - The usual fixes, improvements and cleanups here and there"
      
      * tag 'timers-core-2021-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timens: Delete no-op time_ns_init()
        alarmtimer: Update kerneldoc
        clocksource/drivers/timer-microchip-pit64b: Add clocksource suspend/resume
        clocksource/drivers/prima: Remove sirf prima driver
        clocksource/drivers/atlas: Remove sirf atlas driver
        clocksource/drivers/tango: Remove tango driver
        clocksource/drivers/u300: Remove the u300 driver
        dt-bindings: timer: nuvoton: Clarify that interrupt of timer 0 should be specified
        clocksource/drivers/davinci: Move pr_fmt() before the includes
        clocksource/drivers/efm32: Drop unused timer code
      3f6ec19f
    • L
      Merge tag 'irq-core-2021-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b5183bc9
      Linus Torvalds 提交于
      Pull irq updates from Thomas Gleixner:
       "Updates for the irq subsystem:
      
         - The usual new irq chip driver (Realtek RTL83xx)
      
         - Removal of sirfsoc and tango irq chip drivers
      
         - Conversion of the sun6i chip support to hierarchical irq domains
      
         - The usual fixes, improvements and cleanups all over the place"
      
      * tag 'irq-core-2021-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/imx: IMX_INTMUX should not default to y, unconditionally
        irqchip/loongson-pch-msi: Use bitmap_zalloc() to allocate bitmap
        irqchip/csky-mpintc: Prevent selection on unsupported platforms
        irqchip: Add support for Realtek RTL838x/RTL839x interrupt controller
        dt-bindings: interrupt-controller: Add Realtek RTL838x/RTL839x support
        irqchip/ls-extirq: add IRQCHIP_SKIP_SET_WAKE to the irqchip flags
        genirq: Use new tasklet API for resend_tasklet
        dt-bindings: qcom,pdc: Add compatible for SM8350
        dt-bindings: qcom,pdc: Add compatible for SM8250
        irqchip/sun6i-r: Add wakeup support
        irqchip/sun6i-r: Use a stacked irqchip driver
        dt-bindings: irq: sun6i-r: Add a compatible for the H3
        dt-bindings: irq: sun6i-r: Split the binding from sun7i-nmi
        irqchip/gic-v3: Fix typos in PMR/RPR SCR_EL3.FIQ handling explanation
        irqchip: Remove sirfsoc driver
        irqchip: Remove sigma tango driver
      b5183bc9
    • L
      Merge tag 'for-5.12/io_uring-2021-02-17' of git://git.kernel.dk/linux-block · 5bbb336b
      Linus Torvalds 提交于
      Pull io_uring updates from Jens Axboe:
       "Highlights from this cycles are things like request recycling and
        task_work optimizations, which net us anywhere from 10-20% of speedups
        on workloads that mostly are inline.
      
        This work was originally done to put io_uring under memcg, which adds
        considerable overhead. But it's a really nice win as well. Also worth
        highlighting is the LOOKUP_CACHED work in the VFS, and using it in
        io_uring. Greatly speeds up the fast path for file opens.
      
        Summary:
      
         - Put io_uring under memcg protection. We accounted just the rings
           themselves under rlimit memlock before, now we account everything.
      
         - Request cache recycling, persistent across invocations (Pavel, me)
      
         - First part of a cleanup/improvement to buffer registration (Bijan)
      
         - SQPOLL fixes (Hao)
      
         - File registration NULL pointer fixup (Dan)
      
         - LOOKUP_CACHED support for io_uring
      
         - Disable /proc/thread-self/ for io_uring, like we do for /proc/self
      
         - Add Pavel to the io_uring MAINTAINERS entry
      
         - Tons of code cleanups and optimizations (Pavel)
      
         - Support for skip entries in file registration (Noah)"
      
      * tag 'for-5.12/io_uring-2021-02-17' of git://git.kernel.dk/linux-block: (103 commits)
        io_uring: tctx->task_lock should be IRQ safe
        proc: don't allow async path resolution of /proc/thread-self components
        io_uring: kill cached requests from exiting task closing the ring
        io_uring: add helper to free all request caches
        io_uring: allow task match to be passed to io_req_cache_free()
        io-wq: clear out worker ->fs and ->files
        io_uring: optimise io_init_req() flags setting
        io_uring: clean io_req_find_next() fast check
        io_uring: don't check PF_EXITING from syscall
        io_uring: don't split out consume out of SQE get
        io_uring: save ctx put/get for task_work submit
        io_uring: don't duplicate io_req_task_queue()
        io_uring: optimise SQPOLL mm/files grabbing
        io_uring: optimise out unlikely link queue
        io_uring: take compl state from submit state
        io_uring: inline io_complete_rw_common()
        io_uring: move res check out of io_rw_reissue()
        io_uring: simplify iopoll reissuing
        io_uring: clean up io_req_free_batch_finish()
        io_uring: move submit side state closer in the ring
        ...
      5bbb336b
    • L
      Merge tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-block · 9820b4dc
      Linus Torvalds 提交于
      Pull block driver updates from Jens Axboe:
      
       - Remove the skd driver. It's been EOL for a long time (Damien)
      
       - NVMe pull requests
            - fix multipath handling of ->queue_rq errors (Chao Leng)
            - nvmet cleanups (Chaitanya Kulkarni)
            - add a quirk for buggy Amazon controller (Filippo Sironi)
            - avoid devm allocations in nvme-hwmon that don't interact well
              with fabrics (Hannes Reinecke)
            - sysfs cleanups (Jiapeng Chong)
            - fix nr_zones for multipath (Keith Busch)
            - nvme-tcp crash fix for no-data commands (Sagi Grimberg)
            - nvmet-tcp fixes (Sagi Grimberg)
            - add a missing __rcu annotation (Christoph)
            - failed reconnect fixes (Chao Leng)
            - various tracing improvements (Michal Krakowiak, Johannes
              Thumshirn)
            - switch the nvmet-fc assoc_list to use RCU protection (Leonid
              Ravich)
            - resync the status codes with the latest spec (Max Gurtovoy)
            - minor nvme-tcp improvements (Sagi Grimberg)
            - various cleanups (Rikard Falkeborn, Minwoo Im, Chaitanya
              Kulkarni, Israel Rukshin)
      
       - Floppy O_NDELAY fix (Denis)
      
       - MD pull request
            - raid5 chunk_sectors fix (Guoqing)
      
       - Use lore links (Kees)
      
       - Use DEFINE_SHOW_ATTRIBUTE for nbd (Liao)
      
       - loop lock scaling (Pavel)
      
       - mtip32xx PCI fixes (Bjorn)
      
       - bcache fixes (Kai, Dongdong)
      
       - Misc fixes (Tian, Yang, Guoqing, Joe, Andy)
      
      * tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-block: (64 commits)
        lightnvm: pblk: Replace guid_copy() with export_guid()/import_guid()
        lightnvm: fix unnecessary NULL check warnings
        nvme-tcp: fix crash triggered with a dataless request submission
        block: Replace lkml.org links with lore
        nbd: Convert to DEFINE_SHOW_ATTRIBUTE
        nvme: add 48-bit DMA address quirk for Amazon NVMe controllers
        nvme-hwmon: rework to avoid devm allocation
        nvmet: remove else at the end of the function
        nvmet: add nvmet_req_subsys() helper
        nvmet: use min of device_path and disk len
        nvmet: use invalid cmd opcode helper
        nvmet: use invalid cmd opcode helper
        nvmet: add helper to report invalid opcode
        nvmet: remove extra variable in id-ns handler
        nvmet: make nvmet_find_namespace() req based
        nvmet: return uniform error for invalid ns
        nvmet: set status to 0 in case for invalid nsid
        nvmet-fc: add a missing __rcu annotation to nvmet_fc_tgt_assoc.queues
        nvme-multipath: set nr_zones for zoned namespaces
        nvmet-tcp: fix potential race of tcp socket closing accept_work
        ...
      9820b4dc
    • L
      Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block · 582cd91f
      Linus Torvalds 提交于
      Pull core block updates from Jens Axboe:
       "Another nice round of removing more code than what is added, mostly
        due to Christoph's relentless pursuit of tech debt removal/cleanups.
        This pull request contains:
      
         - Two series of BFQ improvements (Paolo, Jan, Jia)
      
         - Block iov_iter improvements (Pavel)
      
         - bsg error path fix (Pan)
      
         - blk-mq scheduler improvements (Jan)
      
         - -EBUSY discard fix (Jan)
      
         - bvec allocation improvements (Ming, Christoph)
      
         - bio allocation and init improvements (Christoph)
      
         - Store bdev pointer in bio instead of gendisk + partno (Christoph)
      
         - Block trace point cleanups (Christoph)
      
         - hard read-only vs read-only split (Christoph)
      
         - Block based swap cleanups (Christoph)
      
         - Zoned write granularity support (Damien)
      
         - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)"
      
      * tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits)
        mm: simplify swapdev_block
        sd_zbc: clear zone resources for non-zoned case
        block: introduce blk_queue_clear_zone_settings()
        zonefs: use zone write granularity as block size
        block: introduce zone_write_granularity limit
        block: use blk_queue_set_zoned in add_partition()
        nullb: use blk_queue_set_zoned() to setup zoned devices
        nvme: cleanup zone information initialization
        block: document zone_append_max_bytes attribute
        block: use bi_max_vecs to find the bvec pool
        md/raid10: remove dead code in reshape_request
        block: mark the bio as cloned in bio_iov_bvec_set
        block: set BIO_NO_PAGE_REF in bio_iov_bvec_set
        block: remove a layer of indentation in bio_iov_iter_get_pages
        block: turn the nr_iovecs argument to bio_alloc* into an unsigned short
        block: remove the 1 and 4 vec bvec_slabs entries
        block: streamline bvec_alloc
        block: factor out a bvec_alloc_gfp helper
        block: move struct biovec_slab to bio.c
        block: reuse BIO_INLINE_VECS for integrity bvecs
        ...
      582cd91f
    • L
      Merge tag 'for-5.12/libata-2021-02-17' of git://git.kernel.dk/linux-block · bd018bba
      Linus Torvalds 提交于
      Pull libata updates from Jens Axboe:
       "Regulartors management addition from Florian, and a trivial change to
        avoid comma separated statements from Joe"
      
      * tag 'for-5.12/libata-2021-02-17' of git://git.kernel.dk/linux-block:
        ata: Avoid comma separated statements
        ata: ahci_brcm: Add back regulators management
      bd018bba
    • L
      Merge tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux · 24880bef
      Linus Torvalds 提交于
      Pull oprofile and dcookies removal from Viresh Kumar:
       "Remove oprofile and dcookies support
      
        The 'oprofile' user-space tools don't use the kernel OPROFILE support
        any more, and haven't in a long time. User-space has been converted to
        the perf interfaces.
      
        The dcookies stuff is only used by the oprofile code. Now that
        oprofile's support is getting removed from the kernel, there is no
        need for dcookies as well.
      
        Remove kernel's old oprofile and dcookies support"
      
      * tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux:
        fs: Remove dcookies support
        drivers: Remove CONFIG_OPROFILE support
        arch: xtensa: Remove CONFIG_OPROFILE support
        arch: x86: Remove CONFIG_OPROFILE support
        arch: sparc: Remove CONFIG_OPROFILE support
        arch: sh: Remove CONFIG_OPROFILE support
        arch: s390: Remove CONFIG_OPROFILE support
        arch: powerpc: Remove oprofile
        arch: powerpc: Stop building and using oprofile
        arch: parisc: Remove CONFIG_OPROFILE support
        arch: mips: Remove CONFIG_OPROFILE support
        arch: microblaze: Remove CONFIG_OPROFILE support
        arch: ia64: Remove rest of perfmon support
        arch: ia64: Remove CONFIG_OPROFILE support
        arch: hexagon: Don't select HAVE_OPROFILE
        arch: arc: Remove CONFIG_OPROFILE support
        arch: arm: Remove CONFIG_OPROFILE support
        arch: alpha: Remove CONFIG_OPROFILE support
      24880bef
    • L
      Merge tag 'xfs-5.12-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · b52bb135
      Linus Torvalds 提交于
      Pull xfs updates from Darrick Wong:
       "There's a lot going on this time, which seems about right for this
        drama-filled year.
      
        Community developers added some code to speed up freezing when
        read-only workloads are still running, refactored the logging code,
        added checks to prevent file extent counter overflow, reduced iolock
        cycling to speed up fsync and gc scans, and started the slow march
        towards supporting filesystem shrinking.
      
        There's a huge refactoring of the internal speculative preallocation
        garbage collection code which fixes a bunch of bugs, makes the gc
        scheduling per-AG and hence multithreaded, and standardizes the retry
        logic when we try to reserve space or quota, can't, and want to
        trigger a gc scan. We also enable multithreaded quotacheck to reduce
        mount times further. This is also preparation for background file gc,
        which may or may not land for 5.13.
      
        We also fixed some deadlocks in the rename code, fixed a quota
        accounting leak when FSSETXATTR fails, restored the behavior that
        write faults to an mmap'd region actually cause a SIGBUS, fixed a bug
        where sgid directory inheritance wasn't quite working properly, and
        fixed a bug where symlinks weren't working properly in ecryptfs. We
        also now advertise the inode btree counters feature that was
        introduced two cycles ago.
      
        Summary:
      
         - Fix an ABBA deadlock when renaming files on overlayfs.
      
         - Make sure that we can't overflow the inode extent counters when
           adding to or removing extents from a file.
      
         - Make directory sgid inheritance work the same way as all the other
           filesystems.
      
         - Don't drain the buffer cache on freeze and ro remount, which should
           reduce the amount of time if read-only workloads are continuing
           during the freeze.
      
         - Fix a bug where symlink size isn't reported to the vfs in ecryptfs.
      
         - Disentangle log cleaning from log covering. This refactoring sets
           us up for future changes to the log, though for now it simply means
           that we can use covering for freezes, and cleaning becomes
           something we only do at unmount.
      
         - Speed up file fsyncs by reducing iolock cycling.
      
         - Fix delalloc blocks leaking when changing the project id fails
           because of input validation errors in FSSETXATTR.
      
         - Fix oversized quota reservation when converting unwritten extents
           during a DAX write.
      
         - Create a transaction allocation helper function to standardize the
           idiom of allocating a transaction, reserving blocks, locking
           inodes, and reserving quota. Replace all the open-coded logic for
           file creation, file ownership changes, and file modifications to
           use them.
      
         - Actually shut down the fs if the incore quota reservations get
           corrupted.
      
         - Fix background block garbage collection scans to not block and to
           actually clean out CoW staging extents properly.
      
         - Run block gc scans when we run low on project quota.
      
         - Use the standardized transaction allocation helpers to make it so
           that ENOSPC and EDQUOT errors during reservation will back out,
           invoke the block gc scanner, and try again. This is preparation for
           introducing background inode garbage collection in the next cycle.
      
         - Combine speculative post-EOF block garbage collection with
           speculative copy on write block garbage collection.
      
         - Enable multithreaded quotacheck.
      
         - Allow sysadmins to tweak the CPU affinities and maximum concurrency
           levels of quotacheck and background blockgc worker pools.
      
         - Expose the inode btree counter feature in the fs geometry ioctl.
      
         - Cleanups of the growfs code in preparation for starting work on
           filesystem shrinking.
      
         - Fix all the bloody gcc warnings that the maintainer knows about. :P
      
         - Fix a RST syntax error.
      
         - Don't trigger bmbt corruption assertions after the fs shuts down.
      
         - Restore behavior of forcing SIGBUS on a shut down filesystem when
           someone triggers a mmap write fault (or really, any buffered
           write)"
      
      * tag 'xfs-5.12-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (85 commits)
        xfs: consider shutdown in bmapbt cursor delete assert
        xfs: fix boolreturn.cocci warnings
        xfs: restore shutdown check in mapped write fault path
        xfs: fix rst syntax error in admin guide
        xfs: fix incorrect root dquot corruption error when switching group/project quota types
        xfs: get rid of xfs_growfs_{data,log}_t
        xfs: rename `new' to `delta' in xfs_growfs_data_private()
        libxfs: expose inobtcount in xfs geometry
        xfs: don't bounce the iolock between free_{eof,cow}blocks
        xfs: expose the blockgc workqueue knobs publicly
        xfs: parallelize block preallocation garbage collection
        xfs: rename block gc start and stop functions
        xfs: only walk the incore inode tree once per blockgc scan
        xfs: consolidate the eofblocks and cowblocks workers
        xfs: consolidate incore inode radix tree posteof/cowblocks tags
        xfs: remove trivial eof/cowblocks functions
        xfs: hide xfs_icache_free_cowblocks
        xfs: hide xfs_icache_free_eofblocks
        xfs: relocate the eofb/cowb workqueue functions
        xfs: set WQ_SYSFS on all workqueues in debug mode
        ...
      b52bb135
    • L
      Merge tag 'iomap-5.12-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 4f016a31
      Linus Torvalds 提交于
      Pull iomap updates from Darrick Wong:
       "The big change in this cycle is some new code to make it possible for
        XFS to try unaligned directio overwrites without taking locks. If the
        block is fully written and within EOF (i.e. doesn't require any
        further fs intervention) then we can let the unlocked write proceed.
        If not, we fall back to synchronizing direct writes.
      
        Summary:
      
         - Adjust the final parameter of iomap_dio_rw.
      
         - Add a new flag to request that iomap directio writes return EAGAIN
           if the write is not a pure overwrite within EOF; this will be used
           to reduce lock contention with unaligned direct writes on XFS.
      
         - Amend XFS' directio code to eliminate exclusive locking for
           unaligned direct writes if the circumstances permit"
      
      * tag 'iomap-5.12-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: reduce exclusive locking on unaligned dio
        xfs: split the unaligned DIO write code out
        xfs: improve the reflink_bounce_dio_write tracepoint
        xfs: simplify the read/write tracepoints
        xfs: remove the buffered I/O fallback assert
        xfs: cleanup the read/write helper naming
        xfs: make xfs_file_aio_write_checks IOCB_NOWAIT-aware
        xfs: factor out a xfs_ilock_iocb helper
        iomap: add a IOMAP_DIO_OVERWRITE_ONLY flag
        iomap: pass a flags argument to iomap_dio_rw
        iomap: rename the flags variable in __iomap_dio_rw
      4f016a31
    • L
      Merge tag 'pstore-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · f0236163
      Linus Torvalds 提交于
      Pull pstore fix from Kees Cook:
       "Fix a CONFIG typo (Jiri Bohac)"
      
      * tag 'pstore-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore: Fix typo in compression option name
      f0236163
    • L
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · f7b36dc5
      Linus Torvalds 提交于
      Pull fsverity updates from Eric Biggers:
       "Add an ioctl which allows reading fs-verity metadata from a file.
      
        This is useful when a file with fs-verity enabled needs to be served
        somewhere, and the other end wants to do its own fs-verity compatible
        verification of the file. See the commit messages for details.
      
        This new ioctl has been tested using new xfstests I've written for it"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fs-verity: support reading signature with ioctl
        fs-verity: support reading descriptor with ioctl
        fs-verity: support reading Merkle tree with ioctl
        fs-verity: add FS_IOC_READ_VERITY_METADATA ioctl
        fs-verity: don't pass whole descriptor to fsverity_verify_signature()
        fs-verity: factor out fsverity_get_descriptor()
      f7b36dc5
    • L
      Merge tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 99f1a587
      Linus Torvalds 提交于
      Pull nfsd updates from Chuck Lever:
      
       - Update NFSv2 and NFSv3 XDR decoding functions
      
       - Further improve support for re-exporting NFS mounts
      
       - Convert NFSD stats to per-CPU counters
      
       - Add batch Receive posting to the server's RPC/RDMA transport
      
      * tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (65 commits)
        nfsd: skip some unnecessary stats in the v4 case
        nfs: use change attribute for NFS re-exports
        NFSv4_2: SSC helper should use its own config.
        nfsd: cstate->session->se_client -> cstate->clp
        nfsd: simplify nfsd4_check_open_reclaim
        nfsd: remove unused set_client argument
        nfsd: find_cpntf_state cleanup
        nfsd: refactor set_client
        nfsd: rename lookup_clientid->set_client
        nfsd: simplify nfsd_renew
        nfsd: simplify process_lock
        nfsd4: simplify process_lookup1
        SUNRPC: Correct a comment
        svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom()
        svcrdma: Reduce Receive doorbell rate
        svcrdma: Deprecate stat variables that are no longer used
        svcrdma: Restore read and write stats
        svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter
        svcrdma: Convert rdma_stat_recv to a per-CPU counter
        svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up()
        ...
      99f1a587
    • L
      Merge tag 'erofs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 681e2abe
      Linus Torvalds 提交于
      Pull erofs updates from Gao Xiang:
       "This contains a somewhat important but rarely reproduced fix reported
        month ago for platforms which have weak memory model (e.g. arm64).
      
        The root cause is that test_bit/set_bit atomic operations are actually
        implemented in relaxed forms, and uninitialized fields governed by an
        atomic bit could be observed in advance due to memory reordering thus
        memory barrier pairs should be used.
      
        There is also a trivial fix of crafted blkszbits generated by
        syzkaller.
      
        Summary:
      
         - fix shift-out-of-bounds of crafted blkszbits generated by syzkaller
      
         - ensure initialized fields can only be observed after bit is set"
      
      * tag 'erofs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: initialized fields can only be observed after bit is set
        erofs: fix shift-out-of-bounds of blkszbits
      681e2abe
    • L
      Merge tag 'f2fs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 8b42fe12
      Linus Torvalds 提交于
      Pull f2fs updates from Jaegeuk Kim:
       "We've added two major features: 1) compression level and 2)
        checkpoint_merge, in this round.
      
        Compression level expands 'compress_algorithm' mount option to accept
        parameter as format of <algorithm>:<level>, by this way, it gives a
        way to allow user to do more specified config on lz4 and zstd
        compression level, then f2fs compression can provide higher compress
        ratio.
      
        checkpoint_merge creates a kernel daemon and makes it to merge
        concurrent checkpoint requests as much as possible to eliminate
        redundant checkpoint issues. Plus, we can eliminate the sluggish issue
        caused by slow checkpoint operation when the checkpoint is done in a
        process context in a cgroup having low i/o budget and cpu shares.
      
        Enhancements:
         - add compress level for lz4 and zstd in mount option
         - checkpoint_merge mount option
         - deprecate f2fs_trace_io
      
        Bug fixes:
         - flush data when enabling checkpoint back
         - handle corner cases of mount options
         - missing ACL update and lock for I_LINKABLE flag
         - attach FIEMAP_EXTENT_MERGED in f2fs_fiemap
         - fix potential deadlock in compression flow
         - fix wrong submit_io condition
      
        As usual, we've cleaned up many code flows and fixed minor bugs"
      
      * tag 'f2fs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
        Documentation: f2fs: fix typo s/automaic/automatic
        f2fs: give a warning only for readonly partition
        f2fs: don't grab superblock freeze for flush/ckpt thread
        f2fs: add ckpt_thread_ioprio sysfs node
        f2fs: introduce checkpoint_merge mount option
        f2fs: relocate inline conversion from mmap() to mkwrite()
        f2fs: fix a wrong condition in __submit_bio
        f2fs: remove unnecessary initialization in xattr.c
        f2fs: fix to avoid inconsistent quota data
        f2fs: flush data when enabling checkpoint back
        f2fs: deprecate f2fs_trace_io
        f2fs: Remove readahead collision detection
        f2fs: remove unused stat_{inc, dec}_atomic_write
        f2fs: introduce sb_status sysfs node
        f2fs: fix to use per-inode maxbytes
        f2fs: compress: fix potential deadlock
        libfs: unexport generic_ci_d_compare() and generic_ci_d_hash()
        f2fs: fix to set/clear I_LINKABLE under i_lock
        f2fs: fix null page reference in redirty_blocks
        f2fs: clean up post-read processing
        ...
      8b42fe12
    • L
      Merge tag 'for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6f3952cb
      Linus Torvalds 提交于
      Pull btrfs updates from David Sterba:
       "This brings updates of space handling, performance improvements or bug
        fixes. The subpage block size and zoned mode features have reached
        state where they're usable but with limitations.
      
        Performance or related:
      
         - do not block on deleted block group mutex in the cleaner, avoids
           some long stalls
      
         - improved flushing: make it work better with ticket space
           reservations and avoid excessive transaction commits in some
           scenarios, slightly improves throughput for random write load
      
         - preemptive background flushing: separate the logic from ticket
           reservations, improve the accounting and decisions when to flush in
           low space conditions
      
         - less lock contention related to running delayed refs, let just one
           thread do the flushing when there are many inside transaction
           commit
      
         - dbench workload improvements: avoid unnecessary work when logging
           inodes, fewer fallbacks to transaction commit and thus less waiting
           for it (+7% throughput, -20% latency)
      
        Core:
      
         - subpage block size
            - currently read-only support
            - refactor and generalize code where sectorsize is assumed to be
              page size, add the subpage handling everywhere
            - the read-write support is on the way, page sizes are still
              limited to 4K or 64K
      
         - zoned mode, first working version but with limitations
            - SMR/ZBC/ZNS friendly allocation mode, utilizing the "no fixed
              location for structures" and chunked allocation
            - superblock as the only fixed data structure needs special
              handling, uses 2 consecutive zones as a ring buffer
            - tree-log support with a dedicated block group to avoid unordered
              writes
            - emulated zones on non-zoned devices
            - not yet working
            - all non-single block group profiles, requires more zone write
              pointer synchronization between the multiple block groups
            - fitrim due to dependency on space cache, can be implemented
      
        Fixes:
      
         - ref-verify: proper tree owner and node level tracking
      
         - fix pinned byte accounting, causing some early ENOSPC now more
           likely due to other changes in delayed refs
      
        Other:
      
         - error handling fixes and improvements
      
         - more error injection points
      
         - more function documentation
      
         - more and updated tracepoints
      
         - subset of W=1 checked by default
      
         - update comments to allow more automatic kdoc parameter checks"
      
      * tag 'for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (144 commits)
        btrfs: zoned: enable to mount ZONED incompat flag
        btrfs: zoned: deal with holes writing out tree-log pages
        btrfs: zoned: reorder log node allocation on zoned filesystem
        btrfs: zoned: serialize log transaction on zoned filesystems
        btrfs: zoned: extend zoned allocator to use dedicated tree-log block group
        btrfs: split alloc_log_tree()
        btrfs: zoned: relocate block group to repair IO failure in zoned filesystems
        btrfs: zoned: enable relocation on a zoned filesystem
        btrfs: zoned: support dev-replace in zoned filesystems
        btrfs: zoned: implement copying for zoned device-replace
        btrfs: zoned: implement cloning for zoned device-replace
        btrfs: zoned: mark block groups to copy for device-replace
        btrfs: zoned: do not use async metadata checksum on zoned filesystems
        btrfs: zoned: wait for existing extents before truncating
        btrfs: zoned: serialize metadata IO
        btrfs: zoned: introduce dedicated data write path for zoned filesystems
        btrfs: zoned: enable zone append writing for direct IO
        btrfs: zoned: use ZONE_APPEND write for zoned mode
        btrfs: save irq flags when looking up an ordered extent
        btrfs: zoned: cache if block group is on a sequential zone
        ...
      6f3952cb