1. 08 7月, 2019 5 次提交
  2. 28 6月, 2019 1 次提交
  3. 06 6月, 2019 1 次提交
    • Y
      ceph: avoid iput_final() while holding mutex or in dispatch thread · 3e1d0452
      Yan, Zheng 提交于
      iput_final() may wait for reahahead pages. The wait can cause deadlock.
      For example:
      
        Workqueue: ceph-msgr ceph_con_workfn [libceph]
          Call Trace:
           schedule+0x36/0x80
           io_schedule+0x16/0x40
           __lock_page+0x101/0x140
           truncate_inode_pages_range+0x556/0x9f0
           truncate_inode_pages_final+0x4d/0x60
           evict+0x182/0x1a0
           iput+0x1d2/0x220
           iterate_session_caps+0x82/0x230 [ceph]
           dispatch+0x678/0xa80 [ceph]
           ceph_con_workfn+0x95b/0x1560 [libceph]
           process_one_work+0x14d/0x410
           worker_thread+0x4b/0x460
           kthread+0x105/0x140
           ret_from_fork+0x22/0x40
      
        Workqueue: ceph-msgr ceph_con_workfn [libceph]
          Call Trace:
           __schedule+0x3d6/0x8b0
           schedule+0x36/0x80
           schedule_preempt_disabled+0xe/0x10
           mutex_lock+0x2f/0x40
           ceph_check_caps+0x505/0xa80 [ceph]
           ceph_put_wrbuffer_cap_refs+0x1e5/0x2c0 [ceph]
           writepages_finish+0x2d3/0x410 [ceph]
           __complete_request+0x26/0x60 [libceph]
           handle_reply+0x6c8/0xa10 [libceph]
           dispatch+0x29a/0xbb0 [libceph]
           ceph_con_workfn+0x95b/0x1560 [libceph]
           process_one_work+0x14d/0x410
           worker_thread+0x4b/0x460
           kthread+0x105/0x140
           ret_from_fork+0x22/0x40
      
      In above example, truncate_inode_pages_range() waits for readahead pages
      while holding s_mutex. ceph_check_caps() waits for s_mutex and blocks
      OSD dispatch thread. Later OSD replies (for readahead) can't be handled.
      
      ceph_check_caps() also may lock snap_rwsem for read. So similar deadlock
      can happen if iput_final() is called while holding snap_rwsem.
      
      In general, it's not good to call iput_final() inside MDS/OSD dispatch
      threads or while holding any mutex.
      
      The fix is introducing ceph_async_iput(), which calls iput_final() in
      workqueue.
      Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      3e1d0452
  4. 08 5月, 2019 11 次提交
  5. 24 4月, 2019 2 次提交
  6. 06 3月, 2019 9 次提交
  7. 26 12月, 2018 2 次提交
  8. 09 11月, 2018 1 次提交
  9. 22 10月, 2018 3 次提交
    • I
      libceph: preallocate message data items · 0d9c1ab3
      Ilya Dryomov 提交于
      Currently message data items are allocated with ceph_msg_data_create()
      in setup_request_data() inside send_request().  send_request() has never
      been allowed to fail, so each allocation is followed by a BUG_ON:
      
        data = ceph_msg_data_create(...);
        BUG_ON(!data);
      
      It's been this way since support for multiple message data items was
      added in commit 6644ed7b ("libceph: make message data be a pointer")
      in 3.10.
      
      There is no reason to delay the allocation of message data items until
      the last possible moment and we certainly don't need a linked list of
      them as they are only ever appended to the end and never erased.  Make
      ceph_msg_new2() take max_data_items and adapt the rest of the code.
      Reported-by: NJerry Lee <leisurelysw24@gmail.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      0d9c1ab3
    • I
      libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist() · 89486833
      Ilya Dryomov 提交于
      Because send_mds_reconnect() wants to send a message with a pagelist
      and pass the ownership to the messenger, ceph_msg_data_add_pagelist()
      consumes a ref which is then put in ceph_msg_data_destroy().  This
      makes managing pagelists in the OSD client (where they are wrapped in
      ceph_osd_data) unnecessarily hard because the handoff only happens in
      ceph_osdc_start_request() instead of when the pagelist is passed to
      ceph_osd_data_pagelist_init().  I counted several memory leaks on
      various error paths.
      
      Fix up ceph_msg_data_add_pagelist() and carry a pagelist ref in
      ceph_osd_data.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      89486833
    • I
      libceph: introduce ceph_pagelist_alloc() · 33165d47
      Ilya Dryomov 提交于
      struct ceph_pagelist cannot be embedded into anything else because it
      has its own refcount.  Merge allocation and initialization together.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      33165d47
  10. 13 8月, 2018 3 次提交
  11. 03 8月, 2018 2 次提交
    • C
      ceph: add new field max_file_size in ceph_fs_client · 719784ba
      Chengguang Xu 提交于
      In order to not bother to VFS and other specific filesystems,
      we decided to do offset validation inside ceph kernel client,
      so just simply set sb->s_maxbytes to MAX_LFS_FILESIZE so that
      it can successfully pass VFS check. We add new field max_file_size
      in ceph_fs_client to store real file size limit and doing proper
      check based on it.
      Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
      Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      719784ba
    • I
      libceph: add authorizer challenge · 6daca13d
      Ilya Dryomov 提交于
      When a client authenticates with a service, an authorizer is sent with
      a nonce to the service (ceph_x_authorize_[ab]) and the service responds
      with a mutation of that nonce (ceph_x_authorize_reply).  This lets the
      client verify the service is who it says it is but it doesn't protect
      against a replay: someone can trivially capture the exchange and reuse
      the same authorizer to authenticate themselves.
      
      Allow the service to reject an initial authorizer with a random
      challenge (ceph_x_authorize_challenge).  The client then has to respond
      with an updated authorizer proving they are able to decrypt the
      service's challenge and that the new authorizer was produced for this
      specific connection instance.
      
      The accepting side requires this challenge and response unconditionally
      if the client side advertises they have CEPHX_V2 feature bit.
      
      This addresses CVE-2018-1128.
      
      Link: http://tracker.ceph.com/issues/24836Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      6daca13d