1. 07 9月, 2017 8 次提交
  2. 07 7月, 2017 1 次提交
  3. 25 5月, 2017 1 次提交
  4. 09 5月, 2017 1 次提交
    • M
      treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68
      Michal Hocko 提交于
      There are many code paths opencoding kvmalloc.  Let's use the helper
      instead.  The main difference to kvmalloc is that those users are
      usually not considering all the aspects of the memory allocator.  E.g.
      allocation requests <= 32kB (with 4kB pages) are basically never failing
      and invoke OOM killer to satisfy the allocation.  This sounds too
      disruptive for something that has a reasonable fallback - the vmalloc.
      On the other hand those requests might fallback to vmalloc even when the
      memory allocator would succeed after several more reclaim/compaction
      attempts previously.  There is no guarantee something like that happens
      though.
      
      This patch converts many of those places to kv[mz]alloc* helpers because
      they are more conservative.
      
      Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
      Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
      Acked-by: David Sterba <dsterba@suse.com> # btrfs
      Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
      Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Colin Cross <ccross@android.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Santosh Raspatur <santosh@chelsio.com>
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Yishai Hadas <yishaih@mellanox.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: "Yan, Zheng" <zyan@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      752ade68
  5. 04 5月, 2017 4 次提交
  6. 25 2月, 2017 2 次提交
  7. 20 2月, 2017 2 次提交
    • J
      ceph: add a new flag to indicate whether parent is locked · 3dd69aab
      Jeff Layton 提交于
      struct ceph_mds_request has an r_locked_dir pointer, which is set to
      indicate the parent inode and that its i_rwsem is locked.  In some
      critical places, we need to be able to indicate the parent inode to the
      request handling code, even when its i_rwsem may not be locked.
      
      Most of the code that operates on r_locked_dir doesn't require that the
      i_rwsem be locked. We only really need it to handle manipulation of the
      dcache. The rest (filling of the inode, updating dentry leases, etc.)
      already has its own locking.
      
      Add a new r_req_flags bit that indicates whether the parent is locked
      when doing the request, and rename the pointer to "r_parent". For now,
      all the places that set r_parent also set this flag, but that will
      change in a later patch.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      3dd69aab
    • Y
      ceph: avoid calling ceph_renew_caps() infinitely · c1944fed
      Yan, Zheng 提交于
      __ceph_caps_mds_wanted() ignores caps from stale session. So the
      return value of __ceph_caps_mds_wanted() can keep the same across
      ceph_renew_caps(). This causes try_get_cap_refs() to keep calling
      ceph_renew_caps(). The fix is ignore the session valid check for
      the try_get_cap_refs() case. If session is stale, just let the
      caps requester sleep.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      c1944fed
  8. 15 12月, 2016 1 次提交
    • I
      libceph: always signal completion when done · c297eb42
      Ilya Dryomov 提交于
      r_safe_completion is currently, and has always been, signaled only if
      on-disk ack was requested.  It's there for fsync and syncfs, which wait
      for in-flight writes to flush - all data write requests set ONDISK.
      
      However, the pool perm check code introduced in 4.2 sends a write
      request with only ACK set.  An unfortunately timed syncfs can then hang
      forever: r_safe_completion won't be signaled because only an unsafe
      reply was requested.
      
      We could patch ceph_osdc_sync() to skip !ONDISK write requests, but
      that is somewhat incomplete and yet another special case.  Instead,
      rename this completion to r_done_completion and always signal it when
      the OSD client is done with the request, whether unsafe, safe, or
      error.  This is a bit cleaner and helps with the cancellation code.
      Reported-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      c297eb42
  9. 13 12月, 2016 3 次提交
  10. 11 11月, 2016 1 次提交
    • Y
      ceph: use default file splice read callback · 8a8d5617
      Yan, Zheng 提交于
      Splice read/write implementation changed recently. When using
      generic_file_splice_read(), iov_iter with type == ITER_PIPE is
      passed to filesystem's read_iter callback. But ceph_sync_read()
      can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter
      expects pages from page cache).
      
      Fixing ceph_sync_read() requires a big patch. So use default
      splice read callback for now.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      8a8d5617
  11. 29 10月, 2016 1 次提交
  12. 16 10月, 2016 1 次提交
  13. 03 10月, 2016 1 次提交
    • N
      ceph: ignore error from invalidate_inode_pages2_range() in direct write · 5d7eb1a3
      NeilBrown 提交于
      This call can fail if there are dirty pages.  The preceding call to
      filemap_write_and_wait_range() will normally remove dirty pages, but
      as inode_lock() is not held over calls to ceph_direct_read_write(), it
      could race with non-direct writes and pages could be dirtied
      immediately after filemap_write_and_wait_range() returns
      
      If there are dirty pages, they will be removed by the subsequent call
      to truncate_inode_pages_range(), so having them here is not a problem.
      
      If the 'ret' value is left holding an error, then in the async IO case
      (aio_req is not NULL) the loop that would normally call
      ceph_osdc_start_request() will see the error in 'ret' and abort all
      requests.  This doesn't seem like correct behaviour.
      
      So use separate 'ret2' instead of overloading 'ret'.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      5d7eb1a3
  14. 28 9月, 2016 1 次提交
  15. 28 7月, 2016 5 次提交
  16. 06 7月, 2016 1 次提交
  17. 01 6月, 2016 1 次提交
  18. 31 5月, 2016 1 次提交
  19. 26 5月, 2016 4 次提交
    • Y
      ceph: renew caps for read/write if mds session got killed. · 77310320
      Yan, Zheng 提交于
      When mds session gets killed, read/write operation may hang.
      Client waits for Frw caps, but mds does not know what caps client
      wants. To recover this, client sends an open request to mds. The
      request will tell mds what caps client wants.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      77310320
    • I
      libceph: redo callbacks and factor out MOSDOpReply decoding · fe5da05e
      Ilya Dryomov 提交于
      If you specify ACK | ONDISK and set ->r_unsafe_callback, both
      ->r_callback and ->r_unsafe_callback(true) are called on ack.  This is
      very confusing.  Redo this so that only one of them is called:
      
          ->r_unsafe_callback(true), on ack
          ->r_unsafe_callback(false), on commit
      
      or
      
          ->r_callback, on ack|commit
      
      Decode everything in decode_MOSDOpReply() to reduce clutter.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      fe5da05e
    • I
      libceph: drop msg argument from ceph_osdc_callback_t · 85e084fe
      Ilya Dryomov 提交于
      finish_read(), its only user, uses it to get to hdr.data_len, which is
      what ->r_result is set to on success.  This gains us the ability to
      safely call callbacks from contexts other than reply, e.g. map check.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      85e084fe
    • I
      libceph: switch to calc_target(), part 2 · bb873b53
      Ilya Dryomov 提交于
      The crux of this is getting rid of ceph_osdc_build_request(), so that
      MOSDOp can be encoded not before but after calc_target() calculates the
      actual target.  Encoding now happens within ceph_osdc_start_request().
      
      Also nuked is the accompanying bunch of pointers into the encoded
      buffer that was used to update fields on each send - instead, the
      entire front is re-encoded.  If we want to support target->name_len !=
      base->name_len in the future, there is no other way, because oid is
      surrounded by other fields in the encoded buffer.
      
      Encoding OSD ops and adding data items to the request message were
      mixed together in osd_req_encode_op().  While we want to re-encode OSD
      ops, we don't want to add duplicate data items to the message when
      resending, so all call to ceph_osdc_msg_data_add() are factored out
      into a new setup_request_data().
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      bb873b53