1. 03 11月, 2015 2 次提交
  2. 16 10月, 2015 1 次提交
    • I
      rbd: use writefull op for object size writes · e30b7577
      Ilya Dryomov 提交于
      This covers only the simplest case - an object size sized write, but
      it's still useful in tiering setups when EC is used for the base tier
      as writefull op can be proxied, saving an object promotion.
      
      Even though updating ceph_osdc_new_request() to allow writefull should
      just be a matter of fixing an assert, I didn't do it because its only
      user is cephfs.  All other sites were updated.
      
      Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      e30b7577
  3. 09 9月, 2015 1 次提交
    • I
      libceph: check data_len in ->alloc_msg() · d15f9d69
      Ilya Dryomov 提交于
      Only ->alloc_msg() should check data_len of the incoming message
      against the preallocated ceph_msg, doing it in the messenger is not
      right.  The contract is that either ->alloc_msg() returns a ceph_msg
      which will fit all of the portions of the incoming message, or it
      returns NULL and possibly sets skip, signaling whether NULL is due to
      an -ENOMEM.  ->alloc_msg() should be the only place where we make the
      skip/no-skip decision.
      
      I stumbled upon this while looking at con/osd ref counting.  Right now,
      if we get a non-extent message with a larger data portion than we are
      prepared for, ->alloc_msg() returns a ceph_msg, and then, when we skip
      it in the messenger, we don't put the con/osd ref acquired in
      ceph_con_in_msg_alloc() (which is normally put in process_message()),
      so this also fixes a memory leak.
      
      An existing BUG_ON in ceph_msg_data_cursor_init() ensures we don't
      corrupt random memory should a buggy ->alloc_msg() return an unfit
      ceph_msg.
      
      While at it, I changed the "unknown tid" dout() to a pr_warn() to make
      sure all skips are seen and unified format strings.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      d15f9d69
  4. 25 6月, 2015 3 次提交
  5. 21 5月, 2015 2 次提交
    • I
      Revert "libceph: clear r_req_lru_item in __unregister_linger_request()" · 521a04d0
      Ilya Dryomov 提交于
      This reverts commit ba9d114e.
      
      .. which introduced a regression that prevented all lingering requests
      requeued in kick_requests() from ever being sent to the OSDs, resulting
      in a lot of missed notifies.  In retrospect it's pretty obvious that
      r_req_lru_item item in the case of lingering requests can be used not
      only for notarget, but also for unsent linkage due to how tightly
      actual map and enqueue operations are coupled in __map_request().
      
      The assertion that was being silenced is taken care of in the previous
      ("libceph: request a new osdmap if lingering request maps to no osd")
      commit: by always kicking homeless lingering requests we ensure that
      none of them ends up on the notarget list outside of the critical
      section guarded by request_mutex.
      
      Cc: stable@vger.kernel.org # 3.18+, needs b0494532 "libceph: request a new osdmap if lingering request maps to no osd"
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      521a04d0
    • I
      libceph: request a new osdmap if lingering request maps to no osd · b0494532
      Ilya Dryomov 提交于
      This commit does two things.  First, if there are any homeless
      lingering requests, we now request a new osdmap even if the osdmap that
      is being processed brought no changes, i.e. if a given lingering
      request turned homeless in one of the previous epochs and remained
      homeless in the current epoch.  Not doing so leaves us with a stale
      osdmap and as a result we may miss our window for reestablishing the
      watch and lose notifies.
      
      MON=1 OSD=1:
      
          # cat linger-needmap.sh
          #!/bin/bash
          rbd create --size 1 test
          DEV=$(rbd map test)
          ceph osd out 0
          rbd map dne/dne # obtain a new osdmap as a side effect (!)
          sleep 1
          ceph osd in 0
          rbd resize --size 2 test
          # rbd info test | grep size -> 2M
          # blockdev --getsize $DEV -> 1M
      
      N.B.: Not obtaining a new osdmap in between "osd out" and "osd in"
      above is enough to make it miss that resize notify, but that is a
      bug^Wlimitation of ceph watch/notify v1.
      
      Second, homeless lingering requests are now kicked just like those
      lingering requests whose mapping has changed.  This is mainly to
      recognize that a homeless lingering request makes no sense and to
      preserve the invariant that a registered lingering request is not
      sitting on any of r_req_lru_item lists.  This spares us a WARN_ON,
      which commit ba9d114e ("libceph: clear r_req_lru_item in
      __unregister_linger_request()") tried to fix the _wrong_ way.
      
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      b0494532
  6. 19 2月, 2015 2 次提交
    • I
      libceph: kfree() in put_osd() shouldn't depend on authorizer · b28ec2f3
      Ilya Dryomov 提交于
      a255651d ("ceph: ensure auth ops are defined before use") made
      kfree() in put_osd() conditional on the authorizer.  A mechanical
      mistake most likely - fix it.
      
      Cc: Alex Elder <elder@linaro.org>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      b28ec2f3
    • I
      libceph: fix double __remove_osd() problem · 7eb71e03
      Ilya Dryomov 提交于
      It turns out it's possible to get __remove_osd() called twice on the
      same OSD.  That doesn't sit well with rb_erase() - depending on the
      shape of the tree we can get a NULL dereference, a soft lockup or
      a random crash at some point in the future as we end up touching freed
      memory.  One scenario that I was able to reproduce is as follows:
      
                  <osd3 is idle, on the osd lru list>
      <con reset - osd3>
      con_fault_finish()
        osd_reset()
                                    <osdmap - osd3 down>
                                    ceph_osdc_handle_map()
                                      <takes map_sem>
                                      kick_requests()
                                        <takes request_mutex>
                                        reset_changed_osds()
                                          __reset_osd()
                                            __remove_osd()
                                        <releases request_mutex>
                                      <releases map_sem>
          <takes map_sem>
          <takes request_mutex>
          __kick_osd_requests()
            __reset_osd()
              __remove_osd() <-- !!!
      
      A case can be made that osd refcounting is imperfect and reworking it
      would be a proper resolution, but for now Sage and I decided to fix
      this by adding a safe guard around __remove_osd().
      
      Fixes: http://tracker.ceph.com/issues/8087
      
      Cc: Sage Weil <sage@redhat.com>
      Cc: stable@vger.kernel.org # 3.9+: 7c6e6fc5: libceph: assert both regular and lingering lists in __remove_osd()
      Cc: stable@vger.kernel.org # 3.9+: cc9f1f51: libceph: change from BUG to WARN for __remove_osd() asserts
      Cc: stable@vger.kernel.org # 3.9+
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      7eb71e03
  7. 18 12月, 2014 4 次提交
  8. 14 11月, 2014 3 次提交
  9. 15 10月, 2014 5 次提交
  10. 08 7月, 2014 9 次提交
  11. 12 6月, 2014 1 次提交
  12. 05 4月, 2014 2 次提交
  13. 03 4月, 2014 2 次提交
  14. 14 2月, 2014 1 次提交
  15. 08 2月, 2014 2 次提交