1. 11 7月, 2014 10 次提交
    • J
      block: Convert last uses of __FUNCTION__ to __func__ · 659b2e3b
      Joe Perches 提交于
      Just about all of these have been converted to __func__,
      so convert the last uses.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      659b2e3b
    • L
      drbd: implement csums-after-crash-only · aaaba345
      Lars Ellenberg 提交于
      Checksum based resync trades CPU cycles for network bandwidth,
      in situations where we expect much of the to-be-resynced blocks
      to be actually identical on both sides already.
      
      In a "network hickup" scenario, it won't help:
      all to-be-resynced blocks will typically be different.
      
      The use case is for the resync of *potentially* different blocks
      after crash recovery -- the crash recovery had marked larger areas
      (those covered by the activity log) as need-to-be-resynced,
      just in case. Most of those blocks will be identical.
      
      This option makes it possible to configure checksum based resync,
      but only actually use it for the first resync after primary crash.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      aaaba345
    • L
      drbd: fix bogus resync stats in /proc/drbd · a5655dac
      Lars Ellenberg 提交于
      We intentionally do not serialize /proc/drbd access with
      internal state changes or statistic updates.
      
      Because of that, cat /proc/drbd  may race with resync just being
      finished, still see the sync state, and find information about
      number of blocks still to go, but then find the total number
      of blocks within this resync has just been reset to 0
      when accessing it.
      
      This now produces bogus numbers in the resync speed estimates.
      
      Fix by accessing all relevant data only once,
      and fixing it up if "still to go" happens to be more than "total".
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      a5655dac
    • A
      drbd: Remove unnecessary/unused code · caa3db0e
      Andreas Gruenbacher 提交于
      Get rid of dump_stack() debug statements.
      
      There is no point whatsoever in registering and unregistering a reboot
      notifier that doesn't do anything.
      
      The intention was to switch to an "emergency read-only" mode,
      so we won't have to resync the full activity log just because
      we had been Primary before the reboot.
      
      Once we have that implemented, we may re-introduce the reboot notifier.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      caa3db0e
    • L
      drbd: get rid of drbd_queue_work_front · 4dd726f0
      Lars Ellenberg 提交于
      The last user was al_write_transaction, if called with "delegate",
      and the last user to call it with "delegate = true" was the receiver
      thread, which has no need to delegate, but can call it himself.
      
      Finally drop the delegate parameter, drop the extra
      w_al_write_transaction callback, and drop drbd_queue_work_front.
      
      Do not (yet) change dequeue_work_item to dequeue_work_batch, though.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      4dd726f0
    • L
      drbd: use drbd_device_post_work() in more places · ac0acb9e
      Lars Ellenberg 提交于
      This replaces the md_sync_work member of struct drbd_device
      by a new MD_SYNC "work bit" in device->flags.
      
      This replaces the resync_start_work member of struct drbd_device
      by a new RS_START "work bit" in device->flags.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      ac0acb9e
    • L
      drbd: make sure disk cleanup happens in worker context · e334f550
      Lars Ellenberg 提交于
      The recent fix to put_ldev() (correct ordering of access to local_cnt
      and state.disk; memory barrier in __drbd_set_state) guarantees
      that the cleanup happens exactly once.
      
      However it does not yet guarantee that the cleanup happens from worker
      context, the last put_ldev() may still happen from atomic context,
      which must not happen: blkdev_put() may sleep.
      
      Fix this by scheduling the cleanup to the worker instead,
      using a couple more bits in device->flags and a new helper,
      drbd_device_post_work().
      
      Generalized the "resync progress" work to cover these new work bits.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      e334f550
    • L
      drbd: close race when detaching from disk · ba3c6fb8
      Lars Ellenberg 提交于
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
      IP: bd_release+0x21/0x70
      Process drbd_w_t7146
      Call Trace:
       close_bdev_exclusive
       drbd_free_ldev		[drbd]
       drbd_ldev_destroy	[drbd]
       w_after_state_ch	[drbd]
      
      Race probably went like this:
        state.disk = D_FAILED
      
      ... first one to hit zero during D_FAILED:
         put_ldev() /* ----------------> 0 */
           i = atomic_dec_return()
           if (i == 0)
             if (state.disk == D_FAILED)
               schedule_work(go_diskless)
                                      /* 1 <------ */ get_ldev_if_state()
         go_diskless()
            do_some_pre_cleanup()                     corresponding put_ldev():
            force_state(D_DISKLESS)   /* 0 <------ */ i = atomic_dec_return()
                                                      if (i == 0)
              atomic_inc() /* ---------> 1 */
              state.disk = D_DISKLESS
              schedule_work(after_state_ch)           /* execution pre-empted by IRQ ? */
      
         after_state_ch()
           put_ldev()
             i = atomic_dec_return()  /* 0 */
             if (i == 0)
               if (state.disk == D_DISKLESS)            if (state.disk == D_DISKLESS)
                 drbd_ldev_destroy()                      drbd_ldev_destroy();
      
      Trying to fix this by checking the disk state *before* the
      atomic_dec_return(), which implies memory barriers, and by inserting
      extra memory barriers around the state assignment in __drbd_set_state().
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      ba3c6fb8
    • L
      drbd: fix resync finished detection · 5ab7d2c0
      Lars Ellenberg 提交于
      This fixes one recent regresion,
      and one long existing bug.
      
      The bug:
      drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
      accounted in the resync extent corresponding to the start sector.
      
      Since we allow application requests to cross our "extent" boundaries,
      this assumption is no longer true, resulting in possible misaccounting,
      scary messages
      ("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
      and potentially, if the last bit to be cleared during resync would
      reside in previously misaccounted resync extent, the resync would never
      be recognized as finished, but would be "stalled" forever, even though
      all blocks are in sync again and all bits have been cleared...
      
      The regression was introduced by
          drbd: get rid of atomic update on disk bitmap works
      
      For an "empty" resync (rs_total == 0), we must not "finish" the
      resync on the SyncSource before the SyncTarget knows all relevant
      information (sync uuid).  We need to wait for the full round-trip,
      the SyncTarget will then explicitly notify us.
      
      Also for normal, non-empty resyncs (rs_total > 0), the resync-finished
      condition needs to be tested before the schedule() in wait_for_work, or
      it is likely to be missed.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      5ab7d2c0
    • L
      drbd: get rid of atomic update on disk bitmap works · c7a58db4
      Lars Ellenberg 提交于
      Just trigger the occasional lazy bitmap write-out during resync
      from the central wait_for_work() helper.
      
      Previously, during resync, bitmap pages would be written out separately,
      synchronously, one at a time, at least 8 times each (every 512 bytes
      worth of bitmap cleared).
      
      Now we trigger "merge friendly" bulk write out of all cleared pages
      every two seconds during resync, and once the resync is finished.
      Most pages will be written out only once.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      c7a58db4
  2. 10 7月, 2014 3 次提交
  3. 01 5月, 2014 8 次提交
  4. 17 2月, 2014 19 次提交