1. 02 11月, 2010 1 次提交
  2. 28 10月, 2010 7 次提交
  3. 27 10月, 2010 1 次提交
  4. 23 10月, 2010 1 次提交
  5. 22 10月, 2010 1 次提交
  6. 21 10月, 2010 6 次提交
  7. 19 10月, 2010 1 次提交
  8. 18 10月, 2010 2 次提交
  9. 15 10月, 2010 20 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
    • L
      drbd: add race-breaker to drbd_go_diskless · 5dbfe7ae
      Lars Ellenberg 提交于
      This adds a necessary race breaker to these commits:
          drbd: fix for possible deadlock on IO error during resync
          drbd: drop wrong debug asserts, fix recently introduced race
      
      What we do is get a refcount, check the state, then depending on the
      state and the requested minimum disk state, either hold it (success),
      or give it back immediately (failed "try lock").
      
      Some code paths (flushing of drbd metadata) may still grab and hold a
      refcount even if we are D_FAILED (application IO won't).
      So even if we hit local_cnt == 0 once after being D_FAILED,
      we still need to wait for that again after we changed to D_DISKLESS.
      Once local_cnt reaches 0 while we are D_DISKLESS, we can be sure that
      no one will look at the protected members anymore, so only then is it
      safe to free them.
      
      We cannot easily convert to standard locking primitives here, as we want
      to be able to use it in atomic context (we always do a "try lock"),
      as well as hold references for a "long time" (from IO submission to
      completion callback).
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      5dbfe7ae
    • L
    • D
      drbd: cleanup: change "<= 0" to "== 0" · 22657695
      Dan Carpenter 提交于
      dt is unsigned so it's never less than zero.  We are calculating the
      elapsed time, and that's never less than zero (unless there is a bug or
      we invent time travel).  The comparison here is just to guard against
      divide by zero bugs.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      22657695
    • L
      drbd: relax the grace period of the md_sync timer again · ca0e6098
      Lars Ellenberg 提交于
      Consolidate the ifdef's for the debug level, accidentally the used both
      DEBUG and DRBD_DEBUG_MD_SYNC.  Default to off.
      
      For production, we can safely reduce the grace period for this timer
      again the the value we used to have.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      ca0e6098
    • L
      drbd: add some more explicit drbd_md_sync · 856c50c7
      Lars Ellenberg 提交于
      It sometimes may take a while for the after state change work to be
      scheduled, which does drbd_md_sync. At convenient places, we should do
      explicit drbd_md_sync to have the new state information on disk as soon
      as possible.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      856c50c7
    • L
      drbd: drop wrong debug asserts, fix recently introduced race · 9d282875
      Lars Ellenberg 提交于
       commit 2372c38caadeaebc68a5ee190782c2a0df01edc3
       drbd: fix for possible deadlock on IO error during resync
      
      introduced a new ASSERT, which turns out to be wrong. Drop it.
      
      Also serialize the state change to D_DISKLESS with the after state
      change work of the -> D_FAILED transition, don't open a new race.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      9d282875
    • L
    • L
      drbd: add explicit drbd_md_sync to drbd_resync_finished · 13d42685
      Lars Ellenberg 提交于
      As we usually update the generation UUIDs here, we should explicitly
      sync them to disk.  So far this has been done only implicitly by related
      code paths.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      13d42685
    • P
      drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTED · b18b37be
      Philipp Reisner 提交于
      This might happen if on the VERIFY_S node the disk gets dropped.
      Although this is an cluster wide state transition, the VERIFY_T node,
      updates it connection state first. Then the ack packet for the
      cluster wide state transition travels back, and the VERIFY_S node
      stops to produce the P_OV_REQUEST packets.
      
      There is absolutely nothing wrong with that.
      
      Further, do not log "Can not satisfy peer's..." on the VERIFY_S
      node in this case, but pretend that they had equal checksum.
      
      [Bugz 327]
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      b18b37be
    • L
      drbd: fix for possible deadlock on IO error during resync · e9e6f3ec
      Lars Ellenberg 提交于
      Scenario:
      
      Something (say, flush-147:0) is in drbd_al_begin_io,
      holding a local_cnt, waiting for the resync to make progress.
      
      Disk fails, worker in after_state_ch does drbd_rs_cancel_all,
      then waits for local_cnt to drop to zero.
      
      flush-147:0 is woken by drbd_rs_cancel_all, needs to write an AL
      transaction, and queues that on the worker.
      
      Deadlock.
      
      Fix: do not wait in the worker, have put_ldev() trigger the
      state change D_FAILED -> D_DISKLESS when necessary.
      put_ldev() cannot do the state change directly, as it may or may not
      already hold various spinlocks. We queue a short work instead.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      e9e6f3ec
    • L
      drbd: fix unlikely access after free and list corruption · 22cc37a9
      Lars Ellenberg 提交于
      Various cleanup paths have been incomplete, for the very unlikely case
      that we cannot allocate enough bios from process context when submitting
      on behalf of the peer or resync process.
      
      Never observed.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      22cc37a9
    • L
      drbd: fix for spurious fullsync (uuids rotated too fast) · af85e8e8
      Lars Ellenberg 提交于
      If it was an "empty" resync, the SyncSource may have already "finished"
      the resync and rotated the UUIDs, before noticing the connection loss
      (and generating a new uuid, if Primary, rotating again), while the
      SyncTarget did not change its uuids at all, or only got to the previous
      sync-uuid.
      This would then again lead to a full sync on next handshake
      (see also Bug #251).
      
      Fix:
      Use explicit resync finished notification even for empty resyncs,
      do not finish an empty resync implicitly on the SyncSource.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      af85e8e8
    • L
      drbd: allow for explicit resync-finished notifications · e9ef7bb6
      Lars Ellenberg 提交于
      Preparation patch so more drbd_send_state() usage on the peer
      will not confuse drbd in receive_state().
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      e9ef7bb6
    • L
      drbd: preparation commit, using full state in receive_state() · 4ac4aada
      Lars Ellenberg 提交于
      no functional change, just using full state instead of just the .conn
      part of it for comparisons.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      4ac4aada
    • L
      drbd: drbd_send_ack_dp must not rely on header information · 2b2bf214
      Lars Ellenberg 提交于
      drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec
      drbd: receiving of big packets, for payloads between 64kByte and 4GByte
      introduced a new on-the-wire packet header format.  We must no longer
      assume either format, but use the result of whatever drbd_recv_header
      has decoded.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      2b2bf214
    • L
      drbd: Fix regression in recv_bm_rle_bits (compressed bitmap) · 004352fa
      Lars Ellenberg 提交于
      We used to be16_to_cpu the length field in our received packet header.
      drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec
          drbd: receiving of big packets, for payloads between 64kByte and 4GByte
      changed this, but forgot to adjust a few places where we relied on
      h->length being in native byte order.
      
      This broke the receiving side of the RLE compressed bitmap exchange.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      004352fa
    • P
      drbd: Fixed a stupid copy and paste error · f10f2623
      Philipp Reisner 提交于
      This caused rs_planed to be not in sync with the content of the fifo.
      That in turn could cause that the resync comes to a complete halt.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      f10f2623
    • P
      drbd: Allow larger values for c-fill-target. · 00b42537
      Philipp Reisner 提交于
      Connections through a compressing proxy might have more bits
      on the fly. 500MByte instead of 50MByte
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      00b42537
    • L
      drbd: fix possible access after free · f65363cf
      Lars Ellenberg 提交于
      If we release the page pointed to by md_io_tmpp, we need to zero out the
      pointer, too, as that may be used later to decide whether we need to
      allocate a new page again.
      
      Impact: a previously freed page may be used and clobbered.  Depending on
      what that particular page is being used for meanwhile, this may result
      in silent data corruption of completely unrelated things.
      
      Only of concern on devices with logical_block_size != 512 byte,
      if you re-attach after becoming diskless once.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      f65363cf