1. 05 7月, 2019 1 次提交
  2. 03 7月, 2019 2 次提交
  3. 01 7月, 2019 5 次提交
  4. 24 6月, 2019 8 次提交
    • M
      iotests: Fix 205 for concurrent runs · ab5d4a30
      Max Reitz 提交于
      Tests should place their files into the test directory.  This includes
      Unix sockets.  205 currently fails to do so, which prevents it from
      being run concurrently.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190618210238.9524-1-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      ab5d4a30
    • P
      ssh: switch from libssh2 to libssh · b10d49d7
      Pino Toscano 提交于
      Rewrite the implementation of the ssh block driver to use libssh instead
      of libssh2.  The libssh library has various advantages over libssh2:
      - easier API for authentication (for example for using ssh-agent)
      - easier API for known_hosts handling
      - supports newer types of keys in known_hosts
      
      Use APIs/features available in libssh 0.8 conditionally, to support
      older versions (which are not recommended though).
      
      Adjust the iotest 207 according to the different error message, and to
      find the default key type for localhost (to properly compare the
      fingerprint with).
      Contributed-by: NMax Reitz <mreitz@redhat.com>
      
      Adjust the various Docker/Travis scripts to use libssh when available
      instead of libssh2. The mingw/mxe testing is dropped for now, as there
      are no packages for it.
      Signed-off-by: NPino Toscano <ptoscano@redhat.com>
      Tested-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Acked-by: NAlex Bennée <alex.bennee@linaro.org>
      Message-id: 20190620200840.17655-1-ptoscano@redhat.com
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Message-id: 5873173.t2JhDm7DL7@lindworm.usersys.redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      b10d49d7
    • S
      vmdk: Add read-only support for seSparse snapshots · 98eb9733
      Sam Eiderman 提交于
      Until ESXi 6.5 VMware used the vmfsSparse format for snapshots (VMDK3 in
      QEMU).
      
      This format was lacking in the following:
      
          * Grain directory (L1) and grain table (L2) entries were 32-bit,
            allowing access to only 2TB (slightly less) of data.
          * The grain size (default) was 512 bytes - leading to data
            fragmentation and many grain tables.
          * For space reclamation purposes, it was necessary to find all the
            grains which are not pointed to by any grain table - so a reverse
            mapping of "offset of grain in vmdk" to "grain table" must be
            constructed - which takes large amounts of CPU/RAM.
      
      The format specification can be found in VMware's documentation:
      https://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf
      
      In ESXi 6.5, to support snapshot files larger than 2TB, a new format was
      introduced: SESparse (Space Efficient).
      
      This format fixes the above issues:
      
          * All entries are now 64-bit.
          * The grain size (default) is 4KB.
          * Grain directory and grain tables are now located at the beginning
            of the file.
            + seSparse format reserves space for all grain tables.
            + Grain tables can be addressed using an index.
            + Grains are located in the end of the file and can also be
              addressed with an index.
            - seSparse vmdks of large disks (64TB) have huge preallocated
              headers - mainly due to L2 tables, even for empty snapshots.
          * The header contains a reverse mapping ("backmap") of "offset of
            grain in vmdk" to "grain table" and a bitmap ("free bitmap") which
            specifies for each grain - whether it is allocated or not.
            Using these data structures we can implement space reclamation
            efficiently.
          * Due to the fact that the header now maintains two mappings:
              * The regular one (grain directory & grain tables)
              * A reverse one (backmap and free bitmap)
            These data structures can lose consistency upon crash and result
            in a corrupted VMDK.
            Therefore, a journal is also added to the VMDK and is replayed
            when the VMware reopens the file after a crash.
      
      Since ESXi 6.7 - SESparse is the only snapshot format available.
      
      Unfortunately, VMware does not provide documentation regarding the new
      seSparse format.
      
      This commit is based on black-box research of the seSparse format.
      Various in-guest block operations and their effect on the snapshot file
      were tested.
      
      The only VMware provided source of information (regarding the underlying
      implementation) was a log file on the ESXi:
      
          /var/log/hostd.log
      
      Whenever an seSparse snapshot is created - the log is being populated
      with seSparse records.
      
      Relevant log records are of the form:
      
      [...] Const Header:
      [...]  constMagic     = 0xcafebabe
      [...]  version        = 2.1
      [...]  capacity       = 204800
      [...]  grainSize      = 8
      [...]  grainTableSize = 64
      [...]  flags          = 0
      [...] Extents:
      [...]  Header         : <1 : 1>
      [...]  JournalHdr     : <2 : 2>
      [...]  Journal        : <2048 : 2048>
      [...]  GrainDirectory : <4096 : 2048>
      [...]  GrainTables    : <6144 : 2048>
      [...]  FreeBitmap     : <8192 : 2048>
      [...]  BackMap        : <10240 : 2048>
      [...]  Grain          : <12288 : 204800>
      [...] Volatile Header:
      [...] volatileMagic     = 0xcafecafe
      [...] FreeGTNumber      = 0
      [...] nextTxnSeqNumber  = 0
      [...] replayJournal     = 0
      
      The sizes that are seen in the log file are in sectors.
      Extents are of the following format: <offset : size>
      
      This commit is a strict implementation which enforces:
          * magics
          * version number 2.1
          * grain size of 8 sectors  (4KB)
          * grain table size of 64 sectors
          * zero flags
          * extent locations
      
      Additionally, this commit proivdes only a subset of the functionality
      offered by seSparse's format:
          * Read-only
          * No journal replay
          * No space reclamation
          * No unmap support
      
      Hence, journal header, journal, free bitmap and backmap extents are
      unused, only the "classic" (L1 -> L2 -> data) grain access is
      implemented.
      
      However there are several differences in the grain access itself.
      Grain directory (L1):
          * Grain directory entries are indexes (not offsets) to grain
            tables.
          * Valid grain directory entries have their highest nibble set to
            0x1.
          * Since grain tables are always located in the beginning of the
            file - the index can fit into 32 bits - so we can use its low
            part if it's valid.
      Grain table (L2):
          * Grain table entries are indexes (not offsets) to grains.
          * If the highest nibble of the entry is:
              0x0:
                  The grain in not allocated.
                  The rest of the bytes are 0.
              0x1:
                  The grain is unmapped - guest sees a zero grain.
                  The rest of the bits point to the previously mapped grain,
                  see 0x3 case.
              0x2:
                  The grain is zero.
              0x3:
                  The grain is allocated - to get the index calculate:
                  ((entry & 0x0fff000000000000) >> 48) |
                  ((entry & 0x0000ffffffffffff) << 12)
          * The difference between 0x1 and 0x2 is that 0x1 is an unallocated
            grain which results from the guest using sg_unmap to unmap the
            grain - but the grain itself still exists in the grain extent - a
            space reclamation procedure should delete it.
            Unmapping a zero grain has no effect (0x2 will not change to 0x1)
            but unmapping an unallocated grain will (0x0 to 0x1) - naturally.
      
      In order to implement seSparse some fields had to be changed to support
      both 32-bit and 64-bit entry sizes.
      Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com>
      Reviewed-by: NEyal Moscovici <eyal.moscovici@oracle.com>
      Reviewed-by: NArbel Moshe <arbel.moshe@oracle.com>
      Signed-off-by: NSam Eiderman <shmuel.eiderman@oracle.com>
      Message-id: 20190620091057.47441-4-shmuel.eiderman@oracle.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      98eb9733
    • S
      vmdk: Reduce the max bound for L1 table size · 59d6ee48
      Sam Eiderman 提交于
      512M of L1 entries is a very loose bound, only 32M are required to store
      the maximal supported VMDK file size of 2TB.
      
      Fixed qemu-iotest 59# - now failure occures before on impossible L1
      table size.
      Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com>
      Reviewed-by: NEyal Moscovici <eyal.moscovici@oracle.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NArbel Moshe <arbel.moshe@oracle.com>
      Signed-off-by: NSam Eiderman <shmuel.eiderman@oracle.com>
      Message-id: 20190620091057.47441-3-shmuel.eiderman@oracle.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      59d6ee48
    • S
      vmdk: Fix comment regarding max l1_size coverage · 940a2cd5
      Sam Eiderman 提交于
      Commit b0651b8c ("vmdk: Move l1_size check into vmdk_add_extent")
      extended the l1_size check from VMDK4 to VMDK3 but did not update the
      default coverage in the moved comment.
      
      The previous vmdk4 calculation:
      
          (512 * 1024 * 1024) * 512(l2 entries) * 65536(grain) = 16PB
      
      The added vmdk3 calculation:
      
          (512 * 1024 * 1024) * 4096(l2 entries) * 512(grain) = 1PB
      
      Adding the calculation of vmdk3 to the comment.
      
      In any case, VMware does not offer virtual disks more than 2TB for
      vmdk4/vmdk3 or 64TB for the new undocumented seSparse format which is
      not implemented yet in qemu.
      Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com>
      Reviewed-by: NEyal Moscovici <eyal.moscovici@oracle.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NArbel Moshe <arbel.moshe@oracle.com>
      Signed-off-by: NSam Eiderman <shmuel.eiderman@oracle.com>
      Message-id: 20190620091057.47441-2-shmuel.eiderman@oracle.com
      Reviewed-by: Nyuchenlin <yuchenlin@synology.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      940a2cd5
    • A
      iotest 134: test cluster-misaligned encrypted write · 6ec889eb
      Anton Nefedov 提交于
      COW (even empty/zero) areas require encryption too
      Signed-off-by: NAnton Nefedov <anton.nefedov@virtuozzo.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190516143028.81155-1-anton.nefedov@virtuozzo.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      6ec889eb
    • V
      blockdev: enable non-root nodes for transaction drive-backup source · 85c9d133
      Vladimir Sementsov-Ogievskiy 提交于
      We forget to enable it for transaction .prepare, while it is already
      enabled in do_drive_backup since commit a2d665c1
          "blockdev: loosen restrictions on drive-backup source node"
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-id: 20190618140804.59214-1-vsementsov@virtuozzo.com
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      85c9d133
    • K
      nvme: do not advertise support for unsupported arbitration mechanism · 1cc354ac
      Klaus Birkelund Jensen 提交于
      The device mistakenly reports that the Weighted Round Robin with Urgent
      Priority Class arbitration mechanism is supported.
      
      It is not.
      Signed-off-by: NKlaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
      Message-id: 20190606092530.14206-1-klaus@birkelund.eu
      Acked-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      1cc354ac
  5. 21 6月, 2019 24 次提交