1. 09 9月, 2010 28 次提交
    • A
      dccbe6fb
    • S
      virtio-9p: Change handling of flags in open() path for 9P2000.L · 630c2689
      Sripathi Kodi 提交于
      This patch applies on top of 9P2000.L patches that we have on the list.
      I took a look at how 9P server is handling open() flags in 9P2000.L path.
      I think we can do away with the valid_flags() function and simplify the
      code. The reasoning is as follows:
      
      O_NOCTTY: (If the file is a terminal, don't make it the controlling
      terminal of the process even though the process does not have a controlling
      terminal) By the time the control reaches 9P client it is clear that what
      we have is not a terminal device. Hence it does not matter what we do with
      this flag. In any case 9P server can filter this flag out before making the
      syscall.
      
      O_NONBLOCK: (Don't block if i) Can't read/write to the file ii) Can't get
      locks) This has an impact on FIFOs, but also on file locks. Hence we can
      pass it down to the system call.
      
      O_ASYNC: From the manpage:
      
         O_ASYNC
                Enable signal-driven I/O: generate a signal (SIGIO by default,  but
                this  can be changed via fcntl(2)) when input or output becomes pos-
                sible on this file descriptor.  This feature is only available  for
                terminals,  pseudo-terminals,  sockets,  and (since Linux 2.6) pipes
                and FIFOs.  See fcntl(2) for further details.
      
      Again, this does not make any impact on regular files handled by 9P. Also,
      we don't want 9P server to receive SIGIO. Hence I think 9P server can
      filter this flag out before making the syscall.
      
      O_CLOEXEC: This flag makes sense only on the client. If guest user space
      sets this flag the guest VFS will take care of calling close() on the fd if
      an exec() happens. Hence 9P client need not be bothered with this flag.
      Also I think QEMU will not do an exec, but if it does, it makes sense to
      close these fds. Hence we can pass this flag down to the syscall.
      
      O_CREAT: Since we are in open() path it means we have confirmed that the file
      exists. Hence there is no need to pass O_CREAT flag down to the system. In fact
      on some versions of glibc this causes problems, because we pass O_CREAT flag,
      but don't have permission bits. Hence we can just mask this flag out.
      
      So in summary:
      
      Mask out:
      O_NOCTTY
      O_ASYNC
      O_CREAT
      
      Pass-through:
      O_NONBLOCK
      O_CLOEXEC
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      630c2689
    • A
    • A
    • A
      9ed3ef26
    • A
      virtio-9p: Fix the memset usage · 783f04e1
      Aneesh Kumar K.V 提交于
      The arguments are wrong. Use qemu_mallocz directly
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      783f04e1
    • A
      virtio-9p: Use lchown which won't follow symlink · 5c0f255d
      Aneesh Kumar K.V 提交于
      We should always use functions which don't follow
      symlink on the server
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      5c0f255d
    • A
      virtio-9p: Add SM_NONE security model · 12848bfc
      Aneesh Kumar K.V 提交于
      This is equivalent to SM_PASSTHROUGH security model.
      The only exception is, failure of privilige operation like chown
      are ignored. This makes a passthrough like security model usable
      for people who runs kvm as non root
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      12848bfc
    • A
      virtio-9p: Hide user.virtfs xattr in case of mapped security. · 61b6c499
      Aneesh Kumar K.V 提交于
      With mapped security mode we use "user.virtfs" namespace is used
      to store the virtFs related attributes. So hide it from user.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      61b6c499
    • A
      virtio-9p: Implement TXATTRCREATE · 10b468bd
      Aneesh Kumar K.V 提交于
      TXATTRCREATE:  Prepare a fid for setting xattr value on a file system object.
      
       size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
       size[4] RXATTRWALK tag[2]
      
      txattrcreate gets a fid pointing to xattr. This fid can later be
      used to get set the xattr value.
      
      flag value is derived from set Linux setxattr. The manpage says
      "The flags parameter can be used to refine the semantics of the operation.
      XATTR_CREATE specifies a pure create, which fails if the named attribute
      exists already. XATTR_REPLACE specifies a pure replace operation, which
      fails if the named attribute does not already exist. By default (no flags),
      the extended attribute will be created if need be, or will simply replace
      the value if the attribute exists."
      
      The actual setxattr operation happens when the fid is clunked. At that point
      the written byte count and the attr_size specified in TXATTRCREATE should be
      same otherwise an error will be returned.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      10b468bd
    • A
      virtio-9p: Implement TXATTRWALK · fa32ef88
      Aneesh Kumar K.V 提交于
      TXATTRWALK: Descend a ATTR namespace
      
       size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
       size[4] RXATTRWALK tag[2] size[8]
      
      txattrwalk gets a fid pointing to xattr. This fid can later be
      used to get read the xattr value. If name is NULL the fid returned
      can be used to get the list of extended attribute associated to
      the file system object.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      fa32ef88
    • A
      virtio-9p: Add fidtype so that we can do type specific operation · d62dbb51
      Aneesh Kumar K.V 提交于
      We want to add type specific operation during read/write
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      d62dbb51
    • M
      [virtio-9p] qemu: virtio-9p: Implement LOPEN · 771e9d4c
      M. Mohan Kumar 提交于
      Implement 9p2000.L version of open(LOPEN) interface in qemu 9p server.
      
      For LOPEN, no need to convert the flags to and from 9p mode to VFS mode.
      
      Synopsis:
      
          size[4] Tlopen tag[2] fid[4] mode[4]
      
          size[4] Rlopen tag[2] qid[13] iounit[4]
      
      Current qemu 9p server does not support following flags:
          O_NOCTTY, O_NONBLOCK, O_ASYNC & O_CLOEXEC
      
      [Fix mode format - jvrao@linux.vnet.ibm.com]
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      771e9d4c
    • M
      rename - change name of file or directory · c7b4b0b3
      M. Mohan Kumar 提交于
      size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
      size[4] Rrename tag[2]
      
      Implement the 2000.L rename operation. A new function
      v9fs_complete_rename is introduced that acts as a common entry point
      for 2000.L rename operation and 2000.U rename opearation (via wstat).
      As part of this change the field 'nname' (used only for rename) is
      removed from the structure V9fsWstatState. Instead a new structure
      V9fsRenameState is used for rename operations both by 2000.U and 2000.L
      code paths. Both 2000.U and 2000.L rename code paths construct the
      V9fsRenameState structure and passes that to v9fs_complete_rename
      function.
      
      Changes from previous version:
       Use qemu_mallocz to initialize
       Use strcpy,strcat functions instead of memcpy
       Changed the variable name to newdirfid
       Introduced post rename function
       Error checking
       Removed nname field from V9fsWstatState
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      c7b4b0b3
    • M
      qemu: virtio-9p: Implement TMKDIR · b67592ea
      M. Mohan Kumar 提交于
      Synopsis
      
          size[4] Tmkdir tag[2] fid[4] name[s] mode[4] gid[4]
      
          size[4] Rmkdir tag[2] qid[13]
      
      Description
      
          mkdir asks the file server to create a directory with given name,
          mode and gid. The qid for the new directory is returned with
          the mkdir reply message.
      
      Note: 72 is selected as the opcode for TMKDIR from the reserved list.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      [jvrao@linux.vnet.ibm.com: Fix perm handling when creating directory]
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      b67592ea
    • M
      qemu: virtio-9p: Implement TMKNOD · 5268cecc
      M. Mohan Kumar 提交于
      Implement TMKNOD as part of 2000.L Work
      
      Synopsis
      
          size[4] Tmknod tag[2] fid[4] name[s] mode[4] major[4] minor[4] gid[4]
      
          size[4] Rmknod tag[2] qid[13]
      
      Description
      
          mknod asks the file server to create a device node with given device
          type, mode and gid. The qid for the new device node is returned with
          the mknod reply message.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      5268cecc
    • V
      [virtio-9p] This patch implements TLCREATE for 9p2000.L protocol. · c1568af5
      Venkateswararao Jujjuri (JV) 提交于
      SYNOPSIS
      
          size[4] Tlcreate tag[2] fid[4] name[s] flags[4] mode[4] gid[4]
      
          size[4] Rlcreate tag[2] qid[13] iounit[4]
      
      DESCRIPTION
      
      The Tlreate request asks the file server to create a new regular file with the
      name supplied, in the directory (dir) represented by fid.
      The mode argument specifies the permissions to use. New file is created with
      the uid if the fid and with supplied gid.
      
      The flags argument represent Linux access mode flags with which the caller
      is requesting to open the file with. Protocol allows all the Linux access
      modes but it is upto the server to allow/disallow any of these acess modes.
      If the server doesn't support any of the access mode, it is expected to
      return error.
      
      To start with we will not restricit/limit any Linux flags on this server.
      If needed, We can start restricting as we move forward with various use cases.
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      c1568af5
    • V
      [virtio-9p] Define and implement TSYMLINK for 9P2000.L · 08c60fc9
      Venkateswararao Jujjuri (JV) 提交于
      This patch implements creating a symlink for TSYMLINK request
      and responds with RSYMLINK. In the case of error, we return RERROR.
      
      SYNOPSIS
      
          size[4] Tsymlink tag[2] fid[4] name[s] symtgt[s] gid[4]
      
          size[4] Rsymlink tag[2] qid[13]
      
          DESCRIPTION
      
          Create a symbolic link named 'name' pointing to 'symtgt'.
          gid represents the effective group id of the caller.
          The  permissions of a symbolic link are irrelevant hence it is omitted
          from the protocol.
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      08c60fc9
    • V
      [virtio-9p] Implement TLINK for 9P2000.L · b2c224be
      Venkateswararao Jujjuri (JV) 提交于
      Create a Hardlink.
      
      SYNOPSIS
      
      size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]
      
      size[4] Rlink tag[2]
      
      DESCRIPTION
      
      Create a link 'newpath' in directory pointed by dfid linking to oldfid path.
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      b2c224be
    • S
      virtio-9p: Implement server side of setattr for 9P2000.L protocol. · c79ce737
      Sripathi Kodi 提交于
      SYNOPSIS
      
            size[4] Tsetattr tag[2] attr[n]
      
            size[4] Rsetattr tag[2]
      
         DESCRIPTION
      
            The setattr command changes some of the file status information.
            attr resembles the iattr structure used in Linux kernel. It
            specifies which status parameter is to be changed and to what
            value. It is laid out as follows:
      
               valid[4]
                  specifies which status information is to be changed. Possible
                  values are:
                  ATTR_MODE       (1 << 0)
                  ATTR_UID        (1 << 1)
                  ATTR_GID        (1 << 2)
                  ATTR_SIZE       (1 << 3)
                  ATTR_ATIME      (1 << 4)
                  ATTR_MTIME      (1 << 5)
                  ATTR_CTIME      (1 << 5)
                  ATTR_ATIME_SET  (1 << 7)
                  ATTR_MTIME_SET  (1 << 8)
      
                  The last two bits represent whether the time information
                  is being sent by the client's user space. In the absense
                  of these bits the server always uses server's time.
      
               mode[4]
                  File permission bits
      
               uid[4]
                  Owner id of file
      
               gid[4]
                  Group id of the file
      
               size[8]
                  File size
      
               atime_sec[8]
                  Time of last file access, seconds
      
               atime_nsec[8]
                  Time of last file access, nanoseconds
      
               mtime_sec[8]
                  Time of last file modification, seconds
      
               mtime_nsec[8]
                  Time of last file modification, nanoseconds
      
      Explanation of the patches:
      --------------------------
      
      *) The kernel just copies relevent contents of iattr structure to p9_iattr_dotl
         structure and passes it down to the client. The only check it has is calling
         inode_change_ok()
      *) The p9_iattr_dotl structure does not have ctime and ia_file parameters because
         I don't think these are needed in our case. The client user space can request
         updating just ctime by calling chown(fd, -1, -1). This is handled on server
         side without a need for putting ctime on the wire.
      *) The server currently supports changing mode, time, ownership and size of the
         file.
      *) 9P RFC says "Either all the changes in wstat request happen, or none of them
         does: if the request succeeds, all changes were made; if it fails, none were."
         I have not done anything to implement this specifically because I don't see
         a reason.
      
      [jvrao@linux.vnet.ibm.com: Parts of code for handling chown(-1,-1)
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      c79ce737
    • S
      [virtio-9p] Make v9fs_do_utimensat accept timespec structures instead of v9stat. · 8fc39ae4
      Sripathi Kodi 提交于
      Currently v9fs_do_utimensat takes a V9fsStat argument and builds
      timespec structures. It sets tv_nsec values to 0 by default. Instead
      of this it should take struct timespec[2] and pass it down to the
      system directly. This will make it more generic and useful
      elsewhere.
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      8fc39ae4
    • M
      virtio-9p: Do not reset atime · 74bc02b2
      M. Mohan Kumar 提交于
          Current code resets file's atime to 0 when there is a change in mtime.
          This results in resetting the atime to "1970-01-01 05:30:00". For
          example, truncate -s 0 filename results in changing the mtime to the
          truncate time, but resets the atime to "1970-01-01 05:30:00". utime
          system call does not have any provision to set only mtime or atime. So
          change v9fs_wstat_post_chmod function to use utimensat function to change
          the atime and mtime fields. If tv_nsec field is set to the special value
          "UTIME_OMIT", corresponding file time stamp is not updated.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      74bc02b2
    • S
      virtio-9p: getattr server implementation for 9P2000.L protocol. · 00ede4c2
      Sripathi Kodi 提交于
                 SYNOPSIS
      
                    size[4] Tgetattr tag[2] fid[4] request_mask[8]
      
                    size[4] Rgetattr tag[2] lstat[n]
      
                 DESCRIPTION
      
                    The getattr transaction inquires about the file identified by fid.
                    request_mask is a bit mask that specifies which fields of the
                    stat structure is the client interested in.
      
                    The reply will contain a machine-independent directory entry,
                    laid out as follows:
      
                       st_result_mask[8]
                          Bit mask that indicates which fields in the stat structure
                          have been populated by the server
      
                       qid.type[1]
                          the type of the file (directory, etc.), represented as a bit
                          vector corresponding to the high 8 bits of the file's mode
                          word.
      
                       qid.vers[4]
                          version number for given path
      
                       qid.path[8]
                          the file server's unique identification for the file
      
                       st_mode[4]
                          Permission and flags
      
                       st_uid[4]
                          User id of owner
      
                       st_gid[4]
                          Group ID of owner
      
                       st_nlink[8]
                          Number of hard links
      
                       st_rdev[8]
                          Device ID (if special file)
      
                       st_size[8]
                          Size, in bytes
      
                       st_blksize[8]
                          Block size for file system IO
      
                       st_blocks[8]
                          Number of file system blocks allocated
      
                       st_atime_sec[8]
                          Time of last access, seconds
      
                       st_atime_nsec[8]
                          Time of last access, nanoseconds
      
                       st_mtime_sec[8]
                          Time of last modification, seconds
      
                       st_mtime_nsec[8]
                          Time of last modification, nanoseconds
      
                       st_ctime_sec[8]
                          Time of last status change, seconds
      
                       st_ctime_nsec[8]
                          Time of last status change, nanoseconds
      
                       st_btime_sec[8]
                          Time of creation (birth) of file, seconds
      
                       st_btime_nsec[8]
                          Time of creation (birth) of file, nanoseconds
      
                       st_gen[8]
                          Inode generation
      
                       st_data_version[8]
                          Data version number
      
                    request_mask and result_mask bit masks contain the following bits
                       #define P9_STATS_MODE          0x00000001ULL
                       #define P9_STATS_NLINK         0x00000002ULL
                       #define P9_STATS_UID           0x00000004ULL
                       #define P9_STATS_GID           0x00000008ULL
                       #define P9_STATS_RDEV          0x00000010ULL
                       #define P9_STATS_ATIME         0x00000020ULL
                       #define P9_STATS_MTIME         0x00000040ULL
                       #define P9_STATS_CTIME         0x00000080ULL
                       #define P9_STATS_INO           0x00000100ULL
                       #define P9_STATS_SIZE          0x00000200ULL
                       #define P9_STATS_BLOCKS        0x00000400ULL
      
                       #define P9_STATS_BTIME         0x00000800ULL
                       #define P9_STATS_GEN           0x00001000ULL
                       #define P9_STATS_DATA_VERSION  0x00002000ULL
      
                       #define P9_STATS_BASIC         0x000007ffULL
                       #define P9_STATS_ALL           0x00003fffULL
      
              This patch implements the client side of getattr implementation for 9P2000.L.
              It introduces a new structure p9_stat_dotl for getting Linux stat information
              along with QID. The data layout is similar to stat structure in Linux user
              space with the following major differences:
      
              inode (st_ino) is not part of data. Instead qid is.
      
              device (st_dev) is not part of data because this doesn't make sense on the
              client.
      
              All time variables are 64 bit wide on the wire. The kernel seems to use
              32 bit variables for these variables. However, some of the architectures
              have used 64 bit variables and glibc exposes 64 bit variables to user
              space on some architectures. Hence to be on the safer side we have made
              these 64 bit in the protocol. Refer to the comments in
              include/asm-generic/stat.h
      
              There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
              st_data_version apart from the bitmask, st_result_mask. The bit mask
              is filled by the server to indicate which stat fields have been
              populated by the server. Currently there is no clean way for the
              server to obtain these additional fields, so it sends back just the
              basic fields.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      00ede4c2
    • M
      virtio-9p: Compute iounit based on host filesystem block size · 5e94c103
      M. Mohan Kumar 提交于
      Compute iounit based on the host filesystem block size and pass it to
      client with open/create response. Also return iounit as statfs's f_bsize
      for optimal block size transfers.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Reviewd-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      5e94c103
    • S
      [V4] virtio-9p: readdir implementation for 9p2000.L · c18e2f94
      Sripathi Kodi 提交于
      This patch implements the server part of readdir() implementation for
      9p2000.L
      
          SYNOPSIS
      
          size[4] Treaddir tag[2] fid[4] offset[8] count[4]
          size[4] Rreaddir tag[2] count[4] data[count]
      
          DESCRIPTION
      
          The readdir request asks the server to read the directory specified by 'fid'
          at an offset specified by 'offset' and return as many dirent structures as
          possible that fit into count bytes. Each dirent structure is laid out as
          follows.
      
                  qid.type[1]
                    the type of the file (directory, etc.), represented as a bit
                    vector corresponding to the high 8 bits of the file's mode
                    word.
      
                  qid.vers[4]
                    version number for given path
      
                  qid.path[8]
                    the file server's unique identification for the file
      
                  offset[8]
                    offset into the next dirent.
      
                  type[1]
                    type of this directory entry.
      
                  name[256]
                    name of this directory entry.
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Reviewed-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      c18e2f94
    • S
      virtio-9p: Return correct error from v9fs_remove · 926487b7
      Sripathi Kodi 提交于
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      
      In v9fs_remove_post_remove() we currently ignore the error returned by
      the previous call to remove() and return an error only if freeing the
      fid fails. However, the client expects to see the error from remove().
      Currently the client falsely thinks that the remove call has always
      succeeded. For example, doing rmdir on a non-empty directory does
      not return ENOTEMPTY.
      
      With this patch we ignore the error from free_fid(). The client cannot
      use this error value anyway.
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      926487b7
    • M
      qemu: virtio-9p: Implement statfs support in server · be940c87
      M. Mohan Kumar 提交于
      Implement statfs support in qemu server based on Sripathi's
      initial statfs patch.
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      be940c87
    • M
      qemu: virtio-9p: Recognize 9P2000.L protocol · 84151514
      M. Mohan Kumar 提交于
      Make 9P server recognize 9P2000.L protocol version
      Signed-off-by: NM. Mohan Kumar <mohan@in.ibm.com>
      Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
      84151514
  2. 08 9月, 2010 7 次提交
    • A
      virtio-net: Introduce a new bottom half packet TX · a697a334
      Alex Williamson 提交于
      Based on a patch from Mark McLoughlin, this patch introduces a new
      bottom half packet transmitter that avoids the latency imposed by
      the tx_timer approach.  Rather than scheduling a timer when a TX
      packet comes in, schedule a bottom half to be run from the iothread.
      The bottom half handler first attempts to flush the queue with
      notification disabled (this is where we could race with a guest
      without txburst).  If we flush a full burst, reschedule immediately.
      If we send short of a full burst, try to re-enable notification.
      To avoid a race with TXs that may have occurred, we must then
      flush again.  If we find some packets to send, the guest it probably
      active, so we can reschedule again.
      
      tx_timer and tx_bh are mutually exclusive, so we can re-use the
      tx_waiting flag to indicate one or the other needs to be setup.
      This allows us to seamlessly migrate between timer and bh TX
      handling.
      
      The bottom half handler becomes the new default and we add a new
      tx= option to virtio-net-pci.  Usage:
      
      -device virtio-net-pci,tx=timer # select timer mitigation vs "bh"
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      a697a334
    • A
      virtio-net: Rename tx_timer_active to tx_waiting · 4b4b8d36
      Alex Williamson 提交于
      De-couple this from the timer since we might want to use
      different backends to send the packet.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      4b4b8d36
    • A
      virtio-net: Limit number of packets sent per TX flush · e3f30488
      Alex Williamson 提交于
      If virtio_net_flush_tx() is called with notification disabled, we can
      race with the guest, processing packets at the same rate as they
      get produced.  The trouble is that this means we have no guaranteed
      exit condition from the function and can spend minutes in there.
      Currently flush_tx is only called with notification on, which seems
      to limit us to one pass through the queue per call.  An upcoming
      patch changes this.
      
      Also add an option to set this value on the command line as different
      workloads may wish to use different values.  We can't necessarily
      support any random value, so this is a developer option: x-txburst=
      Usage:
      
      -device virtio-net-pci,x-txburst=64 # 64 packets per tx flush
      
      One pass through the queue (256) seems to be a good default value
      for this, balancing latency with throughput.  We use a signed int
      for x-txburst because 2^31 packets in a burst would take many, many
      minutes to process and it allows us to easily return a negative
      value value from virtio_net_flush_tx() to indicate a back-off
      or error condition.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      e3f30488
    • A
      virtio-net: Make tx_timer timeout configurable · f0c07c7c
      Alex Williamson 提交于
      Add an option to make the TX mitigation timer adjustable as a device
      option.  The 150us hard coded default used currently is reasonable,
      but may not be suitable for all workloads, this gives us a way to
      adjust it using a single binary.  We can't support any random option
      though, so use the "x-" prefix to indicate this is a developer
      option.  Usage:
      
      -device virtio-net-pci,x-txtimer=500000,... # .5ms timeout
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f0c07c7c
    • M
      vhost_net: mergeable buffers support · ca736c8e
      Michael S. Tsirkin 提交于
      use the new tap APIs to set header length
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      ca736c8e
    • M
      tap: add APIs for vnet header length · 445d892f
      Michael S. Tsirkin 提交于
      Add APIs to control host header length. First user
      will be vhost-net.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      445d892f
    • M
      tap: generalize code for different vnet header len · ef4252b1
      Michael S. Tsirkin 提交于
      Make host vnet header length a structure field in
      preparation for using this support in linux kernel.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      ef4252b1
  3. 06 9月, 2010 1 次提交
  4. 04 9月, 2010 4 次提交