1. 22 2月, 2021 10 次提交
  2. 21 2月, 2021 21 次提交
  3. 20 2月, 2021 9 次提交
    • P
      Pass result element type to XlaBuilder for `mhlo.dot_general` and `mhlo.convolution` ops. · 37a13924
      Prakalp Srivastava 提交于
      `mhlo.dot_general` and `mhlo.convolution` result element type might be different from operand element type. See `preferred_element_type` attribute that allows i8xi8 to i32 dot computation. `mhlo` to HLO exporter should pass the result element type to Xla builder to override the shape inference of XLA.
      
      PiperOrigin-RevId: 358580718
      Change-Id: If3ad34b6824a52498663f0a1a031a5bdc29a24ee
      37a13924
    • A
      compat: Update forward compatibility horizon to 2021-02-20 · a957f4e1
      A. Unique TensorFlower 提交于
      PiperOrigin-RevId: 358555614
      Change-Id: I601a64ef0cdb53564397e9644f529a92132497c2
      a957f4e1
    • A
      Update GraphDef version to 683. · c332db98
      A. Unique TensorFlower 提交于
      PiperOrigin-RevId: 358555610
      Change-Id: I1ef2f9fc4612d5596a5f3942817298447d9c0258
      c332db98
    • H
      Use scoped diagnostic handler to capture diagnostics within the... · edb1031d
      Haoliang Zhang 提交于
      Use scoped diagnostic handler to capture diagnostics within the `lower_static_tensor_list` pass when `allow_tensorlist_pass_through=true`.
      
      PiperOrigin-RevId: 358555572
      Change-Id: I3545278631d992e4eb8640d7cbde3d7736c012a3
      edb1031d
    • J
      Implement resource and variant tensor sharing across multi-subgraphs · 1636a7d0
      Jaesung Chung 提交于
      * Sharing TensorFlow Resource and Variant Tensors across multi-subgraphs
      
      The existing Flex delegate handles the cases for the multi-partitions in a
      subgraph well but when sharing tensors across subgraphs, we need to be careful
      to implement since resource and variant tensors are RAII objects, which means
      that the memcpy does not work on them. Other value-based tensor formats, numbers
      and strings, can be easily a format of memory data and easily copied around
      subgraphs by just copying data. Sharing the pointer of resource and variant
      tensor should be done instead.
      
      * Sharing TensorFlow tensor pointer across multi-subgraphs
      To do that, Flex resource and variant buffer format is just a container for
      storing the TensorFlow tensor object, which refers to TensorFlow resource or
      variant tensor like the below code snippet.
      
      struct OpaqueBuffer {
        // Store a TensorFlow's tensor pointer. The life cycle of the pointer will be
        // managed by the reference counting in the TensorFlow world and the pointer
        // will be freed when all the buffer maps, who own it, are gone.
        const tensorflow::Tensor* tf_tensor;
      };
      
      * Life cycle of the shared TensorFlow tensor object
      
      When TFLite runtime requires those tensors to be transferred to the other
      subgraphs, TFLite runtime invokes the delegate interface, the
      CopyFromBufferHandle method. The Flex’s CopyFromBufferHandle method will
      create the above OpaqueBuffer structure and store the corresponding TensorFlow
      tensor pointer. Later, the other subgraphs will find the TensorFlow pointer
      inside and insert it to their buffer map to track the life cycle in the
      subgraph.
      
      Even though the same TensorFlow tensor object can be found in the multiple
      buffer maps in the Flex delegate. They will be freed when the Flex delegate is
      finalized.
      
      PiperOrigin-RevId: 358553465
      Change-Id: I7cb056bf8f216851c771d7b2f26e69821a7cbf6a
      1636a7d0
    • A
      PR #46551: Add cudaMallocAsync as an option. · e8c467f4
      A. Unique TensorFlower 提交于
      Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/46551
      
      This PR adds cudaMallocAsync as an option when CUDA 11.2 is used.
      
      PiperOrigin-RevId: 358553125
      Change-Id: Id7110f54838fafb4107f06ed1d68ce0245010a3a
      e8c467f4
    • T
      Merge pull request #46551 from nouiz:upstream-cuda_malloc_async · b6a321d4
      TensorFlower Gardener 提交于
      PiperOrigin-RevId: 358545703
      Change-Id: I46aa9403023db7e1024350593d48d18bd30a2a2e
      b6a321d4
    • F
      Add /tensorflow/tpu/xla_spmd_cores_per_replica metric · ee65d117
      Frank Chen 提交于
      PiperOrigin-RevId: 358540864
      Change-Id: Ie5055e71a7cceffe3534be769cd8d5e23f18cae2
      ee65d117
    • T
      Reduce the number of private symbols Keras relies on, by switching specific... · 83fe4bb5
      Tomer Kaftan 提交于
      Reduce the number of private symbols Keras relies on, by switching specific usages of Trackable data structures to the generic `wrap_or_unwrap` method that decides automatically according to the input type.
      
      PiperOrigin-RevId: 358532293
      Change-Id: Id85a1082f1a9bfc9b6e025c57fb078e960e0db8b
      83fe4bb5