- 20 7月, 2023 40 次提交
-
-
由 pjpratik 提交于
The nightly versions says `experimental_from_jax` is deprecated and `Jax2TF` is recommended way of converting the Jax models to TFLite. Added an option in the example to use `Jax2TF` for TFLite conversion using concrete functions Thanks.
-
由 Richard Levasseur 提交于
PiperOrigin-RevId: 549538997
-
由 Johannes Reifferscheid 提交于
I need to emit calls to nested computations outside of the scope of an ir_emitter (and it's better to have these things be stateless whenever possible anyway). PiperOrigin-RevId: 549538403
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 549534080
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 549534066
-
由 Adrian Kuegel 提交于
PiperOrigin-RevId: 549533669
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 549520448
-
由 Son Tuan Vu 提交于
PiperOrigin-RevId: 549509169
-
由 Scott Zhu 提交于
PiperOrigin-RevId: 549508753
-
由 Haibo Huang 提交于
Many TF API users depend on the two methods above to parse TF_Status. But the implementation directly accesses the internal of TF_Status. This breaks the ABI between TF and other libraries. This change modifies `Set_TF_Status_from_Status` and `StatusFromTF_Status` to go through the API. There's a small overhead in this conversion. Because a new absl::Status will be constructed. And strings will be copied. But I believe we should always bias towards correctness. For TF internal users who can see the layout of `TF_Status`, `TF_Status::status` should be used directly to avoid these limitations. PiperOrigin-RevId: 549498469
-
由 Eugene Zhulenev 提交于
Add passes for lowering from LMHLO to IREEInput dialect to be able to run XLA:GPU executables on top of IREE runtime (VM + HAL). These passes currently are not enabled in the open source build because they require setting up XLA->IREE dependency in Bazel. This will be done separately. Currently we only build and test it using internal Google infrastructure. PiperOrigin-RevId: 549487443
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 549486441
-
由 Yishuang Pang 提交于
PiperOrigin-RevId: 549483740
-
由 A. Unique TensorFlower 提交于
[AutoSharding] Add an option to allow generating fully replicated strategies for dot and convolution HLO ops. PiperOrigin-RevId: 549483733
-
由 Tianrun Li 提交于
DTensor api.relayout and collectives need to recognize the new layout type. PiperOrigin-RevId: 549477658
-
由 A. Unique TensorFlower 提交于
Add unused parameter enable_stub_generation to open-source version of pybind_extension to be in sync with the internal version. PiperOrigin-RevId: 549476112
-
由 Haibo Huang 提交于
This is useful when we need to pass status through API boundary. PiperOrigin-RevId: 549475209
-
由 Armando Ugalde Velasco 提交于
Use MultipleIterationsAutoScaler inside the data service dispatcher implementation as follows: - UpdateOptimalNumberOfWorkersMetric() in the maintenance thread. - ReportProcessingTime() when receiving processing times from a WorkerHeartbeat. - ReportTargetProcessingTime() when receiving a target processing time from a ClientHeartbeat. - RemoveWorker() when detecting missing workers or executing MaybeRemoveTask. - RemoveConsumer() when releasing missing clients. - RegisterIteration() when creating a new Iteration. - UnregisterIteration() when garbage-collecting old Iterations. PiperOrigin-RevId: 549469531
-
由 Scott Zhu 提交于
PiperOrigin-RevId: 549468160
-
由 Anlun Xu 提交于
PiperOrigin-RevId: 549464529
-
由 Yishuang Pang 提交于
PiperOrigin-RevId: 549462485
-
由 Nicolas Perez 提交于
Taking relevant tests from conv_ops_test and conv_ops_3d_test to test general conv Op. Refactor test cases to be parameterized instead of running for loops. PiperOrigin-RevId: 549458691
-
由 Jieying Luo 提交于
Note TF runtime side is already set up in xla_launch_util. PiperOrigin-RevId: 549456738
-
由 Rahul Joshi 提交于
- Add option `xla_gpu_enable_pipelined_reduce_scatter` to enable forward pipelining of reduce-scatter instructions PiperOrigin-RevId: 549452899
-
由 Yishuang Pang 提交于
PiperOrigin-RevId: 549451568
-
由 Skye Wanderman-Milne 提交于
PiperOrigin-RevId: 549440289
-
由 A. Unique TensorFlower 提交于
Move pseudo-constant from while loop argument to while loop body in `tfl_while_outline` to avoid memory penalty during runtime PiperOrigin-RevId: 549440221
-
由 Skye Wanderman-Milne 提交于
It's not possible for tuple buffers (yet?). The eventual goal is for ML frameworks to only call individual getters instead of using PjRtBuffer::{logical_}on_device_shape, since passing around xla::Shapes is expensive and often includes more information than is necessary or even meaningful. We'd like to eventually remove PJRT_Buffer_OnDeviceTrimmedShape from the PJRT C API altogether ({logical_}on_device_shape will likely stay for non-ML framework usage). PiperOrigin-RevId: 549438520
-
由 Yu Feng 提交于
Also tested that relayout to ragged layout works out of box! PiperOrigin-RevId: 549438291
-
由 A. Unique TensorFlower 提交于
http://github.com/tensorflow/runtime/commit/2a7a9bde82ee99f382b26c75769ece54464a210d. PiperOrigin-RevId: 549436957
-
由 Scott Zhu 提交于
The tf.device(CPU:0) is not respected by the dtensor, so we need to explicitly convert the tensor value to dtensor, and make sure they are put on the proper mesh (CPU host mesh) for logging. Also update the test to mimic the current production behavior. PiperOrigin-RevId: 549432976
-
由 A. Unique TensorFlower 提交于
[AutoSharding] Make sure that the 1D device mesh in cluster environment matches the assumptions made by `ReshardingCostMixedMeshShape` in auto_sharding_utils. PiperOrigin-RevId: 549428578
-
由 Rahul Joshi 提交于
PiperOrigin-RevId: 549422754
-
由 Rahul Joshi 提交于
- Specify the HLO opcode the pipeliner config, and use that to derive more descriptive pass name - Some minor code cleanup. PiperOrigin-RevId: 549422627
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 549422464
-
由 Fiona Lang 提交于
PiperOrigin-RevId: 549421317
-
由 Yishuang Pang 提交于
PiperOrigin-RevId: 549420699
-
由 Kanglan Tang 提交于
The following configs are removed: - v1 - avx2_win - avx2_linux - native_arch_linux - numa - libc++ - ios_i386 - stackdriver_support - rbe_lite_linux - rbe_linux_cuda_nvcc - rbe_gpu_linux - rbe_linux_cuda11.2_nvcc_py3.8, rbe_linux_cuda_nvcc_py38 - rbe_linux_cuda11.2_nvcc_py3.10, rbe_linux_cuda_nvcc_py310 - rbe_linux_rocm_py3.7, rbe_linux_rocm_py3.8, rbe_linux_rocm_py3.10 - rbe_linux_cuda_clang_base, rbe_linux_cuda_clang_py** - rbe_win_py37, rbe_win_py310 If the removal of a config breaks your workflow, you can add it back as a command line option. If you think the config is removed mistakenly, please open an issue on GitHub. PiperOrigin-RevId: 549419745
-
由 Justin Szaday 提交于
PiperOrigin-RevId: 549417829
-
由 A. Unique TensorFlower 提交于
[AutoSharding] Ensure that strategies are generated for custom call ops with user shardings. Previously, no shardings strategies were being generated for such ops. PiperOrigin-RevId: 549414995
-