- 14 10月, 2022 40 次提交
-
-
由 A. Unique TensorFlower 提交于
CHANGELOG ========= 3bb6a48d8 - Fix bug atan2 14c847dc0 - Refactor special values test for pow, and add a similar test for atan2 462758e8a - Don'\''t use generic sign function for sign(complex) unless it is vectorizable c0d6a7261 - Use pnegate(pzero(x)) as a generic way to generate -0.0. Some compiler do not handle the literal -0.0 properly in fastmath mode. 7846c7387 - Eigen/Sparse: fix warnings -Wunused-but-set-variable 316754487 - Handle NaN inputs to atan2. 72db3f0fa - Remove references to M_PI_2 and M_PI_4. d6bc06259 - Remove reference to EIGEN_HAS_CXX11_MATH. 5ceed0d57 - Guard GCC-specific pragmas with "#ifdef EIGEN_COMP_GNUC" 528b68674 - [clang-format] Add a few macros to AttributeMacros e95c4a837 - Simpler range reduction strategy for atan<float>(). 80efbfded - Unconditionally enable CXX11 math. e5794873c - Replace assert with eigen_assert. 7d6a9925c - Fix 4x4 inverse when compiling with -Ofast. 1414a76fa - Only vectorize atan<double> for Altivec if VSX is available. c475228b2 - Vectorize atan() for double. 1e1848fdb - Add a vectorized implementation of atan2 to Eigen. * Switch TensorFlow to using the new fast atan2 in Eigen. * Get rid of local implementations since Eigen is now guaranteed to support C++11 math. PiperOrigin-RevId: 481044760
-
由 A. Unique TensorFlower 提交于
http://github.com/tensorflow/runtime/commit/97d12a118984d820c14bdad0971704d95d99f411. PiperOrigin-RevId: 481043847
-
由 A. Unique TensorFlower 提交于
Updates LLVM usage to match [1fda6f6859aa](https://github.com/llvm/llvm-project/commit/1fda6f6859aa) PiperOrigin-RevId: 481042584
-
由 Ralf W. Grosse-Kunstleve 提交于
PiperOrigin-RevId: 481042256
-
由 Jaesung Chung 提交于
For tf.DivNoNan operators, the calibrator should NaN values since the lowered form of TFLite will encounter NaN value by design. PiperOrigin-RevId: 481039479
-
由 Marat Dukhan 提交于
PiperOrigin-RevId: 481035658
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 481034607
-
由 Tongfei Guo 提交于
PiperOrigin-RevId: 481030210
-
由 A. Unique TensorFlower 提交于
Updating determinism check in svd op. Now the op works deterministically when enabled, except in the case that the input matrix has column size 1. PiperOrigin-RevId: 481018683
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 481016434
-
由 Mohammadreza Heydary 提交于
This CL introduces a new legalization pattern that rewrites sigmoid_grad operation as a `grad = dy * y * (1 - y)`, where `y = sigmoid(x)`. PiperOrigin-RevId: 481015565
-
由 Anlun Xu 提交于
JitExecutable is not required for AOT compilation, so XlaRuntimeCpuExecutable can own a runtime::Executable directly. PiperOrigin-RevId: 481012743
-
由 Faizan Muhammad 提交于
PiperOrigin-RevId: 481010554
-
由 David Dunleavy 提交于
PiperOrigin-RevId: 481008739
-
由 Anlun Xu 提交于
PiperOrigin-RevId: 481007346
-
由 David Dunleavy 提交于
PiperOrigin-RevId: 481006537
-
由 Faizan Muhammad 提交于
PiperOrigin-RevId: 481006100
-
由 Umer Javed 提交于
PiperOrigin-RevId: 481004699
-
由 Rahul Joshi 提交于
PiperOrigin-RevId: 481001377
-
由 Pankaj Kanwar 提交于
PiperOrigin-RevId: 480996400
-
由 Anlun Xu 提交于
PiperOrigin-RevId: 480996180
-
由 A. Unique TensorFlower 提交于
http://github.com/tensorflow/runtime/commit/36b7795640a99e44a9ab246c5495be5f99736c99. PiperOrigin-RevId: 480985467
-
由 A. Unique TensorFlower 提交于
[lite] When constructing a subgraph, use control dependencies from the model's metadata, if present. PiperOrigin-RevId: 480984280
-
由 Soo Sung 提交于
PiperOrigin-RevId: 480983235
-
由 Faizan Muhammad 提交于
PiperOrigin-RevId: 480977676
-
由 Eugene Zhulenev 提交于
PiperOrigin-RevId: 480974257
-
由 Jeffrey A. Dean 提交于
Use std::function objects rather than creating thousands of separate copies of very large routines in shape_util.h Refactored code to have an internal helper ForEachState struct, and to have separate code for the parallel vs. non-parallel versions of the core ForEachInternal functionality (compiler wasn't smart enough to track the parallel bit through the std::optional object with the ThreadPool, and so was emitting parallel code even for calls with the non-parallel variant, and due to templatization, there were thousands of copies of these routines. Avoid inlining some large routines in literal.h Avoid inlining some automatically generated constructors for ShapeIndex, ProgramShape, etc. in shape.{h,cc} Avoid inlining large routines on non-OK paths in status_macros.h Changes drop generated text size for a large binary by about 3.0 MB (~1.4%). PiperOrigin-RevId: 480973483
-
由 TensorFlower Gardener 提交于
Merge pull request #55780 from ROCmSoftwarePlatform:google_upstream_remove_rocm_build_flag_nextafter_op PiperOrigin-RevId: 480973150
-
由 Rahul Joshi 提交于
- Unify the code for handling operand->tuple sharding propagation when no sharding is present vs refining the existing sharding (and reuse the code to refine existing sharding). - This also fixes an issue with handling empty tuple sub-elements, which are essentially not counted in the top-level tuple elements of the tuple sharding (since the code the refines existing sharding handles this correctly) PiperOrigin-RevId: 480971521
-
由 Luke Boyer 提交于
PiperOrigin-RevId: 480968915
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 480958124
-
由 Rahul Joshi 提交于
- consider collective communication operations with channel_id as side effecting only in non-spmd mode. - Also handle all collective operations in the function. - Change SPMD partitioning test to verify that any collective generated by partitioning does not have sharding. PiperOrigin-RevId: 480953154
-
由 Bruce Fontaine 提交于
PiperOrigin-RevId: 480950049
-
由 Haoyu Zhang 提交于
PiperOrigin-RevId: 480949877
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 480944931
-
由 Robert Suderman 提交于
Existing lowering missed cases where the width/height of the input or output are 1. These cases were difficult to address in the current implementation so they were cleaned up. Then power-of-2 specific code was removed as it was easier to just depend on GCD to do the right thing. PiperOrigin-RevId: 480936509
-
由 Zhi An Ng 提交于
These 2 tests cases are long running, because they test a cartesian product of dtypes * transpose * adjoint * shapes (2 * 4 * 4 * 3 = 96). These 2 test cases are the bottlenecks in the entire test suite finishing. By converting them into parameterized test cases, each of the case in the product becomes its own test case, and can run on different shards. PiperOrigin-RevId: 480935083
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 480932683
-
由 Jenni Kilduff 提交于
This creates a const op filled with [0, 1, 2...iotaSize] values, then tiles it to the iota result shape PiperOrigin-RevId: 480930548
-
由 Jake Harmon 提交于
PiperOrigin-RevId: 480928553
-