- 17 7月, 2019 30 次提交
-
-
由 Christopher Suter 提交于
(see, e.g., https://stackoverflow.com/questions/4825234/exception-traceback-is-hidden-if-not-re-raised-immediately) PiperOrigin-RevId: 258445844
-
由 Andrew Audibert 提交于
PiperOrigin-RevId: 258445832
-
由 Allen Lavoie 提交于
We weren't properly handling the op-type-is-function-name calling convention. PiperOrigin-RevId: 258441889
-
由 Pavithra Vijay 提交于
PiperOrigin-RevId: 258438818
-
由 Yifei Feng 提交于
PiperOrigin-RevId: 258438229
-
由 Yunxing Dai 提交于
Previously, we only have computation scheduler, which runs heap simulator once per computation. For models with large number of computation, this creates extremly slow compilation time. This cl introduces module scheduler, that only runs heap simulator after the whole module is scheduled. It also contains a helper function that automatically converts a computation scheduler to module scheduler. PiperOrigin-RevId: 258436352
-
由 A. Unique TensorFlower 提交于
Add code that translates between the TFLite and MLIR type systems. This begins the process of building the translator by translating the types of input arguments. The tests are updated to reflect the beginning of the actual work. PiperOrigin-RevId: 258433154
-
由 Pavithra Vijay 提交于
PiperOrigin-RevId: 258432123
-
由 A. Unique TensorFlower 提交于
Add a flatbuffer_importer.cc that registers a translation from TFLite Flatbuffer to MLIR and incorporate it into the flatbuffer_translate tool. The translator does not yet perform any translation, but only validates that the input file contains a FlatBuffer and prints its version number and the names and input tensor IDs of each subgraph. The tests don't actually include the expected correct output, but instead simply make sure that the initial code, which only calls the flatbuffer parser and prints some simple information, functions correctly. PiperOrigin-RevId: 258431902
-
由 Yifei Feng 提交于
In the previous change, we switched from calling importlib on a list of public APIs to using direct import statements. As a result, the list of API was not passed to TFModuleWrapper and attributes start with "_" did not show up in __all__. To fix, we manually include all eligible symbols in __all__ as what was done before. Built and tested with pip package. PiperOrigin-RevId: 258431865
-
由 Benoit Jacob 提交于
Simplify ruy's main loop. Most of the next_* business was unnecessary complication. This code didn't know whether it wanted to hide latency of an atomic increment (60 cycles) or of a block coords computation (comparable). Now it's more intentional about hiding the atomic increment latency, because that's the one instruction here that will always have high latency; for the rest, we can only hope that the compiler will exploit any opportunity to inline the block computation and distribute its instructions so as to hide some of the latency. The more important point is what while we don't really know what would run faster and that will at most make a small impact on latency, on the other hand there is a substantial code simplification here, and this matters because this is very central code. Notice in particular how the block coords computation was written twice, once before the loop body and one at the end of the loop body, and now it's only once anymore. PiperOrigin-RevId: 258429272
-
由 A. Unique TensorFlower 提交于
this is the preparation to merge the internal xprof and external oss version of annotation implementation, I need to make sure the benchmark is comparable or better. PiperOrigin-RevId: 258425457
-
由 Eugene Zhulenev 提交于
PiperOrigin-RevId: 258422018
-
由 Ian Langmore 提交于
this doesn't make sense for matvec (and isn't in the base class matvec defn). PiperOrigin-RevId: 258417922
-
由 Andy Ly 提交于
PiperOrigin-RevId: 258413874
-
由 Smit Hinsu 提交于
PiperOrigin-RevId: 258412619
-
由 Akshay Modi 提交于
It would fail on ndarrays with an numpy.dtype doesn't have is_floating check on the next line in any case, so this gives it a nicer error message. Also use the common RegisterType functionality to check for resource variables. PiperOrigin-RevId: 258407797
-
由 Jiri Simsa 提交于
PiperOrigin-RevId: 258404510
-
由 Priya Gupta 提交于
PiperOrigin-RevId: 258399687
-
由 Nupur Garg 提交于
PiperOrigin-RevId: 258395114
-
由 Feng Liu 提交于
This patch contains various changes to allow the worflow working for both UINT8 and INT8 quantization scheme: - The "restricted_output_params" in the "OpQuantSpec" is changed to a map, so the op definition can define restrictions for both UINT8 and INT8. - INT8 quantization spec is added to the TFLite op definition. - a "quantize_sign" flag is passed into the pre-quantize pass, so the spec and propgation for different sign can be configured by this flag. There are followup patches to read this flag from the user's command line. PiperOrigin-RevId: 258393145
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 258393073
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 258392805
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 258391933
-
由 Alexandre Passos 提交于
PiperOrigin-RevId: 258389587
-
由 Benoit Jacob 提交于
50001. PiperOrigin-RevId: 258389320
-
由 Jiri Simsa 提交于
PiperOrigin-RevId: 258385606
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 258380926
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 258378187
-
由 Sergei Lebedev 提交于
This slightly speeds up the common case of converting shape lists to tensors (when calling an op). Before: >>> %timeit tf.convert_to_tensor([1, 2, 3]) 100000 loops, best of 3: 10.4 ?s per loop >>> %timeit tf.convert_to_tensor(np.array([1, 2, 3])) # For reference. 100000 loops, best of 3: 6.47 ?s per loop After: >>> %timeit tf.convert_to_tensor([1, 2, 3]) 100000 loops, best of 3: 7.23 ?s per loop The remaining 1us is due to the necessary nest.flatten call. It might be optimized by introducing nest.all which does not allocate a flat list. PiperOrigin-RevId: 258375416
-
- 16 7月, 2019 10 次提交
-
-
由 Andrew Audibert 提交于
Dataset elements are no longer limited to nested structures of Tensors. This change updates the docs to refer to "dataset elements" instead of nested structures of Tensors. We may want to also deprecate from_tensors/from_tensor_slices and replace them with better-named from_element/from_element_slices. Since this is more controversial, we can do it as a separate change. PiperOrigin-RevId: 258375042
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 258373033
-
由 Benoit Jacob 提交于
#defines - they were tested directly by #ifdef, and were being defined by path.h. As tune.cc did not #include path.h, it did not enable its platform-specific tuning code, resulting in a performance regression in cases relying on tuning for maximal performance --- in-order ARM. To prevent that from happening again, this moves the platform defines to a new platform.h and forces users to use a RUY_PLATFORM(X) function macro, so that if they fail to #include platform.h, they get a compilation error. PiperOrigin-RevId: 258372624
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 258371737
-
由 A. Unique TensorFlower 提交于
PiperOrigin-RevId: 258361644
-
由 Tamara Norman 提交于
Add absl logging functions to the output_streams supported by tf.print (v2). This ensures we can print to these streams without needing to use tf.compat.v1 PiperOrigin-RevId: 258360763
-
由 A. Unique TensorFlower 提交于
For now, the MLIR backend will expect the same HLO as input as the GPU backend does. Hence, we need to run the same required passes. Also use the same HLO level optimizations, so that we get comparable HLO. PiperOrigin-RevId: 258349785
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 258349360
-
由 Alexander Belyaev 提交于
PiperOrigin-RevId: 258349333
-
由 TensorFlower Gardener 提交于
PiperOrigin-RevId: 258349270
-