- 26 2月, 2016 34 次提交
-
-
由 Xiaoqiang Zheng 提交于
Change: 115611259
-
由 Derek Murray 提交于
Change: 115610448
-
由 A. Unique TensorFlower 提交于
Change: 115608522
-
由 Vijay Vasudevan 提交于
Change: 115607974
-
由 Derek Murray 提交于
Change: 115607801
-
由 Vijay Vasudevan 提交于
Test fails. Change: 115602477
-
由 Vijay Vasudevan 提交于
The current depthwise_conv is very inefficient by calling slice() on each input channel on input and filters, followed by a conv() on each input channel, after which is a concat(). Change: 115601904
-
由 Vijay Vasudevan 提交于
the cuda/cudnn version strings in build_config.bzl Addresses issue mentioned in #1052 Change: 115599314
-
由 Vijay Vasudevan 提交于
Change: 115598732
-
由 Sherry Moore 提交于
Change: 115598592
-
由 Manjunath Kudlur 提交于
Change: 115594986
-
由 Josh Levenberg 提交于
hopefully reduce confusion since io.* is not the implementation of the ".../kernels:io" build target. Change: 115593814
-
由 Eugene Brevdo 提交于
Change: 115589642
-
由 Manjunath Kudlur 提交于
-
由 Vijay Vasudevan 提交于
Change: 115589382
-
由 Jianmin Chen 提交于
The current depthwise_conv is very inefficient by calling slice() on each input channel on input and filters, followed by a conv() on each input channel, after which is a concat(). Change: 115583330
-
由 Derek Murray 提交于
Also fixed some compiler warnings. Change: 115582482
-
由 Vijay Vasudevan 提交于
since this breaks the build on GPU. Change: 115582331
-
由 Vijay Vasudevan 提交于
Change: 115580211
-
由 A. Unique TensorFlower 提交于
Change: 115578243
-
由 A. Unique TensorFlower 提交于
considerably. Shrinks text size of example_trainer binary by ~1.5%. Change: 115578002
-
由 Benoit Steiner 提交于
Made the corresponding unit test more robust. Change: 115575179
-
由 Vijay Vasudevan 提交于
return true. Add a unittest to catch this type of regression in the future. Change: 115573280
-
由 A. Unique TensorFlower 提交于
Change: 115568214
-
由 A. Unique TensorFlower 提交于
Change: 115528686
-
由 A. Unique TensorFlower 提交于
Fix an error message in tf.sparse_to_dense to include the possibility that indices are invalid because they are out of bounds. Change: 115522264
-
由 Eugene Brevdo 提交于
These tools are meant to allow recording of benchmark & unit test structured output to pbtxt files in a directory only when the environment variable TEST_REPORT_FILE_PREFIX is set. For now, only saving of C++ microbenchmark output is supported. Change: 115518303
-
由 Sherry Moore 提交于
Change: 115516426
-
由 Kiril Gorovoy 提交于
Change: 115515678
-
由 Josh Levenberg 提交于
Change: 115511835
-
由 Josh Levenberg 提交于
Change: 115511794
-
由 Vincent Vanhoucke 提交于
Helps with: https://github.com/tensorflow/tensorflow/issues/917 Also fixes https://github.com/tensorflow/tensorflow/issues/1162 The main benefit is that the computation of the sufficient statistics is now decoupled of the aggregation of the moments, which means that if you want to perform the accumulation incrementally, you don't have to keep all the inputs around, and can instead keep the much more compact sum and sum-of-squares. Accumulation could also be performed locally if you aggregate across multiple devices. Computing sum and sum-of-squares can also theoretically be performed in parallel now. Tested running inception: same performance, same step time. Batch normalization benchmark is a bit faster on CPU, a bit slower on GPU: Before: cpu shape:4/3 #layers:10 mode:py scale:True train:False - 1.139310 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.021970 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.767147 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.074531 secs cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.742835 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.013473 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.738806 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.052777 secs cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.119180 secs gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.011201 secs cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.218297 secs gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.048526 secs After: cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.998944 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.025828 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.657428 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.086614 secs cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.603137 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.017668 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.519533 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.055214 secs cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.071344 secs gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.016440 secs cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.222093 secs gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.039967 secs Change: 115507032
-
由 Josh Levenberg 提交于
RemoveNewDefaultAttrsFromGraphDef(). Change: 115506523
-
由 A. Unique TensorFlower 提交于
Change: 115505008
-
- 25 2月, 2016 6 次提交
-
-
由 Benoit Steiner 提交于
X1 crashes when attempting to compile them Change: 115500414
-
由 Eugene Brevdo 提交于
Change: 115496194
-
由 Geoffrey Irving 提交于
Change: 115495726
-
由 Eugene Brevdo 提交于
Change: 115494526
-
由 A. Unique TensorFlower 提交于
Support leaving the offset (beta) parameter out in batch_normalization, in which case no offset will be added after normalization. Change: 115489328
-
由 A. Unique TensorFlower 提交于
Change: 115472914
-