- 18 3月, 2017 8 次提交
-
-
由 Yangzihao Wang 提交于
Change: 150459431
-
由 A. Unique TensorFlower 提交于
Improves performance of tf.matmul(a, b, ...) for dense tensors on NVIDIA GPUs in the following cases: a) If the inner-most dimension of b is 1, i.e. the operation is (possibly a batch of) matrix*vector multiplication(s). This is accomplished by calling Cublas GEMV rather than GEMM. This speeds up large matrix-vector products by about 4x. b) If one or more dimensions are unknown at graph construction time but the operation is in fact either a single matrix*matrix or matrix*vector multiplication. The following benchmark numbers illustrating the improvements for matrix * vector products were measured on a NVIDIA Titan X (Maxwell) card. Benchmark Base (ns) New (ns) Improvement ---------------------------------------------------------------------------- BM_Matmul_50_50_1_false_false_DT_FLOAT_gpu 18102 17056 +5.8% BM_Matmul_50_50_1_true_false_DT_FLOAT_gpu 18108 16374 +9.6% BM_Matmul_50_50_1_false_true_DT_FLOAT_gpu 18153 17173 +5.4% BM_Matmul_50_50_1_true_true_DT_FLOAT_gpu 18150 15950 +12.1% BM_Matmul_500_500_1_false_false_DT_FLOAT_gpu 64605 16874 +73.9% BM_Matmul_500_500_1_true_false_DT_FLOAT_gpu 62810 17298 +72.5% BM_Matmul_500_500_1_false_true_DT_FLOAT_gpu 60447 17014 +71.9% BM_Matmul_500_500_1_true_true_DT_FLOAT_gpu 58443 16934 +71.0% BM_Matmul_2000_2000_1_false_false_DT_FLOAT_gpu 343298 81898 +76.1% BM_Matmul_2000_2000_1_true_false_DT_FLOAT_gpu 294738 63723 +78.4% BM_Matmul_2000_2000_1_false_true_DT_FLOAT_gpu 300671 83650 +72.2% BM_Matmul_2000_2000_1_true_true_DT_FLOAT_gpu 284540 63742 +77.6% Change: 150456725
-
由 Zakaria Haque 提交于
Change: 150452316
-
由 A. Unique TensorFlower 提交于
folding. Change: 150450219
-
由 A. Unique TensorFlower 提交于
Change: 150450082
-
由 A. Unique TensorFlower 提交于
Change: 150449788
-
由 Mustafa Ispir 提交于
Change: 150447439
-
由 Mustafa Ispir 提交于
Change: 150443246
-
- 17 3月, 2017 32 次提交
-
-
由 A. Unique TensorFlower 提交于
FileExistsError doesn't exist in Python 2.7. Change: 150436736
-
由 A. Unique TensorFlower 提交于
- Removes getArgumentList() in favor of args() etc in XLA (LLVM r298010) Change: 150427660
-
由 Eugene Brevdo 提交于
The main change is that RNNCells that wrap other RNNCells now override self.zero_state to call the wrapped cell's zero_state and then (maybe) perform some post-processing... instead of relying on the state_size property to provide all information about the state. Also made zero_state calls create ops inside their own name scope. Change: 150413265
-
由 Eugene Brevdo 提交于
Change: 150400474
-
由 Benoit Steiner 提交于
Change: 150397471
-
由 A. Unique TensorFlower 提交于
Change: 150396376
-
由 Gunhan Gulsoy 提交于
Change: 150389857
-
由 Benoit Steiner 提交于
Change: 150388263
-
由 A. Unique TensorFlower 提交于
Change: 150385615
-
由 A. Unique TensorFlower 提交于
Change: 150384503
-
由 Skye Wanderman-Milne 提交于
This is a first step towards supporting functions in C++ graph construction, e.g. being able to import GraphDefs with functions. Change: 150382046
-
由 A. Unique TensorFlower 提交于
This update contains refinements to the charts in the scalars dashboard. Change: 150380169
-
由 A. Unique TensorFlower 提交于
This involves adding a toggle to the health pills info box in the graph visualizer. When that toggle is enabled, Tensorboard makes a request for health pills at step X when the user moves the slider. This feature can be very slow because it requires reading from disk. Viewing health pills at say step 100,000 could take minutes to an hour. We must design ways to make this faster (for instance, have the debugger write events at a much greater frequency only after it encounters a bad value). Change: 150379929
-
由 A. Unique TensorFlower 提交于
- Feature in 0.4.5 is required for cuda_configure to stay fix #7575 Change: 150376995
-
由 Neal Wu 提交于
Change: 150374176
-
由 Benoit Steiner 提交于
Change: 150373647
-
由 Martin Wicke 提交于
Change: 150372607
-
由 Alexey Surkov 提交于
Change: 150369708
-
由 A. Unique TensorFlower 提交于
Change: 150364996
-
由 Dandelion Mané 提交于
Change: 150364914
-
由 Benoit Steiner 提交于
Change: 150363523
-
由 Dandelion Mané 提交于
- Add an "ignore y-outliers" option (default true). When true, the domain is calculated based on the middle 8 deciles of the y range, i.e. the lowest 10% of data and highest 10% of data is ignored for domain calculation purposes. Also, this is done with the smoothed data rather than the raw data. This means that brief spikes or an initially high loss value will not distort the chart. This can be disabled to view the full data domain. - If the y values are all in the range [0, 1], then the domain is automatically set to be approximately [0, 1]. That makes it easy to compare proportional values. - Added bold gridlines at the origin (0,0) to make the charts easier to understand at a glance. Change: 150362432
-
由 Shanqing Cai 提交于
Change: 150361253
-
由 A. Unique TensorFlower 提交于
Since there are no multi-machine tests, this is likely preferable from a security standpoint to binding to all interfaces, and sidesteps the question of cross-platform compatibility (some platforms may have IPV6_V6ONLY enabled by default). Change: 150358613
-
由 Jonathan Hseu 提交于
This change allows the importing from tensorflow.python.estimator.* for usage in tests. Change: 150356830
-
由 A. Unique TensorFlower 提交于
Change: 150356236
-
由 A. Unique TensorFlower 提交于
Change: 150351473
-
由 A. Unique TensorFlower 提交于
Change: 150348366
-
由 Benoit Steiner 提交于
Change: 150345708
-
由 Shanqing Cai 提交于
I realized that CL/149810103 forgot to include the tfdbg binary and examples in the pip package. This CL fixes that and adds pip test for the correct inclusion of these files. Change: 150345489
-
由 Gunhan Gulsoy 提交于
Change: 150344647
-
由 A. Unique TensorFlower 提交于
Change: 150343323
-