提交 · f4b237f8cdd25a45dc26adc61c3086c2575f5396 · xxadev / tensorflow

18 3月, 2017 8 次提交

Y
Added data type info to conv autotune parameters. · f4b237f8
由 Yangzihao Wang 提交于 3月 17, 2017
```
Change: 150459431
```
f4b237f8

Improves performance of tf.matmul(a, b, ...) for dense tensors on NVIDIA GPUs... · 49f14738

由 A. Unique TensorFlower 提交于 3月 17, 2017

Improves performance of tf.matmul(a, b, ...) for dense tensors on NVIDIA GPUs in the following cases:

a) If the inner-most dimension of b is 1, i.e. the operation is (possibly a batch of) matrix*vector multiplication(s). This is accomplished by calling Cublas GEMV rather than GEMM. This speeds up large matrix-vector products by about 4x.

b) If one or more dimensions are unknown at graph construction time but the operation is in fact either a single matrix*matrix or matrix*vector multiplication.

The following benchmark numbers illustrating the improvements for matrix * vector products
were measured on a NVIDIA Titan X (Maxwell) card.

Benchmark                                    Base (ns)  New (ns) Improvement
----------------------------------------------------------------------------
BM_Matmul_50_50_1_false_false_DT_FLOAT_gpu       18102     17056     +5.8%
BM_Matmul_50_50_1_true_false_DT_FLOAT_gpu        18108     16374     +9.6%
BM_Matmul_50_50_1_false_true_DT_FLOAT_gpu        18153     17173     +5.4%
BM_Matmul_50_50_1_true_true_DT_FLOAT_gpu         18150     15950    +12.1%
BM_Matmul_500_500_1_false_false_DT_FLOAT_gpu     64605     16874    +73.9%
BM_Matmul_500_500_1_true_false_DT_FLOAT_gpu      62810     17298    +72.5%
BM_Matmul_500_500_1_false_true_DT_FLOAT_gpu      60447     17014    +71.9%
BM_Matmul_500_500_1_true_true_DT_FLOAT_gpu       58443     16934    +71.0%
BM_Matmul_2000_2000_1_false_false_DT_FLOAT_gpu  343298     81898    +76.1%
BM_Matmul_2000_2000_1_true_false_DT_FLOAT_gpu   294738     63723    +78.4%
BM_Matmul_2000_2000_1_false_true_DT_FLOAT_gpu   300671     83650    +72.2%
BM_Matmul_2000_2000_1_true_true_DT_FLOAT_gpu    284540     63742    +77.6%
Change: 150456725

49f14738

Z
Removes an unnecessary check that blocks using multihead with custom heads. · bb7cadbc
由 Zakaria Haque 提交于 3月 17, 2017
```
Change: 150452316
```
bb7cadbc
A
Set producer version in the graph used by shape refiner to run constant · 5a95c76c
由 A. Unique TensorFlower 提交于 3月 17, 2017
```
folding.
Change: 150450219
```
5a95c76c
A
Fix mis-spelling. · 8e138e72
由 A. Unique TensorFlower 提交于 3月 17, 2017
```
Change: 150450082
```
8e138e72
A
[XLA] Add a test for the remainder of two scalar U32s. · 0151bae9
由 A. Unique TensorFlower 提交于 3月 17, 2017
```
Change: 150449788
```
0151bae9
M
Copied global step tests from contrib to core. · b26720ca
由 Mustafa Ispir 提交于 3月 17, 2017
```
Change: 150447439
```
b26720ca
M
Tested features, labels, and mode in Estimator.export · 55565d3d
由 Mustafa Ispir 提交于 3月 17, 2017
```
Change: 150443246
```
55565d3d

17 3月, 2017 32 次提交
- A
  Use e.errno instead of trying to except FileExistsError. · cf1a2324
  由 A. Unique TensorFlower 提交于 3月 17, 2017
```
FileExistsError doesn't exist in Python 2.7.
Change: 150436736
```
  cf1a2324
- A
  - Update XLA for removal of TargetOptions::LessPreciseFPMADOption (LLVM r298023) · f63e985e
  由 A. Unique TensorFlower 提交于 3月 17, 2017
```
- Removes getArgumentList() in favor of args() etc in XLA (LLVM r298010)
Change: 150427660
```
  f63e985e
- E
  Updates to RNNCells to allow easy storage of attention TensorArray in the state. · 03abac7f
  由 Eugene Brevdo 提交于 3月 16, 2017
```
The main change is that RNNCells that wrap other RNNCells now override self.zero_state to call the wrapped cell's zero_state and then (maybe) perform some post-processing... instead of relying on the state_size property to provide all information about the state.

Also made zero_state calls create ops inside their own name scope.
Change: 150413265
```
  03abac7f
- E
  Initial cut of documentation for tf.contrib.seq2seq · 9cc50983
  由 Eugene Brevdo 提交于 3月 16, 2017
```
Change: 150400474
```
  9cc50983
- B
  Added an option to disable the collection of detailed statistics in grappler · 5135546b
  由 Benoit Steiner 提交于 3月 16, 2017
```
Change: 150397471
```
  5135546b
- A
  tfdbg: Created a GRPC-based hook that streams debugger-related events. · 9c7e4964
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150396376
```
  9c7e4964
- G
  A simple script to test TF contrib. · 89792d70
  由 Gunhan Gulsoy 提交于 3月 16, 2017
```
Change: 150389857
```
  89792d70
- B
  Sleep forever to trigger the timeout consistently · a93d70c5
  由 Benoit Steiner 提交于 3月 16, 2017
```
Change: 150388263
```
  a93d70c5
- A
  Fix separable convolution bias check · a0ca4bcb
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150385615
```
  a0ca4bcb
- A
  [Tensorflow] Expose API to lookup TensorSlice. · c092f31c
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150384503
```
  c092f31c
- S
  Replace OpRegistryInterface* with FunctionLibraryDefinition in Graph. · 433c8c89
  由 Skye Wanderman-Milne 提交于 3月 16, 2017
```
This is a first step towards supporting functions in C++ graph construction, e.g. being able to import GraphDefs with functions.
Change: 150382046
```
  433c8c89
- A
  Update opensource vulcanized HTML file for Tensorboard. · 57737d10
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
This update contains refinements to the charts in the scalars dashboard.
Change: 150380169
```
  57737d10
- A
  Let the user view health pills at any step. · 1316eeb6
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
This involves adding a toggle to the health pills info box in the graph visualizer. When that toggle is enabled, Tensorboard makes a request for health pills at step X when the user moves the slider.

This feature can be very slow because it requires reading from disk. Viewing health pills at say step 100,000 could take minutes to an hour. We must design ways to make this faster (for instance, have the debugger write events at a much greater frequency only after it encounters a bad value).
Change: 150379929
```
  1316eeb6
- A
  Upgrade bazel to 0.4.5. · e05cfa65
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
 - Feature in 0.4.5 is required for cuda_configure to stay fix
   #7575
Change: 150376995
```
  e05cfa65
- N
  Move the note for gcc version 5 to just below the bazel build command where it actually applies · c33551ea
  由 Neal Wu 提交于 3月 16, 2017
```
Change: 150374176
```
  c33551ea
- B
  Allow soft placement on all the types of clusters. · c7a466b4
  由 Benoit Steiner 提交于 3月 16, 2017
```
Change: 150373647
```
  c7a466b4
- M
  Remove old doc generator. · 5393b576
  由 Martin Wicke 提交于 3月 16, 2017
```
Change: 150372607
```
  5393b576
- A
  Smarter retry logic for non-idempotent file operations such as RenameFile, DeleteFile or DeleteDir. · a8cd6ff8
  由 Alexey Surkov 提交于 3月 16, 2017
```
Change: 150369708
```
  a8cd6ff8
- A
  Change build visibility of tensorflow/compiler/tf2xla:xla_compiler · 791c3ebc
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150364996
```
  791c3ebc
- D
  Autogenerated Change: Change TensorBoard TAG to 47 · 2ab46468
  由 Dandelion Mané 提交于 3月 16, 2017
```
Change: 150364914
```
  2ab46468
- B
  Automated rollback of change 150344647 · cc5e9c0e
  由 Benoit Steiner 提交于 3月 16, 2017
```
Change: 150363523
```
  cc5e9c0e
- D
  Improve TensorBoard scalar dashboard domain calculation. · f73a90a3
  由 Dandelion Mané 提交于 3月 16, 2017
```
- Add an "ignore y-outliers" option (default true). When true, the domain is calculated based on the middle 8 deciles of the y range, i.e. the lowest 10% of data and highest 10% of data is ignored for domain calculation purposes. Also, this is done with the smoothed data rather than the raw data. This means that brief spikes or an initially high loss value will not distort the chart. This can be disabled to view the full data domain.
- If the y values are all in the range [0, 1], then the domain is automatically set to be approximately [0, 1]. That makes it easy to compare proportional values.
- Added bold gridlines at the origin (0,0) to make the charts easier to understand at a glance.
Change: 150362432
```
  f73a90a3
- S
  tfdbg: make debug_gateway_test a CPU-only test for now · bc1763c4
  由 Shanqing Cai 提交于 3月 16, 2017
```
Change: 150361253
```
  bc1763c4
- A
  Switch debug_grpc_testlib to bind on localhost · d115da70
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Since there are no multi-machine tests, this is likely preferable from a
security standpoint to binding to all interfaces, and sidesteps the question of
cross-platform compatibility (some platforms may have IPV6_V6ONLY enabled by
default).
Change: 150358613
```
  d115da70
- J
  Fix estimator tests when running against the installed pip package. · 424662da
  由 Jonathan Hseu 提交于 3月 16, 2017
```
This change allows the importing from tensorflow.python.estimator.* for usage in tests.
Change: 150356830
```
  424662da
- A
  Add Dev Summit video. · e97d8947
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150356236
```
  e97d8947
- A
  Forwards parallel_iterations arg in stack_bidirectional_dynamic_rnn(). · a7d6015d
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150351473
```
  a7d6015d
- A
  Embedded video in docs. · 91d34b20
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150348366
```
  91d34b20
- B
  Fixed a race condition in the single_cluster code · 73394f4c
  由 Benoit Steiner 提交于 3月 16, 2017
```
Change: 150345708
```
  73394f4c
- S
  tfdbg: test examples and binaries against pip install · e8aefd21
  由 Shanqing Cai 提交于 3月 16, 2017
```
I realized that CL/149810103 forgot to include the tfdbg binary and examples in the pip package.

This CL fixes that and adds pip test for the correct inclusion of these files.
Change: 150345489
```
  e8aefd21
- G
  Disable flaky grappler:utils_test · a02f6037
  由 Gunhan Gulsoy 提交于 3月 16, 2017
```
Change: 150344647
```
  a02f6037
- A
  Optimize LiteralUtil::Replicate and add a unit test · b1a02e97
  由 A. Unique TensorFlower 提交于 3月 16, 2017
```
Change: 150343323
```
  b1a02e97

xxadev / tensorflow 与 Fork 源项目一致

xxadev / tensorflow
与 Fork 源项目一致