- 18 3月, 2017 31 次提交
-
-
由 Mark Daoust 提交于
Change: 150493886
-
由 Shanqing Cai 提交于
Other minor changes: * Tweaks in the doc about source line annotation * Tweak in the run-start message Change: 150493562
-
由 Sukriti Ramesh 提交于
Change: 150493560
-
由 A. Unique TensorFlower 提交于
Change: 150490700
-
由 Yao Zhang 提交于
Change: 150489919
-
由 Brennan Saeta 提交于
In order to clarify ownership, this change moves the remote devices from unmanaged raw pointers to std::unique_ptr. In the process, I have removed some confusing comments regarding ownership that are now unnecessary, as the types are both correct and enforced by the compiler. Change: 150489013
-
由 A. Unique TensorFlower 提交于
Change: 150488705
-
由 Benoit Steiner 提交于
Change: 150488108
-
由 Neal Wu 提交于
Change: 150487597
-
由 Alexey Surkov 提交于
Otherwise these failuers aren't currently covered by any retry logic. Change: 150486764
-
由 Mustafa Ispir 提交于
Change: 150479545
-
由 A. Unique TensorFlower 提交于
The docstring incorrectly claimed that momentum was ignored when a variable slice wasn't used in the sparse version of the algorithm. Change: 150477769
-
由 A. Unique TensorFlower 提交于
Change: 150477638
-
由 Yuefeng Zhou 提交于
Make the queue runner own the metadata and mutex. Change: 150475730
-
由 A. Unique TensorFlower 提交于
Change: 150474503
-
由 A. Unique TensorFlower 提交于
Change: 150471440
-
由 Mark Heffernan 提交于
Make mapping from calling instruction to CallSite one-to-one by extending CallSite to handle more than one called computation. This enables instructions like kWhile which call two computations to be represented as a single CallSite. Also add a mapping from instruction to CallSite in CallGraphNode to enable fast call site lookup. Also, include a few other opportunistic improvements: * Change CallGraph::Build factor to return a std::unique_ptr. This enables, for example, more convenient use of CallGraph as a data member to a class. * Change a few uses of unordered_set/map to FlatSet/Map. Change: 150469958
-
由 Mark Daoust 提交于
Change: 150466902
-
由 Gunhan Gulsoy 提交于
Change: 150463151
-
由 A. Unique TensorFlower 提交于
Change: 150462131
-
由 A. Unique TensorFlower 提交于
Change: 150460833
-
由 Asim Shankar 提交于
And also convenience functions for feeding and fetching using handles to outputs/operations instead of their names. Change: 150460697
-
由 Jonathan Hseu 提交于
Change: 150460215
-
由 Yangzihao Wang 提交于
Change: 150459431
-
由 A. Unique TensorFlower 提交于
Improves performance of tf.matmul(a, b, ...) for dense tensors on NVIDIA GPUs in the following cases: a) If the inner-most dimension of b is 1, i.e. the operation is (possibly a batch of) matrix*vector multiplication(s). This is accomplished by calling Cublas GEMV rather than GEMM. This speeds up large matrix-vector products by about 4x. b) If one or more dimensions are unknown at graph construction time but the operation is in fact either a single matrix*matrix or matrix*vector multiplication. The following benchmark numbers illustrating the improvements for matrix * vector products were measured on a NVIDIA Titan X (Maxwell) card. Benchmark Base (ns) New (ns) Improvement ---------------------------------------------------------------------------- BM_Matmul_50_50_1_false_false_DT_FLOAT_gpu 18102 17056 +5.8% BM_Matmul_50_50_1_true_false_DT_FLOAT_gpu 18108 16374 +9.6% BM_Matmul_50_50_1_false_true_DT_FLOAT_gpu 18153 17173 +5.4% BM_Matmul_50_50_1_true_true_DT_FLOAT_gpu 18150 15950 +12.1% BM_Matmul_500_500_1_false_false_DT_FLOAT_gpu 64605 16874 +73.9% BM_Matmul_500_500_1_true_false_DT_FLOAT_gpu 62810 17298 +72.5% BM_Matmul_500_500_1_false_true_DT_FLOAT_gpu 60447 17014 +71.9% BM_Matmul_500_500_1_true_true_DT_FLOAT_gpu 58443 16934 +71.0% BM_Matmul_2000_2000_1_false_false_DT_FLOAT_gpu 343298 81898 +76.1% BM_Matmul_2000_2000_1_true_false_DT_FLOAT_gpu 294738 63723 +78.4% BM_Matmul_2000_2000_1_false_true_DT_FLOAT_gpu 300671 83650 +72.2% BM_Matmul_2000_2000_1_true_true_DT_FLOAT_gpu 284540 63742 +77.6% Change: 150456725
-
由 Zakaria Haque 提交于
Change: 150452316
-
由 A. Unique TensorFlower 提交于
folding. Change: 150450219
-
由 A. Unique TensorFlower 提交于
Change: 150450082
-
由 A. Unique TensorFlower 提交于
Change: 150449788
-
由 Mustafa Ispir 提交于
Change: 150447439
-
由 Mustafa Ispir 提交于
Change: 150443246
-
- 17 3月, 2017 9 次提交
-
-
由 A. Unique TensorFlower 提交于
FileExistsError doesn't exist in Python 2.7. Change: 150436736
-
由 A. Unique TensorFlower 提交于
- Removes getArgumentList() in favor of args() etc in XLA (LLVM r298010) Change: 150427660
-
由 Eugene Brevdo 提交于
The main change is that RNNCells that wrap other RNNCells now override self.zero_state to call the wrapped cell's zero_state and then (maybe) perform some post-processing... instead of relying on the state_size property to provide all information about the state. Also made zero_state calls create ops inside their own name scope. Change: 150413265
-
由 Eugene Brevdo 提交于
Change: 150400474
-
由 Benoit Steiner 提交于
Change: 150397471
-
由 A. Unique TensorFlower 提交于
Change: 150396376
-
由 Gunhan Gulsoy 提交于
Change: 150389857
-
由 Benoit Steiner 提交于
Change: 150388263
-
由 A. Unique TensorFlower 提交于
Change: 150385615
-