- 18 3月, 2017 27 次提交
-
-
由 Yao Zhang 提交于
Change: 150489919
-
由 Brennan Saeta 提交于
In order to clarify ownership, this change moves the remote devices from unmanaged raw pointers to std::unique_ptr. In the process, I have removed some confusing comments regarding ownership that are now unnecessary, as the types are both correct and enforced by the compiler. Change: 150489013
-
由 A. Unique TensorFlower 提交于
Change: 150488705
-
由 Benoit Steiner 提交于
Change: 150488108
-
由 Neal Wu 提交于
Change: 150487597
-
由 Alexey Surkov 提交于
Otherwise these failuers aren't currently covered by any retry logic. Change: 150486764
-
由 Mustafa Ispir 提交于
Change: 150479545
-
由 A. Unique TensorFlower 提交于
The docstring incorrectly claimed that momentum was ignored when a variable slice wasn't used in the sparse version of the algorithm. Change: 150477769
-
由 A. Unique TensorFlower 提交于
Change: 150477638
-
由 Yuefeng Zhou 提交于
Make the queue runner own the metadata and mutex. Change: 150475730
-
由 A. Unique TensorFlower 提交于
Change: 150474503
-
由 A. Unique TensorFlower 提交于
Change: 150471440
-
由 Mark Heffernan 提交于
Make mapping from calling instruction to CallSite one-to-one by extending CallSite to handle more than one called computation. This enables instructions like kWhile which call two computations to be represented as a single CallSite. Also add a mapping from instruction to CallSite in CallGraphNode to enable fast call site lookup. Also, include a few other opportunistic improvements: * Change CallGraph::Build factor to return a std::unique_ptr. This enables, for example, more convenient use of CallGraph as a data member to a class. * Change a few uses of unordered_set/map to FlatSet/Map. Change: 150469958
-
由 Mark Daoust 提交于
Change: 150466902
-
由 Gunhan Gulsoy 提交于
Change: 150463151
-
由 A. Unique TensorFlower 提交于
Change: 150462131
-
由 A. Unique TensorFlower 提交于
Change: 150460833
-
由 Asim Shankar 提交于
And also convenience functions for feeding and fetching using handles to outputs/operations instead of their names. Change: 150460697
-
由 Jonathan Hseu 提交于
Change: 150460215
-
由 Yangzihao Wang 提交于
Change: 150459431
-
由 A. Unique TensorFlower 提交于
Improves performance of tf.matmul(a, b, ...) for dense tensors on NVIDIA GPUs in the following cases: a) If the inner-most dimension of b is 1, i.e. the operation is (possibly a batch of) matrix*vector multiplication(s). This is accomplished by calling Cublas GEMV rather than GEMM. This speeds up large matrix-vector products by about 4x. b) If one or more dimensions are unknown at graph construction time but the operation is in fact either a single matrix*matrix or matrix*vector multiplication. The following benchmark numbers illustrating the improvements for matrix * vector products were measured on a NVIDIA Titan X (Maxwell) card. Benchmark Base (ns) New (ns) Improvement ---------------------------------------------------------------------------- BM_Matmul_50_50_1_false_false_DT_FLOAT_gpu 18102 17056 +5.8% BM_Matmul_50_50_1_true_false_DT_FLOAT_gpu 18108 16374 +9.6% BM_Matmul_50_50_1_false_true_DT_FLOAT_gpu 18153 17173 +5.4% BM_Matmul_50_50_1_true_true_DT_FLOAT_gpu 18150 15950 +12.1% BM_Matmul_500_500_1_false_false_DT_FLOAT_gpu 64605 16874 +73.9% BM_Matmul_500_500_1_true_false_DT_FLOAT_gpu 62810 17298 +72.5% BM_Matmul_500_500_1_false_true_DT_FLOAT_gpu 60447 17014 +71.9% BM_Matmul_500_500_1_true_true_DT_FLOAT_gpu 58443 16934 +71.0% BM_Matmul_2000_2000_1_false_false_DT_FLOAT_gpu 343298 81898 +76.1% BM_Matmul_2000_2000_1_true_false_DT_FLOAT_gpu 294738 63723 +78.4% BM_Matmul_2000_2000_1_false_true_DT_FLOAT_gpu 300671 83650 +72.2% BM_Matmul_2000_2000_1_true_true_DT_FLOAT_gpu 284540 63742 +77.6% Change: 150456725
-
由 Zakaria Haque 提交于
Change: 150452316
-
由 A. Unique TensorFlower 提交于
folding. Change: 150450219
-
由 A. Unique TensorFlower 提交于
Change: 150450082
-
由 A. Unique TensorFlower 提交于
Change: 150449788
-
由 Mustafa Ispir 提交于
Change: 150447439
-
由 Mustafa Ispir 提交于
Change: 150443246
-
- 17 3月, 2017 13 次提交
-
-
由 A. Unique TensorFlower 提交于
FileExistsError doesn't exist in Python 2.7. Change: 150436736
-
由 A. Unique TensorFlower 提交于
- Removes getArgumentList() in favor of args() etc in XLA (LLVM r298010) Change: 150427660
-
由 Eugene Brevdo 提交于
The main change is that RNNCells that wrap other RNNCells now override self.zero_state to call the wrapped cell's zero_state and then (maybe) perform some post-processing... instead of relying on the state_size property to provide all information about the state. Also made zero_state calls create ops inside their own name scope. Change: 150413265
-
由 Eugene Brevdo 提交于
Change: 150400474
-
由 Benoit Steiner 提交于
Change: 150397471
-
由 A. Unique TensorFlower 提交于
Change: 150396376
-
由 Gunhan Gulsoy 提交于
Change: 150389857
-
由 Benoit Steiner 提交于
Change: 150388263
-
由 A. Unique TensorFlower 提交于
Change: 150385615
-
由 A. Unique TensorFlower 提交于
Change: 150384503
-
由 Skye Wanderman-Milne 提交于
This is a first step towards supporting functions in C++ graph construction, e.g. being able to import GraphDefs with functions. Change: 150382046
-
由 A. Unique TensorFlower 提交于
This update contains refinements to the charts in the scalars dashboard. Change: 150380169
-
由 A. Unique TensorFlower 提交于
This involves adding a toggle to the health pills info box in the graph visualizer. When that toggle is enabled, Tensorboard makes a request for health pills at step X when the user moves the slider. This feature can be very slow because it requires reading from disk. Viewing health pills at say step 100,000 could take minutes to an hour. We must design ways to make this faster (for instance, have the debugger write events at a much greater frequency only after it encounters a bad value). Change: 150379929
-