- 10 10月, 2017 17 次提交
-
-
由 Zhijiang 提交于
This closes #4735.
-
由 Zhijiang 提交于
This closes #4499.
-
由 Zhijiang 提交于
This closes #4485.
-
由 Aljoscha Krettek 提交于
Before, we were printing the exception (and stack trace) when the Async I/O Emitter was receving an interrupted exception. The interrupt, however is part of the normal shutdown process of the Emitter and the log message was causing concern because a stack trace usually indicates something went wrong.
-
由 Michael Fong 提交于
Refactor the current usage of on Cassandra Sink w/ more in-depth information. Provides examples for Pojo and Java Tuple data types This closes #4696.
-
由 Chris Ward 提交于
This closes #4760.
-
由 sirko bretschneider 提交于
This closes #4756.
-
由 zentol 提交于
[FLINK-7072] [REST] Extend Dispatcher [FLINK-7072] [REST] Add handlers for job submit/cancel/stop [FLINK-7072] [REST] CLI integration use ExecutorThradFactory + rebase(blobKey fix) add "Flink" prefix to RestCC threads shutdown client for cancel/shutdown Rework CliFrontEnd Stop/Cancel tests These tests verified that the CLI was sending the correct messages and parameters to the JM actor. This is now handled by the ClusterClient, so the tests were adjusted to verify that the correct methods on the ClusterClient are being called. Additional tests were added to the ClusterClientTest class to verify that the correct messages and parameters are being sent. This closes #4742.
-
由 zentol 提交于
-
由 Piotr Nowojski 提交于
-
由 Piotr Nowojski 提交于
-
由 Piotr Nowojski 提交于
-
由 Piotr Nowojski 提交于
-
由 Piotr Nowojski 提交于
-
由 Piotr Nowojski 提交于
-
由 Aljoscha Krettek 提交于
-
由 Aljoscha Krettek 提交于
-
- 09 10月, 2017 2 次提交
-
-
由 zentol 提交于
This closes #4773.
-
由 James Lafa 提交于
This closes #4647.
-
- 07 10月, 2017 9 次提交
-
-
由 Nico Kruber 提交于
Since we'd like to use our own off-heap buffers for network communication, we cannot use HeapMemorySegment anymore and need to rely on HybridMemorySegment. We thus drop any code that loads the HeapMemorySegment (it is still available if needed) in favour of the HybridMemorySegment which is able to work on both heap and off-heap memory. For the performance penalty of this change compared to using HeapMemorySegment alone, see this interesting blob article (from 2015): https://flink.apache.org/news/2015/09/16/off-heap-memory.html This closes #4445
-
由 Nico Kruber 提交于
We deliberately ignore redundant modifiers for now since we want `final` modifiers on `final` classes for increased future-proofness.
-
由 Gary Yao 提交于
This closes #4755
-
由 yew1eb 提交于
This closes #4754
-
由 Piotr Nowojski 提交于
- Set shorter heartbeats intervals. Default pause value of 60seconds is too large (tests would timeout before akka react) - Exclude netty dependency from zookeeper. Zookeeper was pulling in conflicting Netty version. Conflict was extremly subtle - TaskManager in Kafka tests was deadlocking in some rare corner cases. This closes #4775
-
由 Nico Kruber 提交于
BlobCacheDeleteTest did not account for the server executing the delete call of a transient BLOB after acknowledging the request. This resulted in the testDeleteTransientLocalFails* tests failing. This closes #4782
-
由 Stephan Ewen 提交于
Now that Hadoop is an optinal dependency, all the explicit exclusions during shading are no longer needed.
-
由 Stephan Ewen 提交于
The examples are not part of dist's dependencies, hence no need to exclude them from the fat jar. The exclusion did not work anyways, because it used wrong artifact names (not using scala suffixes).
-
由 Stephan Ewen 提交于
-
- 06 10月, 2017 6 次提交
-
-
由 Stephan Ewen 提交于
This changes the discovery mechanism of file from static class name configurations to a service mechanism (META-INF/services). As part of that, it factors HDFS and MapR FS implementations into separate modules. With this change, users can add new filesystem implementations and make them available by simply adding them to the class path.
-
由 Stephan Ewen 提交于
-
由 Stephan Ewen 提交于
This was done reflectively before for Hadoop 1 compatibility. Since Hadoop 1 is no longer supported, this is obsolete now.
-
由 Stephan Ewen 提交于
This makes sure that configurations are loaded once and file system instances are properly reused by scheme and authority. This also factors out a lot of the special treatment of Hadoop file systems and simply makes the Hadoop File System factory the default fallback factory.
-
由 Stephan Ewen 提交于
Some places validate if the file URIs are resolvable on the client. This leads to problems when file systems are not accessible from the client, when the full libraries for the file systems are not present on the client (for example often the case in cloud setups), or when the configuration on the client is different from the nodes/containers that will execute the application.
-
由 Stephan Ewen 提交于
- Simplify access to local file system - Use a fair lock for all FileSystem.get() operations - Robust falback to local fs for default scheme (avoids URI parsing error on Windows) - Deprecate 'getDefaultBlockSize()' - Deprecate create(...) with block sizes and replication factor, which is not applicable to many FS
-
- 05 10月, 2017 6 次提交
-
-
由 Nico Kruber 提交于
[FLINK-7068][blob] address PR review comments, part 1 [FLINK-7068][blob] create a common base class for the BLOB caches [FLINK-7068][blob] update some comments [FLINK-7068][blob] integrate the BLOB type into the BlobKey [FLINK-7068][blob] rename a few methods for better consistency [FLINK-7068][blob] fix Blob*DeleteTest not working as documented in one test [FLINK-7068][blob] add checks for jobId being null in PermanentBlobCache [FLINK-7068][blob] implement get-and-delete logic for transient BLOBs Transient BLOB files are deleted on the BlobServer upon first access from a cache. Therefore, we do not need the DELETE operations anymore, aside from deleting the file from the local cache (for now). [FLINK-7068][blob] address PR comments, part 2 [FLINK-7068][blob] separate permanent and transient BLOB keys * create PermanentBlobKey and TransientBlobKey (inheriting from BlobKey) and forbid using transient BLOBs with permanent caches and vice versa * make BlobKey package-private, similarly for the BlobType which is now reflected by the two BlobKey sub-classes -> this gives a cleaner interface for the user This closes #4358.
-
由 Nico Kruber 提交于
This way, using code can distinguish non-HA cases, i.e. VoidBlobStore, from HA cases, i.e. FileSystemBlobStore, in a general way and have better error reporting.
-
由 Nico Kruber 提交于
[FLINK-7068][blob] start introducing a new BLOB storage abstraction This is incomplete and may not compile and/or run tests successfully yet. [FLINK-7068][blob] remove BlobView from TransientBlobCache The transient BLOB cache is not supposed to work with the HA store since it only serves non-HA files. [FLINK-7068][blob] remove unnecessary use of BlobClient [FLINK-7068][blob] implement TransientBlobCache#put methods [FLINK-7068][blob] remove further unnecessary use of BlobClient and adapt to HA get/put methods [FLINK-7068][blob] fix BlobServer#getFileInternal not being guarded by locks [FLINK-7068][blob] add incoming file cleanup at BlobServer in cases of errors [FLINK-7068] fix missing BlobServer#putHA() jobId propagation [FLINK-7068][blob] remove BlobClient use from BlobServer{Get|Put}Test [FLINK-7068][blob] make helper methods work with any BlobService [FLINK-7068][blob] start adding a BlobCacheGetTest [FLINK-7068][blob] verify get contents in separate threads This allows (at a slight chance) that we may see an intermediate file. [FLINK-7068][blob] better locking granularity during file retrieval This allows multiple parallel downloads from the HA store to the BlobServer's local store although only one of these downloaded staging files will actually be used. In practice, this happens only during recovery and not in parallel anyways. [FLINK-7068][blob] share more code among BlobServer and BlobServerConnection This also applies the better locking granularity of the previous commit to BlobServerConnection. [FLINK-7068][blob] properly cleanup temporary staging files in all cases [FLINK-7068][blob] make PermanentBlobCache and TransientBlobCache thread-safe [FLINK-7068][tests] improve various tests [FLINK-7068][blob] change the signature of the delete calls to return success We will not throw exceptions in case of failures anymore and return whether the operation was successful instead. Failure details will still be accessible in the written logs. [FLINK-7068][tests] extend and adapt BlobServerDeleteTest [FLINK-7068][tests] adapt further BlobCache tests [FLINK-7068][tests] adapt BlobClientTest [FLINK-7068][blob] cleanup BlobClient methods BlobClient is not supposed to be used by anyone else than the BlobServer/BlobCache classes. Most accessors were already package-private, now remove the ones that just blow up the code. [FLINK-7068] add a TODO to fix the currently failing tests [FLINK-7068][tests] add a BlobCacheRecoveryTest This currently fails due to TransientBlobCache#put also storing files in HA store which it should not! [FLINK-7068][tests] improve failure message [FLINK-7068][blob] add permanent/transient BLOB modes to BlobClient This allows a better control of which should end up in HA store and which should not. Also, during GET methods, we do not check the HA store unnecessarily. [FLINK-7068][tests] extend the Blob{Server|Cache}GetTest This adds some failing GET operations and verifies that the files are cleaned up accordingly. [FLINK-7068][blob] remove "final" flag from BlobCache class This re-enables mocking in various unit tests. [FLINK-7068][tests] fix test relying on order of folder contents [FLINK-7068][blob] some BlobServer cleanup [FLINK-7068][hotfix] fix checkstyle errors [FLINK-7068][tests] fix tests now requiring a more complete BlobCache mock A suitable BlobCache mock should at least return a mock for a permanent and a transient BLOB store, so mock(BlobCache.class) is not sufficient anymore. [FLINK-7068] final wrap up * remove a left-over TODO * remove useless tests for the concurrency of the GET operations (we cannot test that the file write is guarded by a lock directly - rely on the concurrent checks in the individual threads instead) * fix some log messages [FLINK-7068][blob] remove Thread#start() call from BlobServer constructor This is bad design and limits extensibility, e.g. in tests like the BlobCacheRetriesTest where this caused a race condition with the sub-class. Instead, the user must now call BlobServer#start() explicitely. [FLINK-7068][tests] remove unused imports [FLINK-7068][tests] fix a typo [FLINK-7068][tests] add some tests that verify behaviour with corrupted files Also add corruption checks for HA-store downloads which was not implemented yet. [FLINK-7068][blob] ensure consistency in PermanentBlobCache even in cases of invalid use During cleanup, no write lock was taken but the storage directory of an (unused!) job was deleted. Normally, there should be no process left accessing its data and no new process can jump in since the registration is locked. In case of invalid use cases, i.e. using a job's data outside a register() and release() block, this could lead to strange effects. By guarding the cleanup with the write lock as well, we circumvent that. [FLINK-7068][hotfix] remove an unused import
-
由 Nico Kruber 提交于
[FLINK-7057][tests][hotfix] fix test instability of JobManagerCleanupITCase#testBlobServerCleanupCancelledJob This test expected two messages to arrice (job cancellation and job state change notification) but did not take different receive orders into account. The fix: - removes state change listening for this test case so that only one message arrives, and - adds message comparison by object, not just class (to improve debugging)
-
由 Nico Kruber 提交于
When a job is registered, it may have been released before and we thus need to reset the cleanup timeout again.
-
由 Till Rohrmann 提交于
This commit waits not only until the Actor has called postStop but also until the actor has been completely shut down by the ActorSystem before completing the termination future. This closes #4770.
-