Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
doujutun3207
flink
提交
ebae500f
F
flink
项目概览
doujutun3207
/
flink
与 Fork 源项目一致
从无法访问的项目Fork
通知
24
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
F
flink
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
ebae500f
编写于
7月 14, 2014
作者:
G
ghermann
提交者:
Stephan Ewen
8月 18, 2014
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
[streaming] StreamComponentHelper refactor
上级
f08d55d0
变更
3
隐藏空白更改
内联
并排
Showing
3 changed file
with
91 addition
and
111 deletion
+91
-111
flink-addons/flink-streaming/pom.xml
flink-addons/flink-streaming/pom.xml
+1
-1
flink-addons/flink-streaming/src/main/java/eu/stratosphere/streaming/api/SinkFunction.java
...main/java/eu/stratosphere/streaming/api/SinkFunction.java
+2
-1
flink-addons/flink-streaming/src/main/java/eu/stratosphere/streaming/api/streamcomponent/StreamComponentHelper.java
.../streaming/api/streamcomponent/StreamComponentHelper.java
+88
-109
未找到文件。
flink-addons/flink-streaming/pom.xml
浏览文件 @
ebae500f
...
...
@@ -12,7 +12,7 @@
<packaging>
jar
</packaging>
<properties>
<stratosphere.version>
0.5
.1
</stratosphere.version>
<stratosphere.version>
0.5
</stratosphere.version>
<project.build.sourceEncoding>
UTF-8
</project.build.sourceEncoding>
<project.reporting.outputEncoding>
UTF-8
</project.reporting.outputEncoding>
</properties>
...
...
flink-addons/flink-streaming/src/main/java/eu/stratosphere/streaming/api/SinkFunction.java
浏览文件 @
ebae500f
...
...
@@ -17,9 +17,10 @@ package eu.stratosphere.streaming.api;
import
java.io.Serializable
;
import
eu.stratosphere.api.common.functions.AbstractFunction
;
import
eu.stratosphere.api.java.tuple.Tuple
;
public
abstract
class
SinkFunction
<
IN
extends
Tuple
>
implements
Serializable
{
public
abstract
class
SinkFunction
<
IN
extends
Tuple
>
extends
AbstractFunction
implements
Serializable
{
private
static
final
long
serialVersionUID
=
1L
;
...
...
flink-addons/flink-streaming/src/main/java/eu/stratosphere/streaming/api/streamcomponent/StreamComponentHelper.java
浏览文件 @
ebae500f
...
...
@@ -49,6 +49,7 @@ import eu.stratosphere.streaming.api.StreamCollectorManager;
import
eu.stratosphere.streaming.api.invokable.DefaultSinkInvokable
;
import
eu.stratosphere.streaming.api.invokable.DefaultSourceInvokable
;
import
eu.stratosphere.streaming.api.invokable.DefaultTaskInvokable
;
import
eu.stratosphere.streaming.api.invokable.StreamComponent
;
import
eu.stratosphere.streaming.api.invokable.StreamRecordInvokable
;
import
eu.stratosphere.streaming.api.invokable.UserSinkInvokable
;
import
eu.stratosphere.streaming.api.invokable.UserSourceInvokable
;
...
...
@@ -77,13 +78,13 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
private
SerializationDelegate
<
Tuple
>
outSerializationDelegate
=
null
;
public
Collector
<
Tuple
>
collector
;
private
List
<
Integer
>
batch
sizes_s
=
new
ArrayList
<
Integer
>();
private
List
<
Integer
>
batch
sizes_f
=
new
ArrayList
<
Integer
>();
private
List
<
Integer
>
numOfOutputs
_f
=
new
ArrayList
<
Integer
>();
private
List
<
Integer
>
batch
SizesNotPartitioned
=
new
ArrayList
<
Integer
>();
private
List
<
Integer
>
batch
SizesPartitioned
=
new
ArrayList
<
Integer
>();
private
List
<
Integer
>
numOfOutputs
Partitioned
=
new
ArrayList
<
Integer
>();
private
int
keyPosition
=
0
;
private
List
<
RecordWriter
<
StreamRecord
>>
outputs
_s
=
new
ArrayList
<
RecordWriter
<
StreamRecord
>>();
private
List
<
RecordWriter
<
StreamRecord
>>
outputs
_f
=
new
ArrayList
<
RecordWriter
<
StreamRecord
>>();
private
List
<
RecordWriter
<
StreamRecord
>>
outputs
NotPartitioned
=
new
ArrayList
<
RecordWriter
<
StreamRecord
>>();
private
List
<
RecordWriter
<
StreamRecord
>>
outputs
Partitioned
=
new
ArrayList
<
RecordWriter
<
StreamRecord
>>();
public
static
int
newComponent
()
{
numComponents
++;
...
...
@@ -117,47 +118,35 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
public
Collector
<
Tuple
>
setCollector
(
Configuration
taskConfiguration
,
int
id
,
List
<
RecordWriter
<
StreamRecord
>>
outputs
)
{
int
batchSize
=
taskConfiguration
.
getInteger
(
"batchSize"
,
1
);
long
batchTimeout
=
taskConfiguration
.
getLong
(
"batchTimeout"
,
1000
);
// collector = new StreamCollector<Tuple>(batchSize, batchTimeout, id,
// outSerializationDelegate, outputs);
collector
=
new
StreamCollectorManager
<
Tuple
>(
batchsizes_s
,
batchsizes_f
,
numOfOutputs_f
,
keyPosition
,
batchTimeout
,
id
,
outSerializationDelegate
,
outputs_f
,
outputs_s
);
collector
=
new
StreamCollectorManager
<
Tuple
>(
batchSizesNotPartitioned
,
batchSizesPartitioned
,
numOfOutputsPartitioned
,
keyPosition
,
batchTimeout
,
id
,
outSerializationDelegate
,
outputsPartitioned
,
outputsNotPartitioned
);
return
collector
;
}
// TODO add type parameters to avoid redundant code
@SuppressWarnings
({
"rawtypes"
,
"unchecked"
})
public
void
setSerializers
(
Configuration
taskConfiguration
)
{
byte
[]
operatorBytes
=
taskConfiguration
.
getBytes
(
"operator"
,
null
);
String
operatorName
=
taskConfiguration
.
getString
(
"operatorName"
,
""
);
Object
function
=
null
;
try
{
ObjectInputStream
in
=
new
ObjectInputStream
(
new
ByteArrayInputStream
(
operatorBytes
));
Object
function
=
in
.
readObject
();
function
=
in
.
readObject
();
if
(
operatorName
.
equals
(
"flatMap"
))
{
setSerializer
(
function
,
FlatMapFunction
.
class
);
setSerializer
Deserializer
(
function
,
FlatMapFunction
.
class
);
}
else
if
(
operatorName
.
equals
(
"map"
))
{
setSerializer
(
function
,
MapFunction
.
class
);
setSerializer
Deserializer
(
function
,
MapFunction
.
class
);
}
else
if
(
operatorName
.
equals
(
"batchReduce"
))
{
setSerializer
(
function
,
GroupReduceFunction
.
class
);
setSerializer
Deserializer
(
function
,
GroupReduceFunction
.
class
);
}
else
if
(
operatorName
.
equals
(
"filter"
))
{
setSerializer
(
function
,
FilterFunction
.
class
);
setSerializer
Deserializer
(
function
,
FilterFunction
.
class
);
}
else
if
(
operatorName
.
equals
(
"sink"
))
{
inTupleTypeInfo
=
(
TupleTypeInfo
)
TypeExtractor
.
createTypeInfo
(
SinkFunction
.
class
,
function
.
getClass
(),
0
,
null
,
null
);
inTupleSerializer
=
inTupleTypeInfo
.
createSerializer
();
inDeserializationDelegate
=
new
DeserializationDelegate
<
Tuple
>(
inTupleSerializer
);
setDeserializer
(
function
,
SinkFunction
.
class
);
}
else
if
(
operatorName
.
equals
(
"source"
))
{
outTupleTypeInfo
=
(
TupleTypeInfo
)
TypeExtractor
.
createTypeInfo
(
UserSourceInvokable
.
class
,
function
.
getClass
(),
0
,
null
,
null
);
outTupleSerializer
=
outTupleTypeInfo
.
createSerializer
();
outSerializationDelegate
=
new
SerializationDelegate
<
Tuple
>(
outTupleSerializer
);
setSerializer
(
function
,
UserSourceInvokable
.
class
,
0
);
}
else
if
(
operatorName
.
equals
(
"elements"
))
{
outTupleTypeInfo
=
new
TupleTypeInfo
<
Tuple
>(
TypeExtractor
.
getForObject
(
function
));
...
...
@@ -168,25 +157,43 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
}
}
catch
(
Exception
e
)
{
throw
new
StreamComponentException
(
"Nonsupported object
passed as operator"
);
throw
new
StreamComponentException
(
"Nonsupported object
(named "
+
operatorName
+
") passed as operator"
);
}
}
private
void
setSerializer
(
Object
function
,
Class
<?
extends
AbstractFunction
>
clazz
)
{
private
void
setSerializerDeserializer
(
Object
function
,
Class
<?
extends
AbstractFunction
>
clazz
)
{
setDeserializer
(
function
,
clazz
);
setSerializer
(
function
,
clazz
,
1
);
}
@SuppressWarnings
({
"unchecked"
,
"rawtypes"
})
private
void
setDeserializer
(
Object
function
,
Class
<?
extends
AbstractFunction
>
clazz
)
{
inTupleTypeInfo
=
(
TupleTypeInfo
)
TypeExtractor
.
createTypeInfo
(
clazz
,
function
.
getClass
(),
0
,
null
,
null
);
inTupleSerializer
=
inTupleTypeInfo
.
createSerializer
();
inDeserializationDelegate
=
new
DeserializationDelegate
<
Tuple
>(
inTupleSerializer
);
}
@SuppressWarnings
({
"rawtypes"
,
"unchecked"
})
private
void
setSerializer
(
Object
function
,
Class
<?>
clazz
,
int
typeParameter
)
{
outTupleTypeInfo
=
(
TupleTypeInfo
)
TypeExtractor
.
createTypeInfo
(
clazz
,
function
.
getClass
(),
1
,
null
,
null
);
typeParameter
,
null
,
null
);
outTupleSerializer
=
outTupleTypeInfo
.
createSerializer
();
outSerializationDelegate
=
new
SerializationDelegate
<
Tuple
>(
outTupleSerializer
);
}
public
void
setSinkSerializer
()
{
if
(
outSerializationDelegate
!=
null
)
{
inTupleTypeInfo
=
outTupleTypeInfo
;
inTupleSerializer
=
inTupleTypeInfo
.
createSerializer
();
inDeserializationDelegate
=
new
DeserializationDelegate
<
Tuple
>(
inTupleSerializer
);
}
}
public
AbstractRecordReader
getConfigInputs
(
T
taskBase
,
Configuration
taskConfiguration
)
throws
StreamComponentException
{
int
numberOfInputs
=
taskConfiguration
.
getInteger
(
"numberOfInputs"
,
0
);
...
...
@@ -240,26 +247,33 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
}
else
{
throw
new
StreamComponentException
(
"Nonsupported object passed to setConfigOutputs"
);
}
if
(
outputs
_f
.
size
()
<
batchsizes_f
.
size
())
{
outputs
_f
.
add
(
outputs
.
get
(
i
));
if
(
outputs
Partitioned
.
size
()
<
batchSizesPartitioned
.
size
())
{
outputs
Partitioned
.
add
(
outputs
.
get
(
i
));
}
else
{
outputs
_s
.
add
(
outputs
.
get
(
i
));
outputs
NotPartitioned
.
add
(
outputs
.
get
(
i
));
}
}
}
public
UserSinkInvokable
getSinkInvokable
(
Configuration
taskConfiguration
)
{
Class
<?
extends
UserSinkInvokable
>
userFunctionClass
=
taskConfiguration
.
getClass
(
"userfunction"
,
DefaultSinkInvokable
.
class
,
UserSinkInvokable
.
class
);
UserSinkInvokable
userFunction
=
null
;
byte
[]
userFunctionSerialized
=
taskConfiguration
.
getBytes
(
"serializedudf"
,
null
);
/**
* Reads and creates a StreamComponent from the config.
*
* @param userFunctionClass
* Class of the invokable function
* @param config
* Configuration object
* @return The StreamComponent object
*/
private
StreamComponent
getInvokable
(
Class
<?
extends
StreamComponent
>
userFunctionClass
,
Configuration
config
)
{
StreamComponent
userFunction
=
null
;
byte
[]
userFunctionSerialized
=
config
.
getBytes
(
"serializedudf"
,
null
);
try
{
ObjectInputStream
ois
=
new
ObjectInputStream
(
new
ByteArrayInputStream
(
userFunctionSerialized
));
userFunction
=
(
UserSinkInvokable
)
ois
.
readObject
();
userFunction
=
(
StreamComponent
)
ois
.
readObject
();
}
catch
(
Exception
e
)
{
if
(
log
.
isErrorEnabled
())
{
log
.
error
(
"Cannot instanciate user function: "
+
userFunctionClass
.
getSimpleName
());
...
...
@@ -269,58 +283,30 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
return
userFunction
;
}
@SuppressWarnings
(
"rawtypes"
)
public
UserSinkInvokable
getSinkInvokable
(
Configuration
config
)
{
Class
<?
extends
UserSinkInvokable
>
userFunctionClass
=
config
.
getClass
(
"userfunction"
,
DefaultSinkInvokable
.
class
,
UserSinkInvokable
.
class
);
return
(
UserSinkInvokable
)
getInvokable
(
userFunctionClass
,
config
);
}
// TODO consider logging stack trace!
@SuppressWarnings
(
"
unchecked
"
)
public
UserTaskInvokable
getTaskInvokable
(
Configuration
taskConfiguration
)
{
@SuppressWarnings
(
"
rawtypes
"
)
public
UserTaskInvokable
getTaskInvokable
(
Configuration
config
)
{
// Default value is a TaskInvokable even if it was called from a source
Class
<?
extends
UserTaskInvokable
>
userFunctionClass
=
taskConfiguration
.
getClass
(
"userfunction"
,
DefaultTaskInvokable
.
class
,
UserTaskInvokable
.
class
);
UserTaskInvokable
userFunction
=
null
;
byte
[]
userFunctionSerialized
=
taskConfiguration
.
getBytes
(
"serializedudf"
,
null
);
try
{
ObjectInputStream
ois
=
new
ObjectInputStream
(
new
ByteArrayInputStream
(
userFunctionSerialized
));
userFunction
=
(
UserTaskInvokable
)
ois
.
readObject
();
// userFunction.declareOutputs(outputs, instanceID, name,
// recordBuffer,
// faultToleranceType);
}
catch
(
Exception
e
)
{
if
(
log
.
isErrorEnabled
())
{
log
.
error
(
"Cannot instanciate user function: "
+
userFunctionClass
.
getSimpleName
());
}
}
return
userFunction
;
Class
<?
extends
UserTaskInvokable
>
userFunctionClass
=
config
.
getClass
(
"userfunction"
,
DefaultTaskInvokable
.
class
,
UserTaskInvokable
.
class
);
return
(
UserTaskInvokable
)
getInvokable
(
userFunctionClass
,
config
);
}
public
UserSourceInvokable
getSourceInvokable
(
Configuration
taskConfiguration
)
{
@SuppressWarnings
(
"rawtypes"
)
public
UserSourceInvokable
getSourceInvokable
(
Configuration
config
)
{
// Default value is a TaskInvokable even if it was called from a source
Class
<?
extends
UserSourceInvokable
>
userFunctionClass
=
taskConfiguration
.
getClass
(
"userfunction"
,
DefaultSourceInvokable
.
class
,
UserSourceInvokable
.
class
);
UserSourceInvokable
userFunction
=
null
;
byte
[]
userFunctionSerialized
=
taskConfiguration
.
getBytes
(
"serializedudf"
,
null
);
try
{
ObjectInputStream
ois
=
new
ObjectInputStream
(
new
ByteArrayInputStream
(
userFunctionSerialized
));
userFunction
=
(
UserSourceInvokable
)
ois
.
readObject
();
// userFunction.declareOutputs(outputs, instanceID, name,
// recordBuffer,
// faultToleranceType);
}
catch
(
Exception
e
)
{
if
(
log
.
isErrorEnabled
())
{
log
.
error
(
"Cannot instanciate user function: "
+
userFunctionClass
.
getSimpleName
());
}
}
return
userFunction
;
Class
<?
extends
UserSourceInvokable
>
userFunctionClass
=
config
.
getClass
(
"userfunction"
,
DefaultSourceInvokable
.
class
,
UserSourceInvokable
.
class
);
return
(
UserSourceInvokable
)
getInvokable
(
userFunctionClass
,
config
);
}
// TODO find a better solution for this
...
...
@@ -340,47 +326,40 @@ public final class StreamComponentHelper<T extends AbstractInvokable> {
}
}
private
void
setPartitioner
(
Configuration
taskConfiguration
,
int
nrOutput
,
private
void
setPartitioner
(
Configuration
config
,
int
numberOfOutputs
,
List
<
ChannelSelector
<
StreamRecord
>>
partitioners
)
{
Class
<?
extends
ChannelSelector
<
StreamRecord
>>
partitioner
=
taskConfiguration
.
getClass
(
"partitionerClass_"
+
nrOutput
,
DefaultPartitioner
.
class
,
ChannelSelector
.
class
);
Class
<?
extends
ChannelSelector
<
StreamRecord
>>
partitioner
=
config
.
getClass
(
"partitionerClass_"
+
numberOfOutputs
,
DefaultPartitioner
.
class
,
ChannelSelector
.
class
);
Integer
batchSize
=
taskConfiguration
.
getInteger
(
"batchSize_"
+
nrOutput
,
1
);
Integer
batchSize
=
config
.
getInteger
(
"batchSize_"
+
numberOfOutputs
,
1
);
try
{
if
(
partitioner
.
equals
(
FieldsPartitioner
.
class
))
{
batchsizes_f
.
add
(
batchSize
);
numOfOutputs_f
.
add
(
taskConfiguration
.
getInteger
(
"numOfOutputs_"
+
nrOutput
,
-
1
));
batchSizesPartitioned
.
add
(
batchSize
);
numOfOutputsPartitioned
.
add
(
config
.
getInteger
(
"numOfOutputs_"
+
numberOfOutputs
,
-
1
));
// TODO:force one partitioning field
keyPosition
=
taskConfiguration
.
getInteger
(
"partitionerIntParam_"
+
nrOutput
,
1
);
keyPosition
=
config
.
getInteger
(
"partitionerIntParam_"
+
numberOfOutputs
,
1
);
partitioners
.
add
(
partitioner
.
getConstructor
(
int
.
class
).
newInstance
(
keyPosition
));
}
else
{
batch
sizes_s
.
add
(
batchSize
);
batch
SizesNotPartitioned
.
add
(
batchSize
);
partitioners
.
add
(
partitioner
.
newInstance
());
}
if
(
log
.
isTraceEnabled
())
{
log
.
trace
(
"Partitioner set: "
+
partitioner
.
getSimpleName
()
+
" with "
+
nrOutput
+
" outputs"
);
log
.
trace
(
"Partitioner set: "
+
partitioner
.
getSimpleName
()
+
" with "
+
numberOfOutputs
+
" outputs"
);
}
}
catch
(
Exception
e
)
{
if
(
log
.
isErrorEnabled
())
{
log
.
error
(
"Error while setting partitioner: "
+
partitioner
.
getSimpleName
()
+
" with "
+
n
rOutput
+
" outputs"
,
e
);
+
" with "
+
n
umberOfOutputs
+
" outputs"
,
e
);
}
}
}
public
void
setSinkSerializer
()
{
if
(
outSerializationDelegate
!=
null
)
{
inTupleTypeInfo
=
outTupleTypeInfo
;
inTupleSerializer
=
inTupleTypeInfo
.
createSerializer
();
inDeserializationDelegate
=
new
DeserializationDelegate
<
Tuple
>(
inTupleSerializer
);
}
}
public
void
invokeRecords
(
StreamRecordInvokable
userFunction
,
AbstractRecordReader
inputs
)
throws
Exception
{
if
(
inputs
instanceof
UnionStreamRecordReader
)
{
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录