Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
doujutun3207
flink
提交
185b5f6c
F
flink
项目概览
doujutun3207
/
flink
与 Fork 源项目一致
从无法访问的项目Fork
通知
24
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
F
flink
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
185b5f6c
编写于
7月 28, 2016
作者:
T
twalthr
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
[FLINK-4179] [table] Additional TPCHQuery3Table example improvements
上级
ec4c9bef
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
65 addition
and
61 deletion
+65
-61
flink-libraries/flink-table/src/main/scala/org/apache/flink/examples/scala/TPCHQuery3Table.scala
...ala/org/apache/flink/examples/scala/TPCHQuery3Table.scala
+65
-61
未找到文件。
flink-libraries/flink-table/src/main/scala/org/apache/flink/examples/scala/TPCHQuery3Table.scala
浏览文件 @
185b5f6c
...
...
@@ -17,57 +17,60 @@
*/
package
org.apache.flink.examples.scala
import
org.apache.flink.api.table.TableEnvironment
import
org.apache.flink.api.table.expressions.Literal
import
org.apache.flink.api.scala._
import
org.apache.flink.api.scala.table._
import
org.apache.flink.api.table.TableEnvironment
/**
* This program implements a modified version of the TPC-H query 3. The
* example demonstrates how to assign names to fields by extending the Tuple class.
* The original query can be found at
* [http://www.tpc.org/tpch/spec/tpch2.16.0.pdf](http://www.tpc.org/tpch/spec/tpch2.16.0.pdf)
* (page 29).
*
* This program implements the following SQL equivalent:
*
* {{{
* SELECT
* l_orderkey,
* SUM(l_extendedprice*(1-l_discount)) AS revenue,
* o_orderdate,
* o_shippriority
* FROM customer,
* orders,
* lineitem
* WHERE
* c_mktsegment = '[SEGMENT]'
* AND c_custkey = o_custkey
* AND l_orderkey = o_orderkey
* AND o_orderdate < date '[DATE]'
* AND l_shipdate > date '[DATE]'
* GROUP BY
* l_orderkey,
* o_orderdate,
* o_shippriority;
* }}}
*
* Compared to the original TPC-H query this version does not sort the result by revenue
* and orderdate.
*
* Input files are plain text CSV files using the pipe character ('|') as field separator
* as generated by the TPC-H data generator which is available at
* [http://www.tpc.org/tpch/](a href="http://www.tpc.org/tpch/).
*
* Usage:
* {{{
* TPCHQuery3Expression <lineitem-csv path> <customer-csv path> <orders-csv path> <result path>
* }}}
*
* This example shows how to use:
* - Table API expressions
*
*/
* This program implements a modified version of the TPC-H query 3. The
* example demonstrates how to assign names to fields by extending the Tuple class.
* The original query can be found at
* [http://www.tpc.org/tpch/spec/tpch2.16.0.pdf](http://www.tpc.org/tpch/spec/tpch2.16.0.pdf)
* (page 29).
*
* This program implements the following SQL equivalent:
*
* {{{
* SELECT
* l_orderkey,
* SUM(l_extendedprice*(1-l_discount)) AS revenue,
* o_orderdate,
* o_shippriority
* FROM customer,
* orders,
* lineitem
* WHERE
* c_mktsegment = '[SEGMENT]'
* AND c_custkey = o_custkey
* AND l_orderkey = o_orderkey
* AND o_orderdate < date '[DATE]'
* AND l_shipdate > date '[DATE]'
* GROUP BY
* l_orderkey,
* o_orderdate,
* o_shippriority
* ORDER BY
* revenue desc,
* o_orderdate;
* }}}
*
* Compared to the original TPC-H query this version does not sort the result by revenue
* and orderdate.
*
* Input files are plain text CSV files using the pipe character ('|') as field separator
* as generated by the TPC-H data generator which is available at
* [http://www.tpc.org/tpch/](a href="http://www.tpc.org/tpch/).
*
* Usage:
* {{{
* TPCHQuery3Expression <lineitem-csv path> <customer-csv path> <orders-csv path> <result path>
* }}}
*
* This example shows how to:
* - Convert DataSets to Tables
* - Use Table API expressions
*
*/
object
TPCHQuery3Table
{
def
main
(
args
:
Array
[
String
])
{
...
...
@@ -76,23 +79,23 @@ object TPCHQuery3Table {
}
// set filter date
val
date
=
java
.
sql
.
Date
.
valueOf
(
"1995-03-12"
)
val
date
=
"1995-03-12"
.
toDate
// get execution environment
val
env
=
ExecutionEnvironment
.
getExecutionEnvironment
val
tEnv
=
TableEnvironment
.
getTableEnvironment
(
env
)
val
lineitems
=
getLineitemDataSet
(
env
)
.
filter
(
l
=>
java
.
sql
.
Date
.
valueOf
(
l
.
shipDate
).
after
(
date
)
).
toTable
(
tEnv
)
.
as
(
'id
,
'extdPrice
,
'discount
,
'shipD
ate
)
.
toTable
(
tEnv
,
'id
,
'extdPrice
,
'discount
,
'shipDate
)
.
filter
(
'shipDate
.
toDate
>
d
ate
)
val
customers
=
getCustomerDataSet
(
env
)
.
toTable
(
tEnv
)
.
as
(
'id
,
'mktSegment
)
.
filter
(
'mktSegment
===
"AUTOMOBILE"
)
val
customers
=
getCustomerDataSet
(
env
)
.
toTable
(
tEnv
,
'id
,
'mktSegment
)
.
filter
(
'mktSegment
===
"AUTOMOBILE"
)
val
orders
=
getOrdersDataSet
(
env
)
.
filter
(
o
=>
java
.
sql
.
Date
.
valueOf
(
o
.
orderDate
).
before
(
date
)
).
toTable
(
tEnv
)
.
as
(
'orderId
,
'custId
,
'orderDate
,
'shipPrio
)
.
toTable
(
tEnv
,
'orderId
,
'custId
,
'orderDate
,
'shipPrio
)
.
filter
(
'orderDate
.
toDate
<
date
)
val
items
=
orders
.
join
(
customers
)
...
...
@@ -102,19 +105,20 @@ object TPCHQuery3Table {
.
where
(
'orderId
===
'id
)
.
select
(
'orderId
,
'extdPrice
*
(
Literal
(
1.0f
)
-
'discount
)
as
'revenue
,
'extdPrice
*
(
1.0f
.
toExpr
-
'discount
)
as
'revenue
,
'orderDate
,
'shipPrio
)
val
result
=
items
.
groupBy
(
'orderId
,
'orderDate
,
'shipPrio
)
.
select
(
'orderId
,
'revenue
.
sum
,
'orderDate
,
'shipPrio
)
.
select
(
'orderId
,
'revenue
.
sum
as
'revenue
,
'orderDate
,
'shipPrio
)
.
orderBy
(
'revenue
.
desc
,
'orderDate
.
asc
)
// emit result
result
.
writeAsCsv
(
outputPath
,
"\n"
,
"|"
)
// execute program
env
.
execute
(
"Scala TPCH Query 3 (Expression) Example"
)
env
.
execute
(
"Scala TPCH Query 3 (
Table API
Expression) Example"
)
}
// *************************************************************************
...
...
@@ -145,12 +149,12 @@ object TPCHQuery3Table {
System
.
err
.
println
(
"This program expects data from the TPC-H benchmark as input data.\n"
+
" Due to legal restrictions, we can not ship generated data.\n"
+
" You can find the TPC-H data generator at http://www.tpc.org/tpch/.\n"
+
" Usage: TPCHQuery3 <lineitem-csv path> <customer-csv path>
"
+
" Usage: TPCHQuery3 <lineitem-csv path> <customer-csv path>
"
+
"<orders-csv path> <result path>"
)
false
}
}
private
def
getLineitemDataSet
(
env
:
ExecutionEnvironment
)
:
DataSet
[
Lineitem
]
=
{
env
.
readCsvFile
[
Lineitem
](
lineitemPath
,
...
...
@@ -164,7 +168,7 @@ object TPCHQuery3Table {
fieldDelimiter
=
"|"
,
includedFields
=
Array
(
0
,
6
)
)
}
private
def
getOrdersDataSet
(
env
:
ExecutionEnvironment
)
:
DataSet
[
Order
]
=
{
env
.
readCsvFile
[
Order
](
ordersPath
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录