Flink is able to process streaming data based on different notions of _time_.
Flink能够根据_time_的不同概念处理流式数据。
* _Processing time_ refers to the system time of the machine (also known as “wall-clock time”) that is executing the respective operation.
* _Event time_ refers to the processing of streaming data based on timestamps which are attached to each row. The timestamps can encode when an event happened.
* _Ingestion time_ is the time that events enter Flink; internally, it is treated similarly to event time.
For more information about time handling in Flink, see the introduction about [Event Time and Watermarks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_time.html).
有关Flink中时间处理的更多信息,请参阅有关 [Event Time and Watermarks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_time.html)的介绍。
This pages explains how time attributes can be defined for time-based operations in Flink’s Table API & SQL.
本页介绍了如何在Flink的Table API和SQL中为基于时间的操作定义时间属性。
## Introduction to Time Attributes
## 时间属性简介
Time-based operations such as windows in both the [Table API](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/tableApi.html#group-windows) and [SQL](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#group-windows) require information about the notion of time and its origin. Therefore, tables can offer _logical time attributes_ for indicating time and accessing corresponding timestamps in table programs.
基于时间的操作,例如 [Table API](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/tableApi.html#group-windows) 和 [SQL](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#group-windows) 中的窗口,需要有关时间概念及其来源的信息。因此,表可以提供 _logical time attributes_ 用于指示时间和访问表程序中的相应时间戳。
Time attributes can be part of every table schema. They are defined when creating a table from a `DataStream` or are pre-defined when using a `TableSource`. Once a time attribute has been defined at the beginning, it can be referenced as a field and can be used in time-based operations.
As long as a time attribute is not modified and is simply forwarded from one part of the query to another, it remains a valid time attribute. Time attributes behave like regular timestamps and can be accessed for calculations. If a time attribute is used in a calculation, it will be materialized and becomes a regular timestamp. Regular timestamps do not cooperate with Flink’s time and watermarking system and thus can not be used for time-based operations anymore.
Processing time allows a table program to produce results based on the time of the local machine. It is the simplest notion of time but does not provide determinism. It neither requires timestamp extraction nor watermark generation.
There are two ways to define a processing time attribute.
有两种方法可以定义处理时间属性。
### During DataStream-to-Table Conversion
### 在 DataStream 到 Table 转换期间
The processing time attribute is defined with the `.proctime` property during schema definition. The time attribute must only extend the physical schema by an additional logical field. Thus, it can only be defined at the end of the schema definition.
@@ -83,9 +83,9 @@ val windowedTable = table.window(Tumble over 10.minutes on 'UserActionTime as 'u
### Using a TableSource
### 使用 TableSource
The processing time attribute is defined by a `TableSource` that implements the `DefinedProctimeAttribute` interface. The logical time attribute is appended to the physical schema defined by the return type of the `TableSource`.
Event time allows a table program to produce results based on the time that is contained in every record. This allows for consistent results even in case of out-of-order events or late events. It also ensures replayable results of the table program when reading records from persistent storage.
Additionally, event time allows for unified syntax for table programs in both batch and streaming environments. A time attribute in a streaming environment can be a regular field of a record in a batch environment.
In order to handle out-of-order events and distinguish between on-time and late events in streaming, Flink needs to extract timestamps from events and make some kind of progress in time (so-called [watermarks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_time.html)).
The event time attribute is defined with the `.rowtime` property during schema definition. [Timestamps and watermarks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_time.html) must have been assigned in the `DataStream` that is converted.
在模式定义期间使用 `.rowtime` 属性定义事件时间属性。[Timestamps and watermarks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_time.html) 必须在已转换的 `DataStream` 中分配。
There are two ways of defining the time attribute when converting a `DataStream` into a `Table`. Depending on whether the specified `.rowtime` field name exists in the schema of the `DataStream` or not, the timestamp field is either
In either case the event time timestamp field will hold the value of the `DataStream` event time timestamp.
在任何一种情况下,事件时间时间戳字段都将保存 `DataStream` 事件时间戳的值。
...
...
@@ -222,11 +222,11 @@ val windowedTable = table.window(Tumble over 10.minutes on 'UserActionTime as 'u
### Using a TableSource
### 使用 TableSource
The event time attribute is defined by a `TableSource` that implements the `DefinedRowtimeAttributes` interface. The `getRowtimeAttributeDescriptors()` method returns a list of `RowtimeAttributeDescriptor` for describing the final name of a time attribute, a timestamp extractor to derive the values of the attribute, and the watermark strategy associated with the attribute.
Please make sure that the `DataStream` returned by the `getDataStream()` method is aligned with the defined time attribute. The timestamps of the `DataStream` (the ones which are assigned by a `TimestampAssigner`) are only considered if a `StreamRecordTimestamp` timestamp extractor is defined. Watermarks of a `DataStream` are only preserved if a `PreserveWatermarks` watermark strategy is defined. Otherwise, only the values of the `TableSource`’s rowtime attribute are relevant.