zl程序教程

您现在的位置是:首页 >  大数据

当前栏目

Anatomy of a Flink Program(Flink程序的剖析)

flink程序 of 剖析 program
2023-09-14 09:14:46 时间

Flink program consists of the same basic parts:

1.Obtain an execution environment,获取执行环境
2.Load/create the initial data,获取数据
3.Specify transformations on this data,指定数据转换
4.Specify where to put the results of your computations,指定数据sink到哪里
5.Trigger the program execution.触发项目的执行

1.获取执行环境
You can obtain one using these static methods on StreamExecutionEnvironment:

getExecutionEnvironment()

createLocalEnvironment()

createRemoteEnvironment(host: String, port: Int, jarFiles: String*)

For specifying data sources the execution environment has several methods to read from files using various methods: you can just read them line by line, as CSV files, or using completely custom data input formats. To just read a text file as a sequence of lines, you can use:

2.获取数据

val env = StreamExecutionEnvironment.getExecutionEnvironment()

val text: DataStream[String] = env.readTextFile("file:///path/to/file")

3.指定数据转换
You apply transformations by calling methods on DataSet with a transformation functions. For example, a map transformation looks like this:

val input: DataSet[String] = ...

val mapped = input.map { x => x.toInt }

4.指定数据sink到哪里
Once you have a DataStream containing your final results, you can write it to an outside system by creating a sink. These are just some example methods for creating a sink:

writeAsText(path: String)

print()

5.触发项目的执行
Once you specified the complete program you need to trigger the program execution by calling execute() on the StreamExecutionEnvironment.

env.execute("AppName")

The execute() method is returning a JobExecutionResult, this contains execution times and accumulator results.