Anatomy of a Flink Program(Flink程序的剖析)
Flink program consists of the same basic parts:
1.Obtain an execution environment,获取执行环境
2.Load/create the initial data,获取数据
3.Specify transformations on this data,指定数据转换
4.Specify where to put the results of your computations,指定数据sink到哪里
5.Trigger the program execution.触发项目的执行
1.获取执行环境
You can obtain one using these static methods on StreamExecutionEnvironment
:
getExecutionEnvironment()
createLocalEnvironment()
createRemoteEnvironment(host: String, port: Int, jarFiles: String*)
For specifying data sources the execution environment has several methods to read from files using various methods: you can just read them line by line, as CSV files, or using completely custom data input formats. To just read a text file as a sequence of lines, you can use:
2.获取数据
val env = StreamExecutionEnvironment.getExecutionEnvironment()
val text: DataStream[String] = env.readTextFile("file:///path/to/file")
3.指定数据转换
You apply transformations by calling methods on DataSet with a transformation functions. For example, a map transformation looks like this:
val input: DataSet[String] = ...
val mapped = input.map { x => x.toInt }
4.指定数据sink到哪里
Once you have a DataStream containing your final results, you can write it to an outside system by creating a sink. These are just some example methods for creating a sink:
writeAsText(path: String)
print()
5.触发项目的执行
Once you specified the complete program you need to trigger the program execution by calling execute() on the StreamExecutionEnvironment.
env.execute("AppName")
The execute()
method is returning a JobExecutionResult, this contains execution times and accumulator results.
相关文章
- (6)Flink CEP SQL模拟账号短时间内异地登录风控预警
- Flink的重启策略
- Flink 读写 Ceph S3入门学习总结
- 解决hudi hms catalog中flink建表,spark无法写入问题
- 万字长文:基于Apache Hudi + Flink多流拼接(大宽表)最佳实践
- Flink SQL 知其所以然(二十九):Deduplication去重 & 获取最新状态操作
- Flink基于两阶段聚合及Roaringbitmap的实时去重方案
- Flink运行方式及对比
- 大数据Flink进阶(六):Flink入门案例
- 投入上百人、经历多次双11,Flink已经足够强大了吗?
- 数据实时化技术创新进展 | 一文览尽 Flink Forward Asia 2022 重磅干货内容
- Flink记录
- 实时化浪潮下,Apache Flink还将在大数据领域掀起怎样的变革?
- 动态分析Flink集成Oracle数据(flink入oracle)