flinkx同步es2hive
2023-03-15 22:02:01 时间
一.elasticsearch环境准备
- elasticsearch 6.4.3
- es-head
elasticsearch的部署基于docker进行部署
首先我们来看一下目录结构
cd docker
mkdir -p es
cd es
mkdir -p conf
mkdir -p data
mkdir -p logs
下面看一下docker-compose.yml
version: '3'
services:
elasticsearch:
image: elasticsearch:6.4.3
container_name: elasticsearch
volumes:
- $PWD/data:/usr/share/elasticsearch/data
- $PWD/logs:/user/share/elasticsearch/logs
ports:
- '9200:9200'
- '9300:9300'
environment:
- discovery.type=single-node
- http.port=9200
- http.cors.enabled=true
- http.cors.allow-origin=*
- http.cors.allow-headers=X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization
- http.cors.allow-credentials=false
- bootstrap.memory_lock=true
- 'ES_JAVA_OPTS=-Xms512m -Xmx512m'
es-head:
image: tobias74/elasticsearch-head:6
container_name: es-head
ports:
- '9100:9100'
links:
- elasticsearch
验证
{
"name" : "Yy55wm7",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "_3n-nj84QhKKpHZ71t0OKQ",
"version" : {
"number" : "6.4.3",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "fe40335",
"build_date" : "2018-10-30T23:17:19.084789Z",
"build_snapshot" : false,
"lucene_version" : "7.4.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
flinkx支持es2hive
es2hive.json
{
"job" : {
"content" : [ {
"reader": {
"name": "esreader",
"parameter": {
"address": "localhost:9200",
"index": "manage",
"type": "user",
"column": [
{
"name": "user_id",
"type": "bigint"
},{
"name": "user_name",
"type": "varchar"
},{
"name": "user_phone",
"type": "varchar"
}
]
}
},
"writer": {
"name" : "hivewriter",
"parameter" : {
"jdbcUrl" : "jdbc:hive2://localhost:10000/es",
"username" : "wangkai",
"password" : "wangkai",
"fileType" : "text",
"writeMode" : "overwrite",
"compress" : "",
"schema" : "es",
"charsetName" : "UTF-8",
"maxFileSize" : 1073741824,
"tablesColumn" : "{"demonstrate_users": [{"key": "user_id","type": "BIGINT"}, {"key": "user_name","type": "string"}, {"key": "user_phone","type": "string"}]}",
"defaultFS" : "hdfs://localhost:9000"
}
}
}
],
"setting" : {
"restore" : {
"maxRowNumForCheckpoint" : 0,
"isRestore" : false,
"restoreColumnName" : "",
"restoreColumnIndex" : 0
},
"errorLimit" : {
"record" : 0,
"percentage" : 0
},
"speed" : {
"bytes" : 1048576,
"channel" : 1
}
}
}
}
运行命令
bin/flinkx
-mode local
-job /Users/wangkai/apps/install/flinkx/es2hive.json
-pluginRoot syncplugins
-flinkconf /Users/wangkai/apps/install/flink-1.13.1/conf
相关文章
- 金融服务领域的大数据:即时分析
- 影响大数据、机器学习和人工智能未来发展的8个因素
- 从0开始构建一个属于你自己的PHP框架
- 如何将Hadoop集成到工作流程中?这6个优秀实践必看
- SEO公司使用大数据优化其模型的5种方法
- 关于Web Workers你需要了解的七件事
- 深入理解HTTPS原理、过程与实践
- 增强分析:数据和分析的未来
- PHP协程实现过程详解
- AI专家:大数据知识图谱——实战经验总结
- 关于PHP的错误机制总结
- 利用数据分析量化协同过滤算法的两大常见难题
- 怎么做大数据工作流调度系统?大厂架构师一语点破!
- 2019大数据处理必备的十大工具,从Linux到架构师必修
- OpenCV中的KMeans算法介绍与应用
- 教大家如果搭建一套phpstorm+wamp+xdebug调试PHP的环境
- CentOS下三种PHP拓展安装方法
- Go语言HTTP Server源码分析
- Go语言HTTP Server源码分析
- 2017年4月编程语言排行榜:Hack首次进入前五十