es-hadoop saveToEsWithMeta
2023-09-14 09:11:57 时间
@Test def testEsRDDWriteWithDynamicMapping() { val doc1 = Map("one" -> null, "two" -> Set("2"), "three" -> (".", "..", "..."), "number" -> 1) val doc2 = Map("OTP" -> "Otopeni", "SFO" -> "San Fran", "number" -> 2) val target = wrapIndex("spark-test/scala-dyn-id-write") val pairRDD = sc.makeRDD(Seq((3, doc1), (4, doc2))).saveToEsWithMeta(target, cfg) assertEquals(2, EsSpark.esRDD(sc, target).count()); assertTrue(RestUtils.exists(target + "/3")) assertTrue(RestUtils.exists(target + "/4")) assertThat(RestUtils.get(target + "/_search?"), containsString("SFO")) } @Test def testEsRDDWriteWithDynamicMapMapping() { val doc1 = Map("one" -> null, "two" -> Set("2"), "three" -> (".", "..", "..."), "number" -> 1) val doc2 = Map("OTP" -> "Otopeni", "SFO" -> "San Fran", "number" -> 2) val target = wrapIndex("spark-test/scala-dyn-id-write") val metadata1 = Map(ID -> 5, TTL -> "1d") val metadata2 = Map(ID -> 6, TTL -> "2d", VERSION -> "23") assertEquals(5, metadata1.getOrElse(ID, null)) assertEquals(6, metadata2.getOrElse(ID, null)) val pairRDD = sc.makeRDD(Seq((metadata1, doc1), (metadata2, doc2))) pairRDD.saveToEsWithMeta(target, cfg) assertTrue(RestUtils.exists(target + "/5")) assertTrue(RestUtils.exists(target + "/6")) assertThat(RestUtils.get(target + "/_search?"), containsString("SFO")) }
spark-2.0.0-bin-hadoop2.6/bin/spark-shell --jars elasticsearch-hadoop-5.0.1/dist/elasticsearch-spark-20_2.11-5.0.1.jar
注意版本对应关系!
import org.apache.spark.SparkConf import org.elasticsearch.spark._ sc.getConf.setMaster("local").setAppName("RDDTest").set("es.nodes", "127.0.0.1").set("es.index.auto.create", "true"); val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3) val airports = Map("OTP" -> "Otopeni", "SFO" -> "San Fran") val r=sc.makeRDD(Seq(numbers, airports)) r.saveToEs("spark/data") val doc1 = Map("one" -> null, "two" -> Set("2"), "three" -> (".", "..", "..."), "number" -> 1) val doc2 = Map("OTP" -> "Otopeni", "SFO" -> "San Fran", "number" -> 2) val pairRDD = sc.makeRDD(Seq((3, doc1), (4, doc2))) pairRDD.saveToEsWithMeta("data/test")
可以看到ES请求data/test/3中id为3的文档,data/test/4中id为4的文档!
相关文章
- es 大批量写入提高性能的策略
- es Elasticsearch 操作手册
- OpenGL ES着色器语言之着色概览(官方文档)
- Hadoop的环境搭建,和编写一个简单的hadoop job
- 自定义Spark Partitioner提升es-hadoop Bulk效率
- OpenGL ES 2.0编程指南中文版.
- OpenGL ES一些函数详解(一)
- TextureView+SurfaceTexture+OpenGL ES来播放视频(二)
- IBASE and ES change pointer
- Atitit Hadoop的MapReduce的执行过程、数据流的一点理解。 目录 1. Why 为什么使用hadoop1 2. Hadoop的MapReduce的执行过程1 2.1. Had
- 【SpringBoot笔记28】SpringBoot集成ES数据库之操作doc文档(创建、更新、删除、查询)
- 【SpringBoot笔记27】SpringBoot集成ES数据库之操作index索引(创建、删除、获取)
- 【异常】Dbeaver配置JDBC连接ES时提示SQL错误:current license is non-compliant for [jdbc]
- 音视频开发(二十九):FFmpeg +OpenSL ES实现音频解码和播放
- Hadoop MapReduce执行过程详解(带hadoop例子)
- Hadoop之—— CentOS Warning: $HADOOP_HOME is deprecated解
- web未授权访问漏洞总结——非常全而细致 redis、mongodb、jenkins、zk、es、memcache、hadoop、couchdb、docker
- lucene IndexOptions可以设置DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS DOCS,ES里也可以设置
- Android12之OpenSL ES中Realize实例化AudioManager(十二)