Windows上搭建Standalone模式的Spark环境
2023-09-11 14:17:43 时间
安装Java8,设置JAVA_HOME,并添加 %JAVA_HOME%\bin 到环境变量PATH中
E:\java -version java version "1.8.0_60" Java(TM) SE Runtime Environment (build 1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)Scala
下载解压Scala 2.11,设置SCALA_HOME,并添加 %SCALA_HOME%\bin 到PATH中
E:\ scala -verion Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFLSpark
下载解压Spark 2.1, 设置SPARK_HOME,并添加 %SPARK_HOME%\bin 到PATH中,此时尝试在控制台运行spark-shell,出现如下错误提示无法定位winutils.exe。
E:\ spark-shell Using Sparks default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 17/06/05 21:34:43 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:379) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:394) at org.apache.hadoop.util.Shell. clinit (Shell.java:387) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:2327) at org.apache.hadoop.hive.conf.HiveConf$ConfVars. clinit (HiveConf.java:365) at org.apache.hadoop.hive.conf.HiveConf. clinit (HiveConf.java:105) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:229) at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:991) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:92) at $line3.$read iw. init ( console :15) at $line3.$read iw. init ( console :42) at $line3.$read. init ( console :44) at $line3.$read$. init ( console :48) at $line3.$read$. clinit ( console ) at $line3.$eval$.$print$lzycompute( console :7) at $line3.$eval$.$print( console :6) at $line3.$eval.$print( console ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at org.apache.spark.repl.SparkILoop anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38) at org.apache.spark.repl.SparkILoop anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105) at scala.tools.nsc.interpreter.ILoop anonfun$process$1.apply$mcZ$sp(ILoop.scala:920) at scala.tools.nsc.interpreter.ILoop anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:69) at org.apache.spark.repl.Main$.main(Main.scala:52) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit runMain(SparkSubmit.scala:743) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
从错误消息中可以看出Spark需要用到Hadoop中的一些类库(通过HADOOP_HOME环境变量,因为我们之前并未设置过,所以文件路径null\bin\winutils.exe里面出现了null),但这并不意味这我们一定要安装Hadoop,我们可以直接下载所需要的winutils.exe到磁盘上的任何位置,比如C:\winutils\bin\winutils.exe,同时设置 HADOOP_HOME=C:winutils 。
现在我们再次运行spark-shell,又有一个新的错误:
java.lang.IllegalArgumentException: Error while instantiating org.apache.spark.sql.hive.HiveSessionState: at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession reflect(SparkSession.scala:981) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at org.apache.spark.sql.SparkSession$Builder anonfun$getOrCreate$5.apply(SparkSession.scala:878) at org.apache.spark.sql.SparkSession$Builder anonfun$getOrCreate$5.apply(SparkSession.scala:878) at scala.collection.mutable.HashMap anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96) ... 47 elided Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating org.apache.spark.sql.hive.HiveExternalCatalog: at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession reflect(SparkSession.scala:978) ... 58 more Caused by: java.lang.IllegalArgumentException: Error while instantiating org.apache.spark.sql.hive.HiveExternalCatalog: at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState reflect(SharedState.scala:169) at org.apache.spark.sql.internal.SharedState. init (SharedState.scala:86) at org.apache.spark.sql.SparkSession anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState. init (SessionState.scala:157) at org.apache.spark.sql.hive.HiveSessionState. init (HiveSessionState.scala:32) ... 63 more Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState reflect(SharedState.scala:166) ... 71 more Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262) at org.apache.spark.sql.hive.HiveExternalCatalog. init (HiveExternalCatalog.scala:66) ... 76 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.HiveClientImpl. init (HiveClientImpl.scala:188) ... 84 more Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) ... 85 more console :14: error: not found: value spark import spark.implicits._ console :14: error: not found: value spark import spark.sql Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ _/ /___/ .__/\_,_/_/ /_/\_\ version 2.1.1 Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60) Type in expressions to have them evaluated. Type :help for more information. scala
错误消息中提示零时目录 /tmp/hive 没有写的权限:
The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
所以我们需要更新E:/tmp/hive的权限(我在E盘下运行的spark-shell命令,如果在其他盘运行,就改成对应的盘符+/tmp/hive)。运行如下命令:
E:\ C:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive
再次运行spark-shell,spark启动成功。此时可以通过 http://localhost:4040 来访问Spark UI
Windows操作系统:测试模式禁用数字签名 程序必须在特殊的启动环境下才能正常使用,尤其是一些非官方或者需要数字签名的驱动程序,而Windows测试模式可以帮助我们解决类似的问题,开启后会帮助我们禁用驱动程序强制签名
windows server 2012 R2 远程桌面授权模式尚未配置 windows server 2012 R2 远程桌面授权模式尚未配置,远程桌面服务将在120天内停止工作。如何破解这个宽限期,目前企业7位协议号码均不包含2012 R2以上授权。 那么只能蛋疼的“破解”咯。
相关文章
- Windows安装RocketMQ
- python 在Windows中描述路径时出现的问题
- Windows启动控制台登录模式
- EasyPlayerPro windows播放器之多窗口播放音量控制方法
- EasyPlayerPro(Windows)流媒体播放器开发之框架讲解
- ubuntu 连接windows远程桌面 &&rdesktop 退出全屏模式
- MFC Windows 程序设计[159]之多彩多样滑动条
- MFC Windows 程序设计[十九]之彩色对话框的创建和编辑二
- windows映射模式
- Install MongoDB Community Edition on Windows
- Sql Server连接数据库(Windows模式及不用密码及用户名模式)
- Windows之Chocolatey:windows下的apt-get即Chocolatey简介、安装、使用方法之详细攻略
- windows环境下GMP静态库安装
- chromium在windows上的编译 构建 Checking out and Building Chromium for Windows
- Windows 10 设置电脑不锁屏
- 1、Windows安装Nacos,单机启动