Setting up Scala sbt, Spark, Hadoop and IntelliJIdea in Windows

1 minute read

We briefly describe the steps to setup Scala sbt, Spark, Hadoop, and IntelliJ Idea in Windows computer.

First, please follow the tools setup instructions to install the necessary JDK and Scala sbt. Then, follow this IntelliJIdea tutorial to setup the IDE.

Next, we focus on Spark and Hadoop.

  1. Download the latest version of Spark from Spark website.
    • Choose a Spark release: choose your desired version (I chose spark-2.2.1)
    • Choose a package type: select Pre-built for Apache Hadoop 2.7 and later
    • Then, click on download spark-x.x.x-bin-hadoop2.7.tgz
  2. Unzip the Spark into your C drive, for example unzip to: C:\spark-2.2.1-bin-hadoop2.7\

  3. Download the entire winutils distribution from this repository. Form the downloaded repo, copy the folder corresponding to your installed Hadoop version to a location on your C drive. In my case I copied hadoop-2.7.1 folder to C:\. Then, rename the hadoop-x.x.x folder to winutil.

  4. Go to your Environment Variables setting and set the following System Variables:

    Variable Value
    HADOOP_HOME C:\winutil\
    SPARK_HOME C:\spark-2.2.1-bin-hadoop2.7\
    Path C:\spark-2.2.1-bin-hadoop2.7\bin
    Path C:\winutil\bin

To verify installation of Spark, open a comannd prompt and navigate to your SPARK_HOME directory, for example go to C:\spark-2.2.1-bin-hadoop2.7\. Run bin\spark-shell which should start up Scala console with a Spark session as sc. Exit the shell via :quit command.

spark

That is it, we are done. Yay!

Leave a comment