Scala is a language that is used for general purposes. One may use it as a statistical tool, a tool to undertake pattern matching etc. Just like any other programming tool like Java, C++, Fortran might do. But on top of that, Scala is used as a means to steer Big Data on a Hadoop platform. For me, being interested in Big Data, Scala is a worthwhile investment.
Let me first show a screenshot on Scala:

The programme is quite straightforward. A function is defined with a name codeer. It receives one variable. When this variable is an ‘a’ or ‘b’, it will be translated into a ‘1’or ‘2’. In all other cases, it won’t be translated. The variable ‘kijk’ that contains the resulting character is returned. Subsequently, one may use this function.

In the Big Data environment, one may start Scala by the command “spark-shell”. If everything is well installed, one sees something like:


It is then clear that the integration between the Big Data environment and Scala is strong in this environment. When the environment is started, one sees that a context variable is created that allows to access the Spark environment:

This then allows to retrieve a Big Data dataset as an object in Scala:

val rawblocks = sc.textFile(“linkage”)

The nice thing is that we may grab all files that are in directory “linkage” and access then as one object that is called rawblocks.