Breaking

Flask and JSON A webserver from the command line Use the node.js server as restful app server Reading a CSV file and translate into dataframe Plotting in pandas

Analysing JSON and database tables in Spark

Door tom 7 oktober 2018

In a previous note, I showed how CSV files can be analysed. One may use the same technique to analyse JSON files or tables in a database. First, analysing JSON files can be analysed with code that looks like:

val jsonRDD = sc.wholeTextFiles("/user/tom/baby_names.json").map(x => x._2)
val namesJson = sqlContext.read.json(jsonRDD)
namesJson.registerTempTable("names")
sqlContext.sql("select * from names").collect.foreach(println)

Going to a table in a database, requires an additional jar, that allows to access a database. in case of MySQL, we may use mysql-connector-java-5.0.8-bin.jar, that must be stored in a directory next to other jars. We can then fire spark with spark-shell –jars lib/mysql-connector-java-5.0.8-bin.jar.
The code looks like:

val df1 = sqlContext.read.format("jdbc").option("url","jdbc:mysql://van-maanen.com/wordpress").option("driver","com.mysql.jdbc.Driver").option("dbtable","wp_comments").option("user","tom").option("password","filpso").load()
df1.registerTempTable("names")
sqlContext.sql("select * from names").collect.foreach(println)

Door tom

Allgemein

Calculate elapse period in Teradata

tom 20 oktober 2021

Allgemein

A pivot table in Teradata

tom 13 oktober 2021

Allgemein

Using the SAS Viya environment

tom 6 oktober 2021

Niet gecategoriseerd

Flask and JSON

Niet gecategoriseerd

A webserver from the command line

Niet gecategoriseerd

Use the node.js server as restful app server

Niet gecategoriseerd

Reading a CSV file and translate into dataframe