Another example shows a similar idea. In this example a stream is created. This stream consists of 3 objects that contain a name and a number. Once the stream is created, it is serialised. In other words: the stream is prepared to be stored. It is stored in a file that is called “test.avro”.
Before continuing, one remark on serialisation.
The idea on serialisation is that one creates a format that is understandable outside the original language. An object that is created in Java can only be handled inside Java. To communicate the content, one needs to use a format that is underable by other languages, such as a string or an integer. The translation from an object into strings/integers is called serialisation. One then creates something that is understood outside Java. In this case, everyting will be translated into strings and integers. These are wrtten to a file. They can be understood by, say, PHP or Oracle. The strings and integers are written to a file. That file can be read by Oracle or PHP as they will only encounter strings/integers that can be transmitted from say Java to Oracle/ PHP.
In that file, we may detect the scheme along which the data are stored and the actual data. It takes a bit of courage as it is a binary file. Subsequently, it will be read from that file and the contents is shown.
The programme is written in Java. It reads like:
package avro;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.avro.Schema;
import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.io.Encoder;
import org.apache.avro.io.EncoderFactory;
import org.apache.avro.util.Utf8;
@SuppressWarnings("deprecation")
class EmployeeTom
{
public static Schema SCHEMA;
static {
try {
SCHEMA = Schema.parse(EmployeeTom.class.getResourceAsStream("EmployeeTom.avsc"));
}
catch (IOException e)
{
System.out.println("Couldn't load a schema: "+e.getMessage());
}
}
private String name;
private int age;
public EmployeeTom(String name, int age){
this.name = name;
this.age = age;
}
public GenericData.Record serialize() {
GenericData.Record record = new GenericData.Record(SCHEMA);
record.put("name", this.name);
record.put("age", this.age);
return record;
}
public static void testWrite(File file, EmployeeTom[] people) throws IOException {
GenericDatumWriter datum = new GenericDatumWriter(EmployeeTom.SCHEMA);
DataFileWriter writer = new DataFileWriter(datum);
writer.create(EmployeeTom.SCHEMA, file);
for (EmployeeTom p : people)
writer.append(p.serialize());
writer.close();
}
public static void testRead(File file) throws IOException {
GenericDatumReader datum = new GenericDatumReader();
DataFileReader reader = new DataFileReader(file, datum);
GenericData.Record record = new GenericData.Record(reader.getSchema());
while (reader.hasNext()) {
reader.next(record);
System.out.println("Name " + record.get("name") +
" Age " + record.get("age") );
}
reader.close();
}
public static void main(String[] args) {
EmployeeTom e1 = new EmployeeTom("Joe",31);
EmployeeTom e2 = new EmployeeTom("Jane",30);
EmployeeTom e3 = new EmployeeTom("Zoe",21);
EmployeeTom[] all = new EmployeeTom[] {e1,e2,e3};
File bf = new File("test.avro");
try {
testWrite(bf,all);
testRead(bf);
}
catch (IOException e) {
System.out.println("Main: "+e.getMessage());
}
}
}
A final remark. I stored the schema in the same directory as the class files. This allowed the class EmployeeTom to find the schema file. The schema looked like:
{
"type": "record",
"name": "Employee",
"fields": [
{"name": "name", "type": "string"},
{"name": "age", "type": "int"}
]
}