Another example shows a similar idea. In this example a stream is created. This stream consists of 3 objects that contain a name and a number. Once the stream is created, it is serialised. In other words: the stream is prepared to be stored. It is stored in a file that is called “test.avro”.
Before continuing, one remark on serialisation.
The idea on serialisation is that one creates a format that is understandable outside the original language. An object that is created in Java can only be handled inside Java. To communicate the content, one needs to use a format that is underable by other languages, such as a string or an integer. The translation from an object into strings/integers is called serialisation. One then creates something that is understood outside Java. In this case, everyting will be translated into strings and integers. These are wrtten to a file. They can be understood by, say, PHP or Oracle. The strings and integers are written to a file. That file can be read by Oracle or PHP as they will only encounter strings/integers that can be transmitted from say Java to Oracle/ PHP.
In that file, we may detect the scheme along which the data are stored and the actual data. It takes a bit of courage as it is a binary file. Subsequently, it will be read from that file and the contents is shown.
The programme is written in Java. It reads like:
package avro; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import org.apache.avro.Schema; import org.apache.avro.file.DataFileReader; import org.apache.avro.file.DataFileWriter; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericDatumWriter; import org.apache.avro.io.Encoder; import org.apache.avro.io.EncoderFactory; import org.apache.avro.util.Utf8; @SuppressWarnings("deprecation") class EmployeeTom { public static Schema SCHEMA; static { try { SCHEMA = Schema.parse(EmployeeTom.class.getResourceAsStream("EmployeeTom.avsc")); } catch (IOException e) { System.out.println("Couldn't load a schema: "+e.getMessage()); } } private String name; private int age; public EmployeeTom(String name, int age){ this.name = name; this.age = age; } public GenericData.Record serialize() { GenericData.Record record = new GenericData.Record(SCHEMA); record.put("name", this.name); record.put("age", this.age); return record; } public static void testWrite(File file, EmployeeTom[] people) throws IOException { GenericDatumWriter datum = new GenericDatumWriter(EmployeeTom.SCHEMA); DataFileWriter writer = new DataFileWriter(datum); writer.create(EmployeeTom.SCHEMA, file); for (EmployeeTom p : people) writer.append(p.serialize()); writer.close(); } public static void testRead(File file) throws IOException { GenericDatumReader datum = new GenericDatumReader(); DataFileReader reader = new DataFileReader(file, datum); GenericData.Record record = new GenericData.Record(reader.getSchema()); while (reader.hasNext()) { reader.next(record); System.out.println("Name " + record.get("name") + " Age " + record.get("age") ); } reader.close(); } public static void main(String[] args) { EmployeeTom e1 = new EmployeeTom("Joe",31); EmployeeTom e2 = new EmployeeTom("Jane",30); EmployeeTom e3 = new EmployeeTom("Zoe",21); EmployeeTom[] all = new EmployeeTom[] {e1,e2,e3}; File bf = new File("test.avro"); try { testWrite(bf,all); testRead(bf); } catch (IOException e) { System.out.println("Main: "+e.getMessage()); } } }
A final remark. I stored the schema in the same directory as the class files. This allowed the class EmployeeTom to find the schema file. The schema looked like:
{ "type": "record", "name": "Employee", "fields": [ {"name": "name", "type": "string"}, {"name": "age", "type": "int"} ] }