This note describes how we can show the content of an AVRO file with Python.
We use python3 as tool here. We use this from an Anaconda framework. I checked whether this installation already contained an AVRO package, but this wasn’t the case. Therefore, AVRO was downloaded (as avro-python3-1.8.2.tar.gz) and next command was issued: C:\ProgramData\Anaconda3\python.exe -m pip install C:\ProgramData\Anaconda3\avro-python3-1.8.2.tar.gz. The screen then looked like:
.
Once, AVRO was installed, an AVRO file can be created with:
import avro.schema
from avro.datafile import DataFileWriter
from avro.io import DatumWriter
schema = avro.schema.Parse(open('file.avsc', "r").read())
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
writer.append({"name": "Alyssa", "favorite_number": 256})
writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
writer.close()
Here use was made of a file that contained the schema:
{"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
When a file was created, it can be read as:
from avro.datafile import DataFileReader
from avro.io import DatumReader
reader = DataFileReader(open("users.avro", "rb"), DatumReader())
for user in reader:
print (user)
reader.close()
This latter programme is nice: another avro file can be read. I downloaded another avro file (called twitter.avro) and the corresponding outcome looked like:
runfile('C:/Users/tmaanen/.spyder-py3/avroTomLees.py', wdir='C:/Users/tmaanen/.spyder-py3')
{'username': 'miguno', 'tweet': 'Rock: Nerf paper, scissors is fine.', 'timestamp': 1366150681}
{'username': 'BlizzardCS', 'tweet': 'Works as intended. Terran is IMBA.', 'timestamp': 1366154481}