This note describes how we can show the content of an AVRO file with Python.
We use python3 as tool here. We use this from an Anaconda framework. I checked whether this installation already contained an AVRO package, but this wasn’t the case. Therefore, AVRO was downloaded (as avro-python3-1.8.2.tar.gz) and next command was issued: C:\ProgramData\Anaconda3\python.exe -m pip install C:\ProgramData\Anaconda3\avro-python3-1.8.2.tar.gz. The screen then looked like:
.
Once, AVRO was installed, an AVRO file can be created with:
import avro.schema from avro.datafile import DataFileWriter from avro.io import DatumWriter schema = avro.schema.Parse(open('file.avsc', "r").read()) writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema) writer.append({"name": "Alyssa", "favorite_number": 256}) writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"}) writer.close()
Here use was made of a file that contained the schema:
{"namespace": "example.avro", "type": "record", "name": "User", "fields": [ {"name": "name", "type": "string"}, {"name": "favorite_number", "type": ["int", "null"]}, {"name": "favorite_color", "type": ["string", "null"]} ] }
When a file was created, it can be read as:
from avro.datafile import DataFileReader from avro.io import DatumReader reader = DataFileReader(open("users.avro", "rb"), DatumReader()) for user in reader: print (user) reader.close()
This latter programme is nice: another avro file can be read. I downloaded another avro file (called twitter.avro) and the corresponding outcome looked like:
runfile('C:/Users/tmaanen/.spyder-py3/avroTomLees.py', wdir='C:/Users/tmaanen/.spyder-py3') {'username': 'miguno', 'tweet': 'Rock: Nerf paper, scissors is fine.', 'timestamp': 1366150681} {'username': 'BlizzardCS', 'tweet': 'Works as intended. Terran is IMBA.', 'timestamp': 1366154481}