Design of this package¶
This package is designed to deserialize binary data stored in ROOT files. It is expected to be used in the context of another package, such as uproot, to fetch data buffers (either locally or over the network) and provide them to this package. It is also out of scope to take deserialized data and interpret it as numpy or awkward arrays, with some minor exceptions when performance is critical.
All deserialized data is stored as dataclass objects, inheriting from the base
ROOTSerializable type. This type has two main methods:
Members = dict[str, Any]
@dataclasses.dataclass
class ROOTSerializable:
@classmethod
def read(cls: type[T], buffer: ReadBuffer) -> tuple[T, ReadBuffer]: ...
@classmethod
def update_members(
cls, members: Members, buffer: ReadBuffer
) -> tuple[Members, ReadBuffer]: ...
The entry point for deserialization is the read method, which calls
update_members on all the subclasses in the inheritance tree to build up the
dictionary of class members (Members). Some classes may override read to
implement header parsing, or to handle layouts that are not simply in base class
order. The read_mupdate_membersembers method is only responsible for reading
the members of the class, not any base class members.
The update_members signature has a type alias in serializable.py:
ReadMembersMethod = Callable[[Members, ReadBuffer], tuple[Members, ReadBuffer]]
these is used to define more advanced types, and to set up the @serializable
decorator.
Note that python dataclasses inherit members from multiple bases in reverse order, so
@dataclass
class Base1:
a: int
b: int
@dataclass
class Base2:
c: int
d: int
@dataclass
class Derived(Base1, Base2):
e: int
will have the following order of members: c, d, a, b, e.
Deserialization of numeric types¶
We use python struct to read numeric types from the buffer. Classes that
contain numeric members can be declared as:
@serializable
class TMyClass(ROOTSerializable):
fInt: Annotated[int, Fmt(">i")]
fFloat: Annotated[float, Fmt(">f")]
where the Fmt type is a descriptor that will be used by the @serializable
decorator to auto-generate the appropriate read_members implementation.
Annotated builtin types vs. objects¶
In several places, we have the option to “pythonize” the data structure, by
using builtin python types where they fully capture the semantics of a given
ROOT type. For example, TString is a variable length bytestring, and can be
represented as a bytes in python. We could have class members, such as the
name and title of a TNamed, either be represented by TString:
@serializable
class TString(ROOTSerializable):
fString: bytes
@serializable
class TNamed(TObject):
fName: TString
fTitle: TString
or by bytes:
@serializable
class TNamed(TObject):
fName: Annotated[bytes, TString]
fTitle: Annotated[bytes, TString]
where the @serializable decorator takes care of the conversion between str
and TString. In this library, we will prefer to use the second approach when
feasible.