Examples of org.lilyproject.repository.api.RecordId.toBytes()

Returns the byte representation of this record id.

The bytes representation of record id's is designed such that they would provide a meaningful row sort order in HBase, and be usable for scan operations. The encoding is such that when an ID in un-encoded form is a prefix of another ID, it remains a prefix when encoded as bytes. This allows for prefix-scanning a range of records. (Of course, this only applies to user-specified IDs, not to UUID's).

The format for a master record id is as follows:

{identifier byte}{basic byte representation}

Where the identifier byte is (byte)0 for a USER record id, and (byte)1 for a UUID record id.

The {identifier byte} is put at the start because otherwise UUIDs and USER-id's would be intermingled, preventing meaningful scan operations on USER id's.

In case there are variant properties:

For USER record id's, a zero byte (NULL character) is appended to mark the end of the master record id. By consequence, use of the (non-printable) zero byte is forbidden in the master record id. The reason for choosing the zero byte is because it sorts before any other byte: this makes that the record id's of variants and their master will be sorted together, without any other master record in between. Any other master record id would have a byte larger than zero at that position.
For UUID record id's, there is no separator byte between the master and the properties, since the UUID has a fixed length of 16 bytes. This also makes that the variant properties do not influence the sort order among the master record id's

The variant properties themselves are written as:

({key string utf8 length}{key string in utf8}{value string utf8 length}{value string in utf8})*

There is no separator between the key-value pairs, as this is not needed. The key-value pairs are always sorted by key.