Schema:
qstruct Account { id @0 uint64; balance @1 double; } qstruct User { id @0 uint64; username @1 string; email @3 string; accounts @2 Account[]; } |
Perl usage:
my $msg = User->encode({ id => 389, username => 'jimmy', email => 'jimmy@example.com', accounts => [ { id => 811, balance => 23.3 }, { id => 131, balance => -900 }, ] }); my $user = User->decode($msg); say "User id is " . $user->id; foreach (@{ $user->accounts }) { $balance += $_->balance; } |
my $msg = User->encode({ name => "hello world!", id => 100, is_admin => 1, is_locked => 1, });
00000000 00 00 00 00 00 00 00 00 20 00 00 00 01 00 00 00 |........ .......| |---------------------header-------------------| 00000010 64 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 |d...............| |--------id (@0)------| || |----free space----| || |->is_admin|is_locked (@1|@3) 00000020 0c 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 00 00 |.hello world!...| || |--------inline string data--------| |--pad-| || |-> name (@2) tag byte indicating length of inline string
HDR: 00000000 00 00 00 00 00 00 00 00 20 00 00 00 01 00 00 00 |........ .......| CONT: 00000010 64 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 |d...............| CONT: 00000020 00 18 00 00 00 00 00 00 30 00 00 00 00 00 00 00 |........0.......| |-------size << 8-----| |----start offset-----| HEAP: 00000030 74 6f 6f 20 6c 6f 6e 67 20 66 6f 72 20 74 61 67 |too long for tag| HEAP: 00000040 67 65 64 20 73 69 7a 65 |ged size|
sha1_checksum @0 uint8[32];
emails @1 string[];
addrs @2 Address[];
/* Here is the format we are parsing: */ qstruct User { id @0 uint64; active @4 bool; name @2 string; email_addrs @3 string[]; # dynamic array (pointer-based) sha256_checksum @1 uint8[32]; # fixed array (inlined in body) accounts @5 Account[]; # array of nested Account qstructs }
my $jpg = $user->profile_image;With "decoded" qstruct objects there is an alternative zero-copy accessor:
$user->profile_image(my $jpg);Especially with larger data, copying is bad because it takes time and it puts often needless pressure on your memory system especially if you are only interested in a portion of the data. Example: the profile image is interlaced/progressive.
Perl 5.20 adds copy-on-write string copies/slices which can accomplish roughly the same thing
say "First acct id: " . $user->accounts->[0]->id;And even though this loop accesses each element, it avoids copying unnecessary data out of each one:
foreach my $p (@{ $user->profiles }) { push @profiles, $p->id; ## Not touching $p->image_data }
$user->emails->get(0, my $first_email);And zero-copy iteration:
$user->emails->foreach(sub { say "Email: ", $_[0]; });
static QSTRUCT_INLINE int qstruct_get_uint64(char *buf, size_t buf_size, uint32_t body_index, uint32_t byte_offset, uint64_t *output) { QSTRUCT_GETTER_PREAMBLE(8) if (exceeds_bounds) { *output = 0; // default value } else { QSTRUCT_LOAD_8BYTE_LE(buf + actual_offset, output); } return 0; }And used like so (64-bit unsigned int, 0th body, 24 byte offset):
int ret = qstruct_get_uint64(buf, buf_size, 0, 24, &my_uint64);
0x0000000000400590 <+0>: cmp $0xf,%rsi 0x0000000000400594 <+4>: jbe 0x4005b1 0x0000000000400596 <+6>: mov 0xc(%rdi),%ecx 0x0000000000400599 <+9>: mov 0x8(%rdi),%edx 0x000000000040059c <+12>: test %ecx,%ecx 0x000000000040059e <+14>: je 0x4005b1 0x00000000004005a0 <+16>: xor %eax,%eax 0x00000000004005a2 <+18>: cmp $0x1f,%edx 0x00000000004005a5 <+21>: jbe 0x4005b1 0x00000000004005a7 <+23>: cmp $0x17,%rsi 0x00000000004005ab <+27>: jbe 0x4005b1 0x00000000004005ad <+29>: mov 0x28(%rdi),%rax 0x00000000004005b1 <+33>: repz retqQSTRUCT_GETTER_PREAMBLE: Bounds checking and default values — I'm planning on optimising this. Also, if you access many values, you will only need to do preamble once.