Tutorial: buildvm Gotchas¶
Introduction¶
This document sheds some light on how functions from Lua standard library (builtins) are “registered” to become available to the platform’s end users. Following platform peculiarities should be taken into account:
Implementation of some builtins may be scattered across the code base. E.g. a “fast path” may be implemented as a fast function inside the VM code, and the “slow path” requiring some more processing or throwing an error may be implemented in C. However, some builtins may be “fast functions”-only, and finally, the third group is C-only builtins.
Some builtins may share a common “slow path”.
Implementation of some builtins may require access to certain upvalues.
You may be interested in this talk about LuaJIT’s original build chain.
buildvm Utility¶
Prior to building the platform’s core, a special utility called buildvm is built and ran. This utility scans the core codebase and, based on special macros, generates some headers (*def.h) which are used inside uj_lib.c for building the standard library.
Builtins Accessing Upvalues¶
As you probably know, pairs and ipairs are implemented via a call to to next. Because of late binding, it is crucial to know the exact location of the original next in run-time, and _G.next is obviously not an option. To solve the issue, the “native” implementation of next is set as an upvalue for both pairs and ipairs. In other words, the platform executes something like:
local next = _G.next -- here, _G.next is our builtin
_G.pairs = function(...)
-- next is used somewhere here
end
But how can we emulate this behaviour? Using following magic macros:
LJLIB_ASM(next)
{
lj_lib_checktab(L, 1);
return FFH_UNREACHABLE;
}
...
LJLIB_PUSH(lastcl)
LJLIB_ASM(pairs)
{
return ffh_pairs(L, MM_pairs);
}
buildvm scans src/lib/base.c and upon parsing LJLIB_PUSH(lastcl), stores a byte 253 (0xfd aka LIBINIT_LASTCL) into the array lj_lib_init_base (auto-generated in lj_libdef.h). After that, uj_lib_register will be run inside luaopen_base: At this time, the LIBINIT_LASTCL byte will be read from the lj_lib_init_base array, and the last registered builtin (next in our case, the order of definitions does matter here) will be pushed on the coroutine’s stack, which will be accessed an an upvalue during registering the pairs builtin.