Tutorial: for-iterators¶
Generic for-iterator¶
Consider the following piece of lua code:
-- test.lua
local tbl = {'a', 'b'}
for k,v in ipairs(tbl) do
print(v)
end
This code is translated in the following bytecode by the LuaVela frontend:
$ ./luajit -bl ./test.lua
-- BYTECODE -- test.lua:0-6
0001 TDUP 0 0
0002 GGET 1 1 ; "ipairs"
0003 MOV 2 0
0004 CALL 1 4 2
0005 JMP 4 => 0009
0006 GGET 6 2 ; "print"
0007 MOV 7 5
0008 CALL 6 1 2
0009 ITERC 4 3 3
0010 ITERL 4 => 0006
0011 RET0 0 1
Instruction 0001 fetches table constant (described by 'tbl'
in the original code) and places it into stack slot 0 (referred to as #0). Instructions 0002 through 0004 perform the call to ipairs
described by 'ipairs(tbl)'
in the original code. After this call, #1 contains ipairs
iterator function, #2 contains 'tbl'
(it was moved there by 0003 and did not change since then) and #3 contains numeric zero. This behaviour of ipairs
is described in the Lua reference
manual.
The return values of ipairs
is everything that is need to iterate through the array part of the table: it contains the table itself, the induction variable being the array index of the current element (already defaulted to zero) and the function that is able to deduce the next element.
Once the loop context is prepared, 0005 jumps to the end of iteration instruction 0009 to trigger the first iteration.
Iterator function is designed to perform the induction by returning either nil
(which designates end of the loop), or new value of the induction variable and corresponding array value (these are actually the 'k,v'
pair in the initial Lua code). Two output values are stored in #4 and #5 (ipairs_aux
iterator is merely a standard Lua function and ITERC
sets its base to (4+1) in order to get the return values right behind the loop context). After that, ITERL
checks that #4 is not nil
, copies it to the induction variable in loop context and jumps to the iteration body. Within the iteration body, lua might freely operate with k
and v
as stored in #4 and #5 by the iterator function implicitly called by ITERC
.
As a result, the following stack layout is maintained during the loop:
- #1, #2 and #3 contains the loop context. #3 is updated by
ITERL
before the new iteration starts.- #4 and #5 contains the values of
k
andv
as returned by the iterator function (called byITERC
) and as seen by the loop body.- #6+ are at the disposal of the loop body (local variables, parameters and return values of the callee functions etc.).
Note
In order to get better understanding of for-iterators, let’s write a code that uses custom for-iterator to only query elements from the array until some particular element is encountered:
-- test-hates-seven.lua
function ipairs_hates_seven_aux(table, index)
new_index = index + 1
if (new_index > #table or table[new_index] == 7) then
return nil
else
return new_index, table[new_index]
end
end
function ipairs_hates_seven(table)
return ipairs_hates_seven_aux, table, 0
end
local tbl = {1, 2, 3, 4, 7, 12, 13}
for index,value in ipairs_hates_seven(tbl) do
print('tbl[' .. index .. '] = ' .. value)
end
$ ./luajit ./test-hates-seven.lua
tbl[1] = 1
tbl[2] = 2
tbl[3] = 3
tbl[4] = 4
There are no restrictions on how the induction variable is incremented:
-- test-even-index.lua
function ipairs_even_index_aux(table, index)
new_index = index + 2
if (new_index > #table) then
return nil
else
return new_index, table[new_index]
end
end
function ipairs_even_index(table)
return ipairs_even_index_aux, table, 0
end
local tbl = {1, 2, 3, 4, 7, 12, 13}
for index,value in ipairs_even_index(tbl) do
print('tbl[' .. index .. '] = ' .. value)
end
$ ./luajit ./test-even-index.lua
tbl[2] = 2
tbl[4] = 4
tbl[6] = 12
In fact, there are no restrictions on induction variable at all, except that the iterator-generating function must properly setup the initial value of the induction variable, the iterator must know how to use it and that the nil
value of the induction variable designates the end of the iteration.
There are no restrictions on the loop state (table object in pairs
and ipairs
), including its type.
There are no restrictions on the iterator, except it must be callable either directly or via metamethods.
Pairs for-iterator¶
Language perspective¶
Lua reference manual starts to get really nasty when it comes to pairs.
It claims that the iterator function for pairs
is next and introduces ambiguous “index” parameter of next
and decides that that’s enough.
In fact, that’s how next
works:
- If the second argument is nil, return the “first” key-value pair of the table. The definition of “first” is implementation- and situation- dependent and will be discussed later.
- If the second argument is not nil, it is treated as a key in the table. In this case,
next
searches for this key in the table and returns “next” key-value pair. Again, “next” is implementation- and situation- dependent in the same way.- If current key-value pair (defined by the second argument as a key) is the “last”, return
nil
.
As long as there is a common mechanism that allows to convert the table contents to the pseudo-array of key-value pairs, “first”, “next” and “last” key-value pairs of the table in terms of next are first, next and last key-value pairs of this array.
With this being said, pairs(tbl)
might be written in Lua as follows:
function pairs(table)
return next, table, nil
-- here 'next' is the reference to the library function
end
Also, one might change the semantics of pairs
by placing custom iterator function on top of next
(or pairs
). However, changing the semantics of next
from within the loop will not make a difference, since the reference to next
is resolved within iterator-generating pairs
only once per loop.