OPT-9 Inline list iterators into assembly instructions#
Background#
Python objects can implement the iterator protocol.
This protocol is used in a few places:
A for loop
The
iter()
builtinWhile loops
The iterator protocol specifies that iterator objects should be generated for a container object, like a list, tuple, or dict.
You can implement custom iterators for classes by implementing the __iter__
method.
A Python for loop will get the iterator of the object using the GET_ITER
opcode and then keep an iterator object on the value stack for each
cycle in a loop.
The next value in the iterator object is yielded by the FOR_ITER
opcode. Most iterator objects just store an index and a pointer to the container. When iter_next()
is called
for any iterator object, the index is incremented and a pointer to the next object in the container is returned.
Solution#
A new source type has been introduced, IteratorSource
, which is generated at the process stage for compilation. This is generated for
the GET_ITER
opcode to mark the type of iterator object that will be pushed the stack on execution of the frame. This will be determined for known types.
At the compile stage, if the abstract value kind at the top of the stack for FOR_ITER
is an IterableValue
, then the abstract kind for the IteratorSource
is inspected.
If it is an optimized type (only List for this optimization), then the IL is compiled into the function to get the next item in a listiterator, increment the index and increment the reference count
of the next item.
These inline IL codes are modelled on the listiter_next
function in CPython.
Gains#
Loops for lists are faster, IF the container type can be determined as a list at compile-time
Edge-cases#
None.
Potential Improvements#
The branch logic could be improved. .NET doesn’t optimize the compiled assembly
Instead of copying
iterxx_next
, a more efficient model could be used, e.g. stacking items through onto a queue and just popping them for each iteration
Configuration#
This optimization is enabled at level 1 by default. See Optimizations for help on changing runtime optimization settings.