The optimizer removes many POP_TOPs, but quite a few remain.
The code for POP_TOP looks like this:
inst(POP_TOP, (value --)) {
PyStackRef_XCLOSE(value);
}
which is fine for the interprete but in the JIT, as POP_TOP is marked as escaping, we need to set the IP before and check for invalidation afterwards.
What we could do, is have a version of POP_TOP that sets the IP and checks for invalidation only if Py_Dealloc is called, so we don't need to add SET_IP and CHECK_VALIDITY.
The new form of POP_TOP would look like this:
op(_POP_TOP_SET_IP_CHECK_INVALID, (value --)) {
if (do_decref_and_refcnt_is_zero(value)) {
SET_IP()
Py_Dealloc()
CHECK_VALID();
}
}
_POP_TOP_SET_IP_CHECK_INVALID should be inserted in the same pass that we insert _SET_IP and _CHECK_VALIDITY as we know whether we need to insert those uops, and leave POP_TOP alone in those cases.
Overall, this should speed things up as it doesn't increase the overall code size, but it does execute less code in many cases.
The optimizer removes many
POP_TOPs, but quite a few remain.The code for
POP_TOPlooks like this:which is fine for the interprete but in the JIT, as
POP_TOPis marked as escaping, we need to set the IP before and check for invalidation afterwards.What we could do, is have a version of
POP_TOPthat sets the IP and checks for invalidation only ifPy_Deallocis called, so we don't need to addSET_IPandCHECK_VALIDITY.The new form of
POP_TOPwould look like this:_POP_TOP_SET_IP_CHECK_INVALIDshould be inserted in the same pass that we insert_SET_IPand_CHECK_VALIDITYas we know whether we need to insert those uops, and leavePOP_TOPalone in those cases.Overall, this should speed things up as it doesn't increase the overall code size, but it does execute less code in many cases.