Programming, Software and Code

The Bugs

Here's an interesting backtrace that I've been looking at for a while:

#2  0x00007f3cbab0bd7c in Parrot_confess (
cond=0x7f3cbad4a0d8 "PObj_is_PMC_TEST(sig_pmc)",
file=0x7f3cbad4a040 "src/call/pcc.c", line=613) at src/exceptions.c:607
#3 0x00007f3cbab2541a in Parrot_init_arg_op (interp=0x946080, ctx=0x22f2040,
pc=0x7f3cbb1e2eb8, sti=0x7fffc31f5a80) at src/call/pcc.c:613
#4 0x00007f3cbab2a150 in set_retval_util (interp=0x946080,
sig=0x7f3cbad4a7fd "P", ctx=0x22f2040, st=0x7fffc31f5a80)
at src/call/pcc.c:1948
#5 0x00007f3cbab2a64e in set_retval (interp=0x946080, sig_ret=80,
ctx=0x22f2040) at src/call/pcc.c:1994
#6 0x00007f3cbab2f0e3 in Parrot_runops_fromc_args (interp=0x946080,
sub=0x7f3cb58636b0, sig=0x7f3cbad4d7a7 "P") at src/call/ops.c:340
#7 0x00007f3cbab7d093 in run_sub (interp=0x946080, sub_pmc=0x7f3cb58636b0)
at src/packfile.c:686
#8 0x00007f3cbab7d3e4 in do_1_sub_pragma (interp=0x946080,
sub_pmc=0x7f3cb58636b0, action=PBC_MAIN) at src/packfile.c:778
#9 0x00007f3cbab7d618 in do_sub_pragmas (interp=0x946080, self=0x268b370,
action=PBC_MAIN, eval_pmc=0x7f3cb5862780) at src/packfile.c:940
#10 0x00007f3cbab7d705 in PackFile_fixup_subs (interp=0x946080, what=PBC_MAIN,
eval=0x7f3cb5862780) at src/packfile.c:4912
#11 0x00007f3cbad3398a in imcc_compile (interp=0x946080,
s=0x3b6fab0 "\n.HLL \"perl6\"\n\n.namespace [\"Test\"]\n.sub \"_block1830\" :subid(\"264_1247316154\")\n.annotate \"line\", 0\n .const 'Sub' $P1833 = \"265_1247316154\" \n capture_lex $P1833\n.annotate 'file', 't/spec/S14-role"..., pasm_file=0, error_message=0x7fffc31f5f60) at compilers/imcc/parser_util.c:738
#12 0x00007f3cbad33a60 in imcc_compile_pir_ex (interp=0x946080,
s=0x3b6fab0 "\n.HLL \"perl6\"\n\n.namespace [\"Test\"]\n.sub \"_block1830\" :subid(\"264_1247316154\")\n.annotate \"line\", 0\n .const 'Sub' $P1833 = \"265_1247316154\" \n capture_lex $P1833\n.annotate 'file', 't/spec/S14-role"...)
at compilers/imcc/parser_util.c:876
#13 0x00007f3cbab51e78 in pcf_P_Jt (interp=0x946080, self=0xa32dd0)
at src/nci.c:237
#14 0x00007f3cbac4de98 in Parrot_NCI_invoke (interp=0x946080, pmc=0xa32dd0,
next=0x7f3cbb1e2eb0) at src/pmc/nci.c:244
#15 0x00007f3cbaa9627c in Parrot_invokecc_p (cur_opcode=0x7f3cbb1e2ea0,
interp=0x946080) at src/ops/core_ops.c:18185
#16 0x00007f3cbab882a8 in runops_slow_core (interp=0x946080, pc=0x7f3cbb1e2ea0)
at src/runcore/cores.c:462
#17 0x00007f3cbab86f7c in runops_int (interp=0x946080, offset=287157)
at src/runcore/main.c:987
#18 0x00007f3cbab2d8cc in runops (interp=0x946080, offs=287157)
at src/call/ops.c:119
#19 0x00007f3cbab2dce9 in runops_args (interp=0x946080, sub=0x7f3cb69c4de0,
obj=0x9c7770, meth_unused=0x0, sig=0x7f3cbad48ba3 "vP", ap=0x7fffc31f62f0)
at src/call/ops.c:269
#20 0x00007f3cbab2f0bc in Parrot_runops_fromc_args (interp=0x946080,
sub=0x7f3cb69c4de0, sig=0x7f3cbad48ba3 "vP") at src/call/ops.c:338
#21 0x00007f3cbab08e31 in Parrot_runcode (interp=0x946080, argc=2,
argv=0x7fffc31f65f0) at src/embed.c:1021
#22 0x00007f3cbad1cb2d in imcc_run_pbc (interp=0x946080, obj_file=0,
output_file=0x0, argc=2, argv=0x7fffc31f65f0) at compilers/imcc/main.c:801
#23 0x00007f3cbad1d791 in imcc_run (interp=0x946080,
sourcefile=0x7fffc31f76ba "perl6.pbc", argc=2, argv=0x7fffc31f65f0)
at compilers/imcc/main.c:1092
#24 0x0000000000400bc4 in main (argc=2, argv=0x7fffc31f65f0) at src/main.c:60

What this is, is a back trace from a test failure in Rakudo. Actually, this same back trace occurs for two separate test failures, which makes it particularly interesting. The two are almost identical except for the values of most function arguments.

So what is going on here? We see the function Parrot_invokecc_p which we can recognize as the function that implements the invokecc opcode. This in turn calls Parrot_NCI_invoke, which is the name of the generated VTABLE "invoke" function for the NCI PMC type. This calls the NCI thunk pcf_P_Jt, which is a generated call frame for the "P_Jt" call signature. It's a function that takes a reference to the interpreter (J) and a C string (t) and returns a PMC (P). This thunk calls the function imcc_compile_pir_ex with a long string of PIR code. In short, a runtime eval of PIR code.

So I was banging my head against this issue for a while, when pmichaud was able to narrow this down to a pure-PIR snippet. Here's his code and backtrace:

$ cat x.pir
.sub main
$S0 = <<'END'
.sub 'abc' :load :init
say 'run abc'

.sub 'def' :load :init
die 'died in def'

.sub 'ghi' :load :init
say 'run ghi'

.sub 'main' :main
say 'run main'

$P0 = compreg 'PIR'

push_eh trap1

debug 0

push_eh trap2

$ gdb ./parrot
(gdb) b Parrot_debug_ic
(gdb) run x.pir

Breakpoint 1, Parrot_debug_ic (cur_opcode=0x916b9a8, interp=0x90c2040) at src/ops/core.ops:1001
1001 if ($1 != 0) { Interp_debug_SET(interp, $1); }
(gdb) bt
#0 Parrot_debug_ic (cur_opcode=0x916b9a8, interp=0x90c2040) at src/ops/core.ops:1001
#1 0xb7dc3410 in runops_slow_core (interp=0x90c2040, pc=0x916b9a8) at src/runcore/cores.c:462
#2 0xb7dc200e in runops_int (interp=0x90c2040, offset=5) at src/runcore/main.c:987
#3 0xb7d9b9e5 in runops (interp=0x90c2040, offs=5) at src/call/ops.c:119
#4 0xb7d9be23 in runops_args (interp=0x90c2040, sub=0x9136388, obj=0x9124f70, meth_unused=0x0, sig=0xb7fe53b3 "P", ap=0xbfe3d52c "����(\031\002�\201�η")
at src/call/ops.c:269
#5 0xb7d9cce6 in Parrot_runops_fromc_args (interp=0x90c2040, sub=0x9136388, sig=0xb7fe53b3 "P") at src/call/ops.c:338
#6 0xb7db76d9 in run_sub (interp=0x90c2040, sub_pmc=0x9136388) at src/packfile.c:686
#7 0xb7db79cb in do_1_sub_pragma (interp=0x90c2040, sub_pmc=0x9136388, action=PBC_MAIN) at src/packfile.c:778
#8 0xb7db7bd7 in do_sub_pragmas (interp=0x90c2040, self=0x916a098, action=PBC_MAIN, eval_pmc=0x9136340) at src/packfile.c:940
#9 0xb7db7cc3 in PackFile_fixup_subs (interp=0x90c2040, what=PBC_MAIN, eval=0x9136340) at src/packfile.c:4912
#10 0xb7fcbc65 in imcc_compile (interp=0x90c2040,
s=0x916a2b8 ..., pasm_file=0, error_message=0xbfe3d6fc)
at compilers/imcc/parser_util.c:738
#11 0xb7fcbd4e in imcc_compile_pir_ex (interp=0x90c2040,
s=0x916a2b8 ...) at compilers/imcc/parser_util.c:876
#12 0x0915be25 in ?? ()
#13 0xb7ef328c in Parrot_NCI_invoke (interp=0x90c2040, pmc=0x91364d8, next=0x916b9a4) at ./src/pmc/nci.pmc:326
#14 0xb7d14f76 in Parrot_invokecc_p (cur_opcode=0x916b99c, interp=0x90c2040) at src/ops/core.ops:504
#15 0xb7dc3410 in runops_slow_core (interp=0x90c2040, pc=0x916b99c) at src/runcore/cores.c:462
#16 0xb7dc200e in runops_int (interp=0x90c2040, offset=0) at src/runcore/main.c:987
#17 0xb7d9b9e5 in runops (interp=0x90c2040, offs=0) at src/call/ops.c:119
#18 0xb7d9be23 in runops_args (interp=0x90c2040, sub=0x9136418, obj=0x9124f70, meth_unused=0x0, sig=0xb7fe0d17 "vP", ap=0xbfe3d97c "") at src/call/ops.c:269
#19 0xb7d9cce6 in Parrot_runops_fromc_args (interp=0x90c2040, sub=0x9136418, sig=0xb7fe0d17 "vP") at src/call/ops.c:338
#20 0xb7d79627 in Parrot_runcode (interp=0x90c2040, argc=1, argv=0xbfe3daf8) at src/embed.c:1021
#21 0xb7fb548f in imcc_run_pbc (interp=0x90c2040, obj_file=0, output_file=0x0, argc=1, argv=0xbfe3daf8) at compilers/imcc/main.c:801
#22 0xb7fb608c in imcc_run (interp=0x90c2040, sourcefile=0xbfe3f63d "x.pir", argc=1, argv=0xbfe3daf8) at compilers/imcc/main.c:1092
#23 0x08048978 in main (argc=1, argv=0xbfe3daf8) at src/main.c:60

See the problem now? What's happening in this little snippet is that we call IMCC to compile a string of code for us into a new Sub PMC. During compilation we have an :init sub, which gets executed immediately in a new runloop. That's why you see two instances of runops_slow_core on the backtrace, the second is the new runcore created inside IMCC to execute the :init sub. That executing sub throws an exception, so Parrot's exception system looks for a handler and finds one.

The problem comes because the handler that we find exists in a different execution context then where control flow is currently. The inner runloop jumps control flow to the handler, and continues executing like normal until the end of the program (or, in this last case, until the breakpoint). Once that runloop gets to the end of the program it exits, and continues control flow from the point inside IMCC where the :init was first executed. This jumps back up to the outer runloop, which attempts to continue it's operations but now with a corrupted interpreter.

This mimics several other problems we've seen in other tickets, where throwing an exception from inside a child runloop creates strange problems in the parent runloop. For a long time we didn't really understand why it happened. Now, thanks to pmichaud's small test case, I think we know exactly why it's happening and are even starting to formulate a few ideas for fixes. Look to the list for more discussion on this issue.

This entry was originally posted on Blogger and was automatically converted. There may be some broken links and other errors due to the conversion. Please let me know about any serious problems.