I might not be too bright. Either that or I might not have a great memory, or maybe I’m just a glutton for punishment. Remember the big IO system rewrite I completed only a few weeks ago? Remember how much of a huge hassle that turned into and how burnt-out I got because of it? Apparently I don’t because I’m back at it again.
Parrot hacker brrt came to me with a problem: After the io_cleanup merge he noticed that his mod_parrot project doesn’t build and pass tests anymore. This was sort of expected, he was relying on lots of specialized IO functionality and I broke a lot of specialized IO functionality. Mea culpa. I had a few potential fixes in mind, so I tossed around a few ideas with brrt, put together a few small branches and think I’ve got the solution.
The problem, in a nutshell is this: In mod_parrot brrt was using a custom Winxed object as an IO handle. By hijacking the standard input and output handles he could convert requests on those handles into NCI calls to Apache and all would just work as expected. However with the IO system rewrite, IO API calls no longer redirect to method calls. Instead, they are dispatched to new IO VTABLE function calls which handle the logic for individual types.
First question: How do we recreate brrt’s custom functionality, by allowing custom bytecode-level methods to implement core IO functionality for custom user types?
My Answer: We add a new IO VTABLE, for “User” objects, which can redirect low-level requests to PMC method calls.
Second Question: Okay, so how do we associate thisnew User IO VTABLE with
custom objects? Currently the get_pointer_keyed_int
VTABLE is used to get
access to the handle’s IO_VTABLE*
structure, but bytecode-level objects cannot
use get_pointer_keyed_int
.
My Answer: For most IO-related PMC types, the kind of IO_VTABLE*
to use is
staticly associated with that type. Socket PMCs always use the Socket IO VTABLE.
StringHandle PMCs always use the StringHandle IO VTABLE, etc. So, we can use a
simple map to associate PMC types with specific IO VTABLEs. Any PMC type not in
this map can default to the User IO VTABLE, making everything “just work”.
Third Question: Hold your horses, what do you mean “most” IO-related PMC types have a static IO VTABLE? Which ones don’t and how do we fix it?
My Answer: The big problem is the FileHandle PMC. Due to some legacy issues
the FileHandle PMC has two modes of operation: normal File IO and Pipe IO. I
guess these two ideas were conflated together long ago because internally the
details are kind of similar: Both files and pipes use file descriptors at the
OS level, and many of the library calls to use them are the same, so it makes
sense not to duplicate a lot of code. However, there are some nonsensical issues
that arise because Pipes and files are not the same: Files don’t have a notion
of a “process ID” or an “exit status”. Pipes don’t have a notion of a “file
position” and cannot do methods like seek
or tell
. Parrot uses the "p"
mode specifier to tell a FileHandle to be in Pipe mode, which causes the IO
system to select a between either the File or the Pipe IO VTABLE for each call.
Instead of this terrible system, I suggest we separate out this logic into two
PMC types: FileHandle (which, as it’s name suggests, operates on Files) and
Pipe. By breaking up this one type into two, we can statically map individual
IO VTABLEs to individual PMC types, and the system just works.
Fourth Question: Once we have these maps in place, how do we do IO with user-defined objects?
My Answer: The User IO VTABLE will redirect low-level IO requests into
method calls on these PMCs. I’ll break IO_BUFFER*
pointers out into a new PMC
type of their own (IOBuffer) and users will be able to access and manipulate
these things from any level. We’ll attach buffers to arbitrary PMCs using named
properties, which means we can attach buffers to any PMC that needs them.
So that’s my chain of thought on how to solve this problem. I’ve put together three branches to start working on this issue, but I don’t want to get too involved in this code until I get some buy-in from other developers. The FileHandle/Pipe change is going to break some existing code, so I want to make sure we’re cool with this idea before we make breaking changes and need to patch things like NQP and Rakudo. Here are the three branches I’ve started for this:
whiteknight/pipe_pmc
: This branch creates the new Pipe PMC type, separate from FileHandle. This is the breaking change that we need to make up front.whiteknight/io_vtable_lookup
: This branch adds the new IOBuffer PMC type, implements the new IO VTABLE map, and implements the new properties-based logic for attaching buffers to PMCs.whiteknight/io_userhandle
: This branch implements the new User IO VTABLE, which redirects IO requests to methods on PMC objects.
Like I said, these are all very rough drafts so far. All these three branches build, but they don’t necessarily pass all tests or look very pretty. If people like what I’m doing and agree it’s a good direction to go in, I’ll continue work in earnest and see where it takes us.