I was going to call this post “IO Cleanup Status”, but let’s face facts: This is a complete rewrite of the entire subsystem. I haven’t hardly left a single line of code untouched. It is a full rewrite of the system hiding behind a mostly-similar (though not quite the same) API. I didn’t intend to completely rewrite the whole subsystem when I started the branch, hence the benign-sounding branch name. Following along with our cultural norms, I could have called it whiteknight/io_massacre or something similarly upbeat. Whatever. I’ve known people stuck with un-liked names for their entire lives, so this branch can be misnamed for a few weeks.

So what is the status of this branch, exactly?

At the time of this writing the branch is mostly complete. The major architectural work has all been done, with per-type logic separated out into new IO_VTABLE structures, and buffering logic divorced from FileHandle into a new IO_BUFFER structure. Now you can do things that have never been possible before, like buffering socket input and output, or doing readline with custom line-end characters on all handle types, and a whole bunch of other, increasingly-obscure operations. A lot of the new capabilities are things you didn’t even know we didn’t support before. Now, we do.

We aren’t quite there yet, but the stage is set for some other awesome changes in the future too, which I’ll talk about in more depth when we get there.

The current status of the branch is good. Parrot builds without any huge amount of new warnings and with no errors on my platform. Some platform-specific code needs to be updated for Windows, I’m sure. The one big thing standing in the way is keeping track of file positions through operations like seek and tell. These things are made a little bit more difficult when you have read buffers reading ahead, because the position of the next character to read according to the user may be far different than the position of the file descriptor according to the operating system. Then consider the case when you have a file opened for read and write, with buffers in both directions. The old system had a single buffer per FileHandle which needed to be flushed if you tried to read when the buffer was in write mode, or you tried to write when it was in read mode. If you’re switching back and forth between reading and writing often enough, buffering actually decreases performance when it’s supposed to be a performance enhancer.

The FileHandle has an attribute to keep a pointer to the current cursor location, but I’m not always updating it as often as I should and not always reading it when I should. If you have a file opened for read and write, when you write 5 characters at the current file position you need to increment the read buffer by 5 characters also. When you go to read in 5 characters from the current position, you either need to flush the write buffer first or you can try to read those characters right out of the write buffer. There’s nothing complicated about it, just a lot of bookkeeping to get right and lots of little interactions that need to be tested. It’s helpful that we don’t do seek or tell on some things like Sockets, and we don’t really buffer StringHandles.

The branch is moving along well and if I can find the time to actually sit down and work on it for a dedicate period of time I might be able to get it closer to being done. I’m shooting for being mergable sometime after the coming release.