Programming, Software and Code

Proper Matrices in Matrixy

I started a branch in Matrixy (it's own little adventure!) to start working on improved integration with parrot-linear-algebra. Matrixy currently uses nested ResizablePMCArray PMCs to implement matrices. This carries a few problems with it and is a huge performance bottleneck.

When I started work on Matrixy I had in mind to create classes for the fundamental things like Matrices and Vectors, but also the other types like cell arrays and structs. My work at the time was very naive, and I expected to be implementing all the various numerical algorithms myself. It wasn't until Blair mentioned it to me that I really thought about using an optimized library like BLAS, although that brings with it it's own hardships. Libraries like BLAS saves us the effort of implementing all the operations from scratch, but requires that we be able to write glue code for interfacing with the library functions. Tradeoffs.

Nested RPAs were easy because they were a pure-PIR solution that could be made to work quickly. The PMC manages it's own memory, so our only job was to manage the individual arrays. There is a certain amount of effort involved in keeping the matrix square when we resize (very expensive) and a certain amount of effort involved in marshalling data into BLAS API functions (very expensive), but we were able to get a lot of development done using the quick prototype.

in Parrot-Linear-Algebra we're providing PMC types that store data in a low-level C data buffer, in a format that is amenable for use directly in BLAS. Also, since it's a square buffer by default, we don't have to do anything special to keep it square on resize: We create a new buffer of the necessary size and block-copy items from the smaller old buffer to the larger new buffer. This might not be the most efficient way to handle it, but if we consider that matrices can be preallocated quickly, it's not going to be a huge bottleneck and a good developer can avoid it almost entirely. The new PMC type also provides all the necessary VTABLEs, so we can interact with it in a very natural way from PIR. Code that we were executing in large PIR functions before can now be performed using single opcodes, or single method calls.

So the general plan for the branch is to change Matrixy to have a strict dependency on parrot-linear-algebra. I will cut out all the code that manages custom RPA types and replace it with (hopefully smaller and better) code to interact with NumMatrix2D PMCs instead. Along the way, I'll be adding METHODs to the PMC type to perform the interactions with BLAS that we were previously doing in PIR. This should provide us a significant speedup in most cases, clean up the code significantly, and get us on the right path for future developments. I've already added several METHODs and VTABLEs for various purposes: clone, transpose, attribute access, fill, etc. These all need testing and improving of course, but I've been on too much of a roll with Matrixy to do these things.

One cool method that I added last night was an "iterate function" method. It takes a Sub object, which it invokes on every element in the matrix and replaces the current value with the result of the Sub. Since Matrixy has a lot of functions that act independently on every item in the array, this was a pretty natural thing to add. Also, since we're doing this in C now instead of PIR, we open up all sorts of possibilities like automatically threading large requests for improved performance. I think Parrot's threads system has some work to do yet before this becomes a viable optimization, but it's still interesting to think about.

Here's a PIR example of iterating a function:

.sub main :main
$P0 = new ['NumMatrix2D']
.const Sub helper = "add_to_each"
$P0[1;1] = 4
$P0[0;1] = 3
$P0[1;0] = 2
$P0[0;0] = 1
# 1 2
# 3 4
$P0.'iterate_function_inplace'(helper, 2)
# 3 4
# 5 6
$P0.'iterate_function_inplace'(helper, 5)
# 8 9
# 10 11

.sub 'add_to_each'
.param pmc matrix
.param num value
.param num to_add
$N0 = value + to_add

I haven't done any testing on this yet, but this is what it will look like when it's all working properly. There are a lot of functions that act item-wise on each element in a matrix, and this little helper routine is going to make those all much easier and cleaner to implement.

This entry was originally posted on Blogger and was automatically converted. There may be some broken links and other errors due to the conversion. Please let me know about any serious problems.