Whiteknight: Rosella Path

Long ago when I first prototyped the Query library, I included some additional functionality with it called “Path”. I decided that the two things were different enough to warrant being set up as two separate libraries, so I pulled out the Path code and eventually made the Query library stable. Recently I’ve gotten back to Path because it’s an essential building block of a bigger project I’ve been working on.

The Query library performs higher-order operations on aggregates like arrays and hashes. The Path library, instead, provides functionality to search through nested aggregates using search strings. It’s similar in concept to the way XPath can be used to search through XML documents, although not nearly as powerful as XPath is. At least, not yet.

As of today, the Path library is a stable part of Rosella. As always, “stable” does not mean it is magically perfect and guaranteed to be bug-free. That label only means that the interface has been reviewed and I’m probably not itching to change it without good reason. It also means that I consider the library to be usable by many people in their projects. Bug reports, feedback, feature requests and other stuff like that are always appreciated.

The Path library is able to traverse hash keys and object attributes by name using search strings. Here’s a lengthy example of hash key traversal in NQP:

my sub new_hash(*%h) { %h; }

my $q := Rosella::build(Rosella::Path);
my %a := new_hash(
    :d(new_hash(
        :e(new_hash(
            :h("i")
        ))
    ))
);
%a{"d.e"} := new_hash(:f("g"));

pir::say($q.get(%a, 'd.e.f'));      # "g"
pir::say($q.get(%a, 'd.e.h'));      # "i"

And here’s a similar example in Winxed:

load_bytecode("rosella/path.pbc");
var q = new Rosella.Path();
var a = {
    "d" : {
        "e" : {
            "h" : "i"
        }
    }
    "d.e" : {
        "f" : "g"
    }
};
say(q.get(a, "d.e.f"));     # "g"
say(q.get(a, "d.e.h"));     # "i"

The majority of the code in this example is used to construct a nested hash %a. The really interesting bit is at the very bottom with the .get() method. This method takes a path string and searches through the aggregate to satisfy it. In the case of hash keys the search is longest-key first. With the given hash, the following two lines are equivalent accesses:

my $result := $q.get(%a, 'd.e.f');
my $result := %a{"d.e"}{"f"};

And these two:

my $result := $q.get(%a, 'd.e.h');
my $result := %a{"d"}{"e"}{"h"};

You’re not necessarily saving yourself any keystrokes with this particular example, but you are abstracting away the storage structure and allowing a simple search path to return the correct value. What the example above shows is just nested hash access, but Path can also search through named attributes in objects. Consider this next example really understand the power of the new library:

my $result := $q.get($obj, 'foo.$!payload.bar.%!props.baz');

That example works right now in the library, if you take the time to put together such a large nested set of objects. Keep in mind that the items separated by periods can transparently be hash keys or attribute names. The library starts searching from longest identifiers first, then slowly starts whittling down until it finds a match. You could, for instance, have a hash object with a single long key:

my %data := {};
%data{'foo.$!payload.bar.%!props.baz'} := "hello!";

…or you could have any other combination nested however you want it. The biggest benefit, as I mentioned earlier, is that we gain the ability to separate the data being consumed from the actual structure of the model which provides that data.

As you might expect, because of the search semantics a lookup through Path is not nearly as efficient as a direct lookup in a hash. The Path library is good for a few things:

For prototyping a system which uses deeply-nested objects. It’s easier to set up a quick search string than to write out all the code to find values that might be changing location as the system matures.
For dealing with data structures of unknown or untrusted shape. For instance, data generated by a user may have a weird structure, and the Path library will automatically search it no matter what it looks like.
For working with data where the exact shape of the structure is less important than knowing the names of particular data bits you want.

To get an idea of where I am envisioning this functionality to go, consider the idea of a text templating engine for Parrot similar to Liquid. Given a Template object which combines both a raw text string and a data context object, the parser would read through the text string until it found templating instructions, and could then pass off data requests to the Path object for resolution. Text templates can be written without regard for the actual structure of the data objects that are used to render the template.

For anybody with ASP.NET experience, this should probably also remind you of the <%= %> syntax, or even <$# Eval(...) $> syntax. For the few (and dwindling) WPF programmers out there, this kind of functionality could work very similarly to XAML bindings with the Path= attribute. In fact, that’s part of the motivation for me naming this library “Path”. I’m sure there are plenty of other examples of functionality like this too, I just can’t name them.

This Path library is small, but useful in certain situations. I don’t pretend that it’s the most amazing thing ever, or usable by all software ever, or anything like that. It does a small number of things and does them reasonably well. I have a few upgrades in mind for the future, but this functionality doesn’t seem to lend itself well to too many feature additions. I don’t currently support anything like regular expressions or wildcard matches or anything, and may not try to add anything like that for a long time.

The library can be extended by subclassing the Path object or by adding new Searcher objects to perform searches differently. At the moment the library only has two searchers: Hash and Attribute. Others can be added pretty easily, and I have plans to extend the default set in the future.

The Path library is both something I’ve been playing with for a long time but also an essential building block for some things I want to do in the future. I’ve been working on a templating library inspired by things like ASP.NET and Liquid we’ve I’ve referenced above. I’m nowhere near ready to present that yet, but I’m sure I’ll be talking about it in the future.

16 Feb 2022	ParserObjects 4.0 Development projects
13 Feb 2022	Welcome to 2022 Personal
17 Dec 2021	Good Programmers Manage Expectations Philosophy

Programming, Software and Code

About

Links

Rosella Path

Related Posts