Parrot4Newbies: Documentation
24 Jun 2009In response to some requests I received and some signs of interest that I saw at YAPC, I decided to start a little series of posts that I am going to call Parrot4Newbies. The goal of these posts are to take a look at some particular subsystems and projects in Parrot-land, describe what kinds of jobs and work need to be done, and break down these tasks into small steps that new users can follow in order to become acclimatized and familiarized to the Parrot development process. I will be publishing these posts pretty frequently, in addition to my "normal" posts which I try to put out at least 5x per week, so the volume here may be getting pretty heavy. I will be using the "Parrot4Newbies" tag on these posts, so you can filter them if you want.
The first area where Parrot needs help, which is actually a common need of most open source software projects, is documentation. Writing and improving documentation is a great way to get familiarized with the codebase for people new to Parrot internals. There are actually several components to our documentation, all of which need help.
There are two good ways to help: (1), open a bug report (requires a username) when you find a problem that you would like somebody else to fix. (2) Even better is to fix the problem yourself and create a bug report with an attached patch.
Project Documentation
We have a lot of documentation in the repository in the folders docs/. However, our documentation is far from comprehensive. Find files that look incomplete, out-of-date, or just plain confusing and misleading, and fix them up.
Our design documents in docs/pdds/ are the documentation that describes what Parrot is and how it is implemented. Read through these to get a good understanding of Parrot, and look for places that are incomplete, not true to reality (which may mean the code is wrong or the specs are wrong), or misleading.
Parrot Book
We have a book at docs/book/ that needs work. We have a first edition being published soon, but it needs a lot of work so we can put out a second edition after Parrot 2.0 is released. Read over the book and find things like core topics that aren't covered (or aren't covered well), typos, out-of-date errors, etc.
When you see code examples, try them and make sure they work too! Don't just trust our word for it make sure it does what it says it does!
Examples and Tutorials
We have a large amount of examples in examples/, and a PIR tutorial in examples/tutorial/ that need lots of help. Read through the examples, try them out locally to see if they work and how, and clean them up a little. Our tutorials especially are incomplete and don't cover all the topics that a new user needs to know. Feel free to write new tutorial pages entirely if a particular topic isn't well covered already.
File- and Function-Level Documentation
All our .c files contain POD-formatted documentation. This is broken into two distinct types: File-level which is the stuff at the top of the file that describes the particular system in a high-level overview; and Function-level which provides documentation for each individual function. Both of these things are required and are verified by our test suite. To run these tests, you type "make codetest" from the parrot repo, after configuring and building Parrot. This will run through a series of tests in addition to the documentation ones. Specifically, the test t/codingstd/c_function_docs.t covers the function-level tests, and there are several files failing this test:
compilers/imcc/instructions.c
compilers/imcc/optimizer.c
compilers/imcc/parser_util.c
compilers/imcc/pbc.c
compilers/imcc/pcc.c
compilers/imcc/reg_alloc.c
compilers/imcc/symreg.c
compilers/pirc/src/bcgen.c
compilers/pirc/src/pircapi.c
compilers/pirc/src/pircompiler.c
compilers/pirc/src/piremit.c
compilers/pirc/src/pirmacro.c
compilers/pirc/src/pirpcc.c
compilers/pirc/src/pirregalloc.c
compilers/pirc/src/pirsymbol.c
config/gen/platform/ansi/dl.c
config/gen/platform/ansi/exec.c
config/gen/platform/ansi/time.c
config/gen/platform/darwin/dl.c
config/gen/platform/darwin/memalign.c
config/gen/platform/generic/dl.c
config/gen/platform/generic/env.c
config/gen/platform/generic/exec.c
config/gen/platform/generic/math.c
config/gen/platform/generic/memalign.c
config/gen/platform/generic/memexec.c
config/gen/platform/generic/stat.c
config/gen/platform/generic/time.c
config/gen/platform/netbsd/math.c
config/gen/platform/openbsd/math.c
config/gen/platform/openbsd/memexec.c
config/gen/platform/solaris/math.c
config/gen/platform/solaris/time.c
examples/c/nanoparrot.c
examples/compilers/japhc.c
examples/embed/lorito.c
src/atomic/gcc_x86.c
src/debug.c
src/gc/gc_malloc.c
src/gc/generational_ms.c
src/gc/res_lea.c
src/io/io_string.c
src/jit/amd64/jit_defs.c
src/jit/arm/exec_dep.c
src/jit/i386/exec_dep.c
src/jit/ppc/exec_dep.c
src/nci_test.c
src/pbc_dump.c
src/pbc_info.c
src/pic.c
src/pic_jit.c
src/string/charset/ascii.c
src/string/charset/binary.c
src/string/charset/iso-8859-1.c
src/string/charset/unicode.c
src/tsq.c
src/jit/alpha/jit_emit.h
src/jit/arm/jit_emit.h
src/jit/hppa/jit_emit.h
include/parrot/atomic/gcc_pcc.h
src/jit/ia64/jit_emit.h
src/jit/mips/jit_emit.h
src/jit/ppc/jit_emit.h
src/jit/skeleton/jit_emit.h
src/jit/sun4/jit_emit.h
Some of these files are complicated, which is why nobody has been brave enough to document them. However, while documenting things you should feel free to clean the code up a little bit so it becomes easier to document and understand. Also, most of these are not core files, they tend to be files on the edge of the codebase in some of our complicated systems: JIT, the IMCC and PIRC compilers, etc, so you're going to need to read up on some core systems in order to what is going on in all of these. More reading and looking at code means you learn more, more quickly!
Most of the rest of the code files in the repository have some documentation, but it might not be high quality. Feel free to look into other files that interest you as well and improve existing documentation to be easier to read and more accurate. This is a great way to get started as well!
Conclusion
So that's a quick list of things that a new user should feel free to take a look at as a way of getting familarized with Parrot and it's documentation. Documentation is a great way to learn the code because you have to understand it and describe it in your own words in order to write the docs. It's a great way to get involved and will have an immediate impact on the quality of Parrot and our ability to attract even more new developers. Fixing some documentation
Please do not hesitate to send in patches, however small. Also, if you have questions and need clarifications, please feel free to ask on #parrot or on the parrot mailing list. We would love to help you help us.
This entry was originally posted on Blogger and was automatically converted. There may be some broken links and other errors due to the conversion. Please let me know about any serious problems.
Comments
w
6/25/2009 5:12:58 AM
Whiteknight
6/25/2009 7:56:40 AM
Whiteknight
6/25/2009 10:56:00 AM
skids
6/25/2009 5:14:45 PM
Whiteknight
6/25/2009 9:49:07 PM
Whiteknight
6/26/2009 3:37:43 PM
Whiteknight
6/27/2009 9:45:58 AM
so what state is Perl6 in now ? do I need to learn both Parrot and Perl6 in order to write something for it ?
Great question! Parrot and Rakudo Perl6 are separate projects in separate repositories, so you don't need to monkey with one in order to monkey with the other.
That said, Rakudo Perl6 is implemented on Parrot, so some knowledge of Parrot is usually required for some tasks (hacking the core Parser and low-level builtins, for instance).
Recently the Rakudo people have added what's called the "Setting". The Setting are all the builtins for Perl6 that are written in Perl6 itself. If you don't want to learn Parrot, you can probably start hacking on the Setting in Perl6 instead. There are also tons of tests that need writing, approving, and updating, and all sorts of other projects written in Perl6 that need help too.
The short answer is you need to know Parrot if you want to hack on the parts that are written in Parrot's syntax, and you don't need to know Parrot to hack on the things written in Perl6.
If you want to know what the current state is (and it's moving very quickly), you should check out rakudo.org, or perl6-projects.org.
A good specific project here to work on is Ticket #592, which deals with documenting the Parrot Debugger. Anybody interested in that is welcome to jump in immediately!
For what code I can understand, I can certainly upgrade a bit of documentation... though I've been hesitant to do so knowing most of the stuff I have been looking at is on the soon-to-be-refactored list, so it just seems a bit futile at times.
Maybe I'll lurk on the svn logs and document the next major subsytem changeset checked in.
Anyway, there's places in there that only the core team understands and only they can document -- you can't rely on cage cleaners for that. Sorta like asking for directions at the vostor's center when you're the one working the desk :-). Though, if the core team would prefer to write more accessible blog posts documenting those parts, I'd be happy to chop them up and fit them into the code.
Things that are still a mystery to me despite reading the source:
1) "Lexpads" and the "linked list" (tree? has to be right?) that serves stackish purposes.
2) What the criteria are for making something a method versus a VTABLE.
3) What's going to be (green thread) reentrant, eventually, and what's going to be one-thread-at-a-time? Where are the locks and/or scheduler hints going to be to make that happen?
4) Who's working on what? Do you have to be a ml and IRC fiend to know what balls are in the air? Maybe parrotteers should adopt the (pains me to say this) Twitter mentality.
You're right that it can be difficult to figure out what is going on from a top-level view, but with good procedural programming best practices, it should be easy to figure out what each individual function does. And if you can't figure it out, start cleaning it up and refactoring it until you can (and that makes it better for everybody).
To answer some of your questions:
1) LexPads are like hashes that store lexical variables by name. Each executing subroutine context gets a LexPad to store it's lexical variables. These do indeed form a tree because they are attached to Contexts which form a tree. You can search upwards to find lexicals in containing scopes quite easily by looking up the tree.
2) VTABLES are fixed in number and type. There are a limited number of VTABLES (I think about 200) of precise types. In exchange for this rigid definition, VTABLES tend to be much faster then METHODs, which can have arbitrary names and function signatures.
3) I have no idea about that, cleanups to the scheduler system are further down on my task list. You could ask Allison about that.
4) You are right, it would be good to keep track of who is doing what at each moment. I'm obviously very vocal about my own work, but that doesn't translate to everybody in the project. Best bet is to check out the weekly reports at the #parrotsketch meeting.
Also, there is a Parrot programming book at Wikibooks, for people who are inclined to work on that platform: PVM Wikibook.
That book could use lots of help and doesn't require you to get a commit bit to the parrot repo.
Another good ticket for this is TT # 766, which involves tagging POD keywords so POD::Index can index them.