UCS-4, NFG and how the grapheme table makes it awesome.

After laying down the foundations of what NFG does in previous blog posts I've started implementing, as part of my work in this Summer of Code, a new Unicode encoding for parrot, UCS-4. In this post I'll try to explain what it is and how it makes NFG easier to achieve.

PAST Pattern Matching

Last Wednesday, I discussed a little bit of the rationale behind my GSoC project and summarized the most low-level portion of my project: PAST::Walker. Today, I want to describe another portion of my project: PAST::Pattern. PAST::Walker provides a very powerful and complete interface. Any possible transformation or other traversal of a PAST should be implementable using it. However, it will not be very convenient if you only want to turn return nodes containing only a call node into tail-call nodes.

Instrumenting Parrot

Having an instrumentation framework opens the doors to having many different tools that can help to diagnose problems within a piece of code. One main example of this is Valgrind. Valgrind provides an interface for making many different tools that help to diagnose and identify certain specific problems, ranging from memory leaks to multithreaded data races between threads. Furthermore, the framework is also used to provide profiling tools, such as Callgrind and Cachegrind, to determine useful information such as call graphs and execution times of functions.

Encodings, charsets and how NFG fits in there.

The post from last week talked about what NFG was and tried to explain how it was a good feature for parrot to have. Today I'll be slightly more concrete and talk a bit about how NFG fits inside the parrot string structure. There's other parts of parrot that will need hacking on, but this time I want to limit myself to the the two bottommost pointers in the STRING structure definition and the concepts behind them.

Hybrid Threads

Threading systems let multiple code paths run at the same time. Why would anyone want that? Simple: impatience. It's no fun waiting for the computer to finish one thing when you want it to be doing something else.

So what are "hybrid" threads and why does Parrot need them? Well, there are two common schools of thought in building threading systems for high level language runtimes. The Java people call them "green" threads and "native" threads. As with any design tradeoff the right answer is to cheat and just take all the good properties of both options.

PAST Optimization

One of the advantages of a common virtual machine for various languages is the ability to apply the same optimizations to all of those languages. For example, LLVM includes optimization passes to propagate constants, eliminate dead arguments, code, and globals, inline functions, and eliminate recursive tail calls, among others. Any language with a compiler to LLVM can easily take advantage of these optimizations without any additional work by the compiler writer.

What is NFG and why you want Parrot to have it.

The Grapheme Normal Form for Unicode (or NFG as we like to call it) has been specified as a feature parrot wants for a long time, it's been in the Parrot Design Document for strings since before I had a commit bit, or any involvement in the project come to that. Something that has gone that long unimplemented can't be that important, right? I mean, we clearly have survived without it. Turns out it is important, but it takes some background to realize why.

Perl 5.10.1 RC2

Tested a build with this release candidate against parrot and partcl with no problems.

The last week in decnum-dynpmcs

Hit the GSoC "suggested pencils down" date yesterday. That means there's one week left on the project, so I figured it was time for a wider overview.

This week on decnum-dynpmcs

Another week, another #parrotsketch report:
* Made some small cleanups here and there. Added missing checks to a few DecNum METHODS.
* Added a method to DecNum to retrieve the actual exponent of a number. More useful than it sounds, really.
* DecInt is starting to work. Trying it out as a subclass of DecNum. Had to tweak the set_*_native() VTABLEs to properly round their input.
* It looks like SUPER() is misbehaving for dynpmcs. I'll look a bit more into that today and file a TT if it's actually a bug.
* The decTest parser is now functioning, it parses and runs most decTest files.

Syndicate content