Table of Contents | Foreword | Preface
Chapters: 1 2 3 4 5 6 7 8 9 10 11 12
Appendices: A B C D
Glossary | Colophon | Copyright

Chapter 6: The Lexer, Compiler, Resolver, and Interpreter Objects

Now that you're familiar with Mason's basic syntax and some of its more advanced features, it's time to explore the details of how the various pieces of the Mason architecture work together to process components. By knowing the framework well, you can use its pieces to your advantage, processing components in ways that match your intentions.

In this chapter we'll discuss four of the persistent objects in the Mason framework: the Interpreter, Resolver, Lexer, and Compiler. These objects are created once (in a mod_perl setting, they're typically created when the server is starting up) and then serve many Mason requests, each of which may involve processing many Mason components.

Each of these four objects has a distinct purpose. The Resolver is responsible for all interaction with the underlying component source storage mechanism, which is typically a set of directories on a filesystem. The main job of the Resolver is to accept a component path as input and return various properties of the component such as its source, time of last modification, unique identifier, and so on.

The Lexer is responsible for actually processing the component source code and finding the Mason directives within it. It interacts quite closely with the Compiler, which takes the Lexer's output and generates a Mason component object suitable for interpretation at runtime.

The Interpreter ties the other three objects together. It is responsible for taking a component path and arguments and generating the resultant output. This involves getting the component from the resolver, compiling it, then caching the compiled version so that next time the interpreter encounters the same component it can skip the resolving and compiling phases.

Figure 6-1 illustrates the relationship between these four objects. The Interpreter has a Compiler and a Resolver, and the Compiler has a Lexer.



Figure 6-1. The Interpreter and its cronies

Passing Parameters to Mason Classes

An interesting feature of the Mason code is that, if a particular object contains another object, the containing object will accept constructor parameters intended for the contained object. For example, the Interpreter object will accept parameters intended for the Compiler or Resolver and do the right thing with them. This means that you often don't need to know exactly where a parameter goes. You just pass it to the object at the top of the chain.

Even better, if you decide to create your own Resolver for use with Mason, the Interpreter will take any parameters that your Resolver accepts -- not the parameters defined by Mason's default Resolver class.

Also, if an object creates multiple delayed instances of another class, as the Interpreter does with Request objects, it will accept the created class's parameters in the same way, passing them to the created class at the appropriate time. So if you pass the autoflush parameter to the Interpreter's constructor, it will store this value and pass it to any Request objects it creates later.

This system was motivated in part by the fact that many users want to be able to configure Mason from an Apache config file. Under this system, the user just sets a certain configuration directive (such as MasonAutoflush1 to set the autoflush parameter) in her httpd.conf file, and it gets directed automatically to the Request objects when they are created.

The details of how this system works are fairly magical and the code involved is so funky its creators don't know whether to rejoice or weep, but it works, and you can take advantage of this if you ever need to create your own custom Mason classes. Chapter 12 covers this in its discussion of the Class::Container class, where all the funkiness is located.

The Lexer

Mason's built-in Lexer class is, appropriately enough, HTML::Mason::Lexer . All it does is parse the text of Mason components and pass off the sections it finds to the Compiler. As of Version 1.10, the Lexer doesn't actually accept any parameters that alter its behavior, so there's not much for us to say in this section.

Future versions of Mason may include other Lexer classes to handle alternate source formats. Some people -- crazy people, we assure you -- have expressed a desire to write Mason components in XML, and it would be fairly simple to plug in a new Lexer class to handle this. If you're one of these crazy people, you may be interested in Chapter 12 to see how to use objects of your own design as pieces of the Mason framework.

By the way, you may be wondering why the Lexer isn't called a Parser, since its main job seems to be to parse the source of a component. The answer is that previous implementations of Mason had a Parser class with a different interface and role, and a different name was necessary to maintain forward (though not backward) compatibility.

The Compiler

By default, Mason will use the HTML::Mason::Compiler::ToObject class to do its compilation. It is a subclass of the generic HTML::Mason::Compiler class, so we describe here all parameters that the ToObject variety will accept, including parameters inherited from its parent:

Altering Every Component's Content

Several access points let you step in to the compilation process and alter the text of each component as it gets processed. The preprocess, postprocess_perl, postprocess_text, preamble, and postamble parameters let you exert a bit of ad hoc control over Mason's processing of your components.

Figure 6-2 illustrates the role of each of these five parameters.



Figure 6-2. Component processing hooks

Compiler Methods

Once an HTML::Mason::Compiler::ToObject object is created, the following methods may be invoked. Many of them simply return the value of a parameter that was passed (or set by default) when the Compiler was created. Some methods may be used by developers when building a site, while other methods should be called only by the various other pieces in the Mason framework. Though you may need to know how the latter methods work if you start plugging your own modules into the framework, you'll need to read the Mason documentation to find out more about those methods, as we don't discuss them here.

The compiler methods are comp_class() , in_package() , preamble() , postamble() , use_strict() , allow_globals() , default_escape_flags() , preprocess() , postprocess_perl() , postprocess_text() , and lexer() .

Each of these methods returns the given property of the Compiler, which was typically set when the Compiler was created. If you pass an argument to these methods, you may also change the given property. One typically doesn't need to change any of the Compiler's properties after creation, but interesting effects could be achieved by doing so:

  % my $save_pkg = $m->interp->compiler->in_package;
  % $m->interp->compiler->in_package('MyApp::OtherPackage');
  <& /some/other/component &>
  % $m->interp->compiler->in_package($save_pkg);

The preceding example will compile the component /some/other/component -- and any components it calls -- in the package MyApp::OtherPackage rather than the default HTML::Mason::Commands package or whatever other package you specified using in_package.

Of course, this technique will work only if /some/other/component actually needs to be compiled at this point in the code; it may already be compiled and cached in memory or on disk, in which case changing the in_package property (or any other Compiler property) will have no effect. Because of this, changing Compiler properties after the Compiler is created is neither a great idea nor officially supported, but if you know what you're doing, you can use it for whatever diabolical purposes you have in mind.

The Resolver

The default Resolver, HTML::Mason::Resolver::File , finds components and their meta-information (for example, modification date and file length) on disk. The Resolver is a pretty simple thing, but it's useful to give it its own place in the pluggable Mason framework because it allows a developer to use whatever storage mechanism she wants for her components.

The HTML::Mason::Resolver::File class accepts only one parameter:

If you don't provide a comp_root parameter, it defaults to something reasonably sensible. In a web context it defaults to the server's DocumentRoot; otherwise, it defaults to the current working directory.

The Interpreter

The Interpreter is the center of Mason's universe. It is responsible for coordinating the activities of the Compiler and Resolver, as well as creating Request objects. Its main task involves receiving requests for components and generating the resultant output of those requests. It is also responsible for several tasks behind the scenes, such as caching components in memory or on disk. It exposes only a small part of its object API for public use; its primary interface is via its constructor, the new() method.

The new() method accepts lots of parameters. It accepts any parameter that its Resolver or Compiler (and through the Compiler, the Lexer) classes accept in their new() methods; these parameters will be transparently passed along to the correct constructor. It also accepts the following parameters of its own:

Request Parameters Passed to the Interpreter

Besides the Interpreter's own parameters, you can pass the Interpreter any parameter that the Request object accepts. These parameters will be saved internally and used as defaults when making a new Request object.

The parameters that can be set are: autoflush , data_cache_defaults , dhandler , error_mode , error_format , and out_method .

Besides accepting these as constructor parameters, the Interpreter also provides get/set accessors for these attribute. Setting these attributes in the interpreter will change the attribute for all future Requests, though it will not change the current Request.

Footnotes

1. All initialization parameters have corresponding Apache configuration names, found by switching from lower_case_with_underscores to StudlyCaps and prepending "Mason." -- Return.

2. This package name is purely historical; it may be changed in the future. -- Return.