Cooking with Maypole, part II

We began our gastronomic adventures with Maypole last month, where we constructed a recipe collection and a way to index and investigate the current state of food stocks in the house. Now we’re going to combine the two concepts, and search for recipes that fit what we’ve got available to eat.

First, though, we’ll take a brief look at how Maypole works and what it actually does.

How Maypole Works

Last month we saw Maypole primarily in terms of putting an interface onto an existing data structure, and applying templates to this. However, to think of it like this is to miss the flexibility and extensibility of Maypole as a web application framework.

Perhaps the best way to think of Maypole is as a tool for mapping a URL onto an “action”, where an action is specified as a method call and a template. So the URL /recipe/view/12 is asking for the view action to be performed on a recipe class, with argument “12”. Practically, this means that the view method will be called on the Larder::Recipe class on the object representing the 12th row in the table, and then the view template used to display the results.

This process is carried out by the gradual fleshing out of a “Maypole request object”; this is analogous to an Apache request object but at a much higher level. As well as containing the means to communicate with the web server, (such as an Apache::Request object) it begins with the configuration and some idea of the path begin requested: /recipe/view/12.

Next, it decomposes the path down to its components:

    { table => “recipe”,
      action => “view”,
      args => [ 12 ]
    }

Then it associates the table with the Larder::Recipe class, and then calls the view method. Finally, the output from the templating stage gets added to the request, and the request is sent back to the front-end Maypole class, (usually Apache::MVC or CGI::Maypole) for eventual output to the browser.

Avoiding Rotten Tomatos

We’ll begin our extension of the Larder application by adding an action to the contents of our larder which are in danger of going off imminently. To help us do this, we’ll write a utility method in the Larder::Contents class called ripe_food which returns all the objects which are need to be eaten. This is pure Class::DBI for the time being.

First, SQL dates are a bit of a pain to do anything sensible with, so we have Class::DBI automatically inflate them to Time::Piece objects:

   Larder::Contents->has_a(use_by => ‘Time::Piece’,
           inflate => sub { Time::Piece->strptime(shift, “%Y-%m-%d”) },
           deflate => ‘ymd’,
               );

Now every time we call use_by, we get a Time::Piece object. This doesn’t really affect our display, since Time::Piece stringifies nicely.

We can now look for those Contents objects which have a use_by date under five days away.

    use Time::Seconds;
    use Time::Piece;

    sub ripe_food {
        my $class = shift;
        my $deadline = localtime + 5 * ONE_DAY;
        grep { $_->use_by <= $deadline } $class->retrieve_all;
    }

(If more efficiency is required, we could do the searching in the SQL by using Class::DBI::AbstractSearch, but my larders aren’t big enough to warrant it.)

To turn this method into an action that can be called from the web, we need to create an “Exported” method which places these objects into the Maypole request object:

    sub must_eat :Exported {
        my ($self, $r) = @_;
        $r->{objects} = [ $self->ripe_food ];
    }

We create a template in contents/must_eat, and now we can view these items from the URL /contents/must_eat.

In order to suggest recipes which use these up, we need to link ingredients to recipes.

Categories and Ingredients

Our data loader last month looked like this:

    use Larder;
    use XML::Simple;
    use File::Slurp;
    for my $recipe (<xml/*>) {
        my $xml = read_file($recipe);
        my $structure = XMLin($xml, ForceArray => 1)->{recipe}->[0];
        my $name = $structure->{head}->[0]->{title}->[0];
        my @ingredients = @{$structure->{ingredients}[0]{ing}};
        my @cats = @{$structure->{head}[0]{categories}[0]{cat}};
        Larder::Recipe->find_or_create({
            name => $name,
            xml => $xml
        });
    }

Now we’re going to extend this, and our Larder class, to support the linkages between recipes and categories, and recipes and their ingredients. Class::DBI makes it easy for us to do this: we tell it the name of the accessor we want, the name of the mapping class, and the name of the accessor in that class which returns what we want.

In our case, we first have to tell Class::DBI about the relationship between the ingredient table and the food table.

    Larder::Ingredient->has_a(food => “Larder::Food”);

and vice versa:

    Larder::Food->has_many(ingredients => “Larder::Ingredient”);

So, in our case, we want the Larder::Recipe class to have an accessor called ingredients, which uses Larder::Ingredient to get a list of ingredients for a recipe and calls food on each one to return a Larder::Food object. The code for that looks like this:

    Larder::Recipe->has_many(ingredients => [ Larder::Ingredient => “food” ]);

Similarly, we can get from a Larder::Food object to recipes containing it:

    Larder::Ingredient->has_a(recipe => “Larder::Recipe”);
    Larder::Food->has_many(recipes => [ Larder::Ingredient => “recipe” ]);

And all the same for categories. Of course, we could do this a little easier, using my module Class:DBI::Loader::Relationship. This uses Class::DBI::Loader (which Maypole also uses; this is no coincidence) to express the relationships between classes in more natural terms. It’s not as powerful, but it is easier to remember:

    Larder->config->{loader}->relationship(
        “A recipe has categories via categorizations”
    );

With out relationships set up, we can tell the loader to associate a recipe with its categories and ingredients:

    for (@categories) {
        my $cat = Larder::Category->find_or_create({ name => $_ });
        $recipe->add_to_categories({ category => $cat });
    }

    for (@ingredients) {
        my ($ing, $amt) = ($_->{item}[0], $_->{amt}[0]);
        my $ingredient = Larder::Food->find_or_create({ name => $ing });
        my $quantity = $amt->{qty}[0]. ” ” . $amt->{unit}[0];
        $recipe->add_to_ingredients({ food => $ingredient,
                                      quantity => $quantity   });
        }
    }

Now we can create a template which suggests some recipes to go with our moribund ingredients:

    <h1> You need to use up some food! </h1>

    [% FOR content = contents %]
        <h2> [% content.food.name %] </h2>

        <P> Needs to be used up by [% content.use_by %] </P>

        <P>
        Some recipes you could make with this:
        <P>
        <UL>
             [% FOR recipe = content.food.recipes %] 
            <LI> <A HREF=”/recipe/view/[% recipe.id %]”> [% recipe %]</A>
            [% END]
        </UL>
    [% END %]

Of course, these aren’t necessarily the best recipes for the job; ideally we’re going to find the recipes which use up as many of the ingredients as possible. We can’t do this with a plain database search. My initial plan was to gather up all the potential recipes, and then score them based on the ingredients that they use and the immediacy of getting rid of the ingredient.

But then an even more interesting technology came along.

Plucene

Plucene is a Perl port of the Java Lucene search engine, a Jakarta project. (http://jakarta.apache.org/lucene/) Rather than being a standalone search tool, it is a library with which you can construct your own indexing and searching tools. The easiest interface to it is through the Perl module Plucene::Simple, which we’ll use for indexing the recipes.

Plucene works in terms “documents”, which are a little like pages in a book. When you’re building an index to a book, the index will relate a word or phrase from the page (the index term) to a page number - it doesn’t directly relate the word to the entire contents of the page, or the index would be exponentially longer than the book itself! Instead, the reader is responsible for turning the page number into the original page contents. If we’re indexing a book with Plucene, we might create documents like this:

    @documents = (
    1 => {
        chapter_title => “Preface”,
        text => “We have many emotions as we …”
    },
    3 => {
        chapter_title => “Ethos”,
        text => “We have made fundamental assumptions  …”
    },
    …
    );

We create a Plucene::Simple object which represents the index:

    use Plucene::Simple;
    my $index = Plucene::Simple->open(“/home/simon/ow_book/index”);

And we can add the documents:

    $index->add(@documents);
    $index->optimize;

The optimization step “defragments” the index once we’ve finished adding a lot of data at once.

To run a search, we open the index again and call its search method:

    use Plucene::Simple;
    my $index = Plucene::Simple->open(“/home/simon/ow_book/index”);
    my @results = $index->search(“we”);

Now a search for we would return 1 and 3, and we could narrow down our search by looking for we chapter_title:Ethos to return only page 3. It’s assumed that we have an easy way of turning the ID, “3”, into the full text of the book’s page.

However, we’re not indexing a book, but a set of recipes. In this case, our documents are going to look like this:

    52 => {
        title => “Aioli”,
        categories => “Salads Condiment Classic”,
        ingredients => “Garlic Mayonnaise”,
    }

Where “52” is the ID of the recipe in the recipe table - we can, of course, look this up again by doing Larder::Recipe->retrieve(52).

We’re going to skip the process of indexing the directions part of the recipe, since we’re only interested in searching for particular ingredients at the moment.

So, once again we edit our recipe loader script, which currently has this at the end of the loop:

        Larder::Recipe->find_or_create({
            name => $name,
            xml => $xml
        });

We need to keep hold of that recipe’s ID, and build up our hash of things to index:

    my $recipe = Larder::Recipe->find_or_create({
            name => $name,
            xml => $xml
        });
    my $hash = {
        title => $name,
        categories => (join ” “, @categories),
        ingredients => (join ” “, map { $_->{item}[0] } @ingredients)
    };
    $index->add( $recipe->id => $hash );

Finally, we optimize the index outside of the loop, once everything has been added:

    $index->optimize;

Now we have, in addition to our database of recipes, an index by which we can look for entries in that database. How does this help us find good recipes for food that’s going off?

Locating the best recipe

Once our index has been built up, we can now start searching for recipes by their ingredients, and Plucene automatically makes sure that those recipes which match “better” - that is, more of the ingredients - are returned first. So, for instance, if we have some carrots, bacon and mushrooms to use up, we can create a simple test search script like this:

    use Plucene::Simple;
    use Larder;
    my $index = Plucene::Simple->open(“pl_index”);
    for ($index->search(“carrots bacon mushrooms”)) {
        my $r = Larder::Recipe->retrieve($_);
        print $r->name,”\n”, join “, “, map { $_->name } $r->ingredients;
        print “\n\n”;
    }

And we’ll be given a list of recipes which contain any of those ingredients, but starting with the best matches:

    24 Hour Vegetable Salad
    Iceberg lettuce, Mushrooms, Peas, Carrots, Egg whites, Bacon,
    Cheese, Fat-free mayonnaise, Lemon Juice

    Beef Burgundy
    Mushrooms, Onions, Butter, Bacon, Sirloin Steaks, Flour,
    Burgundy, Beef Broth, Bay leaf, Garlic, Ground Thyme, Carrots, 
    Salt And Pepper, Noodles, Chopped Parsley

    Bacon Supper Snack
    Gammon, Tomatoes, Stuffing, Butter, Mushrooms, Soft White
    Bread Crumbs, Salt and pepper, Mixed Herbs, Egg

    All-In-One-Breakfast
    Whole Wheat Bread, Butter, Mushrooms, Tomatoes, Cheese, Bacon

    …

The last two recipes shown here (and there were many more) don’t contain all three ingredients, but do contain two; as we carry on down the list, we get less and less specific.

What we’re doing, then, is using the built-in scoring techniques of a standard search engine to find the best recipes for us - web search engines are all about finding the most appropriate pages relating to the user’s terms; we’re using the same mechanism to find the most appropriate lunch relating to what’s in the cupboard.

Now we can turn our “must eat” page into a page which helps us search for the best recipes to eat:

    sub must_eat :Exported {
        my $index = Plucene::Simple->open(“recipe_index”);
        my @ripe = $self->ripe_food;
        $r->{objects} = \@ripe;
        my @terms = map { ‘”’. $_->name. ‘”’ } @ripe;
        my @results = map { Larder::Recipe->retrieve($_) }
                $index->search(join ” “, @terms);
        $r->{template_args}{recipes} = \@results;
        $r->{template_args}{highlight} = { map { $_->name => 1 } @ripe };
    }

Notice that we surround our ingredient names in double quotes - Plucene understands the concept of phrase matches, as one would expect from a search engine: “fish fingers” searches for recipes containing fish fingers, whereas fish fingers will search for recipes which make use of both fish and long pig. When we’ve retrieved the results from our search engine and turned them into recipe objects, we add them to our set of arguments to the template. We also add a hash of the ingredient names we’re looking for - this will help us highlight the ingredients when we’re producing a summary of the recipe. So, for instance, we want our page to look like this:

    # EDITOR: insert suggestions.png here.

The associated template would go as follows:

    <h2> You need to use up some food! </h2>
    <P>
    The following food is getting a bit ripe:
    </P>
    <UL>
    [% FOR content = contents %]
        <LI> [% content.food.name %]
    [% END %]
    </UL>

    <H2> Suggested recipes </H2>
    <P>
    These recipes will help you use up those ingredients:
    </P>

Now we look at each recipe in our search results:

    [% FOR recipe = recipes %]
        <h3> <A HREF=”/recipe/view/[%recipe.id%]”> [% recipe.name %] </a> </h3>
        Requires:
        <p>
        [% FOR ingredient = recipe.ingredients;
           SET name = ingredient.food.name;
        

And now for each ingredient, we can show their names, and check whether or not to highlight them:

           ‘<span class=”searchresult”>’ if highlight.$name;
           name;
           ‘</span>’ if highlight.$name;
           END;
        %]
        </p>
    [% END %]

Hooray - now we not only know which recipes will use up the dying ingredients, but also which ones will include the most of them at once. There’s one final touch we can add to our application before we head off to the kitchen - a sense of urgency.

If something needs to be used by today, we want a recipe which uses it today. Let’s change ripe_food so that it returns us a hash of ingredients and a score representing the need to eat them:

    sub ripe_food {
        my $class = shift;
        my $deadline = localtime + 5 * ONE_DAY;
        map { 5 - int(($deadline - $_->use_by) / ONE_DAY) }
        grep { $_->use_by <= $deadline } $class->retrieve_all;
    }

Now if we have some cheese that really needs to be eaten today and some ham that has two days to go, we get

    ( Cheese => 5, Ham => 3 )

We want Plucene to score up recipes which contain cheese, relative to those which contain ham. We can do this using a “boost” factor in the search term. Plucene allows us to search for “Cheese”^5 “Ham”^3 - now it tries to find recipes which have both ham and cheese in, then those which contain cheese, then those which contain ham. With a list of ten or twenty ingredients to get rid of, this is more or less guaranteed to give us recipes which use up the widest range of the most desperate ingredients first, giving us the most economical ways to clean out our cupboards. We’ll need to modify the must_eat action to understand the list returned:

    sub must_eat :Exported {
        my $index = Plucene::Simple->open(“recipe_index”);
        my @ripe = $self->ripe_food;
        $r->{objects} = [];
        my @terms;
        while (my ($obj, $score) = splice(@ripe, 0, 2)) {
            push @{$r->{objects}}, $obj;
            push @terms, ‘”’. $obj->name. ‘”^’. $score;
            $r->{template_args}{highlight}{$obj->name}++;
        }

        my @results = map { Larder::Recipe->retrieve($_) }
                $index->search(join ” “, @terms);
        $r->{template_args}{recipes} = \@results;
    }

We’re using a list of pairs instead of a real hash, because the “keys” are objects, and Perl doesn’t let us use objects as hash keys - they store just fine, but as they are stored, they get stringified and we can’t use them as objects again when we retrieve from the hash. The technique of using

    while (my ($key, $value) = splice(@list, 0, 2)) {

where you’d normally expect

    while (my ($key, $value) = each %hash) {

is quite a common one you can use where you’d like to use objects as hash keys.

With this in place, Plucene scores each ingredient according to its freshness, in a nice simple way which frees us from having to think up a complicated algorithm to do the job. That’s code re-use! And now, what’s in the fridge?

Happy cooking!

I hope you’ve enjoyed our two-part foray into cooking with Perl; we’ve covered quite a lot of ground on the way. This time we’ve particularly focussed on Maypole, and showed how to turn it from a simple front-end to databases into a web application framework on which to base more complex applications. We’ve also taken a look at Plucene, a pure-Perl search engine which allows us to index and search through all kinds of data - including recipes!

If you want to find out more about Maypole, there’s a growing set of documentation at http://maypole.simon-cozens.org/docs/, including a large manual with several examples of real-life applications. Plucene can be downloaded from CPAN, and there’s a longer introduction to it at http://www.perl.com/pub/a/2004/02/19/plucene.html. Finally, for a load of great recipes in RecipeML, try http://dsquirrel.tripod.com/recipeml/indexrecipes2.html.

Bon appetit!


neverclickonthislink