rails caching and my projects

After posting about my projects the other day, someone pointed out that my site was “broken”. So I spent quite a while investigating the performance issues. The site is hosted on Dreamhost, and set up through FastCGI – but it seems that Dreamhost’s FastCGI configuration is not really optimal for Rails applications.

I made a minor fix by giving a setting for Rails FastCGI dispatcher to garbage collect regularly. I also found that some incorrect file permissions seemed to be preventing Typo (this blog engine) from caching fragments, so it was going to the database for every request. This is bad on Dreamhost because the databases are on separate machines elsewhere on their network.

Given these problems, I’ve realised I need to implement caching in Giggle 2 before I make it live. So I’ve been doing a lot of thinking about caching strategies in Rails.There are three basic types of caching in Rails:

Page caching – This is the best for performance, but it comes at a heavy price. It means writing out generated pages as static .html files which will then be served directly by Apache, not hitting Rails at all. The downside is that it difficult to clean up the cache (i.e. delete the files).

Action caching – Similar to Page caching in that it caches a whole page, but it does a check on filters in Rails before serving the page. I don’t think it’s useful in cases I’ve been thinking about, and it depends on Fragment caching anyway.

Fragment caching – This means taking parts of pages – ‘fragments’ – and storing them. This doesn’t give such a performance boost, since Rails still has to do some evaluation and rendering, but it is the most controllable. By caching the right parts of pages you can avoid repeating database accesses and rendering, while retaining dynamic elements elsewhere in the page.

This blog runs on Typo, which has its own extended action caching behaviour based on Fragment caching. The extension stores some extra information about the page, including a ‘lifetime’, which is how long the cached page will be kept. Since a Typo page is the combination of articles and sidebar components, the lifetime will be the lowest requested lifetime of any component in the page. (For this site I edited the ‘del.icio.us’ and ‘upcoming’ components to have lifetimes of 1 day, so pages can be cached for longer.)

Ordo Acerbus uses Page caching for nearly all the public pages – in fact I designed it to do so right from the start. It works because there is nothing dynamic other than the page content, and no login users other than the admin (just me). The only downside is that it deletes the whole cache whenever an article is added or removed or moved, since every page has the menu on it. (Note to self: I could improve on this by either having a small time delay before deleting the cache, or by requiring a manual delete of the cache, since I often work on several articles at once.)

Giggle 2 is going to be a bit more difficult to come up with a caching strategy for, but if I break it down I think I can do something useful.

The first problem is that it can be viewed by users who aren’t logged in, as well as those who are. It would be nice to use Page caching for the public view, but also use the same URLs for the logged in view. Fortunately a task I did at work a while back involved configuring rewrite rules in Apache to check a cookie – so I know I can alter the Apache configuration that deals with Page caching, to check whether the user is logged in, by looking at their cookies. (It might need an extra cookie, but it is doable).

The second problem is the ‘logged in’ views themselves. But I can break them into two fragments – the footer is constant for any given user, while most of the page bodies are the same regardless of who is viewing them. So I can use Fragment caching for these parts.

The worst problem is the ‘search’ page. Since Giggle has a fairly general search, there isn’t a good way of telling whether a data update would invalidate some cached search result. On the other hand it would be a useful performance gain if I could avoid repeated searches hitting the database every time, since searching is also a fairly intensive operation. Since the data is not that crucial, I think I can accept searches with results that are no longer correct persisting for a few minutes if that helps.

I want to cache searches that are popular, i.e. happen repeatedly, which suggests a LRU cache would be appropriate. I’m not sure if there is one suitable for Rails, so I may have to implement it. When a data update occurs that would invalidate the search cache, it will be allowed to ‘live’ for a few more minutes, since the working of the application suggests that several data updates are likely to occur together.

One Comment

  1. Posted October 22, 2006 at 9:17 am | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*