Caching in CMS Made Simple

What is Cache?

Wikipedia has a very good explanation of caching and how specifically it relates to computing at
http://en.wikipedia.org/wiki/Cache_(computing)

In computer science, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparatively faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the greater the number of requests that can be served from the cache, the faster the overall system performance becomes.

Any data can be cached for later use, but there are numerous factors that need to be evaluated and weighed when making the decision as to whether data should be cached, and if so, what type of caching, and how it should be used. Some of those factors include:

  • The type of data, and its dependencies

    Is there a large amount of data to cache, or can it grow large? Is the data specific to a user? a group of users? or to all users?

  • The performance of extracting the data from its original source

    When evaluating whether to cache some data, the developer must look at where the bottleneck exists. Is it in extracting the data from the database. Is it in a calculation stage, or is the performance problem you are attempting to solve due to transmission of data from the server to the client (in the case of online applications). This knowledge will assist in determining what type of cache can be used.

  • Frequency of change

    When, and how often does the data that will be cached change? If it changes monthly, weekly, or even hourly it can often be worthwhile to cache the information. However if the data changes on each request then it may not be appropriate to cache the information at all.

  • Frequency of use

    Related to the above point, caching data is only useful if the data that is cached will be used more than one time before it has to be refreshed. Or if the cached data is not requested often there may be no point in caching the data. i.e: some very expensive data calculations can be cached but if they are only going to be used once each month by one specific user is there really a need?

  • Security, laws, privacy, and ethics

    When determining what data to cache, where, and for how long it is important to analyze the data that would be stored with respect to security, laws, privacy and ethics. Particularly when the data is of a personal nature.

    For example, when building an e-commerce website you may think it useful to remember the users address and credit card information for repeated purchases. However, due to privacy laws and ethics concerns you have to decide if it is legal and ethical to store that data, and if so, how it can be securely stored. This decision requires an intimate knowledge of the computer technologies in play (databases, sessions, cookies, encryption, etc), along with the laws of your applications user base.

  • Storage limitations

    When deciding what caching mechanism to use it is important to consider the limitation of the storage mechanism you may need. i.e: if considering a memory cache the size of the combined data that may already be cached in the system and the size of data you may wish to store may exceed the size limit allocated to the storage mechanism. This is also a consideration for database and file storage mechanisms, but is usually less of a concern.

  • For web software (is the cache mechanism available on all hosts).

    Often when programmers are writing re-usable code they are enthused to use a caching technology that is cool and interesting, and will solve many of their concerns. However it may rely on a php extension or other software that may not be available on many shared hosts.

  • Complexity of implementation

Types of Cache implementation

Caching in Memory

Numerous technologies exist for allowing software to cache data in the system memory. Caching in the system memory is useful in that it can be extremely fast compared to other methodologies. Memcached and APC are two utilities that allow memory caching for php based web apps. memcached has the added feature that it can allow cached data to be shared across numerous physical hosts. CMSMS does not use these for a few reasons:

  • They are not universally available on shared hosts.
  • They operate using "shared" memory and have a fixed capacity amongst all applications that are using them.
  • The implementation of some technologies requires opening a socket and sending and receiving data, which may be slower than other technologies in certain circumstances.

Caching in Files

Perhaps the most common way of caching is to create a file somewhere where the calculated/extracted data is kept. CMSMS does this frequently for a number of its caches. The tmp/cache and tmp/templates_c (explained below) directory is most frequently used, but other modules may use other directories.

Caching in the Database

It is possible to cache data in the database as well. This may be useful if the cached data must exist between requests, or for an extended period of time. However storing data in the database is usually considered slow.

Caching in the session

The "session" is a mechanism whereby applications can store data that is user related between http requests. This allows storing state information (such as the contents of the user cart). Sessions usually expire automatically after a set period of unuse or when the users browser is closed. PHP allows that the actual data storage mechanism for sessions be implemented in a variety of ways including in memory (see memcached or apc), in the database, or by default in files.

Frontend requests for the CMSMS core do not use the session to cache any information. However, the admin interface and numerous third party modules cache information in the session.

Caching in cookies

"cookies" are user specific pieces of data that are stored on the clients local machine and are transmitted to the server on each page request. They are web domain specific, and can cleared automatically at a set time, or when the user closes the browser. Applications can cache any user specific information in these, but typically only small pieces of data are stored to reduce transmission times, and ensure security. By default CMSMS uses a cookie to transmit a unique session identifier. We also use cookies to store some information for admin actions. Modules may use cookies for storing personal information such as addresses etc.

Cached data in CMSMS

CMS Made Simple caches numerous different amounts of data in order to improve the performance of the user experience. Some of those ways are explained below. Third party modules also cache different types of data.

Content Page Structure Cache

CMSMS gathers the page tree structure (but not the page content) and caches it in a file. This information is used for rapid menu generation on frontend requests. It is updated each time a content page is changed.

Module Metadata

Module dependencies, tags,and other information is cached so that when a smarty tag is encountered a module, and its dependent modules can be loaded into memory. This conserves memory and increases performance because a module does not need to be loaded if it is not needed on a particular page request. This meta information cache is re-generated whenever a module is installed, removed or upgraded.

Menu Manager Cache

If a menu manager template is marked as cachable, and the page that it is called from is cachable then the html output from the menumanager calls can be cached. This significantly decreases the memory requirements and amount of database calls required to build a page, and therefore significantly improves performance. This cache is re-generated whenever a page is changed, or the menu manager template is changed.

Stylesheet Caches

CMS Made Simple allows attaching numerous stylesheets to a page template, and thereby to one or more content pages. The {cms_stylesheet} tag is responsible for generating HTML code that will result in the stylesheets being downloaded to the users browser. In order to reduce the number of requests from a browser to the server to render a page, and to take advantage of browser caching, this tag also does some advanced processing of the stylesheets. It generates uniquely named css files which are then returned to the browser, and allowed to cache on the browser. New css files are generated each time a stylesheet or stylesheet association is changed.

Smarty Compiler Cache

The Smarty template engine takes template code and "compiles" it into php code for processing on the server. This compiled PHP code is cached on the server so that if a template has not changed the template does not need to be recompiled.

Smarty HTML Cache

Smarty is capable of caching the output for a specific compiled template into a unique file to even further improve performance. This caches the generated HTML code into unique files for each page template, module template, or GCB. These caches are re-generated after an hour.

Influencing Smarty Caching in CMSMS

Do a Compilation Check

Though there is no ability to control wether or not smarty template compilation is cached there is an option in the "Global Settings" page to indicate wether smarty should check for changed templates. If the "compile check" option is disabled, Smarty (and therefore CMSMS) will not check for changes in templates.

Disabling the "compile check" option can be useful to improve performance for websites that do not change frequently. However after each template change, the administrator will either have to manually clear the entire CMSMS cache, or wait for the compiled files to be automatically removed.

Cachable Content Pages

CMS Made Simple has the option to control whether the HTML output from the compiled Smarty templates can be cached and re-used on multiple requests. This option can be controlled by the "Cacheable" option on the edit form for each page. If enabled, the output from the various templates on that page may be cached.

It may be possible to allow caching on pages that generate dynamic data. Portions of the page can be cached, but other portions may not and smarty will execute only the portions that are not cachable. i.e: If a page is marked as cachable, but calls a non cachable tag in its page template, then tag will be executed but other output may be cached. See below regarding Cachable and Non Cachable Plugins, and Module Call Caching.

Browser Cache Settings

CMS Made Simple has the option to allow the web browser to cache the contents of an entire web page. This is useful on static websites or where the delivery of content is not necessarily time dependent. This option, along with the browser cache expiry period (how long the browser can cache the page for) is controlled on the advanced tab of the "Global Settings" page in the admin console. Browsers can only cache content pages that are marked as "cachable"


Server Cache Age

CMSMS has the ability to automatically remove files from its cache directory that reach a certain age (in days). This can be useful for sites that are in flux (so that unnecessary files are automatically cleaned up) or in conjunction with the compile check option as mentioned above if there is no time specific information in the compiled output.

Menu Manager Cache

Menu Manager, the module that builds navigations for CMSMS has the ability to cache the output from specific menu manager calls. This only occurs if:

  • The content page currently being requested is cachable.
  • The menu manager template has been marked as cachable.
  • The nocache parameter has not been specified.

Enable Smarty Caching:

This option, located on the Advanced tabs of the "Global Settings" page controls whether smarty caching is used on the site at all. Additionally, there are further options which can assist with controlling the caching of various plugins:


Cachable and Non Cachable Plugins

Smarty plugins can either be cachable, or not cachable, This is dependent upon the plugins implementation. The rules are as follows:

  • All CMSMS supplied or third party plugins in which the implementation function name begins with smarty_cms_ are not cachable.
  • All CMSMS supplied or third party plugins in which the implementation function name begins with smarty_ are cachable.
  • {content}, {content_image}, and {content_module} blocks in the page template are not cachable.

The page at Extensions >> Tags indicates which tags are cachable and which are not.

Global Content Blocks

Global content blocks (gcb) are always cachable, however you may call other plugins from within this block which are not cachable.

User Defined Tags

A single global option in the "Global Settings" page of the CMSMS admin console controls whether User Defined Tags (udt) can be cached.

Module Call Caching

Module caching can be influenced by an option in the "Global Settings" page. The options there allow specifying that all module calls can be cached, none, or to let the the module decide. Note, this will not influence the caching options for the Menu Manager as described above as it does not use smarty caching, but its own (subject to change).

If set to "let the module decide" Each module can decide whether its output is to be cached. A column in the module list under Extensions >> Modules indicates which module calls support caching.

Explicitly Disabling Caching

It is possible to explicitly disable caching in sections of your page (or other templates) by using the nocache attribute on the tag, or the {nocache} tag. i.e: {News nocache} will disable caching (if otherwise allowed) for that call to the News module. Additionally, a template portion such as: {nocache}{News}{/nocache} would have the same effect but allow some smarty logic between the opening and closing tags.

This is particularly useful when logic is used in the template. i.e:

{nocache}
{content block='Sidebar' assign='sidebar'}
{if $sidebar ne ''}<div id="sidebar">{$sidebar}</div>{/if}
{/nocache}

Special notes for Site developers

Due to limitations in Smarty3, when caching for a page is enabled, special consideration must be used to capture the output of a non-cached plugin. In order to capture the output of a plugin that does not cache, you need to use either the "capture" smarty compiler tag, or the "nocache" tag attribute. i.e:

{capture assign='mycontent'}{content}{/capture} or {content assign='mycontent'}{$mycontent nocache}

For Module Programmers:

Module programmers can control caching in their actions by overriding the "AllowSmartyCaching" method in their module class. i.e:

public function AllowSmartyCaching() { return TRUE; }

Additionally, in the actions themselves to determine if smarty should use the information from its cache with code similar to the following:

$cache_id = '|'.$this->GetName().md5(serialize($params));
$compile_id = '';
if( !$smarty->isCached($this->GetDatabaseResource($template),$cache_id,$compile_id) )
{
  // do database work, and assign variables to smarty.
}
echo $smarty->fetch($this->GetDatabaseResource($template),$cache_id,$compile_id);