Accessibility
 
 
Query Caching in ColdFusion
By: Daryl Banttari
Senior Consultant
Allaire Consulting

Scalability means never having to say, "We can't handle any more traffic."

Many projects start small: one ColdFusion Web server, one database server. Then the development team encounters one of two motivating factors--the need for more uptime (availability) or the need for more capacity (scalability). In this article, we discuss the latter, enhancing scalability through the use of query caching.

Improving scalability is about finding and removing bottlenecks that restrict the growth of a system. The most common bottlenecks for Web systems include:

  1. Insufficient network bandwidth.
  2. Insufficient CPU resources.
  3. Inability to get data to/from the database.

Solving each problem seems simple:

  1. Call the ISP.
  2. Add servers.
  3. Add more database server(s).

Unfortunately, the cost of adding database server(s), both in terms of money and administrative overhead, is very high. So, it seems that maximizing your existing investment in database hardware and software is warranted. Enter query caching.

Query caching is designed to accomplish two goals:

  1. Decrease the time between a page request and the page view.
  2. Reduce the amount of work generated for the database server for each page view.

Implementing query caching is very simple. For example, examine the query below, which might be used to retrieve a list of states for a <SELECT> list.

Before:

 

<CFQUERY Name="qStates" DataSource="#Request.DSN#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>

After:

 

<CFQUERY Name="qStates" DataSource="#Request.DSN#"
    CachedWithin="#CreateTimeSpan(0,1,0,0)#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>

What Just Happened?

You just empowered ColdFusion to hold on to the results of that query for up to an hour. ColdFusion will now stop repeatedly asking the database for the results of this query. In fact, ColdFusion will reuse the results of that query for up to an hour before asking the database for that result set again. The database is now relieved of the duty of fetching these rows and sorting them (which usually involves creating and dropping a temporary table) for each request to that page.

Before you get too excited, there are a few details to consider:

  • There is a limit to the number of queries you can cache. This number is configured in the ColdFusion Administrator under "Caching." In ColdFusion 4.01, you cannot set this number higher than 100. This limitation was removed in ColdFusion 4.5, but that is not an invitation to set the value to 30,000. Caching too many result sets will cause memory starvation and heavy virtual memory paging, negating the benefits.

  • If you have a dynamic query, such as "SELECT * FROM Catalog WHERE CatalogNumber = #val(FORM.CatalogNumber)#", each permutation of that query counts as one cached query. Therefore, query caching should only be used for commonly accessed result sets. The CFML Language Reference defines a distinct result set by stating "…the current query must use the same SQL statement, data source, query name, user name, password, and DBTYPE. Additionally, for native drivers it must have the same DBSERVER and DBNAME (Sybase only)."

  • There is no easy way to invalidate a result set, if you detect that a result set should be invalidated. Therefore, the time span used for the result set cache should be chosen carefully. However, if you have a result set that is accessed four times per second, setting a timeout as low as a minute reduces the load on the database (for that query) by a factor of 240.

  • You cannot use query caching for parameterized queries (queries using <CFQUERYPARAM>.) Parameterized queries should be used for common non-cached queries, since they allow the query plan to be reused on some database systems (such as Oracle), and they are virtually immune to malicious query editing as documented in Allaire Security Bulletin ASB99-04 (http://www.allaire.com/handlers/index.cfm?ID=8728&Method=Full)
Keep in mind that queries are also global to the server, so if you use the same query with the same name in multiple pages, that cached result set is shared between the pages.

Let's look at the "Tack2 Intranet" case study from within the example applications optionally installed with ColdFusion. The home page was altered to allow access from machines other than localhost. Then a test was performed with one computer simulating 20 users pressing "refresh" 100 times each in rapid succession against a Windows 2000-based ColdFusion server with a Windows NT 4.0 SQL Server database server. Each server is a Pentium 3, 450 MHz PC with at least 128MB of RAM. The queries are pretty light, but there are four on that page. The following is a graph of CPU utilization of both servers during the initial test:

Figure 1: CPU Utilization during initial test.

Now, the same test, after implementing query caching:

 

Figure 2: CPU utilization while using query caching.

Notice how the utilization of the SQL server dropped to idle. Now let's look at the results of the change:

Figure 3: Comparison between tests, the second using query caching.

Even on the heavily CPU-bound application server, we see a 16% improvement in the ability of the ColdFusion server to produce pages with a 14% improvement in page response time. An even more compelling story is told in the standard deviation--the "usual variation" in response time. While the non-cached server produced pages where the response time "usually" varied almost as much as the page time, the cached version was very consistent in its response time. For pages with more difficult queries, the difference becomes particularly noticeable.

The real story, however, lies in the difference between the CPU loads on the SQL server. With the average CPU utilization during the test dropping from 23.4% to 4.2%, the site's ability to scale by adding Web servers (for this particular page, anyway) has increased dramatically.

Conclusion

The repetitive nature of Web pages often causes database servers to work very hard at producing the same results sets over and over again. However, by using the Query Caching capability included with ColdFusion, a significant amount of work can easily be moved from the database server to the ColdFusion servers. This allows for a much higher ratio of ColdFusion servers to database servers, enhances the performance and scalability of your Web system, and maximizes your investment in your database servers.