Web Developer's Journal internet.com
Back to the Web Developer's Journal Main Page
internet.com
side nav bar

Web Developer's Journal Archive Section

Data Caching: A Means to Higher Performance


Jobs at webdeveloper.com


Seagate Technology
920 Disc Drive
Scotts Valley, CA 95066-4544
408-438-6550



If an article is here in the Archives, it is because we feel it may be out of date in one way or another. Some of the information may still be useful to some readers, but keep in mind that some information may no longer be accurate. Product capabilities change, prices change, links may no longer work, and in fact, the company that makes this product may have gone down the tubes long ago. To get the latest, we recommend that you go to The Web Developer's Journal Home Page, and see if there is a more recent article about this subject.
HOW DID THEY DO THAT???

Find out in:
Amazing HTML



Site Map



Check out our Web-based
Discussion Groups:

Check out and join our email-based Mailing Lists for Web developers.


Developer Channel
FlashKit
Jobs.webdeveloper
JavaScript.com
JavaScriptSource
JustSMIL
ScriptSearch
Streaming Media World
WebDeveloper.com
WebReference
XMLFiles
WDVL
The computer industry is engaged in a neverending quest to improve performance. Disciples of the faster is always better ethic search for techniques that can enhance performance without significantly increasing the cost and complication of a design. Caching is one technique implemented on many levels of computer design. Although the implementation itself can be complicated, the idea of caching is very simple. Within a computer, a cache is a place to store program information, memory addresses or data. Because of the repetitive nature of computing, caching provides a very effective method of increasing performance, since it enables systems to store oft-needed data in a nearby "cache" where it is easily accessible.

Read Caching on Disc Drives

Seagate has found this technique especially useful for disc drives, since the speed of the drives can be many thousands, or even millions of times slower than solid-state elements on the computer system's motherboard and elsewhere.

Since computers tend to access the disc drive in a sequential or predictably ordered manner, performance can be greatly increased by reading extra data into a memory cache before the computer asks for it. This way, when the computer asks for the next bit of data, it is already in memory; and data retrieval from memory can be several thousand times faster than having to get it from the disc. Tests show that if a computer requests data from a certain location, there is an 80% to 90% chance that next request will for data in the following location. Unfortunately, using a cache to improve performance is a gamble. We make the effort to read the extra data based on the 80% percent chance it will be used next.

Caching in effect anticipates requests for certain data (usually the next in line) and draws that data into memory before the system needs ft. Because the caching program reads information into memory before it is needed, ft is often also called a Read Look-Ahead Buffer. To increase system efficiency further, ft is not necessary for the cache to reside with the host computers main memory. A disc drive can handle the caching for itself, leaving the host computer free from the burden of managing the cache.

Seagate Technology's mobile, desktop, and high-performance hard disc drives support read caching for these reasons and has innovated on the concept. By placing more memory chips on a drive, for example, the drive gets a larger cache, which in turn allows it to hold more data at the ready without having to wait to read it from the discs. In other words, by placing 64 kilobytes of memory directly on the drive's circuit-board, it can read forward and store about 64 kilobytes of data. So long as the computer keeps requesting data from consecutive locations, the drive can provide it instantly from its cache memory.

Seagate's Barracuda family of hard disc drives are available with a full megabyte (1,024 kilobytes) of cache memory directly on the drive board. This is necessary for the type of environments the Barracuda is designed to work in: high-end workstations and servers, minicomputers and supercomputers. Each of these types of applications need large amounts of data at sustained transfers, as quickly as possible.

Adaptive Caching

Every time the computer asks for data that is currently in the cache, the request is referred to as a cache hit. If the computer asks for data which is not in the cache, the drive must go and fetch the data from the discs, which is called a cache miss. A measure of how effective the caching program (or algorithm) is working is to observe how adept it is at caching the required data. This is done by comparing the number of cache hits to cache misses. This ratio is called the cache hitrate for a given caching scheme.

The use of what are known as "adaptive" and "segmenting" strategies help minimize the occurrence of a cache miss. This is an example of segmenting the cache: suppose a drive has 800 Kbytes of cache memory. A simple cache strategy would have to fill all 800 Kbytes of cache with look-ahead data. If none of the data is required for the next request, all of this data must be purged. Now imagine if the 800K buffer was partitioned into two 400K buffers. This would allow the drive to behave as if, in effect, it had two caches. Now, data for one application could be stored in the first 400K segment, and the data for a second program could be stored in the second 400K segment. Most Seagate drives have either two or four fixed cache segments. More than four segments are usually unnecessary.

The second technique to improve the hit rate is called adaptive caching. Actually, the term adaptive caching can be used to imply two distinct techniques. First, some drives implement adaptive segmentation, which allows the drive to control the number of segments in the cache. The second type of adaptive caching involves an adaptive algorithm. Assume again that we have the same drive with an 800K read-look ahead buffer; and that the drive has four 200K segments. Now suppose that an application asks for data that is not in any of the four segments. The drive must go off and read this data from the platters, and decide where to place ft in the cache. To do this a decision must be made to determine which segment of the cache gets purged. Ideally we would not want to purge the data in a segment that is about to be used, since we would have to reload that data again. The optimum segment to bump is one which contains data that is no longer needed. Adaptive algorithms make the decision on what to bump based on its analysis of what the least-utilized data is.

Caching has proven itself to be invaluable in the design of computers and peripherals. Well implemented cache designs can increase performance by several orders of magnitude. As Seagate caching techniques and algorithms continue to become more sophisticated, they will contribute to spectacular performance from systems commonly available at inexpensive prices.
Advertising Rates & Policies Contact Subscribe to Our Mailing List Please send us mail! Webmaster Web Developer's Journal

internet.com
e-commerce
WebDeveloper Network