![]()
RAID in a Netware Environment Walter Boyd, ECNE, CNIThe last ten years have witnessed a massive increase in the performance level of PC network workstations and servers. Electronic advances have enabled CPU and video performance to improve 50 to 100 times. However, mechanically bound disk I/O performance has become an ever increasing bottleneck. Disk I/O performance has only improved by about five times over the same period. This performance mismatch will only worsen as we implement ever more demanding applications such as image processing, CAD, database, and data acquisition. Even more importantly, data reliability and availability must be assured as more and more mission critical applications migrate to LANs. Drive manufacturers have made great advances in disk reliability. The MTBF (Mean Time Between Failure) for these drives has also increased (although perhaps not as much as the marketing arms of the drive makers might like to claim). Even though the likelihood of a drive failure has decreased, many critical applications cannot afford any system downtime. Many corporations have found that implementing some form of data redundancy can help to guard against the risk of a drive failure. Network administrators are continually seeking ways to improve system performance and reliability. One method of achieving these dual and often opposing goals is to use RAID (Redundant Array of Independent Disks) technology. The term RAID was originally introduced in a paper published in 1988 by David A. Patterson, Garth Gibson and Randy H. Katz of the University of California at Berkeley entitled “A Case for Redundant Arrays of Inexpensive Disks.” They defined five disk array models that would yield improved data reliability, data transfer rate, and data I/O rate. Each level has its own tradeoffs and benefits, explained below. Various vendors have developed extensions to these defined levels and have labeled them according to the whims of their respective marketing departments. I will use the terms and definitions developed by the RAID Advisory Board. The RAID Advisory Board is a group of over 40 leading companies in the industry as well as other organizations with an interest in RAID technology. One of the RAID Advisory Board’s major goals is to standardize RAID-related terminology throughout the industry. The original RAID paper referred to inexpensive disks as compared to SLEDs (Single Large Expensive Disks). With the general decrease in disk drive costs, SLEDs have essentially disappeared. The cost of a RAID array today is typically compared to the independent disks from which it is built. So the acronym RAID is now commonly accepted as referring to a Redundant Array of Independent Disks.
Why RAID?Disk I/O is relatively slow because it is mechanical. A disk read or write involves two processes. First positioning the R/W head, and second transferring the information to/from the disk. Head positioning is limited by seek time and disk rotation to the start of the data (rotational latency). Data transfer occurs one bit at a time and is limited by the speed of rotation and the density of the data. One method that can be used to speed the transfer of data is to use many disks in parallel. For example, if a disk is capable of a given number of I/O transactions per second, then two disks are theoretically capable of twice that many. Similarly, if one disk can transfer a certain amount of data, then two disks acting together can transfer twice as much data. Additional disks would continue this increase in I/O capability until some other component of the computer system became the limiting factor.System managers often try to accomplish this by manually spreading data among several disks so that each disk receives a somewhat equal share of the system load. This I/O tuning process can yield good results but is limited by two factors: (1) It doesn’t speed the transfer of data for a single file; it only increases the number of files being accessed concurrently. And (2) system I/O demands change minute-to-minute; it is rarely perfectly balanced and certainly won’t stay that way over time. A more effective way to balance the I/O load is to use a disk array. The RAID Advisory Board definition of a disk array is "“a collection of disks from one or more commonly accessible disk subsystems, combined with a body of Array Management Software. Array Management Software controls disk operation and presents the disks as one or more virtual disks to the host operating environments." Array Management Software may reside either in the host computer or in the disk subsystem. (I will refer to these as software or hardware-based disk arrays.) Both array types can be implemented as either internal or external drive systems. NetWare has supported software-based disk arrays for some time. Disk duplexing and mirroring are examples of software-based disk arrays. Recently, third-party software manufacturers have implemented additional levels of software-based disk arrays that can be implemented as NetWare Loadable Modules (NLMs). Hardware-based disk arrays are typically implemented using a specialized SCSI host adapter. These SCSI adapters use dedicated onboard co-processors which relieve the main system CPU from most tasks related to managing the disk array. They also frequently use their own cache to further improve subsystem performance. Either software-based or hardware-based disk arrays will appear to NetWare as one or more large virtual disks—virtual disks that (1) are more reliable and (2) usually provide better performance than any single disk. A disk array can provide better performance because it automatically divides a read/write request among its member disks. For example, if a read/write request involved four 4KB blocks of data, then four disks operating in an array could theoretically provide four times the transfer rate of a single drive because each drive would only need to handle one block of data and they could each operate at the same time. In practice the total throughput would not be quite that high due to the overhead of managing four drives. Additionally, data redundancy can be built into a disk array, which protects against drive failure. It is this redundancy which creates a RAID array. And the different ways in which this redundancy can be achieved results in different levels of RAID. Each RAID level has benefits and tradeoffs, explained below. Sidebar: The Language of RAIDParallel vs. Independent Access ArraysParallel Access Arrays. Parallel access arrays are arrays in which every member disk participates in every I/O operation. They provide a high data transfer rate because the data for any given read/write operation is spread among the member disks and is accessed simultaneously. The data transfer rate will be approximately equal (within 5%) to the sum of the transfer rates of the member disks. The disk I/O rate (number of individual read/writes per second) is similar to the I/O rate for a single disk, because the disks in a parallel access array work in unison to access only one block of data at a time. In other words, a parallel access array will access only one file at a time, but will do so at a very high transfer rate. Some implementations of parallel access arrays also require synchronized rotation of member disks. Disk synchronization requires disks that support this feature. RAID levels 2 and 3 are parallel access arrays.Independent Access Arrays. Independent Access Arrays are arrays in which the member disks operate independently, even to the extent of satisfying multiple I/O requests concurrently. They provide a high disk I/O rate because each disk can be servicing a separate I/O request. The disk I/O rate will be approximately equal (within 5%) to the sum of the I/O rates of the member disks. The data transfer rate for a given block is the same as the transfer rate for a single disk because each block is stored on only one member disk. The total system transfer rate for an independent access array is similar to a parallel access array but may be split between several unrelated I/O requests. RAID levels 4 and 5 are independent access arrays. RAID levels 0 and 1 may be either parallel or independent but are usually implemented as independent arrays.
RAID DefinedThe example for the following definititions is an I/O request of a 16KB file that will be written to the disks in four 4KB chunks.
Striping and MirroringRAID levels 0, 1, and 0 & 1 can be implemented as either independent or parallel access arrays. Netware implements them as independent access arrays as a standard feature of the operating system, which does not require additional hardware or software. They are also commonly implemented in hardware-based disk arrays.Level 0. (Diagram 1) While this level is not technically a RAID implementation because no redundancy is involved, it is widely recognized and is in common use. It is frequently referred to as disk striping or spanning. Disk striping occurs when the data is distributed in alternating fashion among the members of the array.
The example 16KB I/O request will be evenly distributed across the four member disks. The algorithm necessary to convert the 16KB I/O request for the virtual disk into the individual I/O requests for the member disks is simple enough that this solution can be implemented at the host level without much overhead. To achieve RAID Advisory Board compliance, the combined data transfer rate of a level 0 parallel access array must be within 5% of the sum of the data transfer rates of its member disks. A level 0 independent access array must execute I/O requests at a rate within 5% of the sum of the I/O rates of its member disks. A test suite is currently under development by the RAID Advisory Board to test compliance. Vendors who meet the specifications for each level will be allowed to display a trademark symbol indicating that compliance. Level 0 is implemented in a NetWare environment by assigning segments from each member disk, usually of equal size, to a given volume. For example, to create four 1 GB volumes from four 1 GB disks, the normal non-RAID implementation would assign the entire contents of each 1 GB disk partition to a separate volume. A RAID level 0 implementation would assign a 250MB segment from each disk partition to each volume (Diagram 2).
NetWare implements RAID level 0 as an independent access array and, therefore, provides a high level of I/O request execution. The disadvantage of RAID 0 array is that loss of any drive in the array will result in the loss of all data within the array. In our example we would lose the data from every volume. It would then need to be restored from backup (you were making regular backups, right?). If your application can withstand the interruption of service that might occur with RAID level 0, then it will provide a very high level of I/O performance. If you need the performance gains associated with disk striping but cannot afford the risk of data loss, then consider RAID level 0 & 1 described below.
Level 1. (Diagram 3) Commonly known as disk mirroring, RAID level 1 provides a high level of data reliability by duplicating the data on one member disk to another member disk. Mirroring is implemented at the partition level. The example 16KB I/O request would be written as four 4KB request for each disk. Because both writes can occur concurrently, the actual elapsed time to make both writes will not be substantially longer than for a single disk. However, this does chew into the data transfer capacity of the system because twice the amount of data is written, producing a bottleneck for write-intensive applications. On the other hand, because NetWare implements RAID level 1 as an independent access array, reads will be serviced by the first available disk. Consequently, they occur at approximately twice the rate of reads on a single disk. Whether or not mirroring yields a net gain or loss in system performance is a function of the ratio of reads to writes for your system. A typical NetWare environment might expect performance to be approximately the same with mirroring as without. But, as they say, your mileage may vary. Mirroring provides the highest level of data availability of the five standard RAID levels at a “cost” of 50% of your disk space. Applications which cannot afford system downtime may want to consider disk mirroring. Mirroring only provides redundancy for the disk drives. Redundancy of the entire disk drive channel can be achieved by attaching each side of the mirrored pair to its own host bus adapter in the server. This is known as disk duplexing. Level 0 & 1. (Diagram 4) Sometimes referred to as Raid 10, this level combines levels 0 and 1 by first mirroring disks together to form multiple mirrored virtual disks then striping these virtual disks together to form a very high performance and high reliability storage system.
For the example 16KB I/O request, mirroring and striping across four mirrored disks would require eight disks. Since eight devices is too many for a single SCSI channel the disks must be duplexed. Each host bus adapter will have four attached 1 GB disks with SCSI IDs 0-3 on each channel. At the partition level, mirror channel 0, disk 0 (0-0) to channel 1, disk 0 (1-0); 0-1 to 1-1; 0-2 to 1-2; and 0-3 to 1-3. In practice, any pairing between the channels will work, but it is a cleaner model on paper and a cleaner install in real life to place the four disks from each channel in a separate cabinet and to set up the cabinets identically. Our 16KB write request now causes a single 4KB write to each disk. Each disk on channel 0 will receive one of the four 4KB write requests and a second copy of each write request will also be written to the mirrored disks on channel 1. This solution provides very high I/O performance as well as full data redundancy. It is the most desirable software-based implementation, but as a mirrored solution, incurs a cost of 50%.
Parallel Access ArraysParallel access arrays are capable of very high data transfer rates because every I/O request involves every member disk. They are not well suited to applications which require a high I/O rate. The drives may be fully synchronized with the heads of each member disk positioned over the same sector and track, thus minimizing rotational latency. Or they may be semi-synchronized with only the track positioning synchronized.Level 2. RAID level 2 implements data redundancy by using an error detection and correction scheme commonly used for RAM chips known as a Hamming code. This level is of academic interest only since the benefits of a parallel access array apply equally well to RAID level 3 without the complexity involved in trying to implement a solid state error detection and correction scheme using mechanical devices. Level 3. (Diagram 5) RAID level 3 achieves data redundancy by storing parity information on a dedicated parity drive. Parity information is calculated by executing an exclusive OR (XOR) on all member data disks. There is disagreement within the vendor community over the details of how RAID level 3 should be implemented. To provide the host computer with an apparent sector size of 512 bytes (required by NetWare) it is necessary to format the physical drives into 128 byte sectors (not all drives support this) for a five drive level 3 array (4 drives @ 128 bytes plus 1 parity drive). Smaller sector sizes also negatively affect the formatting efficiency of the drives. These complications have caused vendors to use varying techniques to maintain 512 byte sectors in their implementations of RAID level 3. This article describes the “pure” RAID level 3 implementation.
RAID level 3 differs from 0, 1, 4 and 5 because it stores data in much smaller chunks on the member disks. The other levels use at least 4KB blocks which are accessed independently producing a somewhat consistent stream of data, which concurrently can service multiple I/O requests—A five disk RAID level 3 array stores data in 128 byte segments and reads four disks simultaneously. All member disks in a level 3 array service the same I/O request at any given time. While they are actually servicing the I/O request the data transfer rate will be approximately equal to the number of data drives in the array multiplied by the maximum transfer rate of the drive (the parity drive is not counted). Very fast indeed. However, when the heads are involved in a track-to-track seek or are waiting for the disk to rotate to the appropriate sector of the disk (rotational latency), the data transfer rate is zero. For this reason a parallel access array is best suited for applications which tend to have long reads or writes. Applications that tend toward many little reads and writes and that spend more of their time in the process of positioning R/W heads would be better served by an independently accessed array. NetWare 2.x and 3.x volumes usually use 4KB disk allocation blocks and scatter them throughout the volume. These factors tend to favor independent access arrays over a parallel access array such as RAID level 3.
Independent Access ArraysIndependent access arrays do not require that every member of the array participate in every I/O request.Level 4. (Diagram 6) RAID level 4 uses the same data mapping model as RAID level 3—4 data disks and 1 dedicated parity disk. Data is stored in disk allocation block-size chunks similar to levels 0, 1, and 5. And each disk is accessed independently. Level 4 differs in that it is accessed independently and its blocksize is larger. These two differences yield performance characteristics that are very different from a parallel access array such as RAID level 3. These characteristics illustrate the inherent differences between parallel and independent access arrays. Parallel access arrays generally provide superior data transfer rates, while independent access arrays generally provide superior I/O rates.
The example 16KB I/O request is distributed as four 4KB blocks, one for each data disk in the array just like RAID level 0. Additionally, the 4KB blocks will be XORed; the resulting parity information will be stored on the parity disk. Calculating parity information for independent access arrays (levels 4 and 5) necessitates a read/modify/write cycle, which causes the write process for these arrays to be slower than all other configurations. RAID level 4 is particularly affected by this phenomenon for reasons which will be clear shortly. The read process and performance of a RAID level 4 or 5 array are similar to that of a striped array (RAID level 0). The write process is slower because writing a single block of data to a member disk requires reading existing data from both the target block and its associated parity block, performing two XOR functions, and writing the data back to the respective blocks. The exclusive OR function has the property that a given chunk’s contribution to the parity information can be nullified by executing the function again using the same data. The six necessary steps follow:
Level 5. (Diagram 7) RAID level 5 improves the parity drive bottleneck by distributing the parity information evenly across all of the member disks.
In all other respects RAID level 5 is functionally equivalent to RAID level 4. Write performance is still lower because of the read-modify-write cycle, but its impact is not as extreme as with RAID level 4 because the parity block is evenly distributed across the member disks, and each disk bears an equal share of the load. RAID level 5 is the most common hardware-based RAID implementation. Its performance characteristics—very fast read, high I/O rate, write similar to that of a single stand-alone disk—are well-suited to the typical NetWare environment. Hardware-based implementations also typically use large onboard memory caches to help overcome the write performance bottleneck.
Which level is bestThe best implementation depends on many factors, including cost, I/O transfer load, I/O transaction rate, and the ratio of reads to writes for you system. The following chart summarizes the characteristics of each level:
Keep in mind that a redundant storage system does not lessen the necessity for adequate system backups. The various RAID levels described in this article greatly improve performance of disk storage systems. At the same time, greater data reliability and availability can be achieved. All at a reasonable cost.
For Further Reading
Thanks in particular to CNEPA Hands-on Labs and Doug Bui at Hewlett-Packard for their help with this article. Sidebar: The Language of RAIDThe following terms are frequently used when discussing RAID arrays. They are provided by the RAID Advisory Board:
This article originally appeared in the April 1994 issue of Network News, the technical journal of the Network Professional Association.
Return to top of page |