HOW TO SIZE A SAN
Determining the number of drives in a RAID
groupYou know you need storage, and
you have a pretty good idea of how to size the SAN layer. But how do you go
about determining the correct number and proper type of disks? How many drives
in a RAID group will it take to give you the performance you need for a database
or email or file server, or for that new VMware implementation? Can you have too
many drives in a RAID group, or even in the actual storage RAID array?
When should you use SAS, SATA, FC or SCSI? What
are the other metrics for sizing storage arrays, such as optimal max spindles
per controller? What about the other vendor performance specifications?
Vendor data sheets tell you just about all you
need to know about storage arrays and disk drives: raw capacity, throughput
(usually in MB/sec), IOPS, reliability (MTBF), and drive type (SAS, SATA, FC,
etc). Most storage arrays are going to have Fibre Channel interfaces on the
front end (some offer InfiniBand), and can be FC, SATA or SAS on the disk end.
They support RAID unless the array is JBOD (Just a Bunch Of Disks). DAS (direct
attached storage) is what your environment typically contains if you don't have
a SAN, and can refer to internal drives in your server or a SCSI RAID array
attached to one or two servers.
RAID is basically a group of disks, usually
with one or both characteristics of parity and striping. Parity is redundancy of
your blocks of data on the disks; striping allows the individual drive speeds
and feeds to add up, giving you more performance than a single disk could
provide. Each RAID type has tradeoffs in reliability, performance and cost. (The
level of redundancy you choose can cost you a lot of usable capacity.)
Some aspects of RAID can affect throughput or
IOPS. (Remember, you'll want a high IOPS performance for latency-sensitive
applications such as databases or email). Performance under RAID can take a hit
during a drive failure because the RAID controller will be working hard to
rebuild the RAID group using your global hot spare.
Usable capacity vs. raw capacity. The capacity
you have to play with won't be exactly what's stated on the spec sheet. That
750GB SATA drive is likely to be effectively about 690 GB due to disk geometry
or "overhead." A 1TB drive might format out to about 900GB.
The type of RAID also affects your usable
capacity. Most RAID arrays will allow you to have one (or more) global hot
spare, a disk waiting to step up for a failed drive in a RAID group, which would
begin rebuilding the redundancy on that drive after a disk failure. When
calculating usable capacity, don't forget to take away the drive capacity of
your hot spares.
The performance delivered by your SAN storage
will vary significantly, depending upon how much of your data access can be
defined as sustained sequential reads, random reads, sustained sequential
writes, random writes or some combination of the above. Don't worry -- you won't
need to go through benchmark reports on each LUN unless you want to target a
specific application problem. Odds are, unless you're sizing a specific
application, you'll have a mix of the above.
Keep one thing in mind. For a RAID array, the
worst-case performance scenario is random write, and the best-case performance
scenario is sustained sequential reads.
The maximum number of drives behind a pair of
controllers is usually specified, and that number is more than likely going to
be "too many." You need to find out how many drives are supported where maximum
performance is gained, and how many you can add before performance is noticeably
affected.
Drive types
How do you identify which kind of drives you
need for your environment? By categorizing your capacity requirements in this
fashion:
IOPS-sensitive applications will be your first
category. Throughput-sensitive applications (video streaming, video editing,
backup to D2D or VTL, etc) will be another. Your basic file servers, web
servers, print, home drive space and archive space will go into an archive
category.
For transaction applications, such as database
or email applications, you'd lump those servers' capacity requirements into the
IOPS-sensitive category. Recommendation: 15K SAS or FC disks, and lots of them.
Should you determine the RAID type and work out your usable capacity and find
you need five 300GB 15K RPM SAS or FC drives to get to that usable capacity, you
might be better off going with ten 15K 146GB SAS or FC drives. With twice as
many spindles, you'd get more IOPS, and ten 15K 146GB drives would cost less
than five 15K 300GB drives.
If you have a large storage environment and
need a lot of drives, you'd be smarter to get the 300GB 15K drives, since you'll
have enough spindles to gain IOPS benefit, and will still be able to use the
capacity. If you have a target for the amount of IOPS you need, the performance
consideration below will give you a rule of thumb on meeting this requirement.
You might find that you need 20 76GB 15K drives instead. Your RAID groups
probably won't be that big but you can make more than one RAID group, and the
person integrating this will be able to optimize how it is used by the server
utilizing this set of storage resources. (The more drives you have in a RAID
group, the longer rebuild takes during a failure.)
Video editing, video streaming, backup and
certain types of file servers go into the throughput-sensitive category. (A
workload that might put 'simultaneous' demands on your storage resources might
require 10K or 15K SAS or FC disks, but 7.2K RPM SATA disks with the right kind
of controllers could also deliver enough performance.) I've seen 800MB/sec
sustained sequential write performance on storage servers using newer SATA RAID
controllers. Note: if your web, file and print servers do not have a demanding
speed or IOPS requirement, you'd want to put them into your deep and cheap
archive category. This would be your 750GB or 1TB 7.2K RPM SATA drives. I also
recommend RAID 6 for SATA storage solutions since you'll see twice the failure
rate of SATA drives compared to SAS or FC according to MTBF stats from
manufacturers.
RAID selection
If you want high availability in your storage solution, you'll need some level
of redundancy or parity to protect your data in the event of one or more drive
failures. You'll also probably need to leverage striping (RAID 0) to aggregate
the individual disk drives' performance, and in reality some combination of
parity and striping. This is where you end up with RAID 50 (5+0 or 0+5) or RAID
10 (1+0 or 1/0). Let's compare the more common RAID offerings (RAID-1/0, RAID-5,
RAID 50 and RAID-6 solutions) based on speed, space utilization and performance
during rebuilds and failures.
Comparison of RAID types
RAID-1/0 is where data is striped (RAID-0) across mirrored (RAID-1) sets.
(RAID-0-1 is not the same as RAID-1/0; I don't recommend RAID-0-1 for Microsoft
Exchange data.) Transactional performance with RAID-1/0 is good because either
disk in the mirror can respond to read requests. No parity information needs to
be calculated so disk writes are handled efficiently. Each disk in the mirrored
set must perform the same write.
If a disk fails in a RAID-1/0 array, write
performance is not affected because there a member of the mirror can still
accept writes. Reads are moderately affected because now only one physical disk
can respond to read requests. When the failed disk is replaced, the mirror is
again established, and the data must be copied or rebuilt. However, your disk
capacity is cut in half, because you are creating 1 for 1 redundancy on the
disks.
RAID-5 involves calculating parity that
can be used with surviving member data to recreate the data on a failed disk.
Writing to a RAID-5 array causes up to four I/Os for each I/O to be written, and
the parity calculation can consume controller or server resources. Transactional
performance with RAID-5 can still be good, particularly when using a storage
controller to calculate the parity.
When a disk fails in a RAID-5 array, the array
is in a degraded state, performance is less and latencies are higher. This
situation occurs because most arrays spread the parity information equally
across all disks in the array, and it can be combined with surviving data blocks
to reconstruct data in real time. Both reads and writes must access multiple
physical disks to reconstruct data on a lost disk, thereby increasing latency
and reducing performance on a RAID-5 array during a failure.
When the failed disk is replaced, the parity
and surviving blocks are used to reconstruct the lost data, a lengthy process
that can take days. If a second member of the RAID-5 array fails during the
Interim Data Recovery Mode or rebuild, the array is lost. RAID-6 was created to
address this vulnerability.
RAID Levels 0+5 (05) and 5+0 (50)
are techniques where you have block striping with distributed parity combined
with block striping. RAID 05 and 50 form large arrays by combining the block
striping and parity of RAID 5 with the straight block striping of RAID 0. RAID
05 is a RAID 5 array comprised of a number of striped RAID 0 arrays; it is less
common than RAID 50, which is a RAID 0 array striped across RAID 5 elements.
RAID 50 and 05 improve the performance of RAID 5 through the addition of RAID 0,
particularly during writes. It also provides better fault tolerance than the
single RAID level does, especially if configured as RAID 50. Most
characteristics of RAID 05 and 50 are similar to those of RAID 03 and 30. RAID
50 and 05 are preferable for transactional environments with smaller files than
03 and 30. If you're doing video editing, I suggest investigating RAID 03 and
30.
RAID-6 adds another parity block and
provides about double the data protection over RAID-5, but at a cost of even
lower write performance. As physical disks grow larger, and consequently RAID
rebuild times grow longer, RAID-6 may be necessary to prevent LUN failure if an
uncorrectable error occurs during the rebuild, or if a second disk in the array
group fails during rebuild. Due to disk capacity, some vendors support RAID-6
instead of RAID-5.
To achieve the IOPS goal of the Exchange 2007
requirements for a given capacity, RAID 5 may actually require more spindles
than RAID 10.
Ultimately, performance depends on the
performance characteristics of the drives, the configuration of RAID groups and
the type of RAID. When choosing RAID 5 (or RAID 6), it's important to consider
that each host IOP has 4+ operations associated with it due to this being a
partial stripe RAID 5 write or RAID 6 double stripe write. The operations read
drive/read parity, recomputed parity, write drive, write parity, and reduce the
effective IO rate of the drive by ¼.
Selecting a RAID type
To select a RAID type, you'll need to balance
your requirements for capacity, throughput, transactional I/O and
failure/rebuild performance. RAID-1/0 is the ideal configuration for databases
and email, and it works well with large capacity disks. Having more writes as a
percentage of total I/O in your environment? Use RAID 1/0. RAID 1/0 will give
you performance consistency even during a drive failure.
For RAID-5 and RAID-6, rebuild performance can
have a significant effect on storage throughput, cutting it as much as half,
depending on the storage array and configuration. Scheduling rebuilds outside of
production hours can offset this performance drop, but you'll sacrifice
reliability. In a cluster continuous replication (CCR) environment, you can
prevent the throughput reduction affecting users by moving the Mailbox server to
the passive node, thereby making it the active node. If neither option is
available, additional I/O throughput should be designed into the architecture to
accommodate RAID-5 or RAID-6 rebuild conditions during production hours. This
additional I/O throughput can be up to twice the non-failed state I/O
requirements.
If your backup solution (VTL or D2D pool)
needed to sustain a certain amount of data throughput, you'd have to consider
how many drive resources would be needed to handle that level of 'sustained,
sequential write' performance. Simply put, if your RAID array can do 350 MB/sec
sustained sequential writes according to the specs, odds are those numbers are
based on load balancing across all the controllers to the disk resources. You'd
need to make sure there are enough drives to get you to maximum performance of
the array. Usually you can do this with at least two or three 'trays' of disks.
Plan on creating RAID groups for each 'channel' (data path going to the RAID
controllers on the RAID array). So if you have a dual controller RAID array,
you'd want to have your RAID groups evenly divided between any controllers you
have.