Key considerations in developing a storage area
network design
Storage area networks (SANs) let several
servers share storage resources and are often used in situations that require
high performance or shared storage with block-level access, like virtualized
servers and clustered databases. Although SANs started out as a high-end
technology used only in large enterprises, cheaper SANs are now affordable even
for small and medium-sized businesses (SMBs). In earlier installments of this
Hot Spot Tutorial, we examined what benefits SANs offer over other storage
architectural choices, as well as the two main storage networking protocols,
Fibre Channel and
iSCSI. In this installment, we'll look at the
main considerations you should keep in mind when putting together a storage area
network design.
Uptime and availability
Because several servers will rely on a
SAN for all of their data, it's important to
make the system very reliable and eliminate any single points of failure. Most
SAN hardware vendors offer redundancy within
each unit -- like dual power supplies, internal controllers and emergency
batteries -- but you should make sure that redundancy extends all the way to the
server.
In a typical storage area network design, each
storage device connects to a switch that then connects to the servers that need
to access the data. To make sure this path isn't a point of failure, your client
should buy two switches for the SAN network. Each storage unit should connect to
both switches, as should each server. If either path fails, software can fail
over to the other. Some programs will handle that failover automatically, but
cheaper software may require you to enable the failover manually. You can also
configure the program to use both paths if they're available, for load
balancing.
But you should also consider
how the drives themselves are configured, Franco said. RAID technology spreads
data among several disks -- a technique called striping -- and can add parity
checks so that if any one disk fails, its content can be rebuilt from the
others. There are several types of RAID, but the most common in SAN designs are
levels 5, 6 and 1+0.
RAID 5 stripes data across
every disk in the unit except one, which is used to store parity information
that can be used to rebuild any drive that needs to be replaced. RAID 6 adds a
second disk for redundant parity. This protects your client's data in case a
second drive breaks during the first disk's rebuild, which can take up to 24
hours for a terabyte, Franco said. RAID 1+0 stripes data across a series of
disks without any parity checks, which is very fast, but mirrors each of those
disks to a second set of striped disks for redundancy.
Capacity and scalability
A good storage area network
design should not only accommodate your client's current storage needs, but it
should also be scalable so that your client can upgrade the
SAN as needed throughout the expected lifespan
of the system. You should consider how scalable the SAN is in terms of storage
capacity, number of devices it supports and speed.
Because a SAN's switch
connects storage devices on one side and servers on the other, its number of
ports can affect both storage capacity and speed, Schulz said. By allowing
enough ports to support multiple, simultaneous connections to each server,
switches can multiply the bandwidth to servers. On the storage device side, you
should make sure you have enough ports for redundant connections to existing
storage units, as well as units your client may want to add later.
One feature of storage area
network design that you should consider is thin provisioning of storage. Thin
provisioning tricks servers into thinking a given volume within a SAN, known as
a logical unit number (LUN), has more space than it physically does. For
instance, an operating system (OS) that connects to a given LUN may think the
LUN is 2 TB, even though you have only allocated 250 GB of physical storage for
it.
Thin provisioning allows you
to plan for future growth without your client having to buy all of its expected
storage hardware up front. In a typical "fat provisioning" model, each LUN's
capacity corresponds to physical storage. That means that your client will have
to buy as much space as it anticipates needing for the next few years. While
it's possible to allocate a smaller amount of space for now and transfer its
data to a larger provision as needed, that process is slow and could result in
downtime for your client.
Thin provisioning allows you
to essentially overbook a
SAN's storage, promising a total capacity to
the LUNs that is greater than the SAN physically has. As those LUNs fill up and
start to reach the system's physical capacity, you can add more units to the SAN
-- often in a hot-swappable way. But because this approach to storage area
network design requires more maintenance down the road, it's best for stable
environments where a client can fairly accurately predict how each LUN's storage
needs will grow.
Security
With several servers able to
share the same physical hardware, it should be no surprise that security plays
an important role in a storage area network design. Your client will want to
know that servers can only access data if they're specifically allowed to. If
your client is using iSCSI, which runs on a standard Ethernet network, it's also
crucial to make sure outside parties won't be able to hack into the network and
have raw access to the SAN.
Most of this security work is
done at the SAN's switch level. Zoning allows you to give only specific servers
access to certain LUNs, much as a firewall allows communication on specific
ports for a given IP address. If any outward-facing application needs to access
the SAN, like a website, you should configure the switch so that only that
server's IP address can access it.
If your client is using
virtual servers, the storage area network design will also need to make sure
that each virtual machine (VM) has access only to its LUNs. Virtualization
complicates
SAN security because you cannot limit access to
LUNs by physical controllers anymore -- a given controller on a physical server
may now be working for several VMs, each with its own permissions. To restrict
each server to only its LUNs, set up a virtual adapter for each virtual server.
This will let your physical adapter present itself as a different adapter for
each VM, with access to only those LUNs that the virtualized server should see.
Replication and disaster
recovery
With so much data stored on a
SAN, your client will likely want you to build disaster recovery into the
system. SANs can be set up to automatically mirror data to
another site, which could be a failsafe SAN a
few meters away or a
disaster recovery (DR) site hundreds or
thousands of miles away.
If your client wants to build
mirroring into the storage area network design, one of the first considerations
is whether to replicate synchronously or asynchronously. Synchronous mirroring
means that as data is written to the primary SAN, each change is sent to the
secondary and must be acknowledged before the next write can happen.
While this ensures that both
SANs are true mirrors, synchronization
introduces a bottleneck. If the secondary site has a latency as high as even 100
to 200 milliseconds (msec), your system will slow down as the primary SAN has to
wait for each confirmation. Although there are other factors, latency is often
related to distance; synchronous replication is generally possible up to about 6
miles.
The alternative is to
asynchronously mirror changes to the secondary site. You can configure this
replication to happen as quickly as every second, or every few minutes or hours.
While this means that your client could permanently lose some data, if the
primary SAN goes down before it has a chance to copy its data to the secondary,
your client should make calculations based on its recovery point objective (RPO)
to determine how often it needs to mirror.