SearchStorage.com

Solution Search:
Storage Magazine October 2011 by SearchStorage.com
In the October 2011 cover story of Storage magazine, Phil Goodwin provides a status update on solid-state storage technology. Learn about the various use cases...
Storage Magazine June 2011 by SearchStorage.com
Storage Magazine Online June 2011 – Managing Storage for Virtual Servers

Storage Magazine Online June 2011 – Managing Storage...

Storage Magazine May 2011 by SearchStorage.com
Named as one of Storage magazine’s “Hot Technologies for 2011”, automated storage tiering is an effective way to make efficient use of installed...
BUYING GUIDE
The following cross-section of midrange NAS arrays was selected based on input from industry analysts and... More...
STORAGE ARTICLES
SANTA CLARA, Calif. -- Amplidata introduced its first product today, the AmpliStor object storage system consisting of a controller, storage node and monitoring software for cloud storage and archiving of large digital data and online media applications.

Amplidata's product launch came during the second day of the Storage Networking World (SNW) Spring 2011 show.

The AmpliStor AS20 storage node and AmpliStor controller use Amplidata's BitSpread core intellectual property for unstructured data. BitSpread takes objects from the application, runs them through a controller and splits them into sub... More...

SANTA CLARA, Calif. -- The cloud played a prominent role in product releases at Storage Networking World (SNW) Spring 2011 today, with Mezeo Software Corp. releasing a new version of its cloud storage software platform and StorSimple Inc. and Zetta Inc. enhancing their products.

Mezeo launched MezeoCloud 4.0, which lets enterprises and service providers implement a REST API-based public or private storage cloud. The new version runs on distributed cluster servers to increase scalability. Previously, the Mezeo storage platform, which operates as middleware that exposes API cloud storage in front of the... More...

Intel Corp. launched the Solid State Drive (SSD) 320 Series with increased performance and capacity over its previous solid-state drive (SSD) family.

The 320 Series replaces the Intel X25-M SATA SSD product. The 320 uses 25nm NAND flash memory and comes in 40 GB, 80 GB, 120 GB, 160 GB, 300 GB and 600 GB versions. The X25-M had a maximum capacity of 160 GB. Intel claims the 320 produces up to 39,500 IOPS random reads and 23,000 IOPS random writes with a sequential write performance of 220 MBps and sequential read performance of 270 MBps. Intel SSD 320 Series prices, based on 1,000-unit quantities, range from $89 for 40 GB to $1... More...

Cisco Systems Inc. took steps to fill in the gaps in Fibre Channel over Ethernet (FCoE) support today as part of its data center portfolio expansion.

Cisco's launch of servers, switches and management tools included a handful of FCoE enhancements. It added FCoE support for the MDS 9500 storage switch and the Nexus 7000 data center director switch platform, as well as multihop FCoE support in its NX-OS operating system. Cisco is also moving to a common management tool -- Data Center Network Manager -- for storage-area network (SAN) and local-area network (LAN) devices.

Director-class, multihop support for Fibre... More...

STORAGE BEST PRACTICES
Each storage vendor in the auto-tiering space implements the technology differently, and potential users need to assess a number of distinguishing characteristics. Below are five questions to consider before forging ahead with sub-LUN tiering.

Table of Contents: Sub-LUN tiering

>> 1: How many tiers does your auto-tiering software support?
>> 2: Does the size of the data chunks being moved matter?
>> 3: How long does it take auto-tiering software to migrate data to another tier?
>> 4: How can you track the effectiveness of the auto tiering?
>> 5: How much control do you have over auto tiering?

How many tiers does your auto-tiering software support?

One of the first considerations when embarking on the block-based auto-tiering path is the number of tiers the software supports, because even if an IT shop has three or four tiers of storage in an array, the auto tiering might not be able to span all of them.

Compellent Technologies Inc. Data Progression (Compellent has been acquired by Dell Inc.), EMC Corp. Fully Automated Storage Tiering for Virtual Pools (FAST VP), Hewlett-Packard (HP) Co. StorageWorks P9500, HP/3PARInc. Adaptive Optimization and Hitachi Data Systems Dynamic Tiering can automatically migrate data across three tiers. Meanwhile, IBM Easy Tier supports two tiers, one of which must be solid-state drives (SSDs). But each of those vendors offers three or more tiers among the mix of solid-state, Fibre Channel (FC), SAS and SATA drives, as well as different rpm rates.

"Three tiers is probably the maximum you need," said David Floyer, chief technology officer (CTO) at The Wikibon Project. "I think it's a law of diminishing returns when you go from 10K Fibre Channel disk to 15K and have a tier there. The relative performance differences and the relative cost differences get a little fine, and the overhead of doing it all gets a bit high."

Compellent's three-tier system treats the 15,000 rpm and 10,000 rpm drives as if they were the same, and places blocks of data on disk space wherever it's open, according to Bob Fine, director of product marketing at the company.

Large IT organizations with a diverse set of applications might see a need to spend more money for additional data tiers, but others might be content with two. "There are some very good designs that only have two tiers: SSD and slow disk," said Valdis Filks, a research director for storage technologies and strategies at Stamford, Conn.-based Gartner Inc. "Until you know what your requirements are, you can't sit down and decide which number of tiers is best."

Does the size of the data chunks being moved matter?

Storage administrators need to confirm that their auto-tiering software operates at the sub-LUN level to ensure they're reserving their most expensive storage for their most critical performance-sensitive data. But it's less clear if they should concern themselves about the size of the data chunks the system moves, even if their storage vendors are.

Users will find no dearth of block size options for auto tiering their data, from Compellent's choice of 512 KB, 2 MB (default) or 4 MB with its Data Progression; to IBM's 1 GB with its Easy Tier; and EMC's 1 GB for Clariion and VNX, and 768 KB to 370 MB for Symmetrix.

Brian Garrett, vice president of Enterprise Strategy Group (ESG) Lab, created the following hypothetical scenario to illustrate how block size factors into automated storage tiering. Suppose a storage system moves hot chunks of data in 8 KB increments from a hard disk drive to a flash drive to improve the performance of a database application. If the hot chunk is 512 KB in length and the auto-tiering system moves chunks in increments of 512 KB, the system is 100% efficient. But if the system shifts a 1 MB chunk, it would be approximately 50% efficient because it would move not only the 512 KB of hot data but also cooler data that might be next to the hot chunk, thereby wasting half of the expensive flash capacity.

"A smaller chunk size increases the efficiency and cost effectiveness of each sub-LUN migration," Garrett said, via an email. "But smaller chunk sizes increase the amount of metadata needed to monitor and track sub-LUN migrations. Metadata is typically stored in high-speed memory, which adds cost. Doing more metadata updates and lookups could impact performance."

According to Gartner's Filks, the chunk size won't matter much to small organizations that pay little attention to the inner workings of their systems, but it might catch the attention of sophisticated high-end users that want to tune systems for optimal performance based on their application needs.

"It's one of those arguments that will run forever in the storage market: What block size is the most important?" Filks said. "We have argued about block sizes when we've been tuning databases for the last 20 years."

Floyer at The Wikibon Project said vendors constantly raise the issue of block size with him, but he views it as "hogwash." He advises users to "focus on how much money you're going to save; focus on the business case. Can I run this report to predict how much I'm going to save? That's 20 times more important than worrying about the size of the block."

How long does the auto-tiering software collect and analyze workloads before migrating data to another tier?

Some block-based storage systems equipped with sub-LUN tiering collect and analyze data in minutes before migrating it to another tier. Others might take 24 hours to complete the assessment. Several offer the customer a degree of choice.

Dell EqualLogic XVS arrays, for instance, have a learning cycle of approximately 10 minutes before tiering data between SSD and SAS drives. HP-3PAR Adaptive Optimization and HP StorageWorks P9500 have a minimum sampling period of one hour, although customers also have the option to customize the time frame.

EMC claims its Symmetrix arrays can move data based on real-time workload analysis, while its Clariion systems shift blocks based on an analysis window of 24 hours. An EMC spokesperson said the analysis time is optimized based on typical workloads, but users also have the ability to create custom policies. For instance, you can select a window of Monday through Friday from 6 a.m. to 6 p.m. for analysis and ultimately control when the system moves the data.

IBM Easy Tier monitors activity on 1 GB data chunks to determine the data "temperature" and to create a "heat map"; algorithms then generate a data relocation plan once every 24 hours to place the data on the most appropriate tier.

A cyclical application operation that tends to run the same for eight-hour stretches every day might benefit from a 24-hour analysis, whereas dynamic workloads that change quickly might be better suited to speedier assessment periods, noted Randy Kerns, a senior strategist at Evaluator Group Inc. in Broomfield, Colo.

How can you track the effectiveness of the auto tiering?

IT shops can seek out professional services to help pre-determine the potential benefits of automated tiering. Or they can try vendor-supplied tools to predict the effectiveness of the auto tiering, and to later monitor the data movement and system performance.

Some tools show not only what's going on but also answer potential "what if" questions, such as the performance impact of a change in the amount of flash drives or the cost benefit of an increase in the number of SATA disks, often with the savings calculated.

Determining which applications stand to receive the optimal performance benefit from flash drives or the greatest cost benefit with SATA, and how much SSD and SATA an IT shop might need, are complex problems that require calculus to solve, ESG Lab's Garrett said.

"We're just beginning to see tools that can not only model the performance impact but the price impact," Garrett said. "Over time, I think we'll see more friendly tools, more easy ways to do this modeling. But, right now, they're generally sharp and pointy tools in the hands of experts."

EMC, for its part, makes available professional services to plan and implement FAST VP, but it also offers up a free Tier Advisor tool to plan and model FAST configurations based on application workloads.

Compellent Enterprise Manager generates reports that display capacity usage and power and carbon savings with respect to tiering configurations.

Users of Hitachi Data Systems Dynamic Tiering can employ the Hitachi Command Suite or Storage Navigator 2 to monitor the auto tiering, and graphical reports show where the tiers are set and I/O load against each tier. Alerts notify administrators when service levels fall below desired levels.

IBM Easy Tier includes a Storage Tier Advisor Tool (STAT) to report on the system workload of each volume in the pool or to predict how effective Easy Tier will be with SSDs.

To what degree does the auto-tiering software let you set policies or give you a measure of control over auto tiering?

Auto-tiering software can minimize time-consuming and burdensome tasks associated with moving data to the right tier at the right time, but it doesn't eliminate the ability to exert a level of control over the storage tiering process. Tiering products, to varying degrees, provide options to define policies based on their individual needs.

Compellent Data Progression, for example, offers both default policies for customers who have little or no storage management experience, and customizable policies for IT shops that want to tier by application, RAID level or other configuration options. Users can lock a volume to a specific tier for a set time period with a critical application, such as an ERP database or, more typically, they can set an expiration time for data to stay on fast disk, according to Compellent's Fine.

Hitachi Data Systems users also can lock Dynamic Tiering volumes in place and control the monitoring cycle duration, excluding selected time periods from "heat analysis," according to John Harker, a senior product marketing manager at the company.

HP-3PAR storage systems let users define the optimization mode based on performance, cost or a combination of the two, as well as tinker with the schedule to measure performance or migrate data.

EMC FAST VP allows users to assign policies not only to individual storage devices but to storage groups of one or more related LUNs. The policies define the pools that form the three tiers and the maximum amount of space for each tier.

"You can specify which applications can move into the various tiers or which users can move into the various tiers," Gartner's Filks said. "For example, you may not want to have YouTube applications use SSD because that's just a waste of your resources."

Administrators also need to take care that important financial applications that become crucial at the end of the month haven't been pushed to a lower tier of storage.

"There's common sense involved in this," The Wikibon Project's Floyer said. "Don't do anything that rocks the boat too violently. Just because you can do it doesn't mean you should do it." IT shops must consider five areas before adopting sub-LUN tiering as offerings vary in number of tiers, block size, analysis period, policies and monitoring
These are still early days for cloud storage, but we've already gleaned valuable best practices from administrators and other experts for getting the most from a move to the cloud, whether you're looking to do it today or down the road:

Best practice #1: Scrutinize service-level agreements

Proceed with caution when it comes to getting a service-level agreement (SLA) from a cloud provider. That means read the SLA closely before committing.

"There are a few major providers offering SLAs that are very vague about things like guaranteed recovery and assured destruction of data," Beth Israel Deaconess Medical Center (BIDMC) storage architect Michael Passe said at a Storage Decisions session last year. "You want to look behind the wizard's curtain to see what is really there."

Lauren Whitehouse, a senior analyst at Milford, Mass.-based Enterprise Strategy Group (ESG), said data access is one area that bears close examination in an SLA.

"Generally, SLAs have to do with access to the service, not to data," she said. "Generally, the service has to be down more than 10 minutes before it's considered an outage, so two nine-minute outages in an hour don't count as an outage. If there's an outage of the service, they just adjust the bill -- that's the kind of game that gets played. You have to ask, 'What about access to data?'"

Best practice #2: Follow your business needs

Lantmännen, a collective owned by 40,000 Swedish farmers, saved more than $6 million in the first year after building an internal private cloud with EMC Corp. storage and Riverbed Technology WAFS devices, said Dennis Jansson, Lantmännen's chief security officer.

Jansson said users choose what type of application they need through a web interface, and each service has a fee, SLA and integrated enterprise security management application.

"We're able to actually follow business needs," Jansson said of the cloud. "It doesn't make decisions on applications the users need."

He called the cloud "an easier way to say consolidation, virtualization and standardization."

Best practice #3: Repurpose your own resources

Online advertising sales rep firm Gorilla Nation Media LLC built an external customer-facing cloud and an internal cloud for employees by using servers it already owned along with cloud vendor ParaScale Inc.'s Hyper-scale Storage Cloud software to build an object-based clustered NAS system for unstructured data. Alex Godelman, vice president of technology at Gorilla Nation, said the cloud replaced a more expensive NAS setup.

"To grow the internal cloud, we just add more nodes," he said. "The design of the system is also very simple -- we just kind of use it. And it allows us to breathe some life into a huge existing investment, which means we created the system virtually for free."

Best practice #4: Prepare for the future

Even if you're not ready for the cloud now -- or the cloud's not ready for you -- start thinking about how it may help you down the road.

Charles Shepard, director of systems architecture at the MGM Mirage in Las Vegas, said he will consider an external private cloud when technology advances make it feasible.

"When Fibre Channel over Ethernet [FCoE] becomes completely adaptable and adopted over the next five years, and when it is completely standardized, that is the pathway to develop a full cloud outside our data center," he said. "If you have a big enough pipe, like 10 Gigabit Ethernet [10 GbE] or even 100 [Gigabit] Ethernet, you might be able to take a database and write from it to the cloud."

He said FCoE would be well suited to multitenancy, which is a crucial component of the cloud.

"It inherently subsegments networks for internal and external multitenant environments," he said.

Best practice #5: Beware of hidden costs

Cloud storage providers will tell you the basic cost per gigabyte of cloud storage up front to help you figure out how much it will cost you per month depending on the amount of data you need to store. But these basic costs are only part of the picture, and providers may also charge extra for data transfers, metadata functions, or copying and deleting files. And don't forget the costs of connecting to the cloud, perhaps with a T1 line.

For more on cloud storage:

1. Find out why the evolving cloud storage market has users weighing their enterprise data storage options

2. Discover why external cloud storage appeals to smaller firms, but large enterprises remain cautious

3. We explain how internal private cloud storage makes its way into larger enterprises

4. Read why not everyone thinks the future is bright for clouds Learn best practices for getting the most from cloud storage, including what to look for in a service-level agreement (SLA), using existing resources to set up a cloud and how to avoid hidden pricing.

The iSCSI storage-area network (iSCSI SAN) has been discussed and debated for much of the last decade, but iSCSI has finally come into its own as a networked storage underpinning for virtual server environments, analysts say.

Enterprise data storage vendors such as Hewlett-Packard (HP) Co.'s LeftHand Networks and Dell EqualLogic have cited the additional costs of networked storage requirements as a barrier to server virtualization adoption for some customers in positioning iSCSI SAN products for that purpose, and tout the relatively low cost of iSCSI SANs compared with Fibre Channel (FC). But according to Jeff Boles, senior analyst and director, validation services at Hopkinton, Mass.-based Taneja Group, there are some technical considerations that make iSCSI more appealing for virtual servers as well.

"A lot of engineering went into Fibre Channel based on the assumption of one host per port," Boles said. "iSCSI has virtualized access anyway, over an IP connection, and has had more engineering around multiple-host contention and various queuing patterns."

While Ethernet networks and the basic best practices for iSCSI SANs are generally well understood by now, if you're looking to deploy an iSCSI SAN to support server virtualization, experts say there are some different factors to keep in mind than when connecting physical servers via iSCSI. Here are five best practices for using iSCSI in a virtual server environment.

Best practice #1: Look beyond basic iSCSI

In the years since iSCSI first came on the scene, products have had time to mature and develop, adding specialized features along the way. In the meantime, iSCSI-related products have proliferated to the point where software-based iSCSI initiators and targets can be had completely free of charge. iSCSI SANs can be built using commodity server hardware and open source software as well.

But Boles said iSCSI specialists, like HP's LeftHand or Dell EqualLogic, are charging a premium for advanced features such as integrated VMware snapshots. Other iSCSI SAN vendors, such as EMC Corp. and NetApp Inc., offer unified storage arrays with various options for connecting servers, including iSCSI. Disk arrays from storage specialist vendors also often have features like quality of service and virtual machine-aware management consoles.

The iSCSI network these arrays are attached to can also make a difference, Boles said. "If you have the right infrastructural underpinnings, for example a well-built, fully managed Cisco environment, you can apply more sophisticated and granular policies to virtual servers."

On the other hand, some of the most advanced iSCSI deployment methods aren't really necessary for a virtual server environment where cost and consolidation are primary factors in purchasing decisions, countered Greg Schulz, founder and analyst at Stillwater, Minn.-based StorageIO Group. As data grows and 10 Gigabit Ethernet (10 GbE) looms on the horizon, some industry experts see technologies like TCP/IP offload engines (TOE cards) coming into play.

But users should balance the availability of these performance enhancers with their original rationale for deployment, Schulz said. "If low cost is the reason I'm deploying iSCSI, I'm probably not going to invest in hardware adapters. Instead, I might want to enable jumbo frames and quality-of-service features through software."

Best practice #2: Consider where iSCSI targets should live in the virtual environment on an application-by-application basis

For VMware environments specifically, "It used to be users had to make a tough choice," Schulz said, between VMware's clustered file system (VMware vStorage VMFS) or raw device mapping (RDM). Before Version 3.5, VMFS offered features like VMotion, but RDM was sometimes the only way to continue to use value-added features of storage arrays like snapshots and virtual provisioning.

While this is no longer the case today, Brian Garrett, vice president of ESG Labs at Milford, Mass.-based Enterprise Strategy Group (ESG), said users should still evaluate where to place the iSCSI target in the infrastructure for performance and manageability reasons. They have a choice of deploying the target as either a virtual disk at the hypervisor level, allowing the server virtualization software to handle calls to the back-end storage through a virtual hard disk layer; or at the disk array, providing somewhat speedier block-based access to the back-end storage.

"The decision will depend in part on what you're already used to," Garrett said. "But block-based apps like SQL databases, for example, work well with raw disks, and would probably be suited to the pass-through or raw mode."

Best practice #3: Rethink network and cabling designs

"One thing users often don't think about is the way iSCSI can give you freedom from past paradigms," Taneja Group's Boles said. Storage pros are used to the Fibre Channel world, where a monolithic disk array is attached via a complex series of switches and cables to servers in a separate aisle of the data center.

With an increase in scale-out and commodity-hardware-based iSCSI SAN architectures, Boles said a new networked storage deployment might also be a good opportunity to rethink the data center layout. "With some of these iSCSI systems, you can interleave the storage with the server farm, and get the storage closer to the server environment without as many long cables."

Rethinking the physical placement of resources in the data center can help resolve issues with overloading parts of the network. "You don't have to shove I/O down a big trunk and then fan-out to the entire infrastructure – interleaving can avoid these bottlenecks," he added.

Best practice #4: Be mindful of monitoring

Boles and Garrett both emphasized that the new virtual world requires new virtualization-aware monitoring tools throughout the data center infrastructure, particularly as highly portable virtual machines (VMs) move around the network. "When you get into a virtual environment, performance monitoring and tuning become a lot more important," ESG Labs' Garrett said. "In the physical world it was easier to make sure you had the right number of actuators to avoid overconsolidating and violating basic storage guidelines."

Added Taneja Group's Boles: "It's easier to implement monitoring from Day 1 than to go back and retrofit a network fabric with monitoring tools; make purchasing decisions with this in mind."

Best practice #5: 10 Gigabit Ethernet remains a ways off

The next boost in Ethernet bandwidth will probably improve iSCSI performance and offer more network consolidation opportunities within data centers, and the transition to 10 Gigabit Ethernet will begin imminently, according to Rick Villars, vice president, storage systems and executive strategies at IDC in Framingham, Mass. "This will be the year server vendors tell people to go to 10 Gigabit Ethernet," he said.

But Villars urged caution when it comes to porting iSCSI SANs to 10 GbE networks too soon, particularly if you're dealing with implementing a virtual server environment already. "You have to decide whether iSCSI is the first or the last thing you want to bring on [to a new 10 GbE network]," Villars said. "Since it's in the early stages, I wouldn't want to go out and start with an iSCSI SAN on [10 GbE] yet." Storage experts offer best practices on how to maximize iSCSI SAN performance and efficiency in virtual server environments, where the protocol has found its biggest audience.

SECURITY TIPS
What you will learn: We'll help you determine whether or not network-attached storage (NAS) devices are right for your virtualized server environment. In most cases, NAS performance won't equal that of a Fibre Channel storage-area network (SAN), but a properly architected NFS solution can meet the performance needs of most workloads. Learn more about NAS storage devices and their advantages.

For the most part, NAS storage devices in a virtualized server environment function similarly to block storage devices, but there may be some limitations due to their architecture.

If you don't use local storage on your virtual host and want to boot directly from a shared storage device, you'll need a storage resource other than a NAS system. With Fibre Channel and iSCSI adapters you can boot the hypervisor directly from a shared storage device without using any local storage. NFS uses a software client built into the hypervisor instead of a hardware I/O adapter. Because of that, there's CPU overhead as the hypervisor must use a software client to communicate with the NFS server. On a very busy host this can cause degradation in performance as the CPUs are also being shared by the virtual machines. In vSphere environments, while you can create virtual machine (VM) datastores on NFS devices, they don't use the high-performance VMFS file system. While this doesn't affect the use of most of vSphere's features, you can't use raw device mappings (RDMs) to attach a physical disk directly to a VM. Some vendors don't recommend NFS storage for certain sensitive transactional apps (e.g., Exchange and Domino) due to latency that can occur. But there are many factors that figure into this, such as host resources/configuration and the performance of the NFS device you're using. This shouldn't be a problem for a properly sized NFS system. NFS doesn't support using multipathing from a host to an NFS server. Only a single TCP session will be opened to an NFS datastore, which can limit its performance. This can be alleviated by using multiple smaller datastores instead of a few larger datastores, or by using 10 Gigabit Ethernet (10 GbE) where the available throughput from a single session will be much greater. The multipathing constraint doesn't affect high availability, which can still be achieved using multiple NICs in a virtual switch. Despite the limitations, there are some good reasons why you might prefer a NAS system over block storage devices.

Many NFS storage devices use thin provisioning by default, which can help conserve disk space because virtual disks don't consume the full amount of space they've been allocated. File locking and queuing are handled by the NFS device, which can result in better performance vs. iSCSI/FC where locking and queuing are handled by the host server. NFS doesn't have a single disk I/O queue like a block storage device has, so you may get better performance. The performance of NFS is based on the size of the network connection and the capabilities of the disk array. Implementing NAS costs a lot less than traditional FC storage. NAS devices require only common NICs instead of expensive HBAs, and use traditional network components rather than expensive FC switches and cables. Because NAS takes away a lot of the complexity of managing shared storage, specialized storage administrators aren't necessary in most cases. Managing files on an NFS server is much easier than managing LUNs on a SAN. Virtual datastores can be expanded easily by simply increasing the disk on the NFS server; there's no need to increase the size of datastores as they'll automatically increase accordingly. Operations like snapshots and cloning are done at the file system level instead of at the LUN level, which can offer greater flexibility and more granular support. The advantages to using NAS are many and you shouldn't be discouraged by the disadvantages that mainly apply to specific circumstances or with lower quality NAS products. With a properly sized and designed system that will handle the VM workloads on your hosts, NAS can be as good a choice as any block storage device.

Is NAS performance enough for your virtual server workload?

Many IT shops considering NAS as an alternative to block storage for their virtual servers are concerned about performance, and with good reason. In most cases, NAS performance won't equal that of an FC SAN, but a properly architected NFS solution can easily meet the performance needs of most workloads.

Some users end up comparing iSCSI to NAS as they're both low-cost alternatives to FC storage and they can each use existing Ethernet infrastructure. VMware Inc. has published test results comparing the performance of virtual machines on NAS, iSCSI and FC storage devices. The results show that the performance of NAS vs. both hardware and software iSCSI is nearly identical. As long as the CPU doesn't become a bottleneck, the maximum throughput of both iSCSI and NFS is limited by the available network bandwidth. Software iSCSI and NFS are both more efficient than Fibre Channel and hardware iSCSI at writing smaller block sizes (fewer than 16 KB), but with larger blocks more CPU cycles are used, which makes software iSCSI and NFS less efficient than hardware iSCSI and Fibre Channel. The CPU cost per I/O is greatest with NFS; it's only slightly higher than iSCSI, but much higher than hardware iSCSI and FC -- but on a host with enough spare CPU capacity this shouldn't be an issue.

Achieving the best performance with NAS comes down to several factors; the first is having enough CPU resources available so the CPU never becomes a bottleneck to NFS protocol processing. It's easy enough to achieve by simply making sure you don't completely overload your virtual host's CPU with too many virtual machines. Unfortunately, there's no way to prioritize or reserve CPU resources for NFS protocol processing, so you need to make sure you adjust your workloads on your hosts accordingly and monitor CPU usage. Using a technology like VMware's Distributed Resource Scheduler will help balance CPU workloads evenly across hosts.

The second factor is network architecture; the performance of NAS storage is highly dependent on network health and utilization. You should isolate your NAS traffic on dedicated physical NICs that aren't shared with virtual machines. You should also ensure that you use a physically isolated storage network that's dedicated to your hosts and NFS servers, and isn't shared with any other network traffic. Your NICs are your speed limit; 1 Gbps NICs are adequate for most purposes, but to take NFS to the next level and experience the best possible performance, 10 Gbps is the ticket. There are a number of network configuration tweaks you can use to boost performance, as well as technology like jumbo frames.

The final factor in NFS performance is the type of NAS storage devices you're connected to. Just like any storage device, you must size your NAS systems to meet the storage I/O demands of your virtual machines. Don't use an old physical server running a Windows NFS server and expect to meet the workload demands of many busy virtual machines. Generally, the more money you put into a NAS product the better performance you'll get. There are many high-end NAS systems available that will meet the demands of most workloads.

BIO: Eric Siebert is an IT industry veteran with more than 25 years of experience who now focuses on server administration and virtualization. He's the author of VMware VI3 Implementation and Administration (Prentice Hall, 2009) and Maximum vSphere (Prentice Hall, 2010).

This article originally appeared in Storage magazine. Are NAS storage devices a fit for a virtual server environment? We list factors in choosing these devices, and explain their impact on your storage systems.

What you'll learn in this tip: Cloud storage options for your company include public, private and hybrid cloud storage. Find out how each one compares in terms of scalability, security, performance, reliability and cost—and learn how to determine which one is right for your data storage environment.

The primary use of cloud storage today is for unstructured data, which is the fastest growing and most voluminous content, causing the most administrative pains. Cloud storage is less suitable for structured data, which continues to live on traditional enterprise data storage.

The benefits of cloud storage technology

The benefits of using cloud storage technology for unstructured data are compelling, starting with lower overall storage costs. Being service based, there's no storage hardware to buy, manage and maintain, and depending on the service, it can greatly reduce, if not eliminate, data center and storage administrator costs. Cloud storage eliminates expensive technology refreshes that usually kick in three years to five years after the initial purchase, needed to either get state-of-the-art technology or simply to get around purchasing expensive support contracts for older arrays.

The technology can provide close to 100% storage utilization by eliminating the massive amounts of unused storage that are needed with traditional data storage for anticipated growth and peak loads. Besides the overall cost savings, scalability of cloud storage and its ability to transparently support base and peak loads are its most appealing characteristics.

Public cloud storage

Public cloud storage services are a cloud storage option offered by a fast growing list of service providers: AT&T, Amazon, Iron Mountain Inc., Microsoft Corp., Nirvanix Inc., Rackspace Hosting Inc. and many others. Their storage infrastructure usually consists of low-cost storage nodes with directly attached commodity drives with an object-based storage stack that manages the distribution of content across nodes. Data in the cloud is typically accessed via Internet protocols, mostly Representational State Transfer (REST) and to a lesser degree Simple Object Access Protocol (SOAP). Resilience and redundancy is achieved by storing each object on at least two nodes. Usage is charged on a dollar-per-gigabyte-per-month basis and, depending on the service provider, there may be additional fees for the amount of data transferred and access charges.

Public cloud storage is designed for massive multi-tenancy that enables isolation of data, access and security for each client. The type of content stored on public clouds ranges from static non-core application data and archived content that needs to be available, to backup and disaster recovery data. Public cloud storage isn't suited for active content that changes all the time. The primary concern of using public cloud storage in the enterprise is security and, to some extent, performance.

Internal or private cloud storage

Internal or private cloud storage runs on dedicated infrastructure in the data center and, as a result, address the two main concerns of security and performance, but otherwise offers the same benefits of public cloud storage. Internal storage clouds are usually for a single tenant, even though larger enterprises may use multi-tenancy features to segregate access by departments or office locations. Unlike their public cloud storage counterparts, scalability requirements are more modest, so internal cloud storage offerings are more likely to have traditional storage hardware under the hood. A case in point is Hewlett-Packard (HP) Co.'s CloudStart, which combines HP BladeSystem Matrix, an HP StorageWorks Enterprise Virtual Array (EVA) Family array and Cloud Service Automation (CSA) software into an internal cloud storage infrastructure. HP CloudStart by itself isn't a private cloud storage offering because it lacks the key element of being service based; instead, it's the enabling infrastructure that could be used by HP, one of its partners or even enterprises to offer it as a fully managed, pay-as-you-go cloud storage offering.

An example of a private cloud storage offering is the Hitachi Data Systems Cloud Service for Private File Tiering. Based on the Hitachi Content Platform (HCP), it resides in the customer's data center but is owned and managed by Hitachi. Besides an initial setup fee, the customer pays for it by usage. Similarly, Nirvanix hNode provides a fully managed, pay-as-you-go, internal cloud offering within the data center, based on the same technology that powers the Nirvanix Storage Delivery Network (SDN).

Hybrid cloud storage

Users who have a hybrid cloud storage environment manage resources both externally and in-house. Because hybrid cloud scenarios often provide an on-site appliance, they can provide local cache and memory, data deduplication and encryption for an IT shop's data.

However, a hybrid cloud solution must meet certain key requirements to make hybrid cloud storage work. They must behave like homogeneous storage, be virtually transparent and have mechanisms in place that keep active and frequently used data on-site while simultaneously moving inactive data to the cloud. These types of clouds also depend on policy engines to define when specific data gets moved into -- or pulled out -- of the cloud. For more on hybrid storage clouds, check out our tip on hybrid cloud implementation.

Public cloud vs. private cloud vs. hybrid cloud storage

The following chart provides a quick overview of available cloud storage options.

Characteristic

Public cloud storage

Private cloud storage

Hybrid cloud storage

Scalability

Very high

Limited

Very high

Security

Good, but depends on the security measures of the service provider

Most secure, as all storage is on-premises

Very secure; integration options add an additional layer of security

Performance

Low to medium

Very good

Good, as active content is cached on-premises

Reliability

Medium; depends on Internet connectivity and service provider availability

High, as all equipment is on-premises

Medium to high, as cached content is kept on-premises, but also depends on connectivity and service provider availability

Cost

Very good; pay-as-you-go model and no need for on-premises storage infrastructure

Good, but requires on-premises resources, such as data center space, electricity and cooling

Improved, since it allows moving some of storage resources to a pay-as-you-go model

Each cloud storage option discussed here has its pros and cons. Public clouds have high scalability, but often lag in performance. Private clouds generally have high reliability, but limited scalability. And hybrid clouds might offer the in-house control that some companies are looking for, but also tend to cost more. Depending on your specific needs, the size of your environment, and your budget, one of these cloud storage options is bound to be a good fit for your organization.

BIO: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at jgsoedl@ahoo.com.

This article originally appeared in Storage magazine. Cloud storage options include public cloud, private cloud and hybrid cloud storage. We weigh their scalability, security, performance, reliability and cost.

What you'll learn in this tip: Implementing hybrid clouds in your data storage environment can be done in three different ways. We provide the details on the various cloud software options you can choose and help you learn which one is best for your organization.

Hybrid clouds come into play when traditional storage systems or internal cloud storage are supplemented with public cloud storage. To make it work, however, certain key requirements must be met. First and foremost, the hybrid storage cloud must behave like homogeneous storage. Except for maybe a small delay when accessing data on the public cloud, it should otherwise be transparent. Mechanisms have to be in place that keep active and frequently accessed data on-premises and push inactive data into the cloud. Hybrid clouds usually depend on nimble policy engines to define the circumstances when data gets moved into or pulled back from the cloud.

There are currently three routes you can take to implement hybrid clouds:
Via cloud storage software that straddles on-premises and public cloud storage Via cloud storage gateways Through application integration Cloud storage software implementation

Combining private cloud storage (on-premises) and public cloud storage into a single heterogeneous storage cloud without custom integration or gateways is only possible today if the internal and external storage clouds run the same cloud storage software. While there are standardization initiatives in progress, such as the Storage Networking Industry Association (SNIA) Cloud Data Management Interface (CDMI), a lack of standards has prohibited out-of-the-box integration between heterogeneous storage clouds. So what we're seeing is cloud software vendors selling their offerings to corporations and service providers to create the prerequisite for hybrid clouds. And some cloud storage providers are offering their storage stacks as internal storage clouds that provide easy integration with their public storage cloud services.

An example of the latter is Nirvanix Inc. Until recently, Nirvanix was only available as a public cloud service, but with the Nirvanix hNode internal cloud storage introduction users are now able to run Nirvanix cloud storage internally and complement it with Nirvanix Storage Delivery Network cloud storage as needed.

Rackspace has been offering its Cloud Files as a public cloud storage service, but it has now open sourced Cloud Files and formed OpenStack.org to drive standardization. The intent is to enable hybrid clouds between service providers and corporate customers, as well as Rackspace Inc.'s public cloud storage service.

Until recently, cloud storage service providers had to either use one of the open source cloud storage products, such as Luster and MogileFS, with their idiosyncrasies and limitations, or develop their own solutions. In the past couple of years, however, cloud storage software has become available as a commercial product from several vendors who sell it to both enterprises and service providers.

Among the commercially available products, EMC Corp.'s Atmos is the most prominent. It's a software-based, hardware-agnostic, object-based storage stack that consists of three loosely coupled services: a presentation layer that handles interfacing to clients via REST, SOAP and traditional file-system protocols; a metadata management layer that manages where data objects are stored and how they're protected and distributed on storage nodes; and a storage target layer that interfaces with storage nodes. It can run on dedicated hardware or on VMware virtual machines. Architected as a scale-out system, it's able to scale to petabytes of storage by simply adding nodes. EMC sells Atmos to enterprises and providers, so on-premises Atmos deployments can federate with Atmos services in the cloud.

EMC's most prominent customer is AT&T. The AT&T Synaptic Storage virtual private cloud, however, is a hybrid storage cloud offering that's quite different from others. It runs in AT&T data centers, but is accessed by customers through AT&T's MPLS network. As a result, it combines security and performance of private clouds with the economics and scalability of public cloud offerings.

Besides EMC Atmos, there are several other cloud storage software products. Caringo Inc. brought CAStor Content Storage Software into this market by repositioning its content addressable storage (CAS) product as a cloud storage solution. Cleversafe Inc. offers a cloud storage platform that leverages information dispersal algorithms (IDAs) that slice data across nodes in the cloud, eliminating the need for replication; Cleversafe claims it has achieved substantially higher storage utilization than products that have to store multiple copies of data on storage nodes for redundancy.

Cloud storage gateways implementation

Cloud storage gateways sit between on-premises storage and public cloud storage. They translate between traditional storage protocols and the more esoteric cloud storage protocols and APIs. Historically, public cloud storage could only be accessed via custom integration. Furthermore, cloud gateways perform data migration of information from on-premises (private) storage into public cloud storage and vice versa, usually via policy engines.

Cloud storage gateways differ in several key areas. They're either block or file based; and they present themselves within the data center as block-based storage or NAS devices. Data deduplication and compression are critical cloud gateway features, as both features significantly impact cloud storage cost. Encryption of data in-transit and while stored in the storage cloud is a must. Some gateways are designed and optimized for backup and archival, some are closely integrated with applications like Microsoft Exchange and SharePoint, and others are targeted as a transactional cloud storage tier to supplement internal storage tiers.

Application integration implementation for hybrid clouds

All public cloud storage services offer APIs to interact with internal cloud storage software and cloud gateways, but these APIs can also be used to directly integrate applications with public cloud storage. Cloud storage APIs enable custom in-house and commercial applications to tap into public cloud storage via REST interfaces.

For instance, backup application vendors have started to add public cloud storage support to their backup suites. Symantec Corp. offers cloud storage support for NetBackup and Backup Exec. Similarly CommVault's Simpana backup software integrates with public storage clouds.

Whether you choose to implement hybrid clouds via cloud storage software, cloud storage gateways or through application integration, all are viable options with several providers and products to choose from. Be sure to weigh your options and choose the hybrid cloud approach that best suits your storage environment.

BIO: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at jgsoedl@yahoo.com.

This article originally appeared in Storage magazine. Are you implementing hybrid clouds in your firm? Find out whether cloud storage software, cloud storage gateways or application integration is the best bet.
What you'll learn in this tip: There are several ways you can fine-tune and improve your storage-area network (SAN). This tip covers topics such as using ISLs and understanding HBA queue depth to help you avoid storage bottlenecks.

In this tip, we take a closer look at how SAN performance and SAN efficiency improve with transparency, testing and a better understanding of the impact your data storage has on the rest of your system. Check out our earlier tip on how to improve your storage networks to find out how storage performance issues are often linked to data storage networks with outdated information or that don't undergo regular testing.

Tip 1. Understand how you're using ISLs

Inter-switch links (ISLs) are critical areas for tuning and, as a SAN grows, they become increasingly important to performance. The art of fine-tuning an ISL is often an area where different vendors will have conflicting opinions on what a good rule of thumb is for switch fan-in configurations and the number of hops between switches. The reality is that the latency between switch connections compared to the latency of mechanical hard drives is dramatically lower, even negligible; however, in high fan-in situations or where there are a lot of hops (servers crossing multiple switches to access data), ISLs play an important role.

The top concern is to ensure that ISLs are configured at the correct bandwidth between the switches, which seems to be a surprisingly common mistake. Beyond that, it's important to measure the traffic flow between hosts and switches, and the ISL traffic between the switches themselves. Switch reporting tools will provide much of this information but, a visual tool that measures switch intercommunication may be preferable.

Based on the traffic measurements, a determination can be made to rebalance traffic flow by adjusting which primary switch the server connects with, which will involve physical rewiring and potential server downtime. Another option is to add ISLs, which increases bandwidth but consumes ports and, to some extent, further adds to the complexity of the storage architecture.

Tip 2. Use NPIV for virtual machines

Server virtualization has changed just about everything when configuring SANs and one of the biggest challenges is to identify which virtual machines are demanding the most from the infrastructure. Before server virtualization, a single server had a single application and communicated to the SAN through a single host bus adapter (HBA); now virtual hosts may have many servers trying to communicate with the storage infrastructure all through the same HBA. It's critical to be able to identify the virtual machines that need storage I/O performance the most so that they can be balanced across the hosts, instead of consuming all the resources of a single host. N_Port ID Virtualization (NPIV) is a feature supported by some HBAs that lets you assign each individual virtual machine a virtual World Wide Name (WWN) that will stay associated with it, even through virtual machine migrations from host to host. With NPIV, you can use your switches' statistics to identify the most active virtual machines from the point of view of storage and allocate them appropriately across the hosts in the environment.

Tip 3. Know thy HBA queue depth

HBA queue depth is the number of pending storage I/Os that are sent to the data storage infrastructure. When installing an HBA, most storage administrators simply use the default settings for the card, but the default HBA queue depth setting is typically too high. This can cause storage ports to become congested, leading to application performance issues. If queue depth is set too low, the ports and the SAN infrastructure itself aren't used efficiently. When a storage system isn't loaded with enough pending I/Os, it doesn't get the opportunity to use its cache; if essentially everything expires out of cache before it can be accessed, the majority of accesses will then be coming from disk. Most HBAs set the default queue depth between 32 to 256, but the optimal range is actually closer to 2 to 8. Most initiators can report on the number of pending requests in their queues at any given time, which allows you to strike a balance between too much and not enough queue depth.

Tip 4. Multipath verification

Multipath verification involves ensuring that I/O traffic has been distributed across redundant paths. In many environments, our experts said they found multipathing isn't working at all or that the load isn't balanced across the available paths. For example, if you have one path carrying 80% of its capacity and the other path only 3%, it can affect availability if an HBA or its connection fails, or it can impact application performance. The goal should be to ensure that traffic is balanced fairly evenly across all available HBA ports and ISLs.

You can use switch reports for multipath verification. To do this, run a report with the port WWNs, the port name and the MBps sorted by the port name combined with a filter for an attached device type equal to "server." This is a quick way to identify which links have balanced multipaths, which ones are currently acting as active/passive and which ones don't have an active redundant HBA.

Tip 5. Improve replication and backup performance

While some environments have critical concerns over the performance of a database application, almost all of them need to decrease the amount of time it takes to perform backups or replication functions. Both of these processes are challenged by rapidly growing data sets that need to be replicated across relatively narrow bandwidth connections and ever-shrinking backup windows. They're also the most likely processes to put a continuous load across multiple segments within the SAN infrastructure. The backup server is the most likely candidate to receive data that has to hop across switches or zones to get to it.

All of the above tips apply doubly to backup performance. Also consider adding extra HBAs to the backup server and have ports routed to specific switches within the environment to minimize ISL traffic.

BIO: George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments.

This article originally appeared in Storage magazine. There are several ways you can fine-tune and improve your SAN performance. Follow our five tips to avoid data storage bottlenecks and improve SAN efficiency