ACMQueue
Wed, Oct 10, 2007
Columns: Curmudgeon Geek@Home Interviews Kode Vicious | Conference Calendar Issue Index Site Map Videos |
Queue Partners:

ALM: Application Development 2.0 - Myths and Realities
Discover 21 Ways to Use Excel with Java Applications
Actuate brings business intelligence to the Eclipse Platform
Maximize the Value of Software Assets
Download Strategic Licensing Management White Paper
Latest Queuecasts:





White Papers:

Conferences:









Poll
Are you a member of the ACM?

Yes
No, I let my membership lapse
No, but I'm planning to join
No, I've never been a member and don't intend to join
What's the ACM?



Results
Polls

Votes: 2
Comments: 0

What's New
on ACM Queue
·Standardizing Storage Clusters
·Some Swans Are Black
·Voyage in the Agile Memeplex
·Usability Testing for the Web
·Phishing Forbidden
go to issue index
Most Popular Articles
1A Conversation with Joel Spolsky
2DNS Complexity
3API: Design Matters
4A Conversation with Cory Doctorow and Hal Stern
5The Seven Deadly Sins of Linux Security
 more

CRC Career Resource Center


Storage -> Features -> File Systems and Storage issue

Standardizing Storage Clusters

by Garth Goodson, Sai Susarla, and Rahul Iyer, Network Appliance
  printer-friendly format
  recommend to a colleague

The pNFS protocol could pave the way for commoditized parallel file systems. Implementing a pNFS client adds some amount of complexity to a regular NFSv4 client.

Data Distribution

Queue Digital Edition
The PDF version of the July/August issue of Queue is now online.
Download here
Only subscribers are able to download the Queue PDF edition. Activate your account here
Sections
1: Data Distribution
2: Design
3: File-sharing
4: Considerations

more Storage

Voyage in the Agile Memeplex
In the world of agile development, context is key. Too often reduced to catchy slogans, these practices must be handled with care.
Usability Testing for the Web
Your users have important things to tell you; are you listening? Today's sophisticated Web applications make tracking and listening to users more important than ever.
Phishing Forbidden
Current anti-phishing technologies prevent users from taking the bait. A security team from Yahoo! looks at the state-of-the-art in anti-phishing technologies.

Data-intensive applications such as data mining, movie animation, oil and gas exploration, and weather modeling generate and process huge amounts of data. File-data access throughput is critical for good performance. To scale well, these HPC (high-performance computing) applications distribute their computation among numerous client machines. HPC clusters can range from hundreds to thousands of clients with aggregate I/O demands ranging into the tens of gigabytes per second.

To simplify management, data is typically hosted on a networked storage service and accessed via network protocols such as NFS (Network File System) and CIFS (Common Internet File System). For scalability, the storage service is often distributed among multiple nodes to leverage their aggregate compute, network, and I/O capacity. Traditional network file protocols, however, restrict clients to access all files in a file system through a single server node. This prevents a storage service from delivering its aggregate capacity to clients on a per-file basis and limits scalability. To circumvent the single-server bottleneck of traditional network file system protocols, designers of clustered file services are faced with three choices (these are illustrated in figure 1):

  • To force administrators to partition the file namespace manually among servers via multiple mount points.
  • To hide the data distribution from clients by employing a transparent request routing layer.
  • To devise a new protocol that allows clients to directly access multiple servers in parallel.

The first approach imposes the burden and expense of manual data distribution on system administrators. It is error-prone, reduces availability, and quickly becomes unmanageable as data grows in size. Moreover, it cannot spread large files over multiple servers without application-level changes.

The second approach allows existing unmodified clients to access distributed storage and hence is simple to deploy and maintain on large client farms. It limits end-to-end scalability, however, by forcing a client’s data always to flow through a single entry point.

The third approach eliminates this bottleneck and enables true data parallelism. As such, it has been adopted by several clustered storage solutions. Because of the lack of a standard protocol for parallel data access, however, the protocols and interfaces remain proprietary.

Although custom client access protocols provide the best performance and scalability, they have limitations: they inhibit interoperability across diverse client platforms and storage architectures; they also make it difficult to develop and maintain client software for the heterogeneous platforms that must operate in large compute farms over extended periods of time; and finally, clients using inflexible interfaces cannot be evolved rapidly to benefit from advances in distributed storage architectures and require constant maintenance. The lack of a standard parallel data access protocol remains a key hurdle to the widespread adoption of clustered storage for mission-critical HPC applications.

The pNFS (parallel NFS) protocol is being standardized as part of the NFSv4.1 specification to bridge the gap between current NFS protocols (versions 2, 3, and 4) and parallel cluster file system interfaces. Current NFS protocols force clients to access all files on a given file-system volume from a single server node, which can become a bottleneck for scalable performance. As a standardized extension to NFSv4, however, pNFS provides clients with scalable end-to-end performance and the flexibility to interoperate with a variety of clustered storage service architectures.

The pNFS protocol enables clients to directly access file data spread over multiple storage servers in parallel. As a result, each client can leverage the full aggregate bandwidth of a clustered storage service at the granularity of an individual file. A standard protocol also improves manageability of storage client software and allows for interoperability across heterogeneous storage nodes. Finally, the pNFS protocol is backward-compatible with the base NFSv4 protocol. This allows interoperability between old and new clients and servers.

Using the pNFS protocol, clients gather metadata, called layouts, about how files are distributed across data servers. Layouts are maintained internally by the pNFS server. Once the client understands the file’s layout, it is able to directly access the data servers in parallel. Unlike NFSv4 whereby a client accesses data via the NFS protocol from a single NFS server, a pNFS client communicates with the data servers using a variety of storage access protocols, including NFSv4 and iSCSI/Fibre Channel using the SCSI block command set or the new SCSI object command set. The pNFS specification allows for the addition of new layout distributions and storage access protocols. It also provides significant flexibility in the implementation of the back-end storage system.

next page (2/4)
Design
next
ACM Queue vol. 5, no. 6 - September / October 2007
by Garth Goodson, Sai Susarla, and Rahul Iyer, Network Appliance

Submit this story to one of the following blogs:
Slashdot   del.icio.usdiggtechnoratiblinklistfurlreddit

Related Stories
Discuss Standardizing Storage Clusters
 
Be the first to comment on this article.
Post your comment now!
name:
email:
subject:
comment:
note: only <b>, <i>, and <br> tags allowed
Please type in the captcha number below
 

Free QueueNews Email Newsletter
QueueNews is a weekly newsletter featuring a listing and excerpts of the latest articles to appear on Queue's Web site.
Subscribing is quick and easy! Just fill out the form below.
- HTML version
- plaintext version
Please type in the captcha number:
 
privacy policy


Place Your Link Here
AllinfoDir Web Directory Apartments for rent Bonus Casino Businesses for sale Casinos Cheap Personal Loans Counter Strike Hacks Elegant Directory Free Themes
Web development & buy MLB tickets.
Jogos Online Casino Online Casino Games osCommerce Services phone cards Spiele Web Design WoW Hacks


ACM Home
About Queue Advertise with Queue Advisory Board Back Issues Contact Us Dev Tools Roadmap Free Subscription Privacy Policy Writer Faq RSS feeds
© ACM, Inc. All rights reserved.