ACM Queue - Standardizing Storage Clusters: One of the goals of pNFS is backward compatibility in terms of filesharing semantics for clients. Will pNFS become the new standard for parallel data access?

Wed, Oct 10, 2007

Software Test
& Performance
Sponsored by Parasoft

ALM

Sponsored by TechExcel

Developer Tools

Security

Sponsored by Aladdin

Development Tools Directory

Columns:

Curmudgeon

Geek@Home

Interviews

Kode Vicious

Conference Calendar

Issue Index

Site Map

Videos

Queue Partners:

ALM: Application Development 2.0 - Myths and Realities

Discover 21 Ways to Use Excel with Java Applications

Actuate brings business intelligence to the Eclipse Platform

Maximize the Value of Software Assets
Download Strategic Licensing Management White Paper

Latest Queuecasts:

White Papers:

Conferences:

Poll

What's New on ACM Queue
·	Standardizing Storage Clusters
·	Some Swans Are Black
·	Voyage in the Agile Memeplex
·	Usability Testing for the Web
·	Phishing Forbidden
go to issue index

Most Popular Articles
1	A Conversation with Joel Spolsky
2	DNS Complexity
3	API: Design Matters
4	A Conversation with Cory Doctorow and Hal Stern
5	The Seven Deadly Sins of Linux Security
	more

CRC Career Resource Center

Storage -> Features -> File Systems and Storage issue

Standardizing Storage Clusters

by Garth Goodson, Sai Susarla, and Rahul Iyer, Network Appliance

printer-friendly format

recommend to a colleague

The pNFS protocol could pave the way for commoditized parallel file systems. Implementing a pNFS client adds some amount of complexity to a regular NFSv4 client.

Data Distribution

Queue Digital Edition

The PDF version of the July/August issue of Queue is now online.
Download here
Only subscribers are able to download the Queue PDF edition. Activate your account here

Sections

1: Data Distribution

2: Design

3: File-sharing

4: Considerations

more Storage

Voyage in the Agile Memeplex
In the world of agile development, context is key. Too often reduced to catchy slogans, these practices must be handled with care.
Usability Testing for the Web
Your users have important things to tell you; are you listening? Today's sophisticated Web applications make tracking and listening to users more important than ever.
Phishing Forbidden
Current anti-phishing technologies prevent users from taking the bait. A security team from Yahoo! looks at the state-of-the-art in anti-phishing technologies.

more Features

Data-intensive applications such as data mining, movie animation, oil and gas exploration, and weather modeling generate and process huge amounts of data. File-data access throughput is critical for good performance. To scale well, these HPC (high-performance computing) applications distribute their computation among numerous client machines. HPC clusters can range from hundreds to thousands of clients with aggregate I/O demands ranging into the tens of gigabytes per second.

To simplify management, data is typically hosted on a networked storage service and accessed via network protocols such as NFS (Network File System) and CIFS (Common Internet File System). For scalability, the storage service is often distributed among multiple nodes to leverage their aggregate compute, network, and I/O capacity. Traditional network file protocols, however, restrict clients to access all files in a file system through a single server node. This prevents a storage service from delivering its aggregate capacity to clients on a per-file basis and limits scalability. To circumvent the single-server bottleneck of traditional network file system protocols, designers of clustered file services are faced with three choices (these are illustrated in figure 1):

To force administrators to partition the file namespace manually among servers via multiple mount points.
To hide the data distribution from clients by employing a transparent request routing layer.
To devise a new protocol that allows clients to directly access multiple servers in parallel.

The first approach imposes the burden and expense of manual data distribution on system administrators. It is error-prone, reduces availability, and quickly becomes unmanageable as data grows in size. Moreover, it cannot spread large files over multiple servers without application-level changes.

The second approach allows existing unmodified clients to access distributed storage and hence is simple to deploy and maintain on large client farms. It limits end-to-end scalability, however, by forcing a client’s data always to flow through a single entry point.

The third approach eliminates this bottleneck and enables true data parallelism. As such, it has been adopted by several clustered storage solutions. Because of the lack of a standard protocol for parallel data access, however, the protocols and interfaces remain proprietary.

Although custom client access protocols provide the best performance and scalability, they have limitations: they inhibit interoperability across diverse client platforms and storage architectures; they also make it difficult to develop and maintain client software for the heterogeneous platforms that must operate in large compute farms over extended periods of time; and finally, clients using inflexible interfaces cannot be evolved rapidly to benefit from advances in distributed storage architectures and require constant maintenance. The lack of a standard parallel data access protocol remains a key hurdle to the widespread adoption of clustered storage for mission-critical HPC applications.

The pNFS (parallel NFS) protocol is being standardized as part of the NFSv4.1 specification to bridge the gap between current NFS protocols (versions 2, 3, and 4) and parallel cluster file system interfaces. Current NFS protocols force clients to access all files on a given file-system volume from a single server node, which can become a bottleneck for scalable performance. As a standardized extension to NFSv4, however, pNFS provides clients with scalable end-to-end performance and the flexibility to interoperate with a variety of clustered storage service architectures.

The pNFS protocol enables clients to directly access file data spread over multiple storage servers in parallel. As a result, each client can leverage the full aggregate bandwidth of a clustered storage service at the granularity of an individual file. A standard protocol also improves manageability of storage client software and allows for interoperability across heterogeneous storage nodes. Finally, the pNFS protocol is backward-compatible with the base NFSv4 protocol. This allows interoperability between old and new clients and servers.

Using the pNFS protocol, clients gather metadata, called layouts, about how files are distributed across data servers. Layouts are maintained internally by the pNFS server. Once the client understands the file’s layout, it is able to directly access the data servers in parallel. Unlike NFSv4 whereby a client accesses data via the NFS protocol from a single NFS server, a pNFS client communicates with the data servers using a variety of storage access protocols, including NFSv4 and iSCSI/Fibre Channel using the SCSI block command set or the new SCSI object command set. The pNFS specification allows for the addition of new layout distributions and storage access protocols. It also provides significant flexibility in the implementation of the back-end storage system.

next page (2/4)
Design

ACM Queue vol. 5, no. 6 - September / October 2007
by Garth Goodson, Sai Susarla, and Rahul Iyer, Network Appliance

Submit this story to one of the following blogs:

Be the first to comment on this article.

Post your comment now!

name:
email:
subject:
comment:
	note: only <b>, <i>, and <br> tags allowed
	Please type in the captcha number below

Place Your Link Here

AllinfoDir Web Directory Apartments for rent Bonus Casino Businesses for sale Casinos Cheap Personal Loans Counter Strike Hacks Elegant Directory Free Themes

Web development & buy MLB tickets.

Jogos Online Casino Online Casino Games osCommerce Services phone cards Spiele Web Design WoW Hacks

Free QueueNews Email Newsletter
QueueNews is a weekly newsletter featuring a listing and excerpts of the latest articles to appear on Queue's Web site. Subscribing is quick and easy! Just fill out the form below.
	- HTML version - plaintext version
Please type in the captcha number:
privacy policy

ACM Home	About Queue	Advertise with Queue	Advisory Board	Back Issues	Contact Us	Dev Tools Roadmap	Free Subscription	Privacy Policy	Writer Faq	RSS feeds
ACM Home	© ACM, Inc. All rights reserved.

Place Your Link Here
AllinfoDir Web Directory	Apartments for rent	Bonus Casino	Businesses for sale	Casinos	Cheap Personal Loans	Counter Strike Hacks	Elegant Directory	Free Themes
Web development & buy MLB tickets.
Jogos	Online Casino	Online Casino Games	osCommerce Services	phone cards	Spiele	Web Design	WoW Hacks