Goal of xFS
Central file servers are often performance, reliability, and
availability bottlenecks in today's networks of workstations.
The xFS file system is designed to distribute the work of central
file server over the network of workstations to avoid central
bottlenecks and to scale service with the NOW.
Eliminating the Central Server
How does xFS accomplish its goal of eliminating the central file server?
A typical file server has 4 main duties:
- Caching data blocks
- Maintaining cache coherence
- Storing data on disk
- Recording where the data is stored on disk
xFS uses a hash function to divide the name space of the file system.
Each part of the name space is assigned to a different manager.
XFS accomplishes distributed caching and cache coherence of data
blocks through cooperative caching and distributed management.
Clients may obtain copies of data blocks from other clients. The
manager of the data block keeps track of who is currently storing the block
and enforces cache coherence.
This feature is shown in the demo. For example, if Client 5 caches a data block and
Client 3 wants to read it, Client 3 will inquire the block's location from the block's manager
who will forward the request to Client 5 who has the block stored in memory and
forwards the block to Client 3.
However, if Client 3 wants to write to the Block 1, the manager must in addition
revoke Client 5's copy of the data so that cache consistency is maintained.
Disk storage of data is distributed across several machines called storage servers. Each client keeps an in-memory log of its dirty data. When the log is full or the data is sync'd to disk for reliability reasons, the data is divided into
fragments and sent over the network to the storage servers. A parity fragment is
also constructed and sent to another storage server to increase reliability and availity.
You can see this feature in the demo by having one client create many blocks then hitting the flush button.
Performance
Small write performance
This graph shows how the file system scales with the number of clients
actively writing small (1KB) files. xFS's performance is able to scale with many active
clients since small
writes are batched into large writes which are then distributed to disks across
the network.
In contrast, central file servers systems, like Network File System
and Andrew File System, can
only keep up with a few active clients before they become saturated with requests.
|
|
Figure 1: Small write performance
|
Large write performance
Figure 2 shows that xFS is able to scale well with large numbers of clients
actively making large writes. Because xFS distributes its writes across
several storage servers and different clients write to different storage
server groups, it is able to provide the aggregate bandwidth of
all the disks in the network of workstations rather than only the disks
attached to the file server.
The graph for large read performance is identical.
|
|
Figure 2: Large write performance
|
Modified Andrew Benchmark
Figure 3 shows the results of running the Modifiew Andrew Benchmark on xFS,
NFS, and AFS. The benchmark has several phases, 3 of which are shown in
the graph. The write phase of the benchmark shows that xFS scales well with many
clients writing, which is consistent with the results of the previous two benchmarks.
All of the file systems scale well on the read phase of the benchmark, however,
since the read phase only reads 4MB and each client has a much larger cache, this phase is only measuring
how fast each file system can read from its local cache.
The compile phase of the benchmark is cpu-bound rather than I/O bound.
However, enough reading and writing of binaries and temporary files occurs
to affect the performance of AFS and NFS. xFS also shows a decrease in
performance in this phase as more clients are added, but for a different
reason. Because the xFS clients are also serving as storage servers and
managers, part of their cpu time is being used in this capacity rather
than in compiling.
|
|
Figure 3: Modified Andrew Benchmark
|