Read/write consistency, which means that data written by a process becomes ``visible'' immediately after a write returns, is explicitly guaranteed by IEEE 1003.1-1990. Figure 3.1 illustrates the concept of read/write consistency for two processes, Process A and Process B, which are running on the same node using Unix IO, not stdio, and are accessing local data. In this section, the term ``Unix IO'' is used to refer to file access which does not use stdio. When Process A or Process B executes a write(), the data goes to a cache for the device. When either Process A or Process B executes a read(), the data is read from cache. Read/write consistency is maintained because both processes read directly from the same cache. When the write() of Process A or Process B returns, the data is ``visible'' immediately to any other process on the system using Unix IO. There is a demonstration of this at the end of this section (see fig. 3.4). If caching is not used, data goes directly to the device.
Figure 3.1: Local case for read() and write().
There are several situations which involve the concept of read/write consistency when accessing local files. These are:
A process opens a file, writes some data at the beginning of the file, uses lseek() to position the file pointer to the beginning of the file, and reads. This case is read/write consistent. The process reads what it just wrote.
A process opens a file, writes some data at the beginning of the file, forks a child who uses lseek() to position the file pointer to the beginning of the file and reads. This case is read/write consistent. The child process reads what the parent process just wrote.
Process A opens a file and writes some data at the beginning of the file. Some time after the write of Process A returns, Process B opens the file, positioning the file pointer to the beginning of the file, and reads. This case is read/write consistent. Process B reads what Process A just wrote.
Figure 3.1 shows two processes, Process C and Process D, which use stdio to buffer data for accessing local files. In order to improve performance, data written using stdio usually goes to a buffer associated with a process. The data is not visible to any other process until the buffer is flushed. When an fwrite() returns, the data may have been written to disk or the data may still be in the buffer. An initial fread() will read data from the disk into a buffer associated with the process. Subsequent fread()s will read data from the buffer. Once all the data in the buffer has been read, the buffer is refilled with data from disk. It is possible for data in the buffer to be inconsistent with data on disk. Therefore, read/write consistency fails when using stdio. There is a demonstration of this case at the end of this section (see fig. 3.5).
Figure 3.2: read() and write() for two processes
on the same client.
Read/write consistency is also an issue in a network environment. In order to improve performance, most client implementations use some caching mechanism when accessing remote files. Processes on the same node are usually read/write consistent because they read from and write to the same cache (see fig. 3.2). However, processes on different nodes using client caching may not be read/write consistent because the processes read from and write to different caches.
Figure 3.3: read() and write() for two processes
on different clients using client caching.
The lack of read/write consistency when using stdio on the
same node is analogous to the
lack of read/write consistency when processes are running on different
nodes. Both stdio and client caching are used to improve
performance. With stdio, each process has a set of read and
write buffers. Thus, read/write consistency is usually only maintained
at the level of a single process. With client caching in a network
environment, read/write consistency may only be maintained for
the set of processes on a single client.
Processes on different clients may not be read/write consistent when
client caching is used. One way to ensure read/write consistency among
processes on different nodes is to forgo the use of client caching.
This means that when Unix IO is used, all writes are sent
to the server before the write() returns, and all reads are
obtained directly from the server before the read()
returns. This usually entails a performance penalty. There
are techniques which permit the use of client caching and still maintain
read/write consistency. Appendix
contains references for some of these
techniques.
Figure 3.3 depicts two processes, Process A and Process B, which are running on different clients and use client caching. Assume that both processes are accessing the same file on the server. An initial read() by a process will read data from the disk on the server into the cache associated with each client. Subsequent read()s by a process will read data from client cache. If Process A executes a write(), the data may not be written immediately to the server. All writes on the client are cached and are written to the server at some later time. Therefore, it is possible for data in cache to be inconsistent with the server's data. In order for Process B to see the data that Process A just wrote, Client A would have to write the data in its cache to the server, and Client B would have to fill its cache with new data from the server. Processes on different nodes may not be read/write consistent when using client caching because the data may not be ``visible'' to all processes after a write() returns.
Demonstrations were developed using an NFS implementation
to illustrate read/write consistency. The following
demonstrations use the programs WriteUnixIO and WriteStdIO.
Source code for these programs is located in Appendix
.
WriteUnixIO is a program which reads input from the terminal
and writes output to outfile. The file name outfile, which
in the demonstrations refers to either a local file or a remote file
mounted under the directory mnt,
is a parameter to WriteUnixIO.
Data is read and written a
byte at a time using Unix IO. WriteStdIO is similar to
WriteUnixIO
except that the program uses stdio to buffer data for writing.
Periodically, the command ``cat outfile; echo `` ''; ls -l outfile''
is run to display the contents and the size of outfile.
This command shows whether or not data is ``visible'' immediately
after a write() returns. For all the
demonstrations, Process A is either the program WriteUnixIO
or WriteStdIO and Process B is a process which displays the
contents and size of outfile.
Figure 3.4 and figure 3.5 illustrate how the
demonstration proceeds.
Commands are displayed in italics
and the output of those commands is displayed in bold.
Process A and Process B are both on the same single system and outfile is a local file. Process A is the program WriteUnixIO. After data is entered, Process B displays the contents of outfile. Data is ``visible'' to Process B immediately after a write() from Process A returns. Unix IO on a single system is read/write consistent.
Figure 3.4: A demonstration of read/write consistency.
Process A and Process B are both on the same single system and outfile is a local file. Process A is the program WriteStdIO. After data is entered, Process B displays the contents of outfile. Because stdio is used to buffer output, data is not ``visible'' to Process B immediately after a write() from Process A returns. Stdio on a single system is not read/write consistent. The data becomes ``visible'' when a Ctrl-D is received which causes the buffer to be dumped to disk.
Figure 3.5: A demonstration of the lack of read/write consistency.
Process A and Process B are on the same client and outfile is a remote file. Process A is the program WriteUnixIO. After data is entered, Process B displays the contents of outfile. Data is ``visible'' to Process B immediately after a write() from Process A returns. Two processes on the same node using NFS are read/write consistent.
Process A is on Client A, Process B is on Client B, and outfile is remote to both processes. Process A is the program WriteUnixIO. After data is entered, Process B displays the contents of outfile. Because client caching is used, data is not ``visible'' to Process B for several seconds after a write() from Process A returns. Processes on different clients which use client caching are not guaranteed to be read/write consistent. The data becomes ``visible'' to Process B when Client A's cache is written to the server and Client B's cache is refilled from the server. This is similar to the read/write consistency issue when using stdio.
Process A is on Client A, Process B is on Client B, and outfile is remote to both Process A and Process B. Process A is the program WriteUnixIO. Before Process A is started, another process on Client A runs setlock outfile to lock outfile in the same manner as was used in the demonstration on the effect of client caching on performance. Advisory record locking is one method of turning off client caching for individual files. Although Client B uses client caching, each time the command line of Process B is run, outfile is opened, read in its entirety, and closed. The important thing to note from this demonstration is that a write() from Process A is ``visible'' immediately to Process B after the write() returns because the data is written directly to the server instead of to Client A's cache. Also note that even though all bytes in outfile have been exclusively locked, Process A is able to write to outfile because the locking mechanism is advisory and not mandatory. Read/write consistency is guaranteed, using NFS, for processes on different nodes that do not use client caching.
Processes on different nodes that do not use client caching are read/write consistent, but processes on different nodes that do use client caching may not be. Applications that need to guarantee read/write consistency should use record locking. In some implementations where client caching is used, record locking is the preferred method of guaranteeing read/write consistency. Because of possible performance degradation from providing read/write consistency all of the time, some implementations only guarantee read/write consistency among processes who use record locking for simultaneous file access.
Table 3.1: Read/write consistency summary
Table 3.1 summarizes the results of the read/write consistency demonstrations. For those cases that are not read/write consistent, they may be read/write consistent some of the time, but read/write consistency is not guaranteed all of the time. For example, the demonstration which concerns processes on different nodes using NFS without client caching is read/write consistent even though Process B uses client caching. This is because Process B reads the entire file at once. Processes on different clients using client caching or processes using stdio may be read/write consistent some of the time, but are not all of the time. The only way to guarantee read/write consistency for all cases of processes simultaneously accessing files and for all implementations is to use record locking and forgo the use of stdio.