Dump(1M) Performance over Networks

Introduction
Description of Tests
- Equipment
- Method
Results
Real Life
Credits

INTRODUCTION

It was taking over 17 hours to backup a 3.2GB filesystem to a tapedrive on a remote host, using dump(1M).This led me to conduct a systematic study of dump(1M) performance over networks, using various parameters.

Acting on the results shown here, I modified my backup script to use a blocksize of 80KB, rather than the default blocksize of 10KB. The time required dropped to under 4 hours -- a speedup factor of almost 5X. This was less than the 8X improvement predicted by my controlled tests, but still significant. I compare the test results with our actual experience in the real life section.

Although these tests were performed using dump(1M), the results probably translate to bru(1), tar(1) or other backup programs.

DESCRIPTION OF TESTS

Equipment

The tests were run on glutamine, an SGI Indigo2 Extreme with 200 MHz R4400. The filesystem was located on a Seagate ST15230N 4GB Hawk disk drive (5400 RPM, 3.5"). We also used methionine as a remote host. Methionine is an SGI Challenge-M with 150MHz R4400. Both hosts are on the same ethernet segment; both are attached to the same FDDI hub; both run IRIX 5.3.

The tape drive used is an Exabyte 8505 8mm from Transitional Technology. It was the only device on the external SCSI bus of glutamine. The tape drive was used in high density mode with data compression on.

Method

A simple shell script was executed on glutamine from the root account (because dump(1m) requires root). There was no other activity on glutamine during the tests; activity on methionine and on the ethernet and FDDI networks is presumed to have been light. This assumption is borne out by the results.

Initially, a directory in a filesystem on glutamine was filled with 27.8MB of mixed files, as shown by the df(1) command. Most of the space was occupied by a few large files. These files are compressable by 30% to 50% using compress(1) -- we therefore assume that a similar compression could be obtained by the tape drive.

First, the entire filesystem was dumped to /dev/null. This has the effect of loading the filesystem cache with as much data as it will hold. Otherwise, this might be done on the first test pass, artificially lengthening its run time in comparison with subsequent passes. A standard command was repeated in the script:
timex dump 0bCf $blocksize $tape /filesystem

Blocksize ranged from 10 to 128 (in KB); tape specified the following output devices:

/dev/null -- The null device on the local hosts.

/dev/rmt/tps1d6nsv -- Tape drive on the local host, by way of the variable block driver.

/dev/rmt/tpd1d6ns -- Tape drive on the local host, by way of the fixed block driver.

methionine-e:/dev/null -- Null device on remote host via ethernet.

methionine-f:/dev/null -- Null device on remote host via FDDI.

methionine-e:/dev/rmt/tps1d6nsv -- Remote tape drive via ethernet.

methionine-f:/dev/rmt/tps1d6nsv -- Remote tape drive via FDDI.

Following these tests, the contents of the directory were copied to each of three other directories, increasing the contents of the filesystem to 111.2MB. The same series of tests were repeated.

The log file from this script provided the real run time (via the timex(1) command) for each of these tests. Comparing the results from the different filesystem sizes, we can calculate the differential data rate (size increase divided by time increase). We can then calculate the fixed overhead involved in setup and cleanup, by taking the total time for the larger filesystem, minus the size of the filesystem multiplied by the differential data rate.

RESULTS

Overhead

Table 1 shows the calculated startup time for different devices and blocksizes. In all cases, this includes the time dump(1M) spends scanning the filesystem (or that portion that is independent of filesystem size). If output is to a tape device, it includes tape positioning, startup and rewinding time.

TABLE 1 -- Overhead (sec)
Blocksize
(KB) local
null local
tpnsv local
tpns ether
null FDDI
null ether
tpnsv FDDI
tpnsv
10 28 111 102 8 29 342 104
20 27 94 101 29 30 103 98
32 28 98 97 33 30 139 98
64 28 98 97 25 29 101 98
80 28 98 98 35 28 107 98
128 28 97 97 27 29 105 98

TABLE 1 -- Overhead (sec)
Blocksize (KB)	local null	local tpnsv	local tpns	ether null	FDDI null	ether tpnsv	FDDI tpnsv
10	28	111	102	8	29	342	104
20	27	94	101	29	30	103	98
32	28	98	97	33	30	139	98
64	28	98	97	25	29	101	98
80	28	98	98	35	28	107	98
128	28	97	97	27	29	105	98

Comparing output to local /dev/null (local null) with local tape drive (local tpnsv and local tpns), we see a fixed overhead of about 60 seconds for tape positioning. This reflects the relatively slow performance of 8mm drives for rewinding, writing filemarks, and other bookkeeping operations.
Because of the way they are calculated, overhead calcualtions are quite sensitive to variations in elapsed time for the 2 different filesystems (27.8MB and 111.2 MB). The high level of consistency of overhead calcualtions affirms the general validity of the test results and methodology. Anomalous overhead calculations appear in the two series that were run using ethernet: ether null and ether tpnsv. This probably reflects the fact that the tests were run in an operating environment. Although general ethernet traffic was light during the tests, there may well have been periods of greater activity which caused greater delays during some test passes.

Differential Data Rates

Table 2 shows the differential data rate for various output devices, for a range of blocksizes. The differential data rate is the average rate, after overhead (startup, rewind, etc) is eliminated.

TABLE 2 -- Differential KB/sec
Blocksize
(KB) local
null local
tpnsv local
tpns ether
null FDDI
null ether
tpnsv FDDI
tpnsv
10 3791 491 463 51 1463 84 390
20 4170 662 695 560 2317 336 600
32 4389 772 758 583 2690 428 689
64 4633 772 758 549 2979 353 772
80 4633 772 765 627 3089 399 772
128 4633 765 758 604 3336 428 772

TABLE 2 -- Differential KB/sec
Blocksize (KB)	local null	local tpnsv	local tpns	ether null	FDDI null	ether tpnsv	FDDI tpnsv
10	3791	491	463	51	1463	84	390
20	4170	662	695	560	2317	336	600
32	4389	772	758	583	2690	428	689
64	4633	772	758	549	2979	353	772
80	4633	772	765	627	3089	399	772
128	4633	765	758	604	3336	428	772

The default blocksize of 10KB always results in inferior performance. A blocksize of 64 or 80 probably gives optimal performance in most situations -- any further improvements would be marginal at best.
Looking at the column labelled local null, ie, output to /dev/null on the local host glutamine, we see data rates limited only by disk speed and processing speed of dump(1M) and the operating system.
There is no significant speed difference between using the fixed (tapens) and variable (tapensv) devices on the local host. In each case there is a slight decline in performance as we go from a blocksize of 80K to 128K. If real, this is probably related to the size of the buffer in the tape drive.
Output to /dev/null on a remote host using FDDI (FDDI null column) is slower than on a local host. However, FDDI rates are enough faster than tape speeds that, when output is directed to a remote tape drive using FDDI (FDDI tpnsv column) data rates are similar to local tape speeds, except at the smallest blocksizes. It is possible that higher-speed tape drives will be more subject to performance degradation due to FDDI delays.
Output to remote /dev/null via ethernet (ether null column) is very sensitive to blocksize. Even accounting for probable experimental error in the measurement, the data rate for the default 10KB blocksize is many times slower than for higher blocksizes -- we have observed this with production backups as well as with controlled experiments. Interestingly, at very small block sizes (4KB), performance is significantly better than at 10KB. Investigation with an ethernet packet sniffer indicates that there is an interaction between blocksize and tcp buffer size which kills peformance at certain blocksizes. More study is indicated.
Output to remote tape via ethernet (ether tpnsv column) is substantially slower than to local tape or to /dev/null via ethernet. The speed limitations imposed by ethernet probably cause the 8mm tape drive to experience frequent and substantial repositioning delays. Thus, the delays from tape and ethernet limitations inract, and the resultant speed is substantially slower than either one individually.

Raw Data

Click here for the raw results for all the individual test passes.

REAL LIFE

TABLE 3 -- Real Life Results
File
system Size
(MB) Blocksize
(KB) trans
port speed
(KB/s)
A1 3211 10 ether 52
A1 3403 80 ether 249
A2 1528 80 fddi 682
A3 818 80 fddi 690
B1 1324 80 fddi 690
B2 3664 80 fddi 694
B3 3759 80 ether 267
C1 3970 10 local 453
C1 3971 10 local 542
C1 4098 10 local 584

TABLE 3 -- Real Life Results
File system	Size (MB)	Blocksize (KB)	trans port	speed (KB/s)
A1	3211	10	ether	52
A1	3403	80	ether	249
A2	1528	80	fddi	682
A3	818	80	fddi	690
B1	1324	80	fddi	690
B2	3664	80	fddi	694
B3	3759	80	ether	267
C1	3970	10	local	453
C1	3971	10	local	542
C1	4098	10	local	584

Table 3 shows actual dump speeds on several of our filesystems. Output is to a Cipher C860 6GB DLT tape drive, with a nominal peak data rate of 800 KB/s. Filesystems A1, A2 and A3 are located on 2GB 3600 RPM 5.25" disk drives. B1, B2 and B3 are on 4GB 5400 RPM 3.5" disk drives. C1 is on a RAID-5 system, with CMD controller and 5 2GB 3600 RPM 5.25" disk drives.

For A1, increasing blocksize from 10KB to 80KB resulted in a data rate increase from 52 KB/s to 249 KB/s. This falls short of the 400 KB/s data rate in the test results. The reasons may include:

The real-life filesystems may have a greater degree of fragmentation or complexity of structure, which could slow dump.
The DLT tape drive used for real-life backups has a nominal top speed of 800KB/s. The 8mm tape drive used in the tests has an nominal uncompressed speed of 500KB/s, which increases in proportion to the compression. With a compression of 2:1, the test drive would be faster than the real-life drive.
The real-life dumps were performed on active systems, where other users may have been accessing the disk drives while the dumps were being performed. Likewise, network activity may have been greater in the real-life dumps.

Real-life FDDI dumps were about 10% slower than predicted by the test results. About half of this is explained by assuming that DLT overhead is the same as 8mm overhead (about 100 seconds).

The several dumps of local filesystem C1 vary considerably. They are all below both the theoretical maximum for this DLT tape drive (800 KB/s) but also below the FDDI results. The probable explanation is that C1 is one of several filesystems on a RAID-5 system; that the RAID system is itself slow; and that both C1 and another RAID filesystem are the most actively used, resulting in poor performance due to disk bottlenecks.

This work was performed by Art Perlo at the Center for Structural Biology at Yale University in February, 1996. Please direct questions and comments to perlo@csb.yale.edu.

Document revised on Friday, 15-Mar-1996 16:08:49 EST

Blocksize (KB)	local null	local tpnsv	local tpns	ether null	FDDI null	ether tpnsv	FDDI tpnsv
10	28	111	102	8	29	342	104
20	27	94	101	29	30	103	98
32	28	98	97	33	30	139	98
64	28	98	97	25	29	101	98
80	28	98	98	35	28	107	98
128	28	97	97	27	29	105	98

Blocksize (KB)	local null	local tpnsv	local tpns	ether null	FDDI null	ether tpnsv	FDDI tpnsv
10	28	111	102	8	29	342	104
20	27	94	101	29	30	103	98
32	28	98	97	33	30	139	98
64	28	98	97	25	29	101	98
80	28	98	98	35	28	107	98
128	28	97	97	27	29	105	98

Dump(1M) Performance over Networks

CONTENTS

Blocksize (KB)	local null	local tpnsv	local tpns	ether null	FDDI null	ether tpnsv	FDDI tpnsv
10	28	111	102	8	29	342	104
20	27	94	101	29	30	103	98
32	28	98	97	33	30	139	98
64	28	98	97	25	29	101	98
80	28	98	98	35	28	107	98
128	28	97	97	27	29	105	98