High-speed PC clusters surpassing Unix in processing price/performance
The offshore seismic industry has always needed a heavy computational effort to image the huge volume of seismic data acquired by modern multi-streamer, multi-array, air-gun vessels. The industry is still not that far removed from the acres of tape devices and big-iron computers of the 1960s and 1970s. And, the 1980s and 1990s also required a large investment in large capacity storage devices and Unix-based RISC processing platforms.
Now the advent of clusters of high-speed multi-processor PCs (personal computers) running under Windows NTtrademark (NT) is apparently producing a price/performance advantage over the Unix/RISC platforms. Typical prices for 16-64 processor HP, Sun, SGI, and IBM RS/6000 systems are in the $500,000 to $3 million range, depending on the size, type, and speed of the computer.
By comparison, a generic 68-CPU PC cluster running NT, using 500 MHz PIII CPUs, 17 GB of memory, standard dual-CPU motherboards, and including a half terabyte of disk capacity, can be assembled for about $60,000. An additional terabyte of PC disk in RAID configuration is now less than $15,000.
Desktop super computing
When Exploration Design Software, Inc (EDS) began developing NT-based software in 1994, there was concern over the selection of Microsoft products as an operating system, compared with Unix or even IBM's OS/2. EDS originally targeted in-field processors who were more familiar with using PC platforms with reliability and low cost, compared to the Unix/RISC platforms. In 1994, NT did have some advantages over other systems. The user interface also had a high comfort factor in that the operating system looked to the in-field user like the familiar DOS-based software and then later like Windows 95 or Windows 98.
The PC/NT combination also supported high speed SCSI tape devices and huge disk arrays, without the need for partitioning the disks into 2 GB segments as with the older generation 32-bit indexed operating systems, since it used a 64-bit addressing scheme for disk accesses.
Cluster PCs
EDS' system for processing has now evolved into a distributed package using clustered PC's (sometimes referred to as "desktop super computing"). There has been a huge increase in computing power for modern PCs. Inexpensive motherboards contain 1, 2, 4, or even 8 CPUs on the same board and will support several gigabytes of memory.
The individual CPUs have become faster and cheaper, because of faster and cheaper memory chips, disk drives, CD-ROMs, video cards, and network cards. This revolution in computing power means that by joining multiple PCs into a cluster, the combined system has the computational capability of rival Unix "super-computers" costing much more.
The technique of clustering PCs leads to some interesting price/performance issues. For example, the fastest PC available will usually not be the best buy. Manufacturers like Intel will significantly discount the pricing on their older technology CPUs when new faster processors are released. A cluster containing ten, 500 MHz CPUs may cost the same as twenty 450 MHz CPUs and yet the latter will have 80% more cumulative compute power.
Distributed programs that have been written for a computing cluster usually consist of a master and multiple slave programs. The master monitors the data flow and distribution. The slave programs run on the individual computing nodes and perform the real work. The cluster may contain CPUs with different clock speed or processing power and the master program can reallocate data flows depending upon the actual performance of the individual CPUs. The master program will also monitor the hardware performance of the cluster in order to detect hardware failures in the individual nodes and again adjust the data distribution to automatically compensate.
Prestack time migration
In the early 1980s, several seismic contractors began to apply prestack time migration to their predominantly 2D offshore spec data libraries. Today, these contractors are looking to apply the same procedure to their 3D libraries. There are several other advantages to be gained besides the obvious potential for improvement of the final image.
The recent resurgence of interest in amplitude versus offset (AVO) analysis has increased the need for prestack processing as well as the trend towards depth processing. The finer detailed time image and velocity analysis derived from prestack migrated data is often a prerequisite starting point for the depth processing procedure.
The 3D computational effort is enormous and in the past, contractors have sought to improve the throughput of their imaging efforts by shortcutting and compromising on their processing algorithms in the quest for speed. The revolution of PC clusters means that these immense imaging projects can now not only be achieved at a fraction of the cost in both dollars and time, but also that the quality of the output may be noticeably improved by no-compromise algorithms.
Standard configuration
There are basically three ways to build the cluster. All three methods use mostly the same components for each node which currently include two 500 MHz CPUs, 512 MB of memory, a 6 GB system disk, a video card, and a network card. The master node can have a SCSI RAID or IDE RAID array to store the input data.
The most inexpensive method of building a cluster is to put the components into standard PC cabinets using standard dual-CPU motherboards. For only a small increase in price, the second configuration will mount dual-CPU motherboards into standard 4U rack-mount enclosures. Ten of these rack enclosures will fit into a standard 6-foot tall 19-inch rack cabinet.
The most expensive method of the three is to use passive backplanes that allow room for 8 CPUs in a single 4U-rack enclosure. The significant difference in CPU density per enclosure yields a configuration that will store 80 of the 500 MHz CPUs into one 6-foot tall 19-inch rack cabinet. The nodes are all connected with standard 100 mbs Ethernet hubs.
The tough economic environment of today's offshore industry requires cost to be a primary concern. The need to stay abreast of new technological standards while staying competitive results in cost-conscious managers looking for new ways to save money.
At the enterprise level, NT/Intel platforms are coming of age and many companies are now recognizing and accepting the robustness and price advantages of this combination.