The CCP4 project ( http://www.ccp4.ac.uk/) for protein crystallography provides a good model:
Unlike many other packages, particularly for small molecule crystallography, the CCP4 suite is a set of separate programs which communicate via standard data files, rather than all operations being integrated into one huge program. This has some disadvantages in that it is less easy for programs to make decisions about what operation to do next-though it is seldom a problem in practice-and that the programs are less consistent with each other (although much work has been done to improve this). However the great advantage arising from such loose organisation is that it is very easy to add new programs or to modify existing ones without upsetting other parts of the suite. This reflects the approach successfully taken by Unix. Converting a program to use the standard CCP4 file formats is generally straightforward, and the philosophy of the collection has been to be inclusive, so that several programs may be available to do the same task. The components of the whole system are thus a collection of programs using a standard software library to access standard format files (and a set of examples files and documentation) available for most Unix operating systems (including Linux), as well as Windows and Mac OS X. Programs are mostly written in C/C++ and Fortran 77.Another source of inspiration is the Macro-EM project ( http://www.macro-em.org/):
Our goal is to develop computational technology that will make it possible to get high-resolution density maps, and to do so from EM images of large, isolated macromolecular particles at a high rate of throughput. We want to develop versions of single-particle software that will take full advantage of modern, affordable, and most importantly highly parallel machine architectures. The parallel machines are able to process large amounts of data simultaneously, thus completing the computing tasks required for cryo-EM in a realistic amount of time Our strategy in developing optimized software for processing large amounts of data in a relatively short time, giving high resolution, is to first implement pilot versions of desired code on multiprocessor clusters that are based on commodity PC hardware. The ultimate goal is to develop the computational technology that will improve both resolution and throughput when calculations are run on machines that are affordable (a) for individual laboratories, (b) as shared instrumentation, or (c) as dedicated machines, run for community as multi-user facilities.
Microscope user 2007-09-20