Project Development

Project-specific components are executables that are not provided by SLINC, but instead must be implemented separately by each project. Most public resource computing projects have a large set of data that can be processed in smaller subsets. We will use the term science data to refer to the entire set of data that needs to be processed by a public resource computing project. In order to process all of the science data in a distributed way, the data must be partitioned into many subsets, and there must be an algorithm that accepts a subset as input and computes some result as output. The results produced by this algorithm can later be recombined to draw meaningful conclusions about the original data set.

There are two components that each project must develop to in order have a complete public resource computing project: a work unit generator and a science application. The work unit generator partitions your data set into smaller subsets called work units, which can be of any size. The science application takes a work unit as input, executes your project-specific algorithm on the data, and produces a result as output, which also does not have any size limitation.

In addition to the required components there is one optional component that you may wish to develop: the result validator. The result validator has three responsibilities. The first is to determine whether each individual result for a work unit is valid. The second is to decide whether a client has passed or failed a spot-check. The last responsibility is to choose a single result from the set of all results for a work unit, and to mark that result as canonical. When a canonical result is chosen, the canonical result will be saved permanently in the database, and all other results for that work unit will be deleted. Although the result validator must perform all three tasks, one or more of these tasks can be stubbed. For example, instead of writing an algorithm to determine which result should be selected as the canonical result, you can simply always designate the first result that was received as the canonical result, which is what the default validator does. That way the validator would have fulfilled its responsibilities, but you would have only had to write a few lines of code. Of course this assumes that the result that is selected as the canonical result is not important for your project. Any or all of the three validator tasks can be stubbed in this way, so you only need to implement the functionality that is necessary for your project to be successful.

Information about how to implement each of these components can be found in the Project Programming Guide. Once you have implemented the necessary project components, please continue to Preparing Project Distribution Files.