ORGANIZATION:

First, the basic structure of what has to be run where. You can run different parts of this at different times, but this overall structure has to be basically preserved in this particular order for every exposure:

Code CPU time Output size Owner
For each Opsim fields:
1. Full-field instance catalog gen. hours/field? ~1 Gbyte/field Andy/Rob/Jim/Simon
2. Atmospheric parameter generator seconds/field small Mallory
3. Atmospheric screen generator minutes/field ~1 Gbyte/field Garrett
4. Cloud screen generator minutes/field ~30 Mbytes/field Garrett
5. Optics parameters seconds/field small Nathan
Loop over the two exposures:
Loop over either chips or amplifiers for the following:
6. Trim program seconds/chip ~30 Mbytes Justin
7. Raytrace <1 hr/chip 16 Mbyte/chip John
8. Cosmic ray adder few seconds/chip 16 Mbyte/chip Mallory
9. Background adder 30 seconds/chip 16 Mbyte/chip Justin
Loop over amplifier (if you are looping over chips above):
10. Electron-> ADC code few seconds/amp 1 Mbyte/amp John

So the current master script runs these in the order described above. It leaves open the question of whether or not the instance catalog are generated ahead of time or on the fly.

GRANULARITY:

Now the question of what part of the above gets divided to each processor. Since the raytrace still dominates the CPU it makes sense to send that to different processors in parallel. But at what granularity? It turns out with the way everything runs now the natural granularity is the chip level. You could consider granularities at the following levels:

1. PHOTON: every photon is separately generated and then collected

2. OBJECT: every object is separately generated and then images are put together

3. AMPLIFIER: every amplifier is separately generated

4. CHIP: every chip is separately generated and then split into amplifiers

5. FOCAL-PLANE: the whole focal plane is generated together and then split into amplifiers

1-3 have large overheads, whereas 5 is still in the hundreds of CPU hours. Consequently, 4 is ideal for now.

IMPLEMENTATION:

So we will have a master controlling script the does the loops described in the organization section, and then an individual script that runs codes 6-10 on the chip level that can be run in parallel.