1. Introduction
SciExp²-ExpDef (aka Scientific Experiment Exploration - Experiment Definition) provides a framework for defining experiments, creating all the files needed for them and, finally, executing the experiments.
SciExp²-ExpDef puts a special emphasis in simplifying experiment design space exploration, using a declarative interface for defining permutations of the different parameters of your experiments, and templated files for the scripts and configuration files for your experiments. SciExp²-ExpDef supports various execution platforms like regular local scripts and cluster jobs. It takes care of tracking their correct execution, and allows selecting which experiments to run (e.g., those with specific parameter values, or those that were not successfully run yet).
1.1. Quick example
As a quick example, we’ll see how to execute our program my-program
with different values of the --size
arguments, generate one script to run each of the experiments, and then execute all the generated scripts.
First, we will define our experiment set to generate all its files to a self-contained experiments
directory using a Experiments
object, so that we can move it all as a unit to any other machine as needed for the evaluation:
#!/usr/bin/env python
# -*- python -*-
from sciexp2.expdef.env import *
l = Experiments(out="experiments")
Then we will copy our my-program
executable into the experiments
directory, since we want to make the directory self-contained and each experiment will need that program. Method pack
will copy files into our output directory by default:
# copy program into experiments directory
l.pack("/path/to/my-program", "bin/my-program")
Now we can go onto defining our experiment parameter size
, which will hold values 1, 2, 4, and 8. Method params
will create one experiment for each of the specified values:
# create one experiment for each value of size
l.params(size=[1, 2, 4, 8])
Finally, we will generate the per-experiment scripts, that will execute my-program
with the --size
argument set to the size
parameter we defined above and the --out
argument pointing to a different output file for each experiment. Method launchgen will generate the per-experiment scripts into scripts/{{size}}.sh
inside the experiments
directory, in this case using the shell
template that requires the CMD
variable to be defined as the command to run:
# generate per-experiment scripts ("scripts/{{size}}.sh") with the specified command (CMD)
l.params(CMD="bin/my-program --size={{size}} --out=results/{{size}}.csv")
l.generate_jobs("shell", "scripts/{{size}}.sh")
The final script would thus be:
#!/usr/bin/env python
# -*- python -*-
from sciexp2.expdef.env import *
l = Experiments(out="experiments")
# copy program into experiments directory
l.pack("/path/to/my-program", "bin/my-program")
# create one experiment for each value of size
l.params(size=[1, 2, 4, 8])
# generate per-experiment scripts ("scripts/{{size}}.sh") with the specified command (CMD)
l.params(CMD="bin/my-program --size={{size}} --out=results/{{size}}.csv")
l.generate_jobs("shell", "scripts/{{size}}.sh")
The resulting experiments
directory now contains all the files we need:
experiments
|- jobs.jd # auto-generated by l.launchgen()
|- bin
| `- my-program
`- scripts
|- 1.sh
|- 2.sh
|- 4.sh
`- 8.sh
The experiments/jobs.jd
is a file that defines all the generated experiments and their parameters. We can now execute this auto-generated file to run all our experiments. It will take care of running each of the per-experiment scripts, controlling if they are finishing correctly:
./experiments/jobs.jd submit
After successfully executing all the scripts, the experiments
directory will also contain the output files we gave on the per-experiment command (--out=results/{{size}}.csv
):
experiments
|- bin
| `- my-program
|- scripts
| |- 1.sh
| |- 2.sh
| |- 4.sh
| `- 8.sh
`- results
|- 1.csv
|- 2.csv
|- 4.csv
`- 8.csv