DMWorkflowsTestGuide.md 4.68 KB
Newer Older
hparraga's avatar
hparraga committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
# Data Management Workflows Test Guide
## Running the Test Suite

Be sure the services are running. Services live in
`data-management/etc/init.d`

To run the test cases:

```
source etc/dm.setup.sh
cd data-management/src/python/dm/aps_beamline_tools/gui #TODO
python dmWorkflowsTest.py
```

## Coverage

To run with coverage:
```
python -m coverage run dmWorkflowsTest.py
python -m coverage report
python -m coverage html #detailed html reports
```

Last run coverage: 0%

## Test Scenarios
### add workflow
- [ ] add a workflow using pyspec
- [ ] add a workflow using json
- [ ] error message if add a workflow without file specified

### delete
- [ ] error message if delete without owner 
- [ ] error message if delete without workflow name
- [ ] removes workflow
- [ ] error if workflow doesn't exist

### get processing job
- [ ] error message if no owner
- [ ] error message if no job id
- [ ] if stage given, gives details for that stage
- [ ] error if stage given but job does not have that stage
- [ ] error if job doesn't exist

### get workflow
- [ ] error if no owner
- [ ] error if no workflow name
- [ ] gives details for workflow
- [ ] error if workflow doesn't exist

### list jobs
- [ ] lists jobs
- [ ] if no owner, uses session login name
- [ ] error if no owner or session login
- [ ] if skip option given, that many jobs are skipped
- [ ] if limit is given, only that many jobs listed
- [ ] if no limit given, only default limit of jobs listed
- [ ] if display spec given, formatted according to that spec
- [ ] if metadata key/val pair given, only list jobs that match

### list workflows
- [ ] lists all workflows for owner
- [ ] if no owner, uses session login name
- [ ] error if no owner or session login
- [ ] if metadata key/val pair given, only list workflows that match

### process files
- [ ] error if no workflow owner or session login
- [ ] if no owner, uses session login name
- [ ] error if no name
- [ ] error if no directory
- [ ] error if no file path pattern
- [ ] if max active jobs option, don't run more than that number of jobs
- [ ] if no max active jobs option, don't run more than the default number of jobs
- [ ] processes files according to the workflow
- [ ] given key/value pairs used as workflow inputs

### start processing job
- [ ] starts processing job according to the workflow
- [ ] error if no workflow owner or session login
- [ ] if no owner, uses session login name
- [ ] error if no workflow name
- [ ] given key/value pairs used as workflow inputs

### stop processing job
- [ ] stops processing job
- [ ] error if no owner or session login
- [ ] if no owner, uses session login name
- [ ] error if no job id

### update workflow
- [ ] update a workflow using pyspec
- [ ] update a workflow using json
- [ ] error message if add a workflow without file specified
how does it know which to update?
What if it doesn't exist already?

### upsert workflow
- [ ] workflow using pyspec
- [ ] workflow using json
- [ ] error message if add a workflow without file specified
- [ ] updates workflow if already existing
- [ ] adds workflow if new
how does it know which to update?

### workflow options
- [ ] error if not dictionary format
- [ ] error if missing required keys name, owner, stages
what if id specified that wasn't assinged by db?

#### workflow stages
- [ ] stages executed in sorted order
- [ ] commands can use variables
- [ ] `workingDir` sets the working directory of a command
- [ ] default parallel execution if iterating over files via the $`filePath` variable
- [ ] setting `parallelExec` to false means no parallel execution
- [ ] setting `parallelExec` does nothing if not iterating over files via the $`filePath` variable
- [ ] if `outputVariableRegexList` then output variables match the regex patterns listed
what if they don't match?
- [ ] if `runIf` used, command only runs when `runIf` condition is met
- [ ] if `repeatPeriod`, `repeatUntil`, `maxRepeats` used, all 3 must be specified
    - [ ] stage command repeats after `repeatPeriod` seconds
    - [ ] stage command repeats until `repeatUntil` condition is met
    - [ ] stage fails if command is repeated `maxRepeats` times
what happens if a stage fails?
- [ ] cannot use `workflow` reserved key in stage definition
- [ ] can use reserved keys id stage status owner startTime startTimestamp endTime endTimeStamp runTime errorMessage maxActiveJobs nActiveJobs nFiles nProcessedFiles nFailedFiles nSkippedFiles nAbortedFiles nCompletedFiles processedFiles failedFiles skippedFiles abortedFiles filePath filePathList filePathPattern fileQueryDict (not yet implemented) dataDir
- [ ] any user defined variables can be used
- [ ] output variables can be used as input variables 
what happens if you define a reserved key with a value in your input keys?