diff --git a/getting_started/getting-started.md b/getting_started/getting-started.md new file mode 100644 index 0000000000000000000000000000000000000000..fa3443ebcb09c0f118358c973f1bb6f6542255e5 --- /dev/null +++ b/getting_started/getting-started.md @@ -0,0 +1,251 @@ +# Getting Started + +## Introduction + +The APS Data Management System is a system for gathering together experimental data, +metadata about the experiment and providing users access to the data based on a users +role. This guide is intended to provide beamline users an introduction to the basic +use of the Data Management System at the beamline. This process will involve creating an +experiment, associating users with this experiment and then adding data, in the form of +files, to the experiment. + +### Setting up the Environment + +On beamline Linux computers, users can set up the environment for using the Data +Management System by executing a setup script which is created as the Data Management +software is installed. This script is executed as + +> /home/DM\_INSTALL\_DIR/etc/dm.setup.sh + +where DM\_INSTALL\_DIR is the deployment directory for this beamline. This script +will set up a number of environment variables which define items such +as URLs for the various data services, the station name in the DM, the location +of a file which defines the login to be used when running commands and adding +the path to DM system commands to the PATH. + +##DM System Commands +----- +After execution of the setup script, the PATH will include the path to commands +for interacting with the DM System. A list of these commands, at the time of +this writing is shown below. + + + +These commands follow some conventions, like adding a --help for conveniece. When +the environment variable DM\_LOGIN\_FILE is defined and points to a file that +contains a username/pasword pair (in the form 'username|password') this information +is used for authentication when executing the commands. In practice the account +defined here should have the role of station manager for the beamline. + +### DM Command usage examples + +#### Getting some data into the system + +As stated early, the experiment is the central object for creating entries in +the Data Management System. For conveniece there are two commands which can +provide examples of creating experiments, assigning users & other attributes and +loading data into the system. These commands are + +> dm-STATION\_NAME-upload +> +> dm-STATION\_NAME-daq + +Where STATION\_NAME is the station name, in lower case with '-' removed. The two +commands here are differentiated by the two methods for moving data into the +system _upload_ and _daq_. moving files using _upload_ will move all files present +in a directory at the time the command is used while _daq_ moves will monitor +files while active and move new files as well. Both of these commands require +two parameters --experiment and --data-directory. With only these required parameters, +these commands will create an experiment named by --experiment and will move files +from --data-directory. Other parameters here will allow for specifying items such +as a list of users, specifying the destination directory (relative to the storage +directory), process a workflow using the data, etc. + +Without the optional parameters, use of these commands would look like + +> dm\-STATION_NAME\-upload --experiment=exp1 --data-directory=/home/bluser/data/2010-03 + +This command will create an experiment with no users and move files from +/home/bluser/data/2010-03 to STORAGE\_DIR/STATION\_NAME/EXP\_NAME on the data storage +server where + - STORAGE\_DIR is the base storage location on the storage sever + - STATION\_NAME is the station name defined on the experiment server + - EXP\_NAME is the experiment name defined by --experiment, _exp1_ in this case + +Adding other parameters to this command can add more information to the experiment +such as --users to add a list of users to the system. Other parameters such as +--dest-directory and --root-path allow customization of the path to the file on +the data storage server. Some beamlines for instance have relied on particular +directory structure for legacy analysis code. + +Sector 8-ID-I for instance has a file structure ROOT\_DIR/RUN/usernameYYYYMM with + - ROOT\_DIR base level for experiment data. In DM this is STORAGE\_ROOT/STATION\_NAME + - RUN is the APS Run cycle such as 2019-1 (year and cycle 1, 2 or 3) + - username is some form to identify the experimenter + - YYYY is the four digit year. + - MM is a 2 digit month, i.e. 02 for February +an example of this would be STORAGE\_ROOT/8idi/2017-2/dufresne201707. Here we +would use a command like + +> dm-8idi-upload --experiment=dufresne201707 +> --data-directory=/net/s8iddata/export/8-id-i/2017-2/dufresne201707 +> --dest-directory=2017-2 + +Sectors 33 & 34 use a file structure under ROOT\_DIR/username/YYYYMMDD/EXP_NAME/. +Here + - ROOT\_DIR base level for experiment data. In DM storage server this is + STORAGE\_ROOT/STATION_NAME + - YYYY four digit year + - MM 2 digit month + - DD 2 digit day + an example of this is ROOT\_DIR/jenia/20140820/FeCrbilayers. Here we would + use a command like + +> dm-33bm-upload --experiment=FeCrbilayers +> --data-directory=/net/s33data/export/33bm/data/jenia/20140820/FeCrbilayers +> --dest-directory=jenia/20140820 + +#### Creating experiments based on the ESAF/Proposal + +For convenience, at APS, it is possible to make use of the Proposal and ESAF systems +to create experiments based off the existing entries in these systems to populate +the list of users in the system. the --proposal-id and --esaf-id options on commands +like dm-STATION\_NAME-upload and dm-create-experiment will add a list of users which +are defined in the esaf or proposal to the created experiment. + +####Displaying information in the system + +Other DM commands show useful information, and can adjust what and how the +information is displayed. `dm-list-experiments` lists the experiments showing +info on the experiemnt name, experiment type, station, and start date. adding the +option --display-keys can allow the user to display more or less information. +--display-keys=name will display only the experiment name, and --display-keys=ALL +will display all keys associated with the experiment (one experiment line is show +as example). + +``` +startDate=2019-12-09 18:35:26.738217-06:00 name=yuyin201912 description=Study of dynamics and aging in solvent segregation driven colloidal gel in a binary solvent (Proposal id: 64326) rootPath=2019-3 experimentStationId=5 id=3579 experimentTypeId=9 experimentStation={u'description': u'Sector 8 ID I', u'id': 5, u'name': u'8IDI'} experimentType={u'id': 9, u'name': u'XPCS8', u'description': u'XPCS Group (Sector 8)'} +``` + +Adding the --display-format will change how the data is displayed. By default +the output is a simple dictionary, --display-format=html will give HTML output. + +``` + <tr> <td>2019-12-09 18:35:26.738217-06:00</td> <td>yuyin201912</td> <td>Study of dynamics and aging in solvent segregation driven colloidal gel in a binary solvent (Proposal id: 64326)</td> <td>2019-3</td> <td>5</td> <td>3579</td> <td>9</td> <td>{u'description': u'Sector 8 ID I', u'id': 5, u'name': u'8IDI'}</td> <td>{u'id': 9, u'name': u'XPCS8', u'description': u'XPCS Group (Sector 8)'}</td> </tr> +``` + +Specifying --display-format=pprint will give a nicer (prettier) output, +breaking nicely on different lines, indenting properly where some items are objects. + +``` +{ u'description': u'Study of dynamics and aging in solvent segregation driven colloidal gel in a binary solvent (Proposal id: 64326)', + u'experimentStation': { u'description': u'Sector 8 ID I', + u'id': 5, + u'name': u'8IDI'}, + u'experimentStationId': 5, + u'experimentType': { u'description': u'XPCS Group (Sector 8)', + u'id': 9, + u'name': u'XPCS8'}, + u'experimentTypeId': 9, + u'id': 3579, + u'name': u'yuyin201912', + u'rootPath': u'2019-3', + u'startDate': u'2019-12-09 18:35:26.738217-06:00'} +``` + +#### Creating experiments based on the ESAF/Proposal + +For convenience, at APS, it is possible to make use of the Proposal and ESAF systems +to create experiments based off the existing entries in these systems to populate +the list of users in the system. the --proposal-id and --esaf-id options on commands +like dm-STATION\_NAME-upload and dm-create-experiment will add a list of users which +are defined in the esaf or proposal to the created experiment. + + +### dm-station-gui + + +#### Overview +One of the methods to manage processes in the APS Data Management System is the +application `dm-station-gui`. This application is a PyQt application which gives +access to add/modify/control items such as experiments, file transfer to storage +location ('daq' and 'uploads'), workflows and processing jobs. An example of +this application is shown in the figure below. + + + +#### Experiments + +`dm-station-gui` opens showing a tab that lists experiments which have been added +to the DM System. At the beamline, these experiments generally will correspond +to an accepted proposal in the APS Proposal system. Experiments in the DM +System define an entity that ties sets of managed data together. When an +experiment is selected by double clicking or by clicking and then clicking the +*Use Selected* button, the contents of the Experiment tab changes to give +details about that experiment. Much of this information is pulled from the +proposal/ESAF databases. This is shown in the image below. + + + +This view of the data enable a couple of key features of the DM System. The +ability to associate files with the Management system, and the ability to set +which users have access to the system and therefore which data they can access. +Selecting users that will have access to experiment data is done by clicking the +*Modify Users* button below the user list and then selecting users in the right +list & pressing the arrow button between the lists to add users or selecting +users from the left list and clicking the arrow button to delete users from the +list. Click the _Save_ button at the bottom to accept the changes and go back +to the Experiment detail or click *Back* button in the upper left to exit back +to Experiment detail without saving. This view is shown below. + + + +#### Getting Files into Data Management + +The overall purpose of this system is to get data into the system and providing +access control to that data. This means linking the data files to the users in +an experiment. For each beam station there is storage location defined for that +station. On the `Experiments` tab there are a couple of items relevant to +getting the data onto that storage location. These items are shown in the image +below. Files transferred onto the storage location will go into: +*STORAGE_LOCATION/(storage_root_path)/(experiment_name)*. Files to go into this +directory are specified in the entry "Data Directory or Single File Path". Note +that if the data directory is specified, only the files will go into the +storage location, not a new subdirectory with the transferred files. Any +subdirectories will be copied as will the contents. + +Once the the storage location and source are defined the transfer can be started +in one of two ways: +- A monitor can be placed on the directory and any new files/subdirectories in the directory will be transferred if you select `Start DAQ` +- A one time copy will happen and all files/subdirectories currently in the directry will be copied if you select `Start Upload` + +When a tranfer is started, the display will switch to either the `DAQs` or +`Uploads` tab. The tab will show the status of all transfers that have been +started since the last restart of the system. Status of a particular transfer is +shown by color +- green: Done success +- yellow: running +- red: Done with an errors + +An example of this is shown below. + + + +Clicking on a particular transfer, on `DAQs` or `Uploads` will switch the +view to show detail of that transfer including errors. + + + +#### Workflows and processing data +The tab 'Workflows' allows defining a workflows which will provide a template for processing data. A workflow is simply a set of defined steps that will process data. Workflows each step in the workflow executes a command on the host computer. Commands can + * be as simple as a single command such as listing a directory + * can tranfer files between computers + * can launch & monitor jobs on a remote cluster + * can do just about anything that can be scripted on a computer. +Workflow steps allow defining inputs that are defined at runtime and can also create outputs that can be used in following steps. + +The `Workflows` tab is shown below and contains a list of workflows that have been defined. + + + +Clicking on a workflow and then selecting the 'Inspect' button you will be able to examine and possibly modify the steps in the workflow. diff --git a/getting_started/images/dm-station-gui-experiments-detail.png b/getting_started/images/dm-station-gui-experiments-detail.png new file mode 100644 index 0000000000000000000000000000000000000000..5d0fcc12c4860814c677cb7613627a7a425c4d2c Binary files /dev/null and b/getting_started/images/dm-station-gui-experiments-detail.png differ diff --git a/getting_started/images/dm-station-gui-experiments-file-location-items.png b/getting_started/images/dm-station-gui-experiments-file-location-items.png new file mode 100644 index 0000000000000000000000000000000000000000..b6ebcfa08b10cbbf7c3b99452866d7ae151bba51 Binary files /dev/null and b/getting_started/images/dm-station-gui-experiments-file-location-items.png differ diff --git a/getting_started/images/dm-station-gui-experiments-user-management.png b/getting_started/images/dm-station-gui-experiments-user-management.png new file mode 100644 index 0000000000000000000000000000000000000000..4a2abd48697e610db44a315661ac57efcdf05b46 Binary files /dev/null and b/getting_started/images/dm-station-gui-experiments-user-management.png differ diff --git a/getting_started/images/dm-station-gui-experiments.png b/getting_started/images/dm-station-gui-experiments.png new file mode 100644 index 0000000000000000000000000000000000000000..69a1225441cfcad031ca7ac0a8b70b640e540283 Binary files /dev/null and b/getting_started/images/dm-station-gui-experiments.png differ diff --git a/getting_started/images/dm-station-gui-uploads-detail.png b/getting_started/images/dm-station-gui-uploads-detail.png new file mode 100644 index 0000000000000000000000000000000000000000..d23641dd36b41eb600b668e97e12b242c914c7c2 Binary files /dev/null and b/getting_started/images/dm-station-gui-uploads-detail.png differ diff --git a/getting_started/images/dm-station-gui-uploads.png b/getting_started/images/dm-station-gui-uploads.png new file mode 100644 index 0000000000000000000000000000000000000000..3c8015984bdfa8d2ea871dc7ebf1bab254ca0c96 Binary files /dev/null and b/getting_started/images/dm-station-gui-uploads.png differ diff --git a/getting_started/images/dm-station-gui-workflows.png b/getting_started/images/dm-station-gui-workflows.png new file mode 100644 index 0000000000000000000000000000000000000000..436956bacbac6b4a439f976d0fe5e4a542f0f8ea Binary files /dev/null and b/getting_started/images/dm-station-gui-workflows.png differ diff --git a/getting_started/images/dm-system-commands.png b/getting_started/images/dm-system-commands.png new file mode 100644 index 0000000000000000000000000000000000000000..999195a55af47d7d2f38bfba0acaa7a5733e4674 Binary files /dev/null and b/getting_started/images/dm-system-commands.png differ