Commit 793bb88c authored by dmadmin's avatar dmadmin
Browse files

Bring post-install test setup info from Single node version to

multi-node version.  Still need to review for changes needed for
multi-node testing.
parent 05c9b928
......@@ -69,11 +69,11 @@ For initial test purposes, it is necessary to shortcut some parts of the service
* dm.proc-web-service.conf
- Comment out the entry for principalAuthenticator2 which uses the LDAP authenticator
* dm.ds-web-service.conf
- Comment out the entry for principalAuthenticator2 which uses the LDAP authenticator
- comment out the two lines for platformUtility which use LinuxUtility and LdapLinuxPlatformUtility
- Add a new platformUtility line in place of the other two
* platformUtility=dm.common.utility.noopPlatformUtility.NoopPlatformUtility()
- Change value for `manageStoragePermissions` in ExpermentManager section to False
- Comment out the entry for principalAuthenticator2 which uses the LDAP authenticator
- comment out the two lines for platformUtility which use LinuxUtility and LdapLinuxPlatformUtility
- Add a new platformUtility line in place of the other two
- platformUtility=dm.common.utility.noopPlatformUtility.NoopPlatformUtility()
- Change value for `manageStoragePermissions` in ExpermentManager section to False
### Removing Test system
Often in the development of Data Management system components it will be necessary to remove/reload components of the system. The script _dm/_remove/_test/_test/_system.sh_ in the sbin directory of the 'dm' repository (/local/DataManagement/dev/sbin from the directory describe above) issues commands to clear out database & configurations to allow creating a clean installation of the system.
......
......@@ -8,7 +8,7 @@ In production at APS we are using RedHat Enterprise Linux 7 on all machines. Fo
The typical multiple node VM setup uses two network interfaces. These interfaces are configured in the VirtualBox setup. The first network interface is configured as a generic NAT connection which will allow the VM to access the public network in order to facilitate support tool downloads during installation. This would allow also access to facility resources if it is required. This could be used to extend the __DM__ system to connect to facility resources such as the aps\_db\_web\_service which provides access to systems such as the APS Experiment Safety Assment Form (ESAF), System and Beamline Scheduling System (BSS). The second network interface is configured as a 'Host-only Adapter' on the 'vboxnet0' network. This interface will be used to set up the systems to communicate with each other.
The __DM__ System installation process will use the 'hostname -f' command to get the system name. The host name is used by the __DM__ system when configuring services to make them available 'publicly' on the 'Host-only Adapter' network. This makes services available to the other VMs running on the 'vboxnet0' network. In order for the to recieve names for each system during network setup, the hostname must be set for each system. The system hostname on a CentOS system can be set with the hostnamectl command. In a multiple node environment VMs will also need some form of name resolution for the VM nodes in the system. This can be acheived by adding node entries in /etc/hosts file.
The __DM__ System installation process will use the 'hostname -f' command to get the system name. The host name is used by the __DM__ system when configuring services to make them available 'publicly' on the 'Host-only Adapter' network. This makes services available to the other VMs running on the 'vboxnet0' network. In order for the to recieve names for each system during network setup, the hostname must be set for each system. The system hostname on a CentOS system can be set with the hostnamectl command. In a multiple node environment VMs will also need some form of name resolution for the VM nodes in the system. This can be acheived by adding node entries in /etc/hosts file. __Once the node names are changed reboot the sytem.__
The DM installation process uses scp to transfer some files (such as Certificate Authority files) from one node to another during the setup process. To facilitate this process, ssh-keys should be generated for the different nodes and be copied into the authorized key files on the data-storage node. On both of these systems the following command will generate a set of RSA key files.
......@@ -39,6 +39,7 @@ exp-station ports
| 18182 | Mongo Express Application, localhost |
| 8182 | Nginx Server |
__After these ports are added select__ `Reload Firewall` __from the Options menu.__
### Support Tools Installation
Before installation of the APS Data Management System a number of tools need to be installed on the server nodes. The __DM__ system depends on tools such as Java, Python, Postgresql, MongoDB, ZeroMQ, etc. A set of scripts have been established which will download, build (when necessary) and install these tools for use with the __DM__ system. While it is possible to install most of these tools using more conventional means (e.g. RPM on Linux) the install scripts provided here builds and installs these tools specifically for use with the __DM__ system.
......@@ -110,8 +111,6 @@ To install _dm_ compnents for the data-storage node
- data storage directory - this directory will serve as the root directory for storage of data in the system. During transfers initiated by the daq web service, files will be moved into subdirectories of this system. The subdirectory paths will be constructed from beamline name, experiment name and a path specified by the user in the transfer setup.
- __dm__ system account - This is user __dm__in the Data Management system. This user has administrative priviledge in the Data Management system. This is a user in the 'dm' user table. Each developer can set this to a unique value.
- __dmadmin__ LDAP password - This password provides the Data Management software access to the APS/ANL LDAP system to gather reference to that database. This is a password to an external system and and is therefore a pre-existing password that developers will need to get from the Data Management system administrator.
#### exp-station Node Installation
......@@ -125,4 +124,173 @@ To install _dm_ components on the exp-station:
- This will start the installation process which will prompt for
\ No newline at end of file
### Post-Install configuration
For initial test/development purposes, a few changes are necessary to short-circuit a few features of the system. These changes include using LDAP and Linux services to manage file permissions and access control based on users in an experiment. To do this edit the following files:
##### On the data-storage Node
* dm-aps-db-web-service.conf (_if included_)
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
* dm-ds-web-service.conf
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
- comment out the two lines for platformUtility which use LinuxUtility and LdapLinuxPlatformUtility
- Add a new platformUtility line in place of the other two
- platformUtility=dm.common.utility.noopPlatformUtility.NoopPlatformUtility()
- Change value for `manageStoragePermissions` in ExpermentManager section to False
##### On the exp-station Node
* dm-cat-web-service.conf
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
* dm-daq-web-service.conf
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
* dm-proc-web-service.conf
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
* dm-ds-web-service.conf
- Comment out the entry for the principalAuthenticator2 which uses the LDAP authenticator
After these modifications the services should be restarted:
* data-storage
- `DM\_INSTALL\_DIR/production/etc/init.d/dm-ds-services restart` (if installed)
* exp-station
- `DM\_INSTALL\_DIR/production/etc/init.d/dm-daq-services restart`
### Overview of the sytem & tools
The installed development system has a few tools for managing the system. This section describes some of the available tools and process ideas for the system. The next section will describe some steps to walk through final setup and use.
- A web portal which should now be up and running at the URL https://localhost:8181/dm. This portal is powered by a Payara application server which has its own setup page at https://localhost:4848 (once configured above, you may not need to do much with the Payara config page).
- A PyQt app installed dm-station-gui which can be used to setup/monitor experiment definition, file trasfers and data workflows.
- A set of command-line scripts for manipulating the system. THese commands are made accessible by sourcing the file DM_INSTALL_DIR/etc/dm.setup.sh (Note there are some definitions that are blank in the default version of this file).
- There are also a couple of underlying databases holding the data.
- A postgresql database which holds standard data such as user info, beamline/station definitions, experiments, access info linking users to experiments and data.
- A mongo database, which allows a bit more flexibility. This stores info on workflows and file information.
To start with the Data Management (DM) System is configured with one user __dm__ which is a management account, the third account listed above. One of the first items to handle is to create accounts that will be associated with managing the beamline setup and some (possibly the same accounts) that will be associated with experiments. In practice, the DM system is loosely linked to the list of users in the APS Proposal/ESAF system. Accounts on the ESAF system are coordinated with a list of users on the DM system. This is done by using the dm-update-users-from-aps-db. This will require a configuration file (find a good place to put the file). One other possibility is to create users manually from the supplied web portal. Note that, in the ESAF system, the user name is the badge number of the individual, while in the DM system a 'd' is prepended to the badge number for the user name.
Once users have been added to the system, the DM web portal can be used to associate users with a beamline or with experiments that are created. The __dm__ user can be used to log into the web portal and from the _Experiment Stations_ tab new stations can be added or existing stations, such as the test station, can be edited and station managers can be added. To create experiments, station managers can log into the system and add/manage experiments for that station. From the test installation the user can manually create experiments & add users to the experiment. In practice, at the APS, when a user adds an experiment they are provided with a list of experiments from the proposal system and the list of users is populated from the (Proposal/ESAF ??) info. Note that it is also possible to add/modify experiments either through the dm-station-gui or through the command line interface with commands such as dm-add-experiment or dm-update-experiment.
After defining an experiment, it is possible to then manage tasks such as file transfers (daq or upload) or workflows & processing jobs. These tasks can be done using either the dm-station-gui or by the command line interface.
'daq' transfers monitor selected directories for files from a live data acquisition process from the collected location to a 'storage' location. 'upload' tranfers copy any existing files from the collected location to the 'storage' location. As file are transfered, they are placed into a storage directory with subdirectories for the _(station name)/(storage root path)/(experiment name)_.
DM workflows define a sequence of commands that would operate on data sets to:
- Stage data
- Move the data to a particular location such as a transfer between globus endpoints
- Process data for using reduction/analysis algorithms
- Add results to files that are tracked by Data Management
Each step in a workflow can define inputs and outputs which can then be used in subsequent steps.
### Restarting the test system
If needed the test system can be restarted running a couple of startup commands. Change directory the DM install directory and then
* data-station
* dm/etc/init.d/dm-ds-services restart
* exp-station
* dm/etc/init.d/dm-daq-services restart
This may be necessary if, for instance, the system has been rebooted. These commands restart several services in the install directory. If you have modified something in only one of these services you may be able to restart that service. For instance if only the data storage web service needs to be rebooted then you can run
* dm/etc/init.d/dm-ds-webservice restart
### Testing the sytem
As mentioned earlier, after the inital install we have one user __dm__ which is intended to be for the overall system. We now need to set up a user for administration of a beamline and start some steps to use the sytem.
You should at this point have a directory installed which has both the _Data Manangement_ and _support_ software installed. After doing the installs described above there should be a number of other directories as well such as etc, log and var. We are now going to walk through changes needed in the etc directory which will allow us to interact with the system.
1. source the file _etc/dm.setup.sh_. This defines a number of environment variables and modifies the path to include, in particular, a number of commands beginning with __dm-__ which interact with the underlying system to add/modify users, experiments, upload and daq (both to move files) and workflows and processes (to define & monitor processing of the collected data).
- source etc/dm.setup.sh
2. add a user __dmtest__ to the system which will assume the role of manage what is going on in the system.
- dm-add-user --username dmtest --first-name DM --last-name Test --password dmtest
3. add a system role to the created user __dmtest__ to make this a manager of the station TEST which is already defined in the system. You will be asked to provide username & password. Use username __dm__ system account and the password given during setup above.
- dm-add-user-system-role --role Manager --station TEST --username dmtest
4. create a file, _etc/.dmtest.system.login_, in the same directory as the dm.setup.sh). This will contain the username & password.
- dmtest|dmtest (example contents)
5. Edit the file _etc/dm.setup.sh_, the one from step 1, to modify the line DM\_LOGIN\_FILE to point at the file created in step 4.
- DM\_LOGIN\_FILE=/home/dmadmin/etc/.dmtest.system.login (modified in file)
6. Re-source the setup file from step 1.
- source etc/dm.setup.sh
At this point we will are more in a position to start using the sytem. As a first test we will add a few test users to the system and then run the command dm-test-upload which will
* create a new experiment
* attach a list of users to the experiment
* define a location where data exists
* defines a path to store the data in the storage system
* starts an upload which copies data from the original location to the specified directory on the storage system
To accomplish this we use the following
To add 3 users
```
dm-add-user --username jprofessor --last-name Professor --first-name John
dm-add-user --username gpostdoc --last-name Postdoc --first-name George
dm-add-user --username jgradstudent --last-name Gradstudent --first-name Jane
```
To add an experiment, define the users, and kick off an upload:
```
dm-test-upload --experiment=e1 --data-directory=/home/dmadmin/testData --dest-directory=MyFirstExperiment --users=jprofessor,gpostdoc,jgradstudent
```
This should provide output like the following
```
EXPERIMENT INFO
id=23 name=e1 experimentTypeId=1 experimentStationId=1 startDate=2019-11-07 16:04:30.919828-05:00
UPLOAD INFO
id=ec513c1d-45a3-414f-8c56-50a9d4d6dbdd experimentName=e1 dataDirectory=/home/dmadmin/testData status=pending nProcessedFiles=0 nProcessingErrors=0 nFiles=0 startTime=1573160671.17 startTimestamp=2019/11/07 16:04:31 EST
```
This command will
* Create an experiment named `e1`with
- The three experimenters `jprofessor`, `gpostdoc` & `jgradstudent`
- The data that is being collected will be found at `/home/dmadmin/testData`
- Any data/files found in `/home/dmadmin/testData` will be found in a directory `TEST/e1/MyFirstExperiment` of the storage location defined for the Data Storage service.
Output like the following
```
We trust you have received the usual lecture from the local System
```
likely means that one of the config files did not disable the principalAuthenticator2, LinuxUtility or LdapLinuxPlatformUtility as described at the end of the installation section of this document.
We can now look at the results of what we have done in a number of ways:
The commands `dm-list-users` and `dm-get-experiment --experiment=e1 --display-keys=ALL --display-format=pprint` will give
```
id=1 username=dm firstName=System lastName=Account
id=2 username=dmtest firstName=DM lastName=Test
id=3 username=jprofessor firstName=John lastName=Professor
id=4 username=gpostdoc firstName=George lastName=Postdoc
id=5 username=jgradstudent firstName=Jane lastName=Gradstudent
```
and
```
{ u'experimentStation': { u'description': u'Test Station',
u'id': 1,
u'name': u'TEST'},
u'experimentStationId': 1,
u'experimentType': { u'description': u'Experiment type used for testing',
u'id': 1,
u'name': u'TEST'},
u'experimentTypeId': 1,
u'experimentUsernameList': [u'gpostdoc', u'jgradstudent', u'jprofessor'],
u'id': 23,
u'name': u'e1',
u'startDate': u'2019-11-07 16:04:30.919828-05:00',
u'storageDirectory': u'/home/dmadmin/storage/TEST/e1',
u'storageHost': u'localhost',
u'storageUrl': u'extrepid://localhost/home/dmadmin/storage/TEST/e1'}
```
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment