Commit 3d2be484 authored by dmadmin's avatar dmadmin
Browse files

Add the beginning of a new set of instructions to install software on

multiple nodes.
parent 87a2d962
## Setup of Development/Test Data Management System on Multiple Nodes
In a typical setup, it is necessary to install the Data Mangement System on multiple nodes. Centralizining overall long term data storage for instance would argue that the Data Storage Service on one, or possibly a small set of, server(s). On a given experiemnt, it may be necessary to have more than one DAQ node to deal with different detectors. This document will describe a two node setup. These nodes will be
* The data-storage node. This will provide the data storage service and Web Portal
* The exp-station node. This will provide the _daq_, _proc_ and _cat_ web services which will manage moving data from the collection system to the storage system, processing the data as needed and cataloging steps in storage and processing.
### Computer setup.
In production at APS we are using RedHat Enterprise Linux 7 on all machines. For development we are using either RHEL 7 machines or CentOS 7 machines. In the case of CentOS, this work in typically done on a virtual machine using VirtualBox. When installing, we are typically selecting a devolopment workstation configuration as a starting point for work. In addition to this, a number of requirements have been put together and can be found [here](https://confluence.aps.anl.gov/display/DMGT/DM+Station+System+Requirements). When using VirtualBox, once the OS has completed this system can be cloned to make additional machines with the same configuration. It is therefore recommended to keep a copy of the VM to use as a starting point to repeat the work done.
The typical multiple node VM setup uses two network interfaces. These interfaces are configured in the VirtualBox setup. The first network interface is configured as a generic NAT connection which will allow the VM to access the public network in order to facilitate support tool downloads during installation and access to facility resources if it is required to extend the __DM__ system to connect to facility resources as the aps\_db\_web\_service does to provide access to systems such as the APS Experiment Safety Assment Form (ESAF), System and Beamline Scheduling System (BSS). The second network interface is configured as a 'Host-only Adapter' on the 'vboxnet0' network. This interface will be used to set up the systems to communicate with each other.
The __DM__ System installation process will use the 'hostname -f' command to get the system name. The host name is used by the __DM__ system when launching services to make services available 'publicly' on the 'Host-only Adapter' network. This makes services available to the other VMs running on the 'vboxnet0' network. In order to recieve names for each system during network setup, the hostname must be set for each system. The system hostname on a CentOS system can be set with the hostnamectl command.
The system installed fro
### Support Tools Installation
Before installation of the APS Data Management System a number of tools need to be installed on the server nodes. The __DM__ system depends on tools such as Java, Python, Postgresql, MongoDB, ZeroMQ, etc. A set of scripts have been established which will download, build (when necessary) and install these tools for use with the __DM__ system. While it is possible to install most of these tools using more conventional means (e.g. RPM on Linux) the install scripts provided here builds and installs these tools specifically for use with the __DM__ system.
For the purposes of this tutorial, we will are creating two nodes which will contain different piesces of the __DM__. One node will be referred to as the data-storage node this will contain the data storage web service and the Postgresql database which conatains the user database. The second node will b reffered to as the exp-station node. This node will provide the cat web service (a catalog of the stored data), the daq web service (provides a way to move collected data) and the proc web service (provides a means to process data).
These scripts can be found in the APS git repository at:
https://git.aps.anl.gov/DM/dm-support.git](https://git.aps.anl.gov/DM/dm-support.git)
* Select an account (such as dmadmin) which will build, install and manage the __DM__ system.
* Select a parent location to install the system and create a subdirectory __DM__ to contain the __DM__ system and the support tools. We will refer to this directory in future sections as DM\_INSTALL\_DIR
* Install a copy of the code from the _support_ git repository in DM\_INSTALL\_DIR. This can be done in a variety of ways (3rd an 4th should be the most common)
- Grab a zip file from the APS Gitlab website (from URLs above) and unzip the file.
- Clone the repositories directly into DM\_INSTALL\_DIR (basically like cloning a forked repo shown below)
- Fork the repository following the fork link in the top right of the project page and then clone the repository as shown below. The example shown clones the dm-support repository into a directory __support__ and the __DM__ repository into a directory __dev__. In each case the clone is pulled from the user _USERNAME_'s fork of the repository.
> git clone https://git.aps.anl.gov/_USERNAME_/dm-support.git __support__ (Assumes forking repository)
* Change directory to the _support_ directory
> cd support
* First we will install support tools needed by the data-storage node. To do this run the command `./sbin/install_support_ds.sh`. This will install postgresql, openjdk, ant, payara, python and a number of needed python modules. As this script runs, you will be prompted to provide passwords for the master and admin accounts for the Payara web server. These will be used to manage the Payara server, which will provide a portal for managing some parts of the DM.
* Next we will install support tools needed by the exp-station. To do this run the command `./sbin/install_support_daq.sh`. This will install python, a number of needed python modules, and Python 3 and the same associated python modules. Note that in the near future, this should become just Python 3 and Python 3 modules.
### Data Management component installation
Once again, we are installing two different systems, each with different parts of the system to provide different features on each. Also, scripts have been developed to install and configure the components of the system. These scripts can be found at
[https://git.aps.anl.gov/DM/dm.git](https://git.aps.anl.gov/DM/dm.git)
To work along with the installation scripts in this package, this should be installed with a given structure. The contents of this repository should be cloned in the DM\_INSTALL\_DIR into a directory corresponding to a version tag. This allows the system to be updated in a way that allows updating the system in operation with a new versioned directory. Initially and as the system is updated a symbolic link directory in called _production_, in DM\_INSTALL\_DIR, should be redirected to a new version tag. Similarly, if it is discovered that fallback is necessary, then the link will be moved back to an older version.
#### data-storage node
This node will be responsible for providing the data storage web service, the postgresql database (which stores information on users, experiments, and beamline deployments), and the payara web server (provides portal for management).
To install the data-storage node
#### exp-station node
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment