The Duke Shared Cluster Resource (DSCR) is a shared set of machines, some of which are provided by the University, and some of which are owned by other research groups. These are not true "Beowulf" clusters but are simply sets of Intel x86 based machines running Linux. They have no keyboards and no monitors and thus you must log into the system using ssh. See the Figure below for a better description of the network arrangement.
There are currently 2 "front-end" machines that users must login to first. The names of these head nodes are dscr-login-01.oit.duke.edu and dscr-login-02.oit.duke.edu. There is also a dedicated file-transfer node, dscr-xfer-01.oit.duke.edu, that has higher-speed networking capabilities to help get data in and out of the cluster faster (but it will only accept sftp; ssh logins will not work on dscr-xfer-01).
Once you are logged in to a front-end, you will be able to login from there to any node in the cluster. Most of your work will be done on the front-ends: compilation, job submission, debugging. The only time you may need to directly login to any node is for parallel debugging.
The basic layout, or topology, of the cluster is that of a tree. Groups of 14 to 20 machines are connected, via 1Gbps Ethernet, to a high-speed "edge" switch. Those switches are then connected to a higher level "core" switch which will negotiate network connections that may exist between groups of machines. The "edge" to "core" connections are at 1Gbps (for older machine groups) or 10Gbps (for newer groups). Note that this means some parallel jobs may see unexpected delays if they happen to span multiple switches.
To learn more about gaining access to the DSCR, please Gaining Access.
If you are a member of a group that already participates in the DSCR, please direct your new account request through your designated Point Of Contact.
The machine labeled "dscr-monitor-02" is for monitoring of the system. You cannot login to this machine, but you can point a web-browser at it and find out the status of machines in the cluster: http://dscr-monitor-02.oit.duke.edu.
The last machine to mention is the file server for the cluster. Again, you cannot directly login to this machine. To transfer files in and out of the cluster, use
scp and connect through dscr-xfer-01.
It is worth noting that there are two separate networks in the Figure, one facing campus and one internal to the cluster. From the "outside" world, only the front-end is visible and it is named as above, dscr-login-01.oit.duke.edu. On the internal network, this machine is named head1. You cannot directly log into a compute-node from the campus network.
- Installed Applications
- Using the Cluster
- SGE Queueing System
- Compilers and Libraries
- MPICH Parallel Library
- Machine Info
- Status Monitoring
- Storage Usage
- CPU Speed Issues
- CSH Scripting Basics
- Other References
- Scripted Remote Access to the cluster
- Searching the DSCR Documentation
For more information, you can email us at hpc-support at duke.edu