Connecting to remote Linux servers#

We have access to various Linux clusters:

  • BioHPC

  • Occassionally NBER linux servers

  • Others, as provided by authors

Request an account

  1. Go to the BioHPC account request page, and create an account on the BioHPC cluster.

  2. Then contact BioHPC support, requesting to join the ECCO group and lv39 (Lars’) “lab” (ecco_lv39).

Reserve a node

  • Go to “User”, then Reservations page, choose “Restricted”, and reserve a node:

    • cbsuecco02: up to 7 days

    • all others: up to 3 days

    • in both cases, renewable

  • Then go to ‘My Reservations’ and share the reservation with Lars (lv39) and others, if necessary.

Note

Skip this step if somebody else added you to their reservation!

Note

If you do not see an open server, ask on the mailing list (ldi-lab-l@cornell.edu) if somebody has an active reservation that they can add you to!

Access a node

See Getting Started Guide and Remote Access. SSH is the best path (if you don’t need graphical applications). See Access via VSCode for a more user-friendly way to use SSH to access the server.

Note that, for off-campus access, you will need to use Cornell VPN. Instructions can be found here.

For VNC:

  • Once your have a reserved node, click “My Reservation” to manage all your active reservations.

  • Choose your reservation.

  • Click “Connect VNC” under “Action” and you will have your machine name and port number. To disconnect, click “Cancel VNC” under “Action”.

  • Open VNC Viewer and type in session number in the form of “machine name:port number” given by BioHPC.

Tip

For first-time access, set up your bash environment.

Request an account

You need an account with NBER to access their servers. Contact Lars to get an account.

Access a node

Access for us is primarily via SSH. See Access via VSCode for a more user-friendly way to use SSH to access the server.

Accessing Linux nodes with VSCode#

  • Check that you have installed the Remote-SSH extension on VSCode.

  • Open VSCode and select the Remote-SSH extension from the Command Palette.

  • Enter the host name when prompted. The host name should follow this naming convention:

    • BioHPC: “netid@cbsuecco##.biohpc.cornell.edu”.

    • NBER: “loginid@nber##.nber.org”.

  • You may be prompted to “Select the platform of the remote host”. If so, select the “Linux” option in the drop down menu.

Tip

For this to work on BioHPC, verify that you have a valid reservation and an active VPN!

  • Enter your account password when prompted.

  • Once connected,

    • Open Folder and navigate to your working directory.

    • open a new terminal using the “Terminal” option in the top menu of VSCode (or `Ctrl-``).

  • You should now be able to work on the Linux server via command line.

Some benefits of connecting to BioHPC with VSCode: You can view/edit programs, check log files, and run jobs simultaneously in a given instance of VSCode. Note that you should still use tmux within the VSCode terminal, in case of a disconnect.

In particular, you can navigate to your working directory and git clone the Bitbucket repository (using either the command line, or VSCode prompt to Clone Repository). VSCode recognizes Git, so you can visually navigate through tracked and untracked files via the lefthand side menu.

Where to run code#

You should use /home2/ecco_lv39/Workspace to clone your Bitbucket repository.

cd /home2/ecco_lv39/Workspace

Consult the Jira issue to figure out where you should run the code.

Additional setup and tips-and-tricks#

First-time setup#

Run this ONCE the first time you ever access Linux servers:

echo "umask 007" >> $HOME/.bash_profile

Then do the usual Bash setup. That should work on nearly any Linux server.

Utilize tmux#

If the Linux server has tmux installed, use it to make a persistent session that survives disconnects.

  1. Login via SSH

  2. Launch tmux with a session name that makes sense, e.g. tmux new -s AEAREP-xxxx

  3. Launch your Matlab, Stata, etc job

  4. Disconnect from tmux: ctrl-b d. You don’t need to press this both Keyboard shortcut at a time. First press “Ctrl+b” and then press “d”.

  5. Log out of SSH

Next time:

  1. Login via SSH

  2. Reconnect to your tmux session: tmux a -t AEAREP-xxxx

  3. If you forgot what session, tmux ls

To save the output of a tmux session to a file, see https://unix.stackexchange.com/questions/26548/write-all-tmux-scrollback-to-a-file.

Configuring automatic reservation cancellation (BioHPC only)#

If you use the BioHPC reservation system, it helps others if at the end of a long-running job, your reservation is cancelled as soon as possible. One way to do this is to add the following to the scripts you are running:

// Use the code below at the bottom of the Stata "main" or 
// "master" script to automatically sign out 

shell /programs/bin/labutils/endres.pl 
# Add to end of main or last script.
system("/programs/bin/labutils/endres.pl")
%Use code below at end of MATLAB main script, or last script, to automatically sign out

system("/programs/bin/labutils/endres.pl")
# Use code below at bottom of Python/Anaconda script 
# to automatically sign out

import os

os.system("/programs/bin/labutils/endres.pl")
#Use the code below at the bottom of the bash "main" or 
# "master" script to automatically sign out 
/programs/bin/labutils/endres.pl 

Unzipping large ZIP files fails#

In some cases, unzipping large ZIP files (larger than 2GB) may fail:

error: invalid zip file with overlapped components (possible zip bomb)

If this is detected, you need to use a 64-bit version of a decompression program. This may vary by Linux host.

Instead of zip, use 7z as follows:

ICPSRNUM=123456
/programs/bin/util/7z x -O${ICPSRNUM} ${ICPSRNUM}.zip

(which is the equivalent to the zip command zip -n ${ICPSRNUM} -d ${ICPSRNUM}). The first option (-O) is an upper-case letter O`, not zero.

If 7z is available, use it as well (path may differ). Other options include using jar. Contact system administrator of the system for guidance.

Setting up SSH Agent for password-less login#

The SSH protocol allows you to create a public-private key pair. You deposit the public key on the Linux server, and keep the (passphrase-protected) private key on your laptop. You can then configure a “ssh-agent” running on your laptop to store (temporarily) your passphrase in memory, and provide it every time you log in to the Linux server. This works very easily on Unix-like systems (macOS, Linux laptops), but is a bit trickier on Windows laptops.

To be completed.

To be completed.

First, you need to ensure that the SSH subsystem is installed.

To be completed.

Finally, you will want to start the SSH-agent. The following website explains how to do so. You will likely need admin privileges on your laptop.

The next steps are common to all operating systems.

You will want to create a key, then transfer it to the Linux server. The following commands should do this (should work on both Powershell and Bash):

ssh-keygen -t ed25519 -C "For BioHPC"

ed25519 is the encryption type, “For BioHPC” is an arbitrary comment for your own tracking of keys. When prompted, you should add a passphrase.

This should have created two files in your .ssh directory:

ls $HOME/.ssh

Add the new key to your SSH-agent (optional, but recommended):

ssh-add $HOME/.ssh/id_ed25519.pub
# you should be prompted for your passphrase.

You will want to transfer the .pub file to the Linux server. You can do this with the command ssh-copy-id (if it exists), or manually. We show you how to do this manually:

Note

You should open two terminals: one locally on your laptop, one remotely on the Windows server!

This creates a directory used by SSH:

# this is on the remote server
mkdir $HOME/.ssh
# stay logged in!

This transfer the public key to the remote Linux server

# this is on your laptop
netid=lv39  # adjust to be your own netid!
scp $HOME/.ssh/*.pub ${netid}@cbsulogin.biohpc.cornell.edu:.ssh/

You should now see the .pub key in your .ssh directory on the remote Linux server:

# this is on the remote server
# this should show the *.pub file
ls $HOME/.ssh
# this authorizes you to use the SSH key to log in:
cat $HOME/.ssh/*.pub >> $HOME/.ssh/authorized_keys

Now test it:

# This is run from your laptop
ssh ${netid}@cbsulogin.biohpc.cornell.edu

You should now be prompted for your SSH passphrase:

$ ssh netid@cbsulogin.biohpc.cornell.edu
Enter passphrase for key `C:\Users\netid\.ssh\id_ed15559.pub`:

Tip

Additional tips-and-tricks can be found on the LDIlab wiki. These are focused on the BioHPC cluster, but may work on other servers as well.