Running on TACC

To run a set of simulations on Stampede you need to load the following modules:

$ module load git
$ module load python
$ module load intel
$ module load mvapich
$ module load mkl

To make thse your default modules either add these commands to your ~/.bashrc or module save.

See also

Stampede User Guide Modules

File System Setup

The major file systems available on Stampede are $HOME, $WORK, $SCRATCH, /tmp, and $ARCHIVE.

Installation

You can install PolyADCIRC in the $HOME directory as described in the Overview. If you want to use the cutting edge version from the git repo, I would sugguest putting the git repo containing the Polysim directory in $HOME. Files are not backed up in the``$WORK`` or $SCRATCH directories, so if you need to put the landuse.git repo in either of these locations use the --separate-gir-dir=$HOME/someplace option.

To clone the git repo containing the PolyADCIRC directory:

$ git clone username@ices-workstation:/org/groups/chg/lgraham/PolyADCIRC.git

or:

$ git clone git@github.com:lcgraham/PolyADCIRC.git

Since this code is currently in development it is not in a public repository. If you would like a copy of the code let me know.

Input/Output Directory Structure

This code currently assumes use of v50 of ADCIRC. The top directory containing the work, src, etc. folders for ADCIRC should be kept in the $WORK directory due to memory quota constraints.

Within the work/ folder of your ADCIRC directory you have 2 options with regard to file structure

  1. (RECOMMENDED) Copy /h1/lgraham/group_mts_012914/Inlet_test to a convienent location.

  2. (NOT RECOMMENDED) Clone the git repo ADCIRC_landuse here and add/alter data/input files

    $ git clone --separate-git-dir=$HOME/ADCIRC_landuse ices-workstation:/org/groups/chg/lgraham/ADCIRC_landuse.git
    $ cd ADCIRC_landuse
    $ git checkout --track origin/stampede
    
  3. Create your own directory and add any missing data/input files. The recommended directory structure is as follows

    adcirc_dir/
        src/
        swan/
        util/
        ...
        work/
            ADCIRC_landuse/ (THIS YOU CLONE OR CREATE)
                grid_name1/
                ...
                grid_namen/
                    inputs/
                        grid_dir1/
                        grid_dir2/
                        ...
                    runs/
                        fort.13
                        README
                        save_dir1/
                        ...
                        save_dirn/
                    landuse_basis/
                        basis_dir1/
                        ...
                        basis_dirn
    

    The ADCIRC_landuse/ MUST be in the work/ (base_dir used by runSet) directory of your ADCIRC build. The ADCIRC_landuse/ directory can be renamed but it MUST contain any grid_dir, save_dir, or basis_dir used by runSet. There must be a fort.13 file specific to grid_dir stored in the directory containing the save_dir.

See also

Stampede User Guide File Systems

runSet class documenation

Git Documentation Remote Branches

Python Scripts

The following submission and Python scripts should be located in the directory PolyADCIRC/examples/.

Currently my workflow has been something like...

On Stampede:

$ cd $WORK/landuse_bin
$ qsub submission_script.sub
$ qstat

Once the job has finished running, check the $JOB_NAME.$JOB_ID file for errors.

$ scp $WORK/v50_ADCIRC/work/ADCIRC_landuse/Inlet/runs/my_run/py_save_file.mat
ices-workstation.ices.utexas.edu:~/workspace/ADCIRC_landuse/Inlet/runs/my_run.

On my ICES workstation in bin/ start an interactive Python session using python or ipython

>>> run load_test.py
>>> whos

Now the data collected from the PADCIRC runs are accessible in Python for plotting and analysis. The py_save_file.mat file is also readable by MATLAB and Octave.

run_stampede_test

This is the script I’ve been using for my current setup. This is the script that should be modified for future runs. There are other scripts in the examples/ directory which may be helpful, although minor changes may be required.

Allow running from the command line using ./run_stampede_test.py:

#! /usr/bin/env/python

Import necessary modules:

import polyadcirc.run_framework.domain as dom
import polyadcirc.run_framework.random_manningsn as rmn
import numpy as np
import os, glob

Store string references to important directories:

adcirc_dir = '/work/01837/lcgraham/v50release_130626/work'
grid_dir = adcirc_dir + '/ADCIRC_landuse/Inlet/inputs/tides'
save_dir = adcirc_dir + '/ADCIRC_landuse/Inlet/runs/vel_test'
basis_dir = adcirc_dir + '/ADCIRC_landuse/Inlet/landuse_basis/gap/bands'
# assume that in.prep* files are one directory up from basis_dir
grid_dir
directory where the fort.15, fort.14, and fort.22 files are stored
save_dir
directory where the RF_directory_*/ are created and job specific data is saved
basis_dir
directory containing the landuse_## folders which each contain a fort.13 file specific to the landuse classification
adcirc_dir
directory containing compiled ADCIRC executables

Set run specific names of script and save_file:

script = "runRUNrun.sh"
save_file = 'py_save_file'

Setting diffrerent script names allows for simulatenous runs of PolyADCIRC with differing grid_dir, save_dir, and basis_dir.

Designate which ADCIRC specific output files to collect data from:

timeseries_files = ["fort.61", "fort.63", "fort.62", "fort.64"]
nontimeseries_files = ["tinun.63", "maxvel.63"]

Set nprocs to be number of processors per PADCIRC run. Set ppnode to be TpN (tasks per node) or the number of processors per node. On Stampede, 12 is the number of processors per node. Set NoN to be number of nodes requested by the submission_script.sub. See -pe `` line in submission_script ``<TpN>way<NoN x 12>.:

nprocs = 2
ppnode = 12
NoN = 2
num_of_parallel_runs = (ppnode*NoN)/nprocs # procs_pnode * NoN / nproc

Store directory references and set up random field directories:

main_run = rmn.runSet(grid_dir, save_dir, basis_dir, num_of_parallel_runs,
        base_dir = adcirc_dir, script_name = script)
main_run.initialize_random_field_directories(num_procs = nprocs)

Store fort.14 and fort.15 data in domain:

domain = dom.domain(grid_dir)
domain.update()

Set samples:

lam_domain = np.array([[.02, .2], [.02, .2], [.02, .2]])
lam1 = np.linspace(lam_domain[0, 0], lam_domain[0, 1], 3)
lam2 = np.linspace(lam_domain[1, 0], lam_domain[1, 1], 3)
lam3 = np.linspace(lam_domain[2, 0], lam_domain[2, 1], 3)
lam4 = 0.02
lam1, lam2, lam3, lam4 = np.meshgrid(lam1, lam2, lam3, lam4)
lam_samples = np.column_stack((lam1.ravel(), lam2.ravel(), lam3.ravel(),
lam4.ravel()))

mann_pts = lam_samples.transpose()

Note

mann_pts must have the shape (number_of_landclasses, number_of_samples) in this case that is (4, 27)

Run samples:

main_run.run_points(domain, mann_pts, save_file, num_procs = nprocs,
    procs_pnode = ppnode, ts_names = timeseries_files,
    nts_names = nontimeseries_files, screenout=True)

Job Submission Script

An example submission script is included in examples/submission_script.sub. To run on Stampede you will need to rewrite the qsub script as a sbatch submission script and modify the the requested nodes as Stampede has a different number of cores per node than Stampede. These types of python scripts create a lot of hostfiles in your $HOME\.sge ($HOME\.slurm) so you should schedule a cron tab that periodially wipes old files. You should copy any scripts you wish to modify and run into a separate folder in your $WORK directory. In these examples I am working from $WORK/landuse_bin. You will need to modify the lines that designate the adcirc_dir, grid_dir, save_dir, and basis_dir to match your directory structure.

To run run_stampede_test you need to modify submission_script.sub so that the line #$ -M youremail@someplace.com has your e-mail. Then you can submit it to the queue using:

$ qsub submission_script.sub

To check on your job you can use the commend:

$ qstat

Currently the output is saved to a python formatted binary file called py_save_file.mat in save_dir.

load_test

To run this code on my workstation I generally have the sl6 and python (or epd) modules loaded (module load module_name). I would also recommend installing the latest versions of numpy, scipy, and matplotlib.

Import necessary modules:

import polyadcirc.pyADCIRC.plotADCIRC as pa
import polyadcirc.run_framework.random_manningsn as rmn

Set up local directory and file references:

save_file = 'py_save_file.mat'
base_dir = '/h1/lgraham/workspace'

grid_dir = base_dir+'/ADCIRC_landuse/Inlet/inputs/tides'
save_dir = base_dir+'/ADCIRC_landuse/Inlet/runs/vel_test'
basis_dir = base_dir+'/ADCIRC_landuse/Inlet/landuse_basis/gap/bands'

Load the run set up and data:

main_run, domain, mann_pts = rmn.loadmat(save_file, base_dir, grid_dir,
        save_dir, basis_dir)

Now the data is availiable for plotting methods in plotADCIRC.