Category Archives: Climate Modeling

FOAM Output Variables

Since I get many questions on what's contained in FOAM output, here's a list of all the variables contained in the standard atmosphere, coupler, and ocean output files.

Atmosphere Output
ALB Albedo
ALBCLR Clear sky albedo
CLDHGH Vertically-integrated, random overlap, total cloud amount
CLDLOW Vertically-integrated, random overlap, low cloud amount
CLDMED Vertically-integrated, random overlap, mid-level cloud amount
CLDTOT Vertically-integraed, random overlap, total cloud amount
CLOUD Cloud fraction
CMFDQ Q tendency – moist convetsion
CMFDT T tendency-moist convection
CMFMC Moist convection mass flux
CNVCLD Random overlap total convective cloud amount
DC01 convective adjustment tendency
DTCOND T tendency – convective adjustment
DTH T horizontal diffusive heating
DTV T vertical diffusion
FLNS Net longwave flux at surface
FLNSC Clearsky net longwave flux at surface
FLNT Net longwave flux at top
FLNTC Clearsky net longwave flux at top
FLUT Top of Atmosphere Outgoing Longwave Flux
FLUTC Clearsky Top of Atmosphere Outgoing Longwave Flux
FSDS Flux Shortwave Downwelling at Surface
FSDSC Clearsky Flux Shortwave Downwelling at Surface
FSNS Net solar flux at surface
FSNSC Clearsky net solor flux at surface
FSNT Net solar flux at top
FSNTC Clearsky net solar flux at top
FSNTOA Net Solar Flux at the Top of the Atmosphere
FSNTOAC Clearsky Net Solar Flux at the Top of the Atmosphere
LHFLX Surface latent heat flux
LWCF Longwave Cloud Forcing
OMEGA Vertical pressure velocity
ORO ocean(0), land(1), sea ice(2)
PBLH Planetary Boundary Layer Height
PHIS surface geopotential
PRECC Convective precipitation rate
PRECL Large-scale (stable) precipitation rate
PRECSC Convective snow rate (water equivalent)
PRECSL Large-scale (stable) snow rate (water equivalent)
PS surface pressure
PSL Sea level pressure
Q specific humidity
QFLX Surface water flux
QPERT Perturbation specific humidity (eddies in PBL)
QRL Longwave heating rate
QRS Solar heating rate
RELHUM Relative humidity
SHFLX Surface sensible heat flux
SNOWH Water equivalent snow depth
SOLIN Solar insolation
SRFRAD Net radiative flux at surface
SWCF Shortwave Cloud Forcing
T temperature
TAUGWX East-west gravity wave drag surface stress
TAUGWY North-south gravity wave drag surface stress
TAUX X-component (eat-weat) at surface stress
TAUY Y-component (north-south) of surface stress
TMQ Water Vapor
TPERT Perturbation temperature (eddies in PBL)
TS1 Surface temperature (level 1)
TS2 Subsurface temperature (level 2)
TS3 Subsurface temperature (level 3)
TS4 Subsurface temperature (level 4)
U zonal wind component
USTAR Surface friction velocity
UTGW U tendency – gravity wave drag
V meridional wind component
VD01 vertical diffusion tenency of water vapor
VQ Meridional water transport
VT Meridional heat transport
VTGW V tendency – gravity wave drag
VVPUU Kenetic Energy
VZ Meridional transport
WET Soil moisture
Z3 Geopotential Height

Coupler

EVP moisture flux
FRAC ice fraction
ICET1 ice layer 1 temp
ICET2 ice layer 2 temp
INTERT interface temp
LHF latent heat flux
LWV longwave out
MELTP melt potential
OHEAT ocean heat forcing field
OPREC precip from atm
OQFLX ocean freshwater forcing field
ORNF runoff into the ocean
ORO land mask
RAD surface radiation
RNF land runoff
SHF sensible heat flux
SNDPTH snow depth
SNM snow melt
SNOWT snow temp
TAUX ocean taux foring field
TAUY ocean tauy forcing filed
THCK seaice thickness flag
TSSUB1 top soil layer temp
TSSUB2 soil temp layer 2
TSSUB3 soil temp layer 3
TSSUB4 bottom soil temp layer
VOLR river volume
WS soil moisture

Ocean

CONVEC Upper Ocean Convective adjustment frequencey
CONVEC2 Deep Ocean Convective adjustment frequencey
Currents Currents
HEATF Ocean heat forcing
P Normalized perturbation presure
S Salinity
SALTF Sfc salinity tendency due to freshwater forcing
Sconv Surface layer S convective adjustment
Sconvm1 Near surface layer S convective adjustment
Szz Surface layer S vertical mixing
Szzm1 Near surface layer S vertical mixing
T Temperature
TAUX X-component (east-west) of surface stress
TAUY Y-component (north-south) of surface stress
Tconv Surface layer T convective adjustment
Tconvm1 Near surface layer T convective adjustment
Tzz Surface layer T vertical mixing
Tzzm1 Near surface layer T vertical mixing
U Zonal current component
V Meridional current component
W Vertical velocity

Using SQLite and Python to Store Model Metadata

As I continue to run a range of climate models, I've learned from painful lessons that I need to record as much information about the model run as possible. When I first started this process, I simply kept files used to make the run (the geography and configuration files for the model) and the model output. At first, this seemed sufficient because, in the end, these were the data that were most important. As it turns out, however, that having a history of everything you did during the model run, such as adjustments to the settings or geography, is also important both historically to the run and possibly sorting out problems later.

My initial solution to this problem was to create a log file. Every time I ran the model, the important setting information was sent to a simple flat-file log. It turned out that this log was very important to debugging a model-run issue because it kept a record of how the model was initially run. I also started keeping information about the hardware in this log. Along with the model information, I began to store hardware temperature data from before and after the run in the log just in case I needed to debug hardware issues. However, these data turned out to be virtually useless in a flat log file. Other information I haven't been keeping that I wanted in the log was geography version control information. I use version control to track all my geography work, so I can both track how I change the geography and get an idea how much time I spend on it. However, the exact geography used in a run is important to know. However, even more info in a flat log file makes it even more difficult to review.

My new solution is to dump the flat file approach and go with SQLite. SQLite is a lightweight, public domain SQL file format that works well with a variety of languages. SQLite has become one of my preferred file formats over the years (nudging out XML for custom data structures). The Python scripting language seems a natural fit to work with SQLite as well.

So, how does this solution work? FIrst, I have a simple run script for the model using bash (for some reason, I could never get the model to run using Python). This script calls my python script before the model starts and after the model ends. It sends two pieces of information, a uuid and the model directory path. The python script assembles everything it needs on its own.

Why a uuid? Each time I run the model, I need to identify the run in the database with a unique id that can be used to link across a number of SQL tables. A uuid ensures that the id is unique. I've considered using a uuid for the overall simulation but I haven't implemented that.

To pull in settings data and temperature data, I've written parsers for each format. For the model I've been running, FOAM, I have parsers that read the atmos_params and run_params files in addition to parsing the temperature monitor software and subversion “svn info” command. The script then inserts these data into their own tables marked by the uuid. While most of these tables have fields for each value I pull out of files, the temperature data is stored in key->value type table since the number of temperature sensors is dependent on hardware and thus may change from machine to machine (and is also Mac only).

Here is the schema for the main table, “runs”:

CREATE TABLE runs (
uuid text,
starttime text,
endtime text,
runduration text,
std_out blob,
completed text,
comments text,
runcmd text,
yearsPerDay text
);

Some of these fields are not yet used. std_out and the rancmd is not yet implemented in the script . Right now, I'll do the comments field manually. My currently running simulation looks like this at the moment:

uuid = **deleted
starttime = 2010-01-16 23:33:45
endtime = 2010-01-17 10:29:57
runduration = 39372.0
std_out =
completed = NO
comments = manual shutdown because of memory problem
runcmd =
yearsPerDay = 14.2

For the geography source location, here's the results for the run above:

uuid = **deleted
url = file:///Volumes/**deleted
type = svn
date = 2009-08-28 09:42:01 -0500 (Fri, 28 Aug 2009)
author = tlmoore
rev = 191
branch =

the branch is empty in anticipation for moving from subversion to git.

For temperatures, I can now look at before and after values for specific sensors for a run:

uuid = *deleted
sensor = SMC CPU A HEAT SINK
temperature = 64.4

uuid = *deleted
sensor = SMC CPU A HEAT SINK
temperature = 98.6

One thing I'd change here is specifying pre- versus post-run measurements.

So far, I'm happy with most of this new solution. It just need refinements.

Climate Model on a Mac: Snow Leopard

Snow Leopard, the latest OS offering from Apple, promised to be both 64-bit and faster. The question is whether Apple delivered those promises and whether those improvements impact modeling.

First, I got Snow Leopard booting to the 64-bit Snow Leopard kernal. There are instructions how to do this out there on the web (note that you don't have to bother if your machine is 64-Bit and you're running the server version).

More info below the fold..

For all of the compiling below, the target was 64-bit.

Next, I upgraded my compiler. I was using PGI 7, now I'm using PGI 9. Changing the compiler required updating all the supporting code.

Netcdf was next on the list. In this case, NetCDF was somehow incompatible with PGI 9 or Snow Leopard in one respect. fstat is a command to discover info about files. In this case, fstat was always returning a file size of 0 causing NetCDF to assume the file was not a valid netcdf file (note: this is netcdf 4 with no HDF support). I forced the compiler to use fstat64 and all was well…

Next came MPICH. I decided to drop MPICH 1 in favor of trying MPICH2. This worked fairly well with no major complications.

Finally, I recompiled the climate model, which also appeared to have little trouble.

So, was it faster??

Here is the preliminary result:

Before Snow Leopard:

32-bit compile, MPICH 1, PGI 7
12-13 model years per wall-clock day

64-bit compile, MPICH 2, PGI 9
18-19 model years per wall-clock day

This is a significant speed increase (somewhere around 50%), if it holds. If run constantly, this means an increase potential of over 2500 extra model years in a calendar year!

It's unclear, however, whether the performance is due to just 64-Bit, Snow Leopard, PGI, or MPICH. Regardless of what's causing the improvement, I'll certainly take it!

Adding Climate Model Content To Site

As part of the ongoing efforts to make simulations run by PaleoTerra more useful, we are updating the site to list all climate models run by PaleoTerra and to start making the climate model reports, images, and animations available online to clients.

Updating the full listing of models is expected to take a week or two, so consider all the information contained in the listings to be temporary. Adding reports and imagery of the simulations is expected to take much longer, however.

Models with restricted access will require passwords if/when made available online. Clients may request passwords via the Contact page or directly emailing Thomas Moore. Note, however, that the passwords for model content are NOT the same as those to access the commenting system of the site. You are not required to have a password to visit this site (only to comment on blog posts, for example).

If you have any questions, please use the contact page.

Climate Model on a Mac #15: Watch those dynamic libraries!

I recently upgraded my PGI compiler to version 8, and I had tons of trouble getting the climate model to compile and run. In this case, I decided to switch from mpich 1.2.7 to OpenMPI, on the hopes it would be better for the system and easier to set up.

However, nothing linked properly. If the software compiled, then it would have tons of MPI related errors. As a rule, I install all the libraries needed to run the model in the /opt directory, since it's easier to have more than one version of various libraries (different versions, different compilers, 32 vs 64 bit, etc), so I didn't think there could be a conflict in the libraries.

As it turned out, however, one of the MPI solutions still was getting installed in the /usr directory. On a mac, this can be a big problem because the macos goes for dynamic libraries in the usual places before loading anything in /opt. Hence, I was compiling for OpenMPI, but the OS was loading either mpich, OpenMPI without fortran support, or 32 bit instead of 64. The results of which are not pleasant.

The upshot to solving this problem is that I might be able to get the GCC/g95 solution to run the model instead of the Portland compilers. I'm happy with the portland compilers, but they're expensive to keep current. So, it's time to give it another go!

Climate model on a mac project: #14 Knowing when you quit…

When not using a scheduler like Torque/PBS, it can be complicated to find out whether the model has quit. If the run was successful, you can have a reasonable idea when it SHOULD quit, but it might crash long before that time. As a result, you've lost hours, if not days, of computing time.

One solution to this problem is a handy application that comes with MacOS X: Automator. Automator is a simple way of getting the computer to do some repetitive tasks. In this case, the tasks are 1) run the model, 2) open Apple Mail, and 3) send an email. Automator makes these steps easy. The pre-requisite, however, that you have a mail account already set up in Apple Mail.

For step 1, you need the “Run Shell Script” action. Simply write the shell script to cd to the working direction and execute the run script.

For step 2, you need the “New Mail Message” action. Set the “To:” address, the subject, and message body to something meaningful. Also be sure you've selected the proper sending account at the bottom.

For step 3, all you need to do is actually have Mail send the message. To do this, you simply use the “Send Outgoing Messages” action.

To get this all to run; press play.

That's it!

Climate model on a mac project: #13 Post production!

I've managed to complete a 100 year simulation for the Carboniferous. Over this span, the computer performance was, on average, 13.78 model years per day with a standard deviation of 0.13 model years per day using all 8 cores of the machine. I can't explain the large standard deviation. It is likely a combination of the computing activity of the software, hard drive interaction, the machine needing to do other work during the run, and heat. However, there is no correlation between run speed and duration. So, a direct connection between heat (using duration as a proxy, i.e. the longer the run, the hotter the machine) and run speed doesn't seem to exist. On the other hand, the longest duration run, over 6000 days, produced the slowest run speed, 13.5 model years per day. However, there is only one sample with such a long duration.

Now comes the post processing. I use a wide variety of “in-house” software to process the results and turn them into something manageable. I also use some free and open source software. One package is NCO (NetCDF Operators). Since I'm on a mac, there are at least three ways to get Unix/Linux apps on the machine. 1) Fink. a package manager, 2) Macports, another package manager, and 3) compiling, i.e. the hard way.

I've had no luck installing NCO using Fink or Macports. However, I used Macports to install some of the dependencies for NCO, such as ANTLR. So, compiling I go!

One thing I've found to be extremely handy when plowing through a difficult compile is to first build a script that you can set all your variables, switches, etc. so you keep a record of what you used.

Here's what my script ended up looking like:

#/bin/sh

make distclean

export CC=cc
export ANTLR_ROOT=/opt/local
export CPPFLAGS=-I/usr/include/malloc
export NETCDF_INC=/opt/netcdf/gcc/4.0.1/include
export NETCDF_LIB=/opt/netcdf/gcc/4.0.1/lib

./configure –disable-regex –disable-antlr –disable-shared

make

The malloc in the CPPFLAGS is there because many files could not be build without knowing exactly where malloc.h was located on the mac.

In any case, I only needed about one or two of the actual NCO commands.

Now, onto NCL (NCAR Command Language). NCL comes prebuilt for the mac, just install and go. However, it can be fickle. Today, it's just hanging on some files. We'll see what I can do to shake things loose.

Climate model on a mac project: #12 In Production!

At last, the new machine is now doing production work and I've run almost 24 hours so far. First, it's running a bit slower than anticipated, about 13.5 model years per day. I'm not sure what's causing the slower speeds, but it's still an acceptable speed considering my original estimate was only 7 model years per day.

Temperature remains a worry for me. The CPU temps seemed reasonable, but the RAM temps seemed rather high for long-term processing. So, I ordered some fans to cool the RAM boards.

Tomorrow, I'll take a look at the output and start planning a possible Cocoa front-end for running the model.

Climate model on a mac project: #11 sysctl.conf and mpich

A key component for running the climate model I'm using is MPICH. MPICH is an implementation of the MPI (Message Passing Interface) library for supercomputing maintained at Argonne National Laboratory (disclosure: I am a former employee of the lab). The climate model uses MPICH to break down the world into smaller parts and distribute those parts to each CPU. However, these parts must also talk to each other because climate information must move from processor to processor so that each processor knows what's going on around it.

MPICH comes installed in Leopard (I'm not sure if it's installed with the developer installation or the standard client installation) and is meant for use with GCC only. It is also likely that it's intended for XGrid as well. So, for this project, the pre-installed MPICH does not have the required fortran to run my simulations.

As it turns out, the default Leopard configuration is lacking a critical configuration for running MPICH on a single mac. It does not allow for shared memory. Using shared memory can speed up computation when you're using multi-processor or multi-core machines because it helps limit how much ethernet bandwidth the processes will use. Ethernet bandwidth is a major bottleneck for processing speed, which is why many high performance clusters use optical networking such as Myranet.

To allow shared memory on Leopard, /etc/sysctl.conf must be created. The contents of the file include something like:

kern.sysv.shmmax=536870912 kern.sysv.shmmin=1
kern.sysv.shmmni=128
kern.sysv.shmseg=32
kern.sysv.shmall=131072

Unfortunately, I don't know the optimal settings for these parameters, I just found these settings off the net. However, they do work.

And now, MPICH can be compiled and installed on the system. Here is the contents of my script to build MPICH:

make distclean
export CC=pgcc
export CXX=pgcpp
export FC=pgf90
export F90=pgf90
export F77=pgf77

export CFLAGS='-Msignextend '
export FC=pgf77
export FFLAGS=
export CCFLAGS=

./configure -prefix=/opt/mpich/pgi/7.2/ –with-device=ch_shmem –enable-g –enable-sharedlib