3/31/2012

A first glimpse of CMAQ

In the last three weeks, I did three things:
1. Modify the python script written before to interpolate preciser DEM into WRF.
2. Learn NCL
3. Successfully run CMAQ model.

DEM Interpolation

As mentioned in the last post, I wrote a small python script to interpolate our 30-meter-resolution DEM data into WRF, and tried to make the result more accurate. After the script has been written, however, the validation is another challenging work. First we just used geogrid.exe to take in the output data, and compared the result with that using coarser resolution data, say 30 s(econd) and 2 m(inus), to see the rough trend. The result was OK, but not persuasive enough to claim the correctness, because the terrain is too complex and you can not reach every detail with only eyes!

Then we did more accurate validation. First, we made up some experimental data, say, a block of DEM where four quarters were different constant values. The simplified data was easy to check if the output was right or not. Then real data were transferred in WPS, and we did an overlay on the coarser data, and made a "minus" computation in the topography height. In this way, the correctness of the script can be fully validated.

The following work should be to check how much the more precise topography influences WRF. From the geogrid result, you can hardly tell the difference between 1s and 30s terrain. However, from reports by other researchers, the topography does make a difference. The same method is applied: overlay and minus.

NCL

NCL is an interpreted language (like python) designed specifically for scientific data analysis and visualization. As far as I see, the tool has following advantages:
1. Interpreted language. It's easy to use, with compiling and linking problems. Any change can be immediately applied. It is much like python, and can also be run in both interactive mode and batch mode.
2. Powerful I/O ability. It can read almost all scientific data, from ASCII to binary, and netcdf, grib1 & grib2, and shapefile, etc. That's why we abandon GrADS, which can only deal with specific format, and may not be used for WPS, CMAQ and SMOKE.
3. Programming freedom. The language is rather flexible, and give users enough freedom to do what they want. And it is easy to be extended to wrap C and Fortran.
4. Numerous built-in functions. Though I haven't used these mathematical functions, but it is good to know there are many available tools at hands, something like Matlab.
5. Easy map overlay. As it is especially designed for atmospheric and oceanic models, map overlay is very important to study a specific issue.

More features:
There are several different kinds of data in NCL. A variable can have attribute data, coordinate data and missing values. Attribute data explains additional info apart from the "real data", like variable descriptions. Coordinate data facilitates the mapping procedure. Missing value support is another feature that highlights. Many models have a special value that is assigned to those with no sufficient info (e.g. initial condition) to simulate. NCL can recognize these data and do some special tricks.

Also, unlike C, NCL can deal with the whole array. On the other hand, loop though every element in an array is inefficient and not recommended.

CMAQ

It finally comes to CMAQ. Thanks to Prof. Zheng Junyu, now we have SMOKE output, and can run CMAQ. CMAQ consists of  many sub-programs. It is annoying to set up configurations one by one. So I write a top script to control these sub-scripts.

It's interesting to learn the shell programming. Linux shell is much more powerful than Windows Dos. It not only has basic commands, but also variables, statements, functions and procedures, and logical expressions. Some confusions I now have is Linux has various shells, like sh, bash, csh, and different shells has some different syntax. Besides, the concept "process" is important, and affects the pass of variables between functions and scripts.

It is really time-consuming to run CMAQ. For basic configurations, there are six output files, including pollution concentration and average concentration, and wet and dry deposition. One problem I recently solved is variables in CCTM output has no coordinates, but only rows and columns, making it difficult to overlay with the map. Actually there is such info, and it resides in MCIP output, in GRDCRO2D file. There are variables LON AND LAT. They are 4-dimension array, with time steps, layers, rows and columns. In fact for coordinates the array is duplicated. Only one dimension is enough, which is also required by NCL (in NCL, coordinate variables must be one dimensional. For longitude, pick columns and for latitude, pick rows).

1 comment:

  1. 看懂了一丁丁~~~~已阅!小伙子要再接再厉~~~

    ReplyDelete