3/31/2012

A first glimpse of CMAQ

In the last three weeks, I did three things:
1. Modify the python script written before to interpolate preciser DEM into WRF.
2. Learn NCL
3. Successfully run CMAQ model.

DEM Interpolation

As mentioned in the last post, I wrote a small python script to interpolate our 30-meter-resolution DEM data into WRF, and tried to make the result more accurate. After the script has been written, however, the validation is another challenging work. First we just used geogrid.exe to take in the output data, and compared the result with that using coarser resolution data, say 30 s(econd) and 2 m(inus), to see the rough trend. The result was OK, but not persuasive enough to claim the correctness, because the terrain is too complex and you can not reach every detail with only eyes!

Then we did more accurate validation. First, we made up some experimental data, say, a block of DEM where four quarters were different constant values. The simplified data was easy to check if the output was right or not. Then real data were transferred in WPS, and we did an overlay on the coarser data, and made a "minus" computation in the topography height. In this way, the correctness of the script can be fully validated.

The following work should be to check how much the more precise topography influences WRF. From the geogrid result, you can hardly tell the difference between 1s and 30s terrain. However, from reports by other researchers, the topography does make a difference. The same method is applied: overlay and minus.

NCL

NCL is an interpreted language (like python) designed specifically for scientific data analysis and visualization. As far as I see, the tool has following advantages:
1. Interpreted language. It's easy to use, with compiling and linking problems. Any change can be immediately applied. It is much like python, and can also be run in both interactive mode and batch mode.
2. Powerful I/O ability. It can read almost all scientific data, from ASCII to binary, and netcdf, grib1 & grib2, and shapefile, etc. That's why we abandon GrADS, which can only deal with specific format, and may not be used for WPS, CMAQ and SMOKE.
3. Programming freedom. The language is rather flexible, and give users enough freedom to do what they want. And it is easy to be extended to wrap C and Fortran.
4. Numerous built-in functions. Though I haven't used these mathematical functions, but it is good to know there are many available tools at hands, something like Matlab.
5. Easy map overlay. As it is especially designed for atmospheric and oceanic models, map overlay is very important to study a specific issue.

More features:
There are several different kinds of data in NCL. A variable can have attribute data, coordinate data and missing values. Attribute data explains additional info apart from the "real data", like variable descriptions. Coordinate data facilitates the mapping procedure. Missing value support is another feature that highlights. Many models have a special value that is assigned to those with no sufficient info (e.g. initial condition) to simulate. NCL can recognize these data and do some special tricks.

Also, unlike C, NCL can deal with the whole array. On the other hand, loop though every element in an array is inefficient and not recommended.

CMAQ

It finally comes to CMAQ. Thanks to Prof. Zheng Junyu, now we have SMOKE output, and can run CMAQ. CMAQ consists of  many sub-programs. It is annoying to set up configurations one by one. So I write a top script to control these sub-scripts.

It's interesting to learn the shell programming. Linux shell is much more powerful than Windows Dos. It not only has basic commands, but also variables, statements, functions and procedures, and logical expressions. Some confusions I now have is Linux has various shells, like sh, bash, csh, and different shells has some different syntax. Besides, the concept "process" is important, and affects the pass of variables between functions and scripts.

It is really time-consuming to run CMAQ. For basic configurations, there are six output files, including pollution concentration and average concentration, and wet and dry deposition. One problem I recently solved is variables in CCTM output has no coordinates, but only rows and columns, making it difficult to overlay with the map. Actually there is such info, and it resides in MCIP output, in GRDCRO2D file. There are variables LON AND LAT. They are 4-dimension array, with time steps, layers, rows and columns. In fact for coordinates the array is duplicated. Only one dimension is enough, which is also required by NCL (in NCL, coordinate variables must be one dimensional. For longitude, pick columns and for latitude, pick rows).

3/16/2012

读《激荡三十年》

很早就听说这本书了,由赫赫有名的吴晓波老师写的,当初因为他与我们强化班的导师同名而让我印象深刻。自然,和其他经济学类书一样,这本书受到了强化班同学的追捧。很多人表示“赞”,或想读,或者读过了的出来发表一堆宏论。我自然不喜这些,因此也就耽搁了。

不过,对经济学类的书的好奇让我又翻看了这本书。似乎第一次看这么“务实”的书。书的内容让我很意外,写的很浅显,也很真实,让不谙世事的我第一次对这些“身边的最实实在在的事”有了一个认识。尤其是读到台州的李书福甚至玉环的那些事时,我深切地体会到书中讲的事情是多么的现实。

所以说经济学是很有意思的一门学科,它不像我平时做的学科如计算机等那么抽象、那么脱离现实,它是研究这个社会如何动作的学科,与古时读书人的清风傲骨隔隔不入。

或许以后也该多读点这类务实的书。有时想想,何必执着于那些所谓的技术呢?

3/12/2012

Topography in WPS

Two weeks ago, I tried to run CMAQ. However, it was really beyond my expectation that the pollution emission source data were so difficult to deal with, and finally we had to give up.

SMOKE is a set of programs responsible for the emission data. It takes in pollution inventory and turns it into CMAQ permitted format. In addition, it also divides the inventory spatially into grids and temporally into hours and days as most inventory data we get are annual. The problem is, SMOKE is developed by Americans and for USA use. The standards and administrative divisions are totally different. To take advantage of SMOKE, a specific methodology to adapt to the local is needed. Spanish scholars (R. Borge, et al)  have done that. In China, Prof. Zheng Junyu from South China University of Technology spent two years finishing that.

We are lucky that on 2nd March, Prof. Zheng came here and gave us a speech, exactly about pollution inventory preparation. He promised to give us his fruit so that we can go further. This Wednesday we are going there to learn about the skills.

Last week, I tried to replace topography data for WPS with my own data. WPS is the pre-processing software of WRF, preparing geographical and meteorological data for real-case weather simulation. It consists of three sub-programs, among which geogrid is responsible for interpolating static geographical data into grids. WPS is equipped with global geographical data issued by USGS, with 10m, 5m, 2m, 30s resolutions. For better precision, we have to add customized data to geogrid. We've got 1s (30 meters) PRD DEM data, and we can replace the topography data with it. Later, we are going to collect landuse data and do the same replacement.

The work is not so difficult as it seems. Both DEM data and geogrid formatted data are in grids. All I have to do is to read DEM into memory and then output it to static data. GDAL facilitates us greatly with dealing with DEM and WPS provides the output routine. We have to pay attention to obeying the WPS program interface. Also, data sequence in these two data formats are different. GDAL tends to read data from the first row (from north to south), but geogrid routine writes data from the south to the north. 

Ater the static data are ready, some configurations have to be modified to instruct geogrid to use the data we specified. After geo_em.d* have been output, we can use NCL to view the result. A sample ncl script can be downloaded here

In this way, terrain data with better resolution can be applied to WRF, which should affect the simulation result, like wind direction. Of course, the simulation grid is important in determining the significance of this work. Simulation with large domains and low resolution care less about course geography.

Fig1. Terrain with USGS 10m data

Fig2. Terrain with 1s data

3/05/2012

人生的转折点?

此刻,我正坐在303,听着小萝莉的 phd qualify,看着一页一页很华丽的图文,百感交集:这就是我以后要做的?

昨天看了一篇日志,一个满怀壮志的青年不甘于过着每天上班哄妻子带小孩的平淡日子,决意出国,只有外面的大千世界才能容下他博大的雄心。时隔四年,当他每天忙于看paper、做project、写本子,他最大的愿望就是早点毕业,过上高级知识分子应该有的高薪生活,然后有一个幸福的家庭。他不禁感叹到,到最后,我竟然回到了原点。

这或许是困扰所有人的一个问题:我们为什么活着?几天前,我收到了PSU的offer,我当然很兴奋,但我知道我并没有做好准备。我的兴奋只是因为我赢得了【申请】这一仗,我的【被认可】让我有很大的成就感。但是,这其实只是最初的一小步,后面还有5年6年甚至更长的时间。每天,我要做的就是看paper,写paper,申项目,做presentation,很明显我并不愿意过这样的生活,这只是为我今后有更好的生活付出的代价。不知将来我是否会为我今天这样“牺牲”的决定后悔?

美国,多么令人向往的国家。从小,我妈就希望我出人头地,而一个标竿就是出国,留洋。而我也从小迫切地希望能出去走走,去看我妈妈、我外婆、我曾外婆都从没看过的世界。可是,就在两年前,我真的到了美国,三个星期的时间里给我的只有无助、寂寞,也正是这三周灰色的印象,让我在本科毕业时放弃了出国的想法。如今,为何我又萌生了出国的想法呢?说实话,连我自己也说不清楚。或许小池子说的对吧,我心里还是有一颗出去走走的火苗,如果我出去了,或许有一天我会后悔;可如果我没有出去,我将来肯定会后悔。Follow your heart.

Young and to be young, follow your heart.

题目是“人生的转折点?”,这个出国的决定或许会影响我的一生,甚至改变我今后的职业和发展环境。之所以加个问号,是我又怀疑这个决定真的重大到会“改变”我的人生轨迹吗?会不会如其他决定一样,当我面对它的时候,它总是无限倍地放大,而当几年后我回头再看时,它只是我人生中轻描淡写的一笔。或许根本没有什么能真正改变人生,事实上也根本没有改变之说,谁也不知道“人生”原来怎么样。这就是人生,没什么东西是大不了的。对的错的,好的坏的,都筑成了人生的一分子。

最后想起了你的一句话:

我会勇敢的走下去,只要牵着你的手。

这里的路,已不再只是你我的爱情之路,而是你我的人生长路。