Data-flow manual open for discussion

The lead centre has prepared a collection scheme for meta-data related to GRUAN measurements. All information describing sites, measuring systems, and measurements will be collected with well-defined XML files and stored in a meta-data base at the lead centre. There are comfortable tools available programmed in Java for generating and submitting the GRUAN meta-data (GMD) files.

The lead centre has prepared a manual describing in detail which and how data and meta-data will be collected. This manual is attached to this blog and open for discussion.

Additionally, a short guide was drafted which explains the major steps for generating and submitting meta-data. This guide is also available here. Feedback of any kind is welcome.

The data-flow schemes and tools are currently tested in Lindenberg and during the CIMO radiosonde inter-comparison campaign in China. After this testing and discussion phase we will start asking stations to submit data according to that scheme in September.


3 Responses to Data-flow manual open for discussion

  1. Masatomo Fujiwara says:

    Dear Michael,
    Thank you very much for preparing the Manual for the Data Management.
    I have one comment.
    At Section 2.4.1 (Original Raw Data (Level 0)),
    please put stronger emphasis that these tables
    are imcomplete and will evolve further.
    I am afraid that currently, this section gives
    the readers an impression that only these
    instruments and softwares are for GRUAN.

    With my best wishes,

    P.S. Holger and Franz showed me your software
    here in Yangjiang, China. We have made so far
    3 Scientific radiosonde flights (each having
    4 different potential reference radiosondes)
    and more than 10 operational radiosonde flights
    (each having 5 or 6 different-type radiosondes)
    under the 8th WMO Intercomparison of Radiosonde

  2. peterthorne says:

    My apologies for the delay in reading this and making the time to comment upon it. It doesn’t appear that I can attach a file in making a comment and I don’t want to start a new post, so I will send the marked up pdf that I have under separate cover to the Lead Centre.

    Overall, I think this reflects a lot of hard work and effort on the part of Michael and colleagues and I am very grateful for that. They have clearly thought long and hard about a number of issues to ensure that a scientifically robust set of guidance results. I would have two major comments that are not primarily scientific in nature and then a generic comment.

    Firstly, I think that it would simplify things and save effort on the Lead Centre in the long run if they were to make the top level manual data type independent. They would then produce specific annexes for each instrument type as the capability to process them is developed. Having a generic high level document allows sites to ascertain requirements / overheads for instrument types not yet covered and makes a very clean understanding that mixing generic with instrument specific in my opinion makes harder than necessary. So, I would personally advocate splitting this so that it is generic main manual and a radiosonde specific first annex to that.

    Secondly, although I agree that it is important that the operator be identified I think we need to think long and hard as to how much personal information we can post without falling foul of national / international data protection issues. I would urge our going for a lowest common denominator set which means in my mind that we ensure that each operator at each station has a unique and anonymised id that is always associated with the measurements made by them and that id is logged and maintained at the station. I do not think it realistic to append additional information than solely this id to the data stream. If the additional info is maintained locally at the site this would be legally most astute as they will be most conversant with relevant legislation that pertains in their locale if further details are requested.

    My generic comment is that for GRUAN to work the sites need minimal additional overhead. Therefore we should consider whether generic GUIs / scripts etc. can be set up to do much of the work that require minimal operator training and take up little additional valuable time. The Lead Centre have clearly gone a long way down the line but in redrafting this it would be worth considering what additional user applications can be developed either immediately or in the longer term. I’d also urge the redraft to be more explicit about the likely overhead on a trained operator of undertaking this to make clear to the sites that the additional resource requirement is affordable.

  3. Franz Immler says:

    A pdf-version of the maual with Peter’s comments is available here:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: