Ajay Krishnan - NOAA Affiliate
2016-11-01 17:21:06 UTC
Thanks a lot for the suggestions, Dave, Ed and Seth.
Seth, here's the response from the team that's working on the data:
The first solution is not realistic. There are too many
missing times - separating out into another data variable
would give the user a bifurcated data set.
The second solution is doable, but still doesnt make much
sense. The power of using a convention is that the data
can be dumped, used, graphed, in software which follows
the conventions quickly and easily. What good is it to
graph against a sequential index? Date/time needs to
be a coordinate to interact seamlessly with existing
software.
Ed,
I don't know to record the data as a time_offset in hours or seconds when
there is no information on the number of days that have passed since the
reference time.
-Ajay
Seth, here's the response from the team that's working on the data:
The first solution is not realistic. There are too many
missing times - separating out into another data variable
would give the user a bifurcated data set.
The second solution is doable, but still doesnt make much
sense. The power of using a convention is that the data
can be dumped, used, graphed, in software which follows
the conventions quickly and easily. What good is it to
graph against a sequential index? Date/time needs to
be a coordinate to interact seamlessly with existing
software.
Ed,
I don't know to record the data as a time_offset in hours or seconds when
there is no information on the number of days that have passed since the
reference time.
-Ajay
Send CF-metadata mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of CF-metadata digest..."
1. Re: Handling time when date is "missing" (Seth McGinnis)
2. Re: Handling time when date is "missing"
(Dave Allured - NOAA Affiliate)
3. Re: Feedback requested on proposed CF Simple Geometries
(Jonathan Gregory)
4. Re: Handling time when date is "missing"
(Armstrong, Edward M (398G))
----------------------------------------------------------------------
Message: 1
Date: Tue, 25 Oct 2016 14:26:18 -0600
Subject: Re: [CF-metadata] Handling time when date is "missing"
Content-Type: text/plain; charset=windows-1252
But then the data is non-compliant, and it sounds like a valid CF
solution is needed.
Two possible solutions come to my mind. The first way would be to store
the undated measurements separately. Record the normal measurements in
the normal way, and then record the undated measurements in a separate
data variable with an index coordinate instead of a time coordinate.
The other way would be not to use time as a coordinate variable at all,
but only as a data variable. Record all the measurements with an index
coordinate instead of a time coordinate. Then define data variables for
year, month, day, and time of measurement, and just fill in what's known
for each one. (It sounds like the month and year are still known even
if the day is not.) This is very similar to the approach taken for
trajectories; see example H.12 in the spec.
Cheers,
--Seth
Message: 2
Date: Tue, 25 Oct 2016 14:34:52 -0600
Subject: Re: [CF-metadata] Handling time when date is "missing"
gmail.com>
Content-Type: text/plain; charset="utf-8"
Thank you for the constructive alternatives.
--Dave
all
application
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/
attachments/20161025/803102dd/attachment-0001.html>
------------------------------
Message: 3
Date: Wed, 26 Oct 2016 14:33:37 +0100
Subject: Re: [CF-metadata] Feedback requested on proposed CF Simple
Geometries
Content-Type: text/plain; charset=utf-8
Dear Ben and Bert
Thanks for your emails, which help me to understand the simple geometry
proposals better. Just to be clear, I'd like to repeat my first question.
lines or
polygons) there is a *single* timeseries. For instance, in your example of
a
single geometry composed of several polygons, there is a single number for
each
time. But that is not the case for weather stations; for each weather
station
there is a timeseries, and at each time there is a different number (value
of
temperature, precipitation or whatever) for each weather station. You also
write, "The US National Weather Service?s National Water Model (NWM) ...
forecasts streamflow rates in about 2.7 million stream segments averaging
2km."
The stream network is a MultiLineString geometry, but I don't think there
is
just one value of streamflow applying to the entire network at any given
time;
I guess there is a different timeseries for each stream segment. But in my
example above, the Atlantic Ocean is a single polygon with a single
timeseries
for its average temperature, not a different timeseries for each node.
Thus I
am unclear about the dimensions of the data. In terms of your original
example,
does the data have dimensions (time,geometry, where geometry=1) or
(time,node)?
This seems to me to be a crucial difference. In the former case the simple
geometry can be regarded as a more complex alternative to cells bounds -
the
cell has a complicated geometry of nodes and lines, but it's still a single
cell. In the latter case you're providing many timeseries in an
unstructured
geometry, which is what ugrid describes. Which do you have in mind?
Nonetheless in both cases the geometries have to be described. I think the
difference is how we attach this description to the data or coordinates,
rather
than how the description is constructed.
You propose the index variable in order for the convention to be like
ugrid.
However this still seems to me to be an unnecessary complexity and use of
space
if you aren't going to have many shared nodes. I think the case for having
another convention, distinct from ugrid, is stronger if it is *unlike*
ugrid
in this respect, and therefore simpler as well.
I agree that repeating the inside/outside flag many times is wasteful.
That,
coupled with your clarification that you may have several geometries, each
consisting of several elements (points, lines, polygons), means that you
need,
in effect, a ragged array of ragged arrays (geometry,element,node). This is
more complicated than DSGs, but it seems to me it would be reasonably easy
to
understand if your multi-geometry example
https://github.com/bekozi/netCDF-CF-simple-geometry/
wiki/VLEN-Arrays-in-NetCDF-3#multipolygon-example
geom=3;
part=11;
node=36;
int number_of_parts(geom);
number_of_parts:parts="number_of_nodes";
int number_of_nodes(part);
number_of_nodes:inout="inout";
char inout(part);
float x(node);
float y(node);
number_of_parts=6, 3, 2;
number_of_nodes=4, 3, 3, 3, 3, 3, 3, 5, 3, 3, 3;
inout="OIIIOOOIO";
x=0, 20, 20, 0, 1, 10, 19, 5, 7, 9, 11, 13, 15, 5, 9, 7, 11, 15, 13, -40,
-20, -45, -20, -10, -10, -30, -45, -30, -20, -20, 30, 45, 10, 25, 50, 30;
y = 0, 0, 20, 20, 1, 5, 1, 15, 19, 15, 15, 19, 15, 25, 25, 29, 25, 25,
29,
-40, -45, -30, -35, -30, -10, -5, -20, -20, -15, -25, 20, 40, 40, 5, 10,
15;
where I assume that all polygons are closed.
What do you think?
Best wishes
Jonathan
------------------------------
Message: 4
Date: Wed, 26 Oct 2016 18:54:18 +0000
Subject: Re: [CF-metadata] Handling time when date is "missing"
Content-Type: text/plain; charset="us-ascii"
Jay,
You could use the variable time as a single value to establish the time
(in complete CF date format) of the first observation
Another multi dimension array can then be used to store the time offset
(in hours or seconds etc.) of each measurement from variable time
Or else convert the hourly measurements into a proper CF date format to
store in variable time
Date: Tuesday, October 25, 2016 at 11:07 AM
Subject: [CF-metadata] Handling time when date is "missing"
Hi All,
I have a user that's converting some IMMA format files to CF compliant
NetCDF files.
The problem is that, we've run into several measurements where just the
hour of measurement has been recorded without the corresponding "date". We
would prefer not to omit these data in the conversion, because they are
considered valid measurements (and play a role in monthly summary
statistics)
How do we represent this in a valid CF NetCDF format since we can't use
_FillValues for 'time'? Any suggestions for handling such special cases?
Thanks,
Ajay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/
attachments/20161026/e7b7ecfa/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
------------------------------
End of CF-metadata Digest, Vol 162, Issue 13
********************************************
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of CF-metadata digest..."
1. Re: Handling time when date is "missing" (Seth McGinnis)
2. Re: Handling time when date is "missing"
(Dave Allured - NOAA Affiliate)
3. Re: Feedback requested on proposed CF Simple Geometries
(Jonathan Gregory)
4. Re: Handling time when date is "missing"
(Armstrong, Edward M (398G))
----------------------------------------------------------------------
Message: 1
Date: Tue, 25 Oct 2016 14:26:18 -0600
Subject: Re: [CF-metadata] Handling time when date is "missing"
Content-Type: text/plain; charset=windows-1252
But then the data is non-compliant, and it sounds like a valid CF
solution is needed.
Two possible solutions come to my mind. The first way would be to store
the undated measurements separately. Record the normal measurements in
the normal way, and then record the undated measurements in a separate
data variable with an index coordinate instead of a time coordinate.
The other way would be not to use time as a coordinate variable at all,
but only as a data variable. Record all the measurements with an index
coordinate instead of a time coordinate. Then define data variables for
year, month, day, and time of measurement, and just fill in what's known
for each one. (It sounds like the month and year are still known even
if the day is not.) This is very similar to the approach taken for
trajectories; see example H.12 in the spec.
Cheers,
--Seth
Ajay,
I think this is an exception to CF. I recommend using _FillValue or
missing_value on the time coordinate. Document this in a comment
attribute on the time coordinate variable.
Also document this somehow in another global attribute that explains you
made this exception to the CF conventions. Follow CF conventions in all
other regards.
Then, try to remember to warn people about this when you distribute the
data. CF compliant time coordinates are fundamental to many application
programs, and I expect they will choke or introduce subtle errors if
missing values are in there. So users will need to provide special
handling for such files. HTH.
--Dave
(Please reply to list only)
On Tue, Oct 25, 2016 at 12:07 PM, Ajay Krishnan - NOAA Affiliate
Hi All,
I have a user that's converting some IMMA format files to CF
compliant NetCDF files.
The problem is that, we've run into several measurements where just
the hour of measurement has been recorded without the corresponding
"date". We would prefer not to omit these data in the conversion,
because they are considered valid measurements (and play a role in
monthly summary statistics)
How do we represent this in a valid CF NetCDF format since we can't
use _FillValues for 'time'? Any suggestions for handling such
special cases?
Thanks,
Ajay
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
------------------------------I think this is an exception to CF. I recommend using _FillValue or
missing_value on the time coordinate. Document this in a comment
attribute on the time coordinate variable.
Also document this somehow in another global attribute that explains you
made this exception to the CF conventions. Follow CF conventions in all
other regards.
Then, try to remember to warn people about this when you distribute the
data. CF compliant time coordinates are fundamental to many application
programs, and I expect they will choke or introduce subtle errors if
missing values are in there. So users will need to provide special
handling for such files. HTH.
--Dave
(Please reply to list only)
On Tue, Oct 25, 2016 at 12:07 PM, Ajay Krishnan - NOAA Affiliate
Hi All,
I have a user that's converting some IMMA format files to CF
compliant NetCDF files.
The problem is that, we've run into several measurements where just
the hour of measurement has been recorded without the corresponding
"date". We would prefer not to omit these data in the conversion,
because they are considered valid measurements (and play a role in
monthly summary statistics)
How do we represent this in a valid CF NetCDF format since we can't
use _FillValues for 'time'? Any suggestions for handling such
special cases?
Thanks,
Ajay
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Message: 2
Date: Tue, 25 Oct 2016 14:34:52 -0600
Subject: Re: [CF-metadata] Handling time when date is "missing"
gmail.com>
Content-Type: text/plain; charset="utf-8"
Thank you for the constructive alternatives.
--Dave
But then the data is non-compliant, and it sounds like a valid CF
solution is needed.
Two possible solutions come to my mind. The first way would be to store
the undated measurements separately. Record the normal measurements in
the normal way, and then record the undated measurements in a separate
data variable with an index coordinate instead of a time coordinate.
The other way would be not to use time as a coordinate variable at all,
but only as a data variable. Record all the measurements with an index
coordinate instead of a time coordinate. Then define data variables for
year, month, day, and time of measurement, and just fill in what's known
for each one. (It sounds like the month and year are still known even
if the day is not.) This is very similar to the approach taken for
trajectories; see example H.12 in the spec.
Cheers,
--Seth
yousolution is needed.
Two possible solutions come to my mind. The first way would be to store
the undated measurements separately. Record the normal measurements in
the normal way, and then record the undated measurements in a separate
data variable with an index coordinate instead of a time coordinate.
The other way would be not to use time as a coordinate variable at all,
but only as a data variable. Record all the measurements with an index
coordinate instead of a time coordinate. Then define data variables for
year, month, day, and time of measurement, and just fill in what's known
for each one. (It sounds like the month and year are still known even
if the day is not.) This is very similar to the approach taken for
trajectories; see example H.12 in the spec.
Cheers,
--Seth
Ajay,
I think this is an exception to CF. I recommend using _FillValue or
missing_value on the time coordinate. Document this in a comment
attribute on the time coordinate variable.
Also document this somehow in another global attribute that explains
I think this is an exception to CF. I recommend using _FillValue or
missing_value on the time coordinate. Document this in a comment
attribute on the time coordinate variable.
Also document this somehow in another global attribute that explains
made this exception to the CF conventions. Follow CF conventions in
other regards.
Then, try to remember to warn people about this when you distribute the
data. CF compliant time coordinates are fundamental to many
Then, try to remember to warn people about this when you distribute the
data. CF compliant time coordinates are fundamental to many
programs, and I expect they will choke or introduce subtle errors if
missing values are in there. So users will need to provide special
handling for such files. HTH.
--Dave
(Please reply to list only)
On Tue, Oct 25, 2016 at 12:07 PM, Ajay Krishnan - NOAA Affiliate
Hi All,
I have a user that's converting some IMMA format files to CF
compliant NetCDF files.
The problem is that, we've run into several measurements where just
the hour of measurement has been recorded without the corresponding
"date". We would prefer not to omit these data in the conversion,
because they are considered valid measurements (and play a role in
monthly summary statistics)
How do we represent this in a valid CF NetCDF format since we can't
use _FillValues for 'time'? Any suggestions for handling such
special cases?
Thanks,
Ajay
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________missing values are in there. So users will need to provide special
handling for such files. HTH.
--Dave
(Please reply to list only)
On Tue, Oct 25, 2016 at 12:07 PM, Ajay Krishnan - NOAA Affiliate
Hi All,
I have a user that's converting some IMMA format files to CF
compliant NetCDF files.
The problem is that, we've run into several measurements where just
the hour of measurement has been recorded without the corresponding
"date". We would prefer not to omit these data in the conversion,
because they are considered valid measurements (and play a role in
monthly summary statistics)
How do we represent this in a valid CF NetCDF format since we can't
use _FillValues for 'time'? Any suggestions for handling such
special cases?
Thanks,
Ajay
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/
attachments/20161025/803102dd/attachment-0001.html>
------------------------------
Message: 3
Date: Wed, 26 Oct 2016 14:33:37 +0100
Subject: Re: [CF-metadata] Feedback requested on proposed CF Simple
Geometries
Content-Type: text/plain; charset=utf-8
Dear Ben and Bert
Thanks for your emails, which help me to understand the simple geometry
proposals better. Just to be clear, I'd like to repeat my first question.
You explain that the need is to specify spatial coordinates with a simple
geometry for a timeSeries variable. For example, this could be for the
discharge as a function of time across some line in a river (your
example),geometry for a timeSeries variable. For example, this could be for the
discharge as a function of time across some line in a river (your
or I suppose it could be an average temperature as a function of time for
the Atlantic Ocean, where you wanted to supply the polygon which drew the
outline of the basin. Have I got the idea?
to which you repliedthe Atlantic Ocean, where you wanted to supply the polygon which drew the
outline of the basin. Have I got the idea?
Yes, you have this mostly right. It?s common to have a collection of
points(weather stations), lines (stream reaches), or polygons (hydrologic
catchments) with an associated time series
I was asking whether this means that for each *collection* (of points,catchments) with an associated time series
lines or
polygons) there is a *single* timeseries. For instance, in your example of
a
single geometry composed of several polygons, there is a single number for
each
time. But that is not the case for weather stations; for each weather
station
there is a timeseries, and at each time there is a different number (value
of
temperature, precipitation or whatever) for each weather station. You also
write, "The US National Weather Service?s National Water Model (NWM) ...
forecasts streamflow rates in about 2.7 million stream segments averaging
2km."
The stream network is a MultiLineString geometry, but I don't think there
is
just one value of streamflow applying to the entire network at any given
time;
I guess there is a different timeseries for each stream segment. But in my
example above, the Atlantic Ocean is a single polygon with a single
timeseries
for its average temperature, not a different timeseries for each node.
Thus I
am unclear about the dimensions of the data. In terms of your original
example,
does the data have dimensions (time,geometry, where geometry=1) or
(time,node)?
This seems to me to be a crucial difference. In the former case the simple
geometry can be regarded as a more complex alternative to cells bounds -
the
cell has a complicated geometry of nodes and lines, but it's still a single
cell. In the latter case you're providing many timeseries in an
unstructured
geometry, which is what ugrid describes. Which do you have in mind?
Nonetheless in both cases the geometries have to be described. I think the
difference is how we attach this description to the data or coordinates,
rather
than how the description is constructed.
You propose the index variable in order for the convention to be like
ugrid.
However this still seems to me to be an unnecessary complexity and use of
space
if you aren't going to have many shared nodes. I think the case for having
another convention, distinct from ugrid, is stronger if it is *unlike*
ugrid
in this respect, and therefore simpler as well.
I agree that repeating the inside/outside flag many times is wasteful.
That,
coupled with your clarification that you may have several geometries, each
consisting of several elements (points, lines, polygons), means that you
need,
in effect, a ragged array of ragged arrays (geometry,element,node). This is
more complicated than DSGs, but it seems to me it would be reasonably easy
to
understand if your multi-geometry example
https://github.com/bekozi/netCDF-CF-simple-geometry/
wiki/VLEN-Arrays-in-NetCDF-3#multipolygon-example
geom=3;
part=11;
node=36;
int number_of_parts(geom);
number_of_parts:parts="number_of_nodes";
int number_of_nodes(part);
number_of_nodes:inout="inout";
char inout(part);
float x(node);
float y(node);
number_of_parts=6, 3, 2;
number_of_nodes=4, 3, 3, 3, 3, 3, 3, 5, 3, 3, 3;
inout="OIIIOOOIO";
x=0, 20, 20, 0, 1, 10, 19, 5, 7, 9, 11, 13, 15, 5, 9, 7, 11, 15, 13, -40,
-20, -45, -20, -10, -10, -30, -45, -30, -20, -20, 30, 45, 10, 25, 50, 30;
y = 0, 0, 20, 20, 1, 5, 1, 15, 19, 15, 15, 19, 15, 25, 25, 29, 25, 25,
29,
-40, -45, -30, -35, -30, -10, -5, -20, -20, -15, -25, 20, 40, 40, 5, 10,
15;
where I assume that all polygons are closed.
What do you think?
Best wishes
Jonathan
------------------------------
Message: 4
Date: Wed, 26 Oct 2016 18:54:18 +0000
Subject: Re: [CF-metadata] Handling time when date is "missing"
Content-Type: text/plain; charset="us-ascii"
Jay,
You could use the variable time as a single value to establish the time
(in complete CF date format) of the first observation
Another multi dimension array can then be used to store the time offset
(in hours or seconds etc.) of each measurement from variable time
Or else convert the hourly measurements into a proper CF date format to
store in variable time
Date: Tuesday, October 25, 2016 at 11:07 AM
Subject: [CF-metadata] Handling time when date is "missing"
Hi All,
I have a user that's converting some IMMA format files to CF compliant
NetCDF files.
The problem is that, we've run into several measurements where just the
hour of measurement has been recorded without the corresponding "date". We
would prefer not to omit these data in the conversion, because they are
considered valid measurements (and play a role in monthly summary
statistics)
How do we represent this in a valid CF NetCDF format since we can't use
_FillValues for 'time'? Any suggestions for handling such special cases?
Thanks,
Ajay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/
attachments/20161026/e7b7ecfa/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
CF-metadata mailing list
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
------------------------------
End of CF-metadata Digest, Vol 162, Issue 13
********************************************