acgc.netcdf

High-level tools for writing and reading netCDF files

Comparison to xarray:

  • For reading netCDF files, xarray is generally better than acgc.netcdf.
  • For creating new netCDF files, using write_geo_nc is often more concise than xarray.
  • For editing netCDF files (e.g. modifying a few attributes or variables), put_nc_att and put_nc_var can be convenient because the rest of the file remains unchanged, unlike xarray. (As of 2023-12, using xarray to read and write a netCDF file does not produce identical files. Command line netCDF operators (NCO) is another option for editing netCDF.)
  1# -*- coding: utf-8 -*-
  2"""High-level tools for writing and reading netCDF files
  3
  4Comparison to xarray:
  5- For *reading* netCDF files, xarray is generally better than `acgc.netcdf`.
  6- For *creating* new netCDF files, using `write_geo_nc` is often more concise than xarray.
  7- For *editing* netCDF files (e.g. modifying a few attributes or variables), 
  8`put_nc_att` and `put_nc_var` can be convenient because the rest of the file remains unchanged, 
  9unlike xarray. (As of 2023-12, using xarray to read and write a netCDF file does *not* produce 
 10identical files. Command line netCDF operators (NCO) is another option for editing netCDF.)
 11"""
 12
 13from datetime import datetime
 14import warnings
 15import netCDF4 as nc
 16import numpy   as np
 17
 18def get_nc_var(filename,varname):
 19    """Read a variable from a netCDF file
 20    
 21    Parameters
 22    ----------
 23    filename : str
 24        name/path of netCDF file
 25    varname : str
 26        name of variable that will be retrieved
 27        
 28    Returns
 29    -------
 30    data : N-D array
 31        value of variable
 32    """
 33
 34    # Open file for reading
 35    ncfile = nc.Dataset(filename,'r')
 36
 37    # Get the desired variable
 38    data = ncfile.variables[varname][:]
 39
 40    # Close the file
 41    ncfile.close()
 42
 43    return data
 44
 45def get_nc_att(filename,varname,attname,glob=False):
 46    """ Read an attribute from a netCDF file
 47          
 48    Parameters
 49    ----------
 50    filename : str
 51        name/path of netCDF file
 52    varname : str
 53        name of variable 
 54    attname : str
 55        name of attribute that will be retrieved
 56    glob : bool, default=False
 57        Set glob=True to access global file attribues (varname will be ignored) 
 58        and glob=False for variable attributes
 59        
 60    Returns
 61    -------
 62    data : float or str
 63        attribute value
 64    """
 65
 66    # Open file for reading
 67    ncfile = nc.Dataset(filename,'r')
 68
 69    # Get the desired attribute
 70    if glob:
 71        data = ncfile.getncattr(attname)
 72    else:
 73        data = ncfile.variables[varname].getncattr(attname)
 74
 75    # Close the file
 76    ncfile.close()
 77
 78    return data
 79    
 80def get_nc_varnames(filename):
 81    """ Read variable names from a netCDF file
 82    
 83    Parameters
 84    ----------
 85    filename : str
 86        name/path of netCDF file
 87    
 88    Returns
 89    -------
 90    list of strings containing variable names within filename   
 91    """
 92
 93    # Open file for reading
 94    ncfile = nc.Dataset(filename,'r')
 95
 96    # Get the desired variable
 97    data = list(ncfile.variables.keys())
 98
 99    # Close the file
100    ncfile.close()
101
102    return data
103
104def get_nc_attnames(filename,varname,glob=False):
105    """ Read attributes from a netCDF file
106    
107    Parameters
108    ----------
109    filename : str
110        name/path of netCDF file
111    varname : str
112        name of variable
113    glob : bool, default=False
114        Set glob=True to access global file attribues (varname will be ignored) 
115        and glob=False for variable attributes        
116    
117    Returns
118    -------
119    list of strings containing attribute names   
120    """
121
122    # Open file for reading
123    ncfile = nc.Dataset(filename,'r')
124
125    # Get the attribute names
126    if glob:
127        data = ncfile.ncattrs()
128    else:  
129        data = ncfile.variables[varname].ncattrs()
130
131    # Close the file
132    ncfile.close()
133
134    return data
135
136def put_nc_var(filename,varname,value):
137    """ Assign a new value to an existing variable and existing file
138    
139    Parameters
140    ----------
141    filename : str
142        name/path of netCDF file
143    varname : str
144        name of variable that will be assigned
145    value : N-D array
146        data values that will be assigned to variable
147        must have same shape as the current variable values
148    """
149
150    # Open file for reading
151    ncfile = nc.Dataset(filename,'r+')
152
153    # Set value
154    ncfile.variables[varname][:] = value
155
156    # Close the file
157    ncfile.close()
158
159def put_nc_att(filename,varname,attname,value,glob=False):
160    """ Assign a new value to an existing attribute
161    
162    Parameters
163    ----------
164    filename : str
165        name/path of netCDF file
166    varname : str
167        name of variable
168    attname : str
169        name of attribute that will be assigned
170    value : str, float, list
171        data values that will be assigned to the attribute
172    """
173
174    # Open file for reading
175    ncfile = nc.Dataset(filename,'r+')
176
177    # Set attribute
178    if glob:
179        ncfile.setncattr(attname,value)
180    else:
181        ncfile.variables[varname].setncattr(attname,value)
182
183    # Close the file
184    ncfile.close()
185
186def write_geo_nc(filename, variables,
187                xDim=None, yDim=None,
188                zDim=None, zUnits=None,
189                tDim=None, tUnits=None,
190                globalAtt=None,
191                classic=True, nc4=True, compress=True,
192                clobber=False, verbose=False ):
193    '''Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant
194    
195    This function allows specification of netCDF files more concisely than many alternative
196    python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and 
197    inferring the dimensions for each variable from the variable shape. This is well suited 
198    for many lat-lon-lev-time and xyzt datasets. 
199
200    Each variable is defined as a dict. 
201
202    Required keys: 
203    -    'name'  (str)       variable name 
204    -    'value' (array)     N-D array of variable data 
205    
206    Special keys (all optional):
207    -    'dim_names' (list of str)  names of the dimension variables corresponding to dimensions of variable
208        -    If dim_names is not provided, the dimension variables will be inferred from the data shape.
209        -    If all dimensions have unique lengths, the inferred dimensions are unambiguous. 
210        -    If two or more dimensions have equal lengths, then the dim_names key should be used.
211    -    'fill_value'(numeric) value that should replace NaNs
212    -    'unlimited' (bool)  specifies if dimension is unlimited
213    -    'pack'      (bool)  specifies that variable should be compressed with integer packing
214    -    'packtype'  (str, default='i2')   numeric type for packed data, commonly i1 or i2
215    -    'calendar'  (str)   string for COARDS/CF calendar convention. Only used for time variable
216    
217    All other keys are assigned to variable attributes. CF conventions expect the following:
218    -    'long_name' (str)   long name for variable
219    -    'units'     (str)   units of variable
220
221    Example: ```{'name': 'O3',
222          'value': data,
223          'long_name': 'ozone mole fraction',
224          'units': 'mol/mol'}```
225    
226
227    Parameters
228    ----------
229    filename : str
230        name/path for file that will be created
231    variables : list of dict-like
232        Each variable is specified as a dict, as described above.
233    xDim : array or dict-like, optional
234        x dimension of data. If dict-like, then it should contain same keys as variables.
235        If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat'
236    yDim : array or dict-like, optional
237        y dimension of data. If dict-like, then it should contain same keys as variables.
238        If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon'
239    zDim : array or dict-like, optional
240        z dimension of data. If dict-like, then it should contain same keys as variables.
241        If zDim is an array, then it is named 'lev'
242        zUnits is named used to infer the variable long name:
243        -    m, km   -> zDim is "altitude"
244        -    Pa, hPa -> zDim is "pressure"
245        -    None    -> zDim is "level"
246    zUnits : {'m','km','Pa','hPa','level','' None}
247        Units for zDim. Ignored if zDim is dict-like.
248    tDim : array or dict-like, optional
249        time dimension of data. If dict-like, then it should contain the same keys as variables.
250        If tDim is an array, then tUnits is used and the dimension is set as unlimited and 
251        named 'time'. datetime-like variables are supported, as are floats and numeric.
252    tUnits : str, optional
253        Units for tDim. Special treatment will be used for ``"<time units> since <date>"``
254    globalAtt : dict-like, optional
255        dict of global file attributes
256    classic : bool, default=True
257        specify whether file should use netCDF classic data model (includes netCDF4 classic)
258    nc4 : bool, default=True
259        specify whether file should be netCDF4. Required for compression.
260    compress : bool, default=True
261        specify whether all variables should be compressed (lossless). 
262        In addition to lossless compression, setting pack=True for individual variables enables 
263        lossy integer packing compression.
264    clobber : bool, default=False
265        specify whether a pre-existing file with the same name should be overwritten
266    verbose : bool, default=False
267        specify whether extra output should be written while creating the netCDF file
268    '''
269
270    # NetCDF file type
271    if nc4:
272        if classic:
273            ncfmt = 'NETCDF4_CLASSIC'
274        else:
275            ncfmt = 'NETCDF4'
276    else:
277        ncfmt = 'NETCDF3_64BIT_OFFSET'
278
279    ### Open file for output
280
281    f = nc.Dataset( filename, 'w', format=ncfmt, clobber=clobber )
282    f.Conventions = "COARDS/CF"
283    f.History = datetime.now().strftime('%Y-%m-%d %H:%M:%S') + \
284        ' : Created by write_geo_nc (python)'
285
286    # Write global attributes, if any
287    if globalAtt is not None:
288        f.setncatts(globalAtt)
289
290    ### Define dimensions
291
292    dimName = []
293    dimList = []
294    dimSize = []
295    varList = []
296
297    if xDim is not None:
298        if not isinstance(xDim, dict):
299            xDim = {'name': 'lon',
300                    'value': xDim,
301                    'long_name': 'longitude',
302                    'units': 'degrees_east'}    
303
304        ncDim, ncVar = _create_geo_dim( xDim, f,
305                                      compress=compress,
306                                      classic=classic,
307                                      verbose=verbose )
308
309        dimName.append( ncDim.name )
310        dimList.append( ncDim )
311        varList.append( ncVar )
312        dimSize.append( len(ncVar[:]) )
313
314    if yDim is not None:
315        if not isinstance(yDim, dict):
316            yDim = {'name': 'lat',
317                    'value': yDim,
318                    'long_name': 'latitude',
319                    'units': 'degrees_north'}
320
321        ncDim, ncVar = _create_geo_dim( yDim, f,
322                                      compress=compress,
323                                      classic=classic,
324                                      verbose=verbose )
325
326        dimName.append( ncDim.name )
327        dimList.append( ncDim )
328        varList.append( ncVar )
329        dimSize.append( len(ncVar[:]) )
330
331    if zDim is not None:
332        if not isinstance(zDim, dict):
333            # Infer name from units
334            if zUnits in ['m','km']:
335                lname = 'altitude'
336            elif zUnits in ['Pa','hPa']:
337                lname = 'pressure'
338            elif zUnits in ['level','',None]:
339                lname = 'level'
340                zUnits = ''
341            else:
342                raise ValueError( f'Units of {zUnits:s} for zDim have not been implemented' )
343
344            zDim = {'name': 'lev',
345                    'value': zDim,
346                    'long_name': lname,
347                    'units': zUnits}
348
349        ncDim, ncVar = _create_geo_dim( zDim, f,
350                                      compress=compress,
351                                      classic=classic,
352                                      verbose=verbose )
353
354        dimName.append( ncDim.name )
355        dimList.append( ncDim )
356        varList.append( ncVar )
357        dimSize.append( len(ncVar[:]) )
358
359    if tDim is not None:
360        if not isinstance(tDim, dict):
361            if tUnits is None:
362                tUnits = ''
363            tDim = {'name': 'time',
364                    'value': tDim,
365                    'long_name': 'time',
366                    'units': tUnits,
367                    'unlimited': True }
368
369        ncDim, ncVar = _create_geo_dim( tDim, f,
370                                      compress=compress,
371                                      classic=classic,
372                                      time=True,
373                                      verbose=verbose )
374
375        dimName.append( ncDim.name )
376        dimList.append( ncDim )
377        varList.append( ncVar )
378        dimSize.append( len(ncVar[:]) )
379
380    # Dimension sizes that are not unique i.e. shared by two or more dimensions
381    duplicate_dim_sizes = {s for s in dimSize if dimSize.count(s) > 1}
382
383    ### Define variables
384
385    for var in variables:
386
387        if type(var) is not dict:
388            raise TypeError( 'All variables must be passed as dicts' )
389
390        # Shape of the variable
391        vShape = var['value'].shape
392
393        # If dim_names is provided, otherwise use inference
394        if 'dim_names' in var.keys():
395
396            # Set dimensions based on provided names
397            dID = var['dim_names']
398
399            # Confirm that correct number of dimensions are provided
400            if len(dID) != len(vShape):
401                raise ValueError( 'Variable {:s} has dimension {:d} and {:d} dim_names'.\
402                                 format( var['name'], len(vShape), len(dID) ) )
403            # Shape of each named dimension
404            dShape = tuple( dimSize[dimName.index(d)] for d in dID )
405
406            # Confirm that the named dimensions match the shape of the variable
407            if dShape != vShape:
408                raise ValueError( 'Shape of the dimensions [{:s}]=[{:s}] must match '.\
409                                 format( ','.join(dID),','.join([str(i) for i in dShape]) )
410                                 + 'the shape of variable {:s} which is [{:s}]'.\
411                                 format( var['name'],','.join([str(i) for i in vShape]) ) )
412
413        else:
414
415            # If this variable uses any of the duplicate dims, give a warning 
416            if np.any(np.isin(vShape, duplicate_dim_sizes)):
417                warnings.warn('Dimensions of variable {:s} cannot be uniquely identified.\n'.\
418                              format(var['name'])
419                              +'The "dim_names" key is strongly recommended.')
420
421            # Match dimensions of variable with defined dimensions based on shape
422            dID = []
423            for s in vShape:
424
425                # Find the dimensions that match the variable dimension
426                try:
427                    i = dimSize.index(s)
428                except ValueError:
429                    # No dimensions match
430                    raise ValueError( 'Cannot match dimensions for variable {:s}'.\
431                                     format( var['name'] ) )
432
433                # List of the dimension names
434                dID.append(dimName[i])
435
436        # Create the variable
437        ncVar = _create_geo_var( var, f, dID, compress=compress, classic=classic, verbose=verbose )
438
439        varList.append( ncVar )
440
441    # Close the file
442    f.close()
443
444def _create_geo_dim( var, fid, **kwargs ):
445    '''Create a netCDF dimension and associated variable in a open file
446
447    Parameters
448    ----------
449    var : dict-lik
450        Must contain keys for 'name', 'value'. Other keys will be assiged to attributes. 
451        e.g. 'unlimited' for unlimited dimension.
452    fid : netCDF file ID
453        reference to an open file
454
455    Returns
456    -------
457    ncDim, ncVar : 
458        IDs for the dimension and variable that were created
459    '''
460
461    try:
462        assert ('name' in var), \
463            'Var structure must contain "name" key: create_geo_dim'
464        assert ('value' in var), \
465            'Var structure must contain "value" key: create_geo_dim'
466    except AssertionError as exc:
467        # Close file and exit
468        fid.close()
469        raise IOError() from exc
470
471    size = len(var['value'])
472
473    # If the unlimited tag is True, then set size=None to create unlimited dimension
474    if 'unlimited' in var.keys():
475        if var['unlimited']:
476            size = None
477
478    ncDim = fid.createDimension( var['name'], size )
479
480    ncVar = _create_geo_var( var, fid, (var['name']), isDim=True, **kwargs )
481
482    return ncDim, ncVar
483
484def _get_nc_type( value, name='', classic=True ):
485    '''Get netCDF variable type for a python variable
486    
487    Parameters
488    ----------
489    value : any data type
490    name : str (optional)
491        name of the variable being examined. Only used for error messages
492    classic : bool
493        If classic=True, then error will be raised if variable type is not 
494        allowed in netCDF classic files
495
496    Returns
497    -------
498    vartype : str
499
500    '''
501
502    # Variable type for numpy arrays or native python
503    try:
504        vartype = value.dtype.str
505    except AttributeError:
506        vartype = type(value)
507
508    # Remove brackets
509    vartype = vartype.replace('<','').replace('>','')
510
511    # Raise error if the type isn't allowed
512    if (classic and (vartype not in ['i1','i2','i4','f4','f8','U1','str','bytes'])):
513        raise TypeError( 'Variable {:s} cannot have type {:s} in netCDF classic files'.format(
514            name, vartype ) )
515
516    return vartype
517
518def _infer_pack_scale_offset( data, nbits ):
519    '''Compute scale and offset for packing data to nbit integers 
520    Follow Unidata recommendations:
521    https://www.unidata.ucar.edu/software/netcdf/documentation/NUG/_best_practices.html
522    
523    Parameters
524    ----------
525    data : N-D array
526        data to be compressed
527    nbits : int
528        number of bits in the packed data type
529
530    Returns
531    -------
532    scale_factor, add_offset : numeric
533        recommended values for scale and offset
534    pack_fill_value : numeric
535        recommended fill value within the integer range
536    '''
537    min = np.nanmin( data )
538    max = np.nanmax( data )
539    scale_factor = ( max - min ) / (2**nbits - 2)
540    add_offset   = ( max + min ) / 2
541
542    # Use a fill value within the packed range
543    pack_fill_value   = -2**(nbits-1)
544
545    return scale_factor, add_offset, pack_fill_value
546
547def _integer_pack_data( data, scale_factor, add_offset, fill_value, pack_fill_value ):
548    '''Pack data using the provided scale, offset, and fill value
549    
550    Parameters
551    ----------
552    data : N-D array
553        data values to be packed
554    scale_factor, add_offset : float
555        scale and offset for packing
556    fill_value : numeric
557        fill value for the unpacked data
558    pack_fill_value : numeric
559        fill value for the packed data, must be in the allowed range of the packed integer type
560    
561    Returns
562    -------
563    pack_value : N-D array
564        data values converted to the integer range
565    '''
566
567    # Packed values
568    pack_value = ( data[:] - add_offset ) / scale_factor
569
570    # Set missing value
571    if np.ma.is_masked( data ):
572        pack_value[ data.mask ] = pack_fill_value
573    if fill_value is not None:
574        pack_value[ data==fill_value ] = pack_fill_value
575    if np.any( np.isnan(data) ):
576        pack_value[ np.isnan( data ) ] = pack_fill_value
577    if np.any( np.isinf(data) ):
578        pack_value[ np.isinf( data ) ] = pack_fill_value
579
580    return np.round( pack_value )
581
582def _create_geo_var( var, fid, dimIDs, compress=True, classic=True, time=False,
583                    fill_value=None, verbose=False, isDim=False ):
584    '''Create a netCDF variable, assuming common geospatial conventions
585    
586    Parameters
587    ----------
588    var : dict-like
589        See create_geo_nc for required and optional dict keys.
590    fid : netCDF file ID
591        reference to an open file
592    dimIDs : list
593        list of dimensions ID numbers corresponding to the dimensions of var.value
594    compress : bool, default=True
595        compress=True indicates that variable should be deflated with lossless compression
596    classic : bool, default=True
597        specify if file is netCDF Classic or netCDF4 Classic
598    time : bool, default=False
599        specify if variable has time units that require handling with calendar
600    isDim : bool, default=False
601        indicates dimesion variables
602        
603    Returns
604    -------
605    ncVar : netCDF variable ID that was created
606    '''
607    try:
608        assert ('name' in var), \
609            'Var structure must contain "name" key: create_geo_var'
610        assert ('value' in var), \
611            'Var structure must contain "value" key: create_geo_var'
612    except AssertionError as exc:
613        # Close file and exit
614        fid.close()
615        raise IOError from exc
616
617    # If this is a time variable and units are "<time units> since <date>", then convert
618    if (time and (' since ' in var['units']) ):
619        if 'calendar' in var.keys():
620            calendar = var['calendar']
621        else:
622            calendar = 'standard'
623        var['value'] = nc.date2num( var['value'], units=var['units'], calendar=calendar)    
624
625    # Variable type
626    vartype = _get_nc_type( var['value'], var['name'], classic )
627
628    # Fill value, if any
629    if 'fill_value' in var.keys():
630        fill_value = var['fill_value']
631    else:
632        fill_value = None
633
634    ### Progress towards packing variables as integers; 
635    ### Appears to be working, but only minimal testing so far
636
637    # Check if packing keywords are set
638    if isDim:
639        # Dimension variables should not be packed
640        pack = False
641    elif 'pack' in var.keys():
642        # Use pack key if provided
643        pack = var['pack']
644    else:
645        # Default to pack = False
646        pack = False
647
648    # Pack to integer data, if requested
649    if pack:
650
651        # Integer type for packed data
652        if 'packtype' in var.keys():
653            packtype = var['packtype']
654        else:
655            packtype = 'i2'
656        vartype = packtype
657
658        # Number of bits in the packed data type
659        if packtype=='i2':
660            n = 16
661        elif packtype=='i1':
662            n = 8
663        else:
664            raise ValueError(f'Packing to type {packtype:s} has not been implemented' )
665
666        # Get scale factor, offset and fill value for packing
667        scale_factor, add_offset, pack_fill_value = _infer_pack_scale_offset( var['value'][:], n )
668
669        var['scale_factor'] = scale_factor
670        var['add_offset']   = add_offset
671
672        # Scale data into the integer range
673        pack_value = _integer_pack_data( var['value'],
674                                        scale_factor,
675                                        add_offset,
676                                        fill_value,
677                                        pack_fill_value )
678
679        # Rename
680        fill_value = pack_fill_value
681        var['value'] = pack_value
682
683    else:
684        scale_factor = None
685        add_offset   = None
686    var.pop('pack',None)
687    var.pop('packtype',None)
688    ###
689
690    #*** Check whether the data type is allowed in classic data type
691
692    # Create the variable
693    ncVar = fid.createVariable( var['name'], vartype, dimIDs,
694                             zlib=compress, complevel=2,
695                             fill_value=fill_value )
696
697    # Write variable values 
698    ncVar[:] = var['value'][:]
699
700    # These keys are not attributes, so remove them
701    var.pop('name',None)
702    var.pop('value',None)
703    var.pop('unlimited',None)
704    var.pop('dim_names',None)
705    var.pop('fill_value',None)
706
707    # Save the remaining attributes
708    ncVar.setncatts(var)
709
710    return ncVar
def get_nc_var(filename, varname):
19def get_nc_var(filename,varname):
20    """Read a variable from a netCDF file
21    
22    Parameters
23    ----------
24    filename : str
25        name/path of netCDF file
26    varname : str
27        name of variable that will be retrieved
28        
29    Returns
30    -------
31    data : N-D array
32        value of variable
33    """
34
35    # Open file for reading
36    ncfile = nc.Dataset(filename,'r')
37
38    # Get the desired variable
39    data = ncfile.variables[varname][:]
40
41    # Close the file
42    ncfile.close()
43
44    return data

Read a variable from a netCDF file

Parameters
  • filename (str): name/path of netCDF file
  • varname (str): name of variable that will be retrieved
Returns
  • data (N-D array): value of variable
def get_nc_att(filename, varname, attname, glob=False):
46def get_nc_att(filename,varname,attname,glob=False):
47    """ Read an attribute from a netCDF file
48          
49    Parameters
50    ----------
51    filename : str
52        name/path of netCDF file
53    varname : str
54        name of variable 
55    attname : str
56        name of attribute that will be retrieved
57    glob : bool, default=False
58        Set glob=True to access global file attribues (varname will be ignored) 
59        and glob=False for variable attributes
60        
61    Returns
62    -------
63    data : float or str
64        attribute value
65    """
66
67    # Open file for reading
68    ncfile = nc.Dataset(filename,'r')
69
70    # Get the desired attribute
71    if glob:
72        data = ncfile.getncattr(attname)
73    else:
74        data = ncfile.variables[varname].getncattr(attname)
75
76    # Close the file
77    ncfile.close()
78
79    return data

Read an attribute from a netCDF file

Parameters
  • filename (str): name/path of netCDF file
  • varname (str): name of variable
  • attname (str): name of attribute that will be retrieved
  • glob (bool, default=False): Set glob=True to access global file attribues (varname will be ignored) and glob=False for variable attributes
Returns
  • data (float or str): attribute value
def get_nc_varnames(filename):
 81def get_nc_varnames(filename):
 82    """ Read variable names from a netCDF file
 83    
 84    Parameters
 85    ----------
 86    filename : str
 87        name/path of netCDF file
 88    
 89    Returns
 90    -------
 91    list of strings containing variable names within filename   
 92    """
 93
 94    # Open file for reading
 95    ncfile = nc.Dataset(filename,'r')
 96
 97    # Get the desired variable
 98    data = list(ncfile.variables.keys())
 99
100    # Close the file
101    ncfile.close()
102
103    return data

Read variable names from a netCDF file

Parameters
  • filename (str): name/path of netCDF file
Returns
  • list of strings containing variable names within filename
def get_nc_attnames(filename, varname, glob=False):
105def get_nc_attnames(filename,varname,glob=False):
106    """ Read attributes from a netCDF file
107    
108    Parameters
109    ----------
110    filename : str
111        name/path of netCDF file
112    varname : str
113        name of variable
114    glob : bool, default=False
115        Set glob=True to access global file attribues (varname will be ignored) 
116        and glob=False for variable attributes        
117    
118    Returns
119    -------
120    list of strings containing attribute names   
121    """
122
123    # Open file for reading
124    ncfile = nc.Dataset(filename,'r')
125
126    # Get the attribute names
127    if glob:
128        data = ncfile.ncattrs()
129    else:  
130        data = ncfile.variables[varname].ncattrs()
131
132    # Close the file
133    ncfile.close()
134
135    return data

Read attributes from a netCDF file

Parameters
  • filename (str): name/path of netCDF file
  • varname (str): name of variable
  • glob (bool, default=False): Set glob=True to access global file attribues (varname will be ignored) and glob=False for variable attributes
Returns
  • list of strings containing attribute names
def put_nc_var(filename, varname, value):
137def put_nc_var(filename,varname,value):
138    """ Assign a new value to an existing variable and existing file
139    
140    Parameters
141    ----------
142    filename : str
143        name/path of netCDF file
144    varname : str
145        name of variable that will be assigned
146    value : N-D array
147        data values that will be assigned to variable
148        must have same shape as the current variable values
149    """
150
151    # Open file for reading
152    ncfile = nc.Dataset(filename,'r+')
153
154    # Set value
155    ncfile.variables[varname][:] = value
156
157    # Close the file
158    ncfile.close()

Assign a new value to an existing variable and existing file

Parameters
  • filename (str): name/path of netCDF file
  • varname (str): name of variable that will be assigned
  • value (N-D array): data values that will be assigned to variable must have same shape as the current variable values
def put_nc_att(filename, varname, attname, value, glob=False):
160def put_nc_att(filename,varname,attname,value,glob=False):
161    """ Assign a new value to an existing attribute
162    
163    Parameters
164    ----------
165    filename : str
166        name/path of netCDF file
167    varname : str
168        name of variable
169    attname : str
170        name of attribute that will be assigned
171    value : str, float, list
172        data values that will be assigned to the attribute
173    """
174
175    # Open file for reading
176    ncfile = nc.Dataset(filename,'r+')
177
178    # Set attribute
179    if glob:
180        ncfile.setncattr(attname,value)
181    else:
182        ncfile.variables[varname].setncattr(attname,value)
183
184    # Close the file
185    ncfile.close()

Assign a new value to an existing attribute

Parameters
  • filename (str): name/path of netCDF file
  • varname (str): name of variable
  • attname (str): name of attribute that will be assigned
  • value (str, float, list): data values that will be assigned to the attribute
def write_geo_nc( filename, variables, xDim=None, yDim=None, zDim=None, zUnits=None, tDim=None, tUnits=None, globalAtt=None, classic=True, nc4=True, compress=True, clobber=False, verbose=False):
187def write_geo_nc(filename, variables,
188                xDim=None, yDim=None,
189                zDim=None, zUnits=None,
190                tDim=None, tUnits=None,
191                globalAtt=None,
192                classic=True, nc4=True, compress=True,
193                clobber=False, verbose=False ):
194    '''Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant
195    
196    This function allows specification of netCDF files more concisely than many alternative
197    python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and 
198    inferring the dimensions for each variable from the variable shape. This is well suited 
199    for many lat-lon-lev-time and xyzt datasets. 
200
201    Each variable is defined as a dict. 
202
203    Required keys: 
204    -    'name'  (str)       variable name 
205    -    'value' (array)     N-D array of variable data 
206    
207    Special keys (all optional):
208    -    'dim_names' (list of str)  names of the dimension variables corresponding to dimensions of variable
209        -    If dim_names is not provided, the dimension variables will be inferred from the data shape.
210        -    If all dimensions have unique lengths, the inferred dimensions are unambiguous. 
211        -    If two or more dimensions have equal lengths, then the dim_names key should be used.
212    -    'fill_value'(numeric) value that should replace NaNs
213    -    'unlimited' (bool)  specifies if dimension is unlimited
214    -    'pack'      (bool)  specifies that variable should be compressed with integer packing
215    -    'packtype'  (str, default='i2')   numeric type for packed data, commonly i1 or i2
216    -    'calendar'  (str)   string for COARDS/CF calendar convention. Only used for time variable
217    
218    All other keys are assigned to variable attributes. CF conventions expect the following:
219    -    'long_name' (str)   long name for variable
220    -    'units'     (str)   units of variable
221
222    Example: ```{'name': 'O3',
223          'value': data,
224          'long_name': 'ozone mole fraction',
225          'units': 'mol/mol'}```
226    
227
228    Parameters
229    ----------
230    filename : str
231        name/path for file that will be created
232    variables : list of dict-like
233        Each variable is specified as a dict, as described above.
234    xDim : array or dict-like, optional
235        x dimension of data. If dict-like, then it should contain same keys as variables.
236        If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat'
237    yDim : array or dict-like, optional
238        y dimension of data. If dict-like, then it should contain same keys as variables.
239        If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon'
240    zDim : array or dict-like, optional
241        z dimension of data. If dict-like, then it should contain same keys as variables.
242        If zDim is an array, then it is named 'lev'
243        zUnits is named used to infer the variable long name:
244        -    m, km   -> zDim is "altitude"
245        -    Pa, hPa -> zDim is "pressure"
246        -    None    -> zDim is "level"
247    zUnits : {'m','km','Pa','hPa','level','' None}
248        Units for zDim. Ignored if zDim is dict-like.
249    tDim : array or dict-like, optional
250        time dimension of data. If dict-like, then it should contain the same keys as variables.
251        If tDim is an array, then tUnits is used and the dimension is set as unlimited and 
252        named 'time'. datetime-like variables are supported, as are floats and numeric.
253    tUnits : str, optional
254        Units for tDim. Special treatment will be used for ``"<time units> since <date>"``
255    globalAtt : dict-like, optional
256        dict of global file attributes
257    classic : bool, default=True
258        specify whether file should use netCDF classic data model (includes netCDF4 classic)
259    nc4 : bool, default=True
260        specify whether file should be netCDF4. Required for compression.
261    compress : bool, default=True
262        specify whether all variables should be compressed (lossless). 
263        In addition to lossless compression, setting pack=True for individual variables enables 
264        lossy integer packing compression.
265    clobber : bool, default=False
266        specify whether a pre-existing file with the same name should be overwritten
267    verbose : bool, default=False
268        specify whether extra output should be written while creating the netCDF file
269    '''
270
271    # NetCDF file type
272    if nc4:
273        if classic:
274            ncfmt = 'NETCDF4_CLASSIC'
275        else:
276            ncfmt = 'NETCDF4'
277    else:
278        ncfmt = 'NETCDF3_64BIT_OFFSET'
279
280    ### Open file for output
281
282    f = nc.Dataset( filename, 'w', format=ncfmt, clobber=clobber )
283    f.Conventions = "COARDS/CF"
284    f.History = datetime.now().strftime('%Y-%m-%d %H:%M:%S') + \
285        ' : Created by write_geo_nc (python)'
286
287    # Write global attributes, if any
288    if globalAtt is not None:
289        f.setncatts(globalAtt)
290
291    ### Define dimensions
292
293    dimName = []
294    dimList = []
295    dimSize = []
296    varList = []
297
298    if xDim is not None:
299        if not isinstance(xDim, dict):
300            xDim = {'name': 'lon',
301                    'value': xDim,
302                    'long_name': 'longitude',
303                    'units': 'degrees_east'}    
304
305        ncDim, ncVar = _create_geo_dim( xDim, f,
306                                      compress=compress,
307                                      classic=classic,
308                                      verbose=verbose )
309
310        dimName.append( ncDim.name )
311        dimList.append( ncDim )
312        varList.append( ncVar )
313        dimSize.append( len(ncVar[:]) )
314
315    if yDim is not None:
316        if not isinstance(yDim, dict):
317            yDim = {'name': 'lat',
318                    'value': yDim,
319                    'long_name': 'latitude',
320                    'units': 'degrees_north'}
321
322        ncDim, ncVar = _create_geo_dim( yDim, f,
323                                      compress=compress,
324                                      classic=classic,
325                                      verbose=verbose )
326
327        dimName.append( ncDim.name )
328        dimList.append( ncDim )
329        varList.append( ncVar )
330        dimSize.append( len(ncVar[:]) )
331
332    if zDim is not None:
333        if not isinstance(zDim, dict):
334            # Infer name from units
335            if zUnits in ['m','km']:
336                lname = 'altitude'
337            elif zUnits in ['Pa','hPa']:
338                lname = 'pressure'
339            elif zUnits in ['level','',None]:
340                lname = 'level'
341                zUnits = ''
342            else:
343                raise ValueError( f'Units of {zUnits:s} for zDim have not been implemented' )
344
345            zDim = {'name': 'lev',
346                    'value': zDim,
347                    'long_name': lname,
348                    'units': zUnits}
349
350        ncDim, ncVar = _create_geo_dim( zDim, f,
351                                      compress=compress,
352                                      classic=classic,
353                                      verbose=verbose )
354
355        dimName.append( ncDim.name )
356        dimList.append( ncDim )
357        varList.append( ncVar )
358        dimSize.append( len(ncVar[:]) )
359
360    if tDim is not None:
361        if not isinstance(tDim, dict):
362            if tUnits is None:
363                tUnits = ''
364            tDim = {'name': 'time',
365                    'value': tDim,
366                    'long_name': 'time',
367                    'units': tUnits,
368                    'unlimited': True }
369
370        ncDim, ncVar = _create_geo_dim( tDim, f,
371                                      compress=compress,
372                                      classic=classic,
373                                      time=True,
374                                      verbose=verbose )
375
376        dimName.append( ncDim.name )
377        dimList.append( ncDim )
378        varList.append( ncVar )
379        dimSize.append( len(ncVar[:]) )
380
381    # Dimension sizes that are not unique i.e. shared by two or more dimensions
382    duplicate_dim_sizes = {s for s in dimSize if dimSize.count(s) > 1}
383
384    ### Define variables
385
386    for var in variables:
387
388        if type(var) is not dict:
389            raise TypeError( 'All variables must be passed as dicts' )
390
391        # Shape of the variable
392        vShape = var['value'].shape
393
394        # If dim_names is provided, otherwise use inference
395        if 'dim_names' in var.keys():
396
397            # Set dimensions based on provided names
398            dID = var['dim_names']
399
400            # Confirm that correct number of dimensions are provided
401            if len(dID) != len(vShape):
402                raise ValueError( 'Variable {:s} has dimension {:d} and {:d} dim_names'.\
403                                 format( var['name'], len(vShape), len(dID) ) )
404            # Shape of each named dimension
405            dShape = tuple( dimSize[dimName.index(d)] for d in dID )
406
407            # Confirm that the named dimensions match the shape of the variable
408            if dShape != vShape:
409                raise ValueError( 'Shape of the dimensions [{:s}]=[{:s}] must match '.\
410                                 format( ','.join(dID),','.join([str(i) for i in dShape]) )
411                                 + 'the shape of variable {:s} which is [{:s}]'.\
412                                 format( var['name'],','.join([str(i) for i in vShape]) ) )
413
414        else:
415
416            # If this variable uses any of the duplicate dims, give a warning 
417            if np.any(np.isin(vShape, duplicate_dim_sizes)):
418                warnings.warn('Dimensions of variable {:s} cannot be uniquely identified.\n'.\
419                              format(var['name'])
420                              +'The "dim_names" key is strongly recommended.')
421
422            # Match dimensions of variable with defined dimensions based on shape
423            dID = []
424            for s in vShape:
425
426                # Find the dimensions that match the variable dimension
427                try:
428                    i = dimSize.index(s)
429                except ValueError:
430                    # No dimensions match
431                    raise ValueError( 'Cannot match dimensions for variable {:s}'.\
432                                     format( var['name'] ) )
433
434                # List of the dimension names
435                dID.append(dimName[i])
436
437        # Create the variable
438        ncVar = _create_geo_var( var, f, dID, compress=compress, classic=classic, verbose=verbose )
439
440        varList.append( ncVar )
441
442    # Close the file
443    f.close()

Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant

This function allows specification of netCDF files more concisely than many alternative python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and inferring the dimensions for each variable from the variable shape. This is well suited for many lat-lon-lev-time and xyzt datasets.

Each variable is defined as a dict.

Required keys:

  • 'name' (str) variable name
  • 'value' (array) N-D array of variable data

Special keys (all optional):

  • 'dim_names' (list of str) names of the dimension variables corresponding to dimensions of variable
    • If dim_names is not provided, the dimension variables will be inferred from the data shape.
    • If all dimensions have unique lengths, the inferred dimensions are unambiguous.
    • If two or more dimensions have equal lengths, then the dim_names key should be used.
  • 'fill_value'(numeric) value that should replace NaNs
  • 'unlimited' (bool) specifies if dimension is unlimited
  • 'pack' (bool) specifies that variable should be compressed with integer packing
  • 'packtype' (str, default='i2') numeric type for packed data, commonly i1 or i2
  • 'calendar' (str) string for COARDS/CF calendar convention. Only used for time variable

All other keys are assigned to variable attributes. CF conventions expect the following:

  • 'long_name' (str) long name for variable
  • 'units' (str) units of variable

Example: {'name': 'O3', 'value': data, 'long_name': 'ozone mole fraction', 'units': 'mol/mol'}

Parameters
  • filename (str): name/path for file that will be created
  • variables (list of dict-like): Each variable is specified as a dict, as described above.
  • xDim (array or dict-like, optional): x dimension of data. If dict-like, then it should contain same keys as variables. If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat'
  • yDim (array or dict-like, optional): y dimension of data. If dict-like, then it should contain same keys as variables. If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon'
  • zDim (array or dict-like, optional): z dimension of data. If dict-like, then it should contain same keys as variables. If zDim is an array, then it is named 'lev' zUnits is named used to infer the variable long name:
    • m, km -> zDim is "altitude"
    • Pa, hPa -> zDim is "pressure"
    • None -> zDim is "level"
  • zUnits ({'m','km','Pa','hPa','level','' None}): Units for zDim. Ignored if zDim is dict-like.
  • tDim (array or dict-like, optional): time dimension of data. If dict-like, then it should contain the same keys as variables. If tDim is an array, then tUnits is used and the dimension is set as unlimited and named 'time'. datetime-like variables are supported, as are floats and numeric.
  • tUnits (str, optional): Units for tDim. Special treatment will be used for "<time units> since <date>"
  • globalAtt (dict-like, optional): dict of global file attributes
  • classic (bool, default=True): specify whether file should use netCDF classic data model (includes netCDF4 classic)
  • nc4 (bool, default=True): specify whether file should be netCDF4. Required for compression.
  • compress (bool, default=True): specify whether all variables should be compressed (lossless). In addition to lossless compression, setting pack=True for individual variables enables lossy integer packing compression.
  • clobber (bool, default=False): specify whether a pre-existing file with the same name should be overwritten
  • verbose (bool, default=False): specify whether extra output should be written while creating the netCDF file