acgc.netcdf
High-level tools for writing and reading netCDF files
Comparison to xarray:
- For reading netCDF files, xarray is generally better than
acgc.netcdf
. - For creating new netCDF files, using
write_geo_nc
is often more concise than xarray. - For editing netCDF files (e.g. modifying a few attributes or variables),
put_nc_att
andput_nc_var
can be convenient because the rest of the file remains unchanged, unlike xarray. (As of 2023-12, using xarray to read and write a netCDF file does not produce identical files. Command line netCDF operators (NCO) is another option for editing netCDF.)
1# -*- coding: utf-8 -*- 2"""High-level tools for writing and reading netCDF files 3 4Comparison to xarray: 5- For *reading* netCDF files, xarray is generally better than `acgc.netcdf`. 6- For *creating* new netCDF files, using `write_geo_nc` is often more concise than xarray. 7- For *editing* netCDF files (e.g. modifying a few attributes or variables), 8`put_nc_att` and `put_nc_var` can be convenient because the rest of the file remains unchanged, 9unlike xarray. (As of 2023-12, using xarray to read and write a netCDF file does *not* produce 10identical files. Command line netCDF operators (NCO) is another option for editing netCDF.) 11""" 12 13from datetime import datetime 14import warnings 15import netCDF4 as nc 16import numpy as np 17 18def get_nc_var(filename,varname): 19 """Read a variable from a netCDF file 20 21 Parameters 22 ---------- 23 filename : str 24 name/path of netCDF file 25 varname : str 26 name of variable that will be retrieved 27 28 Returns 29 ------- 30 data : N-D array 31 value of variable 32 """ 33 34 # Open file for reading 35 ncfile = nc.Dataset(filename,'r') 36 37 # Get the desired variable 38 data = ncfile.variables[varname][:] 39 40 # Close the file 41 ncfile.close() 42 43 return data 44 45def get_nc_att(filename,varname,attname,glob=False): 46 """ Read an attribute from a netCDF file 47 48 Parameters 49 ---------- 50 filename : str 51 name/path of netCDF file 52 varname : str 53 name of variable 54 attname : str 55 name of attribute that will be retrieved 56 glob : bool, default=False 57 Set glob=True to access global file attribues (varname will be ignored) 58 and glob=False for variable attributes 59 60 Returns 61 ------- 62 data : float or str 63 attribute value 64 """ 65 66 # Open file for reading 67 ncfile = nc.Dataset(filename,'r') 68 69 # Get the desired attribute 70 if glob: 71 data = ncfile.getncattr(attname) 72 else: 73 data = ncfile.variables[varname].getncattr(attname) 74 75 # Close the file 76 ncfile.close() 77 78 return data 79 80def get_nc_varnames(filename): 81 """ Read variable names from a netCDF file 82 83 Parameters 84 ---------- 85 filename : str 86 name/path of netCDF file 87 88 Returns 89 ------- 90 list of strings containing variable names within filename 91 """ 92 93 # Open file for reading 94 ncfile = nc.Dataset(filename,'r') 95 96 # Get the desired variable 97 data = list(ncfile.variables.keys()) 98 99 # Close the file 100 ncfile.close() 101 102 return data 103 104def get_nc_attnames(filename,varname,glob=False): 105 """ Read attributes from a netCDF file 106 107 Parameters 108 ---------- 109 filename : str 110 name/path of netCDF file 111 varname : str 112 name of variable 113 glob : bool, default=False 114 Set glob=True to access global file attribues (varname will be ignored) 115 and glob=False for variable attributes 116 117 Returns 118 ------- 119 list of strings containing attribute names 120 """ 121 122 # Open file for reading 123 ncfile = nc.Dataset(filename,'r') 124 125 # Get the attribute names 126 if glob: 127 data = ncfile.ncattrs() 128 else: 129 data = ncfile.variables[varname].ncattrs() 130 131 # Close the file 132 ncfile.close() 133 134 return data 135 136def put_nc_var(filename,varname,value): 137 """ Assign a new value to an existing variable and existing file 138 139 Parameters 140 ---------- 141 filename : str 142 name/path of netCDF file 143 varname : str 144 name of variable that will be assigned 145 value : N-D array 146 data values that will be assigned to variable 147 must have same shape as the current variable values 148 """ 149 150 # Open file for reading 151 ncfile = nc.Dataset(filename,'r+') 152 153 # Set value 154 ncfile.variables[varname][:] = value 155 156 # Close the file 157 ncfile.close() 158 159def put_nc_att(filename,varname,attname,value,glob=False): 160 """ Assign a new value to an existing attribute 161 162 Parameters 163 ---------- 164 filename : str 165 name/path of netCDF file 166 varname : str 167 name of variable 168 attname : str 169 name of attribute that will be assigned 170 value : str, float, list 171 data values that will be assigned to the attribute 172 """ 173 174 # Open file for reading 175 ncfile = nc.Dataset(filename,'r+') 176 177 # Set attribute 178 if glob: 179 ncfile.setncattr(attname,value) 180 else: 181 ncfile.variables[varname].setncattr(attname,value) 182 183 # Close the file 184 ncfile.close() 185 186def write_geo_nc(filename, variables, 187 xDim=None, yDim=None, 188 zDim=None, zUnits=None, 189 tDim=None, tUnits=None, 190 globalAtt=None, 191 classic=True, nc4=True, compress=True, 192 clobber=False, verbose=False ): 193 '''Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant 194 195 This function allows specification of netCDF files more concisely than many alternative 196 python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and 197 inferring the dimensions for each variable from the variable shape. This is well suited 198 for many lat-lon-lev-time and xyzt datasets. 199 200 Each variable is defined as a dict. 201 202 Required keys: 203 - 'name' (str) variable name 204 - 'value' (array) N-D array of variable data 205 206 Special keys (all optional): 207 - 'dim_names' (list of str) names of the dimension variables corresponding to dimensions of variable 208 - If dim_names is not provided, the dimension variables will be inferred from the data shape. 209 - If all dimensions have unique lengths, the inferred dimensions are unambiguous. 210 - If two or more dimensions have equal lengths, then the dim_names key should be used. 211 - 'fill_value'(numeric) value that should replace NaNs 212 - 'unlimited' (bool) specifies if dimension is unlimited 213 - 'pack' (bool) specifies that variable should be compressed with integer packing 214 - 'packtype' (str, default='i2') numeric type for packed data, commonly i1 or i2 215 - 'calendar' (str) string for COARDS/CF calendar convention. Only used for time variable 216 217 All other keys are assigned to variable attributes. CF conventions expect the following: 218 - 'long_name' (str) long name for variable 219 - 'units' (str) units of variable 220 221 Example: ```{'name': 'O3', 222 'value': data, 223 'long_name': 'ozone mole fraction', 224 'units': 'mol/mol'}``` 225 226 227 Parameters 228 ---------- 229 filename : str 230 name/path for file that will be created 231 variables : list of dict-like 232 Each variable is specified as a dict, as described above. 233 xDim : array or dict-like, optional 234 x dimension of data. If dict-like, then it should contain same keys as variables. 235 If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat' 236 yDim : array or dict-like, optional 237 y dimension of data. If dict-like, then it should contain same keys as variables. 238 If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon' 239 zDim : array or dict-like, optional 240 z dimension of data. If dict-like, then it should contain same keys as variables. 241 If zDim is an array, then it is named 'lev' 242 zUnits is named used to infer the variable long name: 243 - m, km -> zDim is "altitude" 244 - Pa, hPa -> zDim is "pressure" 245 - None -> zDim is "level" 246 zUnits : {'m','km','Pa','hPa','level','' None} 247 Units for zDim. Ignored if zDim is dict-like. 248 tDim : array or dict-like, optional 249 time dimension of data. If dict-like, then it should contain the same keys as variables. 250 If tDim is an array, then tUnits is used and the dimension is set as unlimited and 251 named 'time'. datetime-like variables are supported, as are floats and numeric. 252 tUnits : str, optional 253 Units for tDim. Special treatment will be used for ``"<time units> since <date>"`` 254 globalAtt : dict-like, optional 255 dict of global file attributes 256 classic : bool, default=True 257 specify whether file should use netCDF classic data model (includes netCDF4 classic) 258 nc4 : bool, default=True 259 specify whether file should be netCDF4. Required for compression. 260 compress : bool, default=True 261 specify whether all variables should be compressed (lossless). 262 In addition to lossless compression, setting pack=True for individual variables enables 263 lossy integer packing compression. 264 clobber : bool, default=False 265 specify whether a pre-existing file with the same name should be overwritten 266 verbose : bool, default=False 267 specify whether extra output should be written while creating the netCDF file 268 ''' 269 270 # NetCDF file type 271 if nc4: 272 if classic: 273 ncfmt = 'NETCDF4_CLASSIC' 274 else: 275 ncfmt = 'NETCDF4' 276 else: 277 ncfmt = 'NETCDF3_64BIT_OFFSET' 278 279 ### Open file for output 280 281 f = nc.Dataset( filename, 'w', format=ncfmt, clobber=clobber ) 282 f.Conventions = "COARDS/CF" 283 f.History = datetime.now().strftime('%Y-%m-%d %H:%M:%S') + \ 284 ' : Created by write_geo_nc (python)' 285 286 # Write global attributes, if any 287 if globalAtt is not None: 288 f.setncatts(globalAtt) 289 290 ### Define dimensions 291 292 dimName = [] 293 dimList = [] 294 dimSize = [] 295 varList = [] 296 297 if xDim is not None: 298 if not isinstance(xDim, dict): 299 xDim = {'name': 'lon', 300 'value': xDim, 301 'long_name': 'longitude', 302 'units': 'degrees_east'} 303 304 ncDim, ncVar = _create_geo_dim( xDim, f, 305 compress=compress, 306 classic=classic, 307 verbose=verbose ) 308 309 dimName.append( ncDim.name ) 310 dimList.append( ncDim ) 311 varList.append( ncVar ) 312 dimSize.append( len(ncVar[:]) ) 313 314 if yDim is not None: 315 if not isinstance(yDim, dict): 316 yDim = {'name': 'lat', 317 'value': yDim, 318 'long_name': 'latitude', 319 'units': 'degrees_north'} 320 321 ncDim, ncVar = _create_geo_dim( yDim, f, 322 compress=compress, 323 classic=classic, 324 verbose=verbose ) 325 326 dimName.append( ncDim.name ) 327 dimList.append( ncDim ) 328 varList.append( ncVar ) 329 dimSize.append( len(ncVar[:]) ) 330 331 if zDim is not None: 332 if not isinstance(zDim, dict): 333 # Infer name from units 334 if zUnits in ['m','km']: 335 lname = 'altitude' 336 elif zUnits in ['Pa','hPa']: 337 lname = 'pressure' 338 elif zUnits in ['level','',None]: 339 lname = 'level' 340 zUnits = '' 341 else: 342 raise ValueError( f'Units of {zUnits:s} for zDim have not been implemented' ) 343 344 zDim = {'name': 'lev', 345 'value': zDim, 346 'long_name': lname, 347 'units': zUnits} 348 349 ncDim, ncVar = _create_geo_dim( zDim, f, 350 compress=compress, 351 classic=classic, 352 verbose=verbose ) 353 354 dimName.append( ncDim.name ) 355 dimList.append( ncDim ) 356 varList.append( ncVar ) 357 dimSize.append( len(ncVar[:]) ) 358 359 if tDim is not None: 360 if not isinstance(tDim, dict): 361 if tUnits is None: 362 tUnits = '' 363 tDim = {'name': 'time', 364 'value': tDim, 365 'long_name': 'time', 366 'units': tUnits, 367 'unlimited': True } 368 369 ncDim, ncVar = _create_geo_dim( tDim, f, 370 compress=compress, 371 classic=classic, 372 time=True, 373 verbose=verbose ) 374 375 dimName.append( ncDim.name ) 376 dimList.append( ncDim ) 377 varList.append( ncVar ) 378 dimSize.append( len(ncVar[:]) ) 379 380 # Dimension sizes that are not unique i.e. shared by two or more dimensions 381 duplicate_dim_sizes = {s for s in dimSize if dimSize.count(s) > 1} 382 383 ### Define variables 384 385 for var in variables: 386 387 if type(var) is not dict: 388 raise TypeError( 'All variables must be passed as dicts' ) 389 390 # Shape of the variable 391 vShape = var['value'].shape 392 393 # If dim_names is provided, otherwise use inference 394 if 'dim_names' in var.keys(): 395 396 # Set dimensions based on provided names 397 dID = var['dim_names'] 398 399 # Confirm that correct number of dimensions are provided 400 if len(dID) != len(vShape): 401 raise ValueError( 'Variable {:s} has dimension {:d} and {:d} dim_names'.\ 402 format( var['name'], len(vShape), len(dID) ) ) 403 # Shape of each named dimension 404 dShape = tuple( dimSize[dimName.index(d)] for d in dID ) 405 406 # Confirm that the named dimensions match the shape of the variable 407 if dShape != vShape: 408 raise ValueError( 'Shape of the dimensions [{:s}]=[{:s}] must match '.\ 409 format( ','.join(dID),','.join([str(i) for i in dShape]) ) 410 + 'the shape of variable {:s} which is [{:s}]'.\ 411 format( var['name'],','.join([str(i) for i in vShape]) ) ) 412 413 else: 414 415 # If this variable uses any of the duplicate dims, give a warning 416 if np.any(np.isin(vShape, duplicate_dim_sizes)): 417 warnings.warn('Dimensions of variable {:s} cannot be uniquely identified.\n'.\ 418 format(var['name']) 419 +'The "dim_names" key is strongly recommended.') 420 421 # Match dimensions of variable with defined dimensions based on shape 422 dID = [] 423 for s in vShape: 424 425 # Find the dimensions that match the variable dimension 426 try: 427 i = dimSize.index(s) 428 except ValueError: 429 # No dimensions match 430 raise ValueError( 'Cannot match dimensions for variable {:s}'.\ 431 format( var['name'] ) ) 432 433 # List of the dimension names 434 dID.append(dimName[i]) 435 436 # Create the variable 437 ncVar = _create_geo_var( var, f, dID, compress=compress, classic=classic, verbose=verbose ) 438 439 varList.append( ncVar ) 440 441 # Close the file 442 f.close() 443 444def _create_geo_dim( var, fid, **kwargs ): 445 '''Create a netCDF dimension and associated variable in a open file 446 447 Parameters 448 ---------- 449 var : dict-lik 450 Must contain keys for 'name', 'value'. Other keys will be assiged to attributes. 451 e.g. 'unlimited' for unlimited dimension. 452 fid : netCDF file ID 453 reference to an open file 454 455 Returns 456 ------- 457 ncDim, ncVar : 458 IDs for the dimension and variable that were created 459 ''' 460 461 try: 462 assert ('name' in var), \ 463 'Var structure must contain "name" key: create_geo_dim' 464 assert ('value' in var), \ 465 'Var structure must contain "value" key: create_geo_dim' 466 except AssertionError as exc: 467 # Close file and exit 468 fid.close() 469 raise IOError() from exc 470 471 size = len(var['value']) 472 473 # If the unlimited tag is True, then set size=None to create unlimited dimension 474 if 'unlimited' in var.keys(): 475 if var['unlimited']: 476 size = None 477 478 ncDim = fid.createDimension( var['name'], size ) 479 480 ncVar = _create_geo_var( var, fid, (var['name']), isDim=True, **kwargs ) 481 482 return ncDim, ncVar 483 484def _get_nc_type( value, name='', classic=True ): 485 '''Get netCDF variable type for a python variable 486 487 Parameters 488 ---------- 489 value : any data type 490 name : str (optional) 491 name of the variable being examined. Only used for error messages 492 classic : bool 493 If classic=True, then error will be raised if variable type is not 494 allowed in netCDF classic files 495 496 Returns 497 ------- 498 vartype : str 499 500 ''' 501 502 # Variable type for numpy arrays or native python 503 try: 504 vartype = value.dtype.str 505 except AttributeError: 506 vartype = type(value) 507 508 # Remove brackets 509 vartype = vartype.replace('<','').replace('>','') 510 511 # Raise error if the type isn't allowed 512 if (classic and (vartype not in ['i1','i2','i4','f4','f8','U1','str','bytes'])): 513 raise TypeError( 'Variable {:s} cannot have type {:s} in netCDF classic files'.format( 514 name, vartype ) ) 515 516 return vartype 517 518def _infer_pack_scale_offset( data, nbits ): 519 '''Compute scale and offset for packing data to nbit integers 520 Follow Unidata recommendations: 521 https://www.unidata.ucar.edu/software/netcdf/documentation/NUG/_best_practices.html 522 523 Parameters 524 ---------- 525 data : N-D array 526 data to be compressed 527 nbits : int 528 number of bits in the packed data type 529 530 Returns 531 ------- 532 scale_factor, add_offset : numeric 533 recommended values for scale and offset 534 pack_fill_value : numeric 535 recommended fill value within the integer range 536 ''' 537 min = np.nanmin( data ) 538 max = np.nanmax( data ) 539 scale_factor = ( max - min ) / (2**nbits - 2) 540 add_offset = ( max + min ) / 2 541 542 # Use a fill value within the packed range 543 pack_fill_value = -2**(nbits-1) 544 545 return scale_factor, add_offset, pack_fill_value 546 547def _integer_pack_data( data, scale_factor, add_offset, fill_value, pack_fill_value ): 548 '''Pack data using the provided scale, offset, and fill value 549 550 Parameters 551 ---------- 552 data : N-D array 553 data values to be packed 554 scale_factor, add_offset : float 555 scale and offset for packing 556 fill_value : numeric 557 fill value for the unpacked data 558 pack_fill_value : numeric 559 fill value for the packed data, must be in the allowed range of the packed integer type 560 561 Returns 562 ------- 563 pack_value : N-D array 564 data values converted to the integer range 565 ''' 566 567 # Packed values 568 pack_value = ( data[:] - add_offset ) / scale_factor 569 570 # Set missing value 571 if np.ma.is_masked( data ): 572 pack_value[ data.mask ] = pack_fill_value 573 if fill_value is not None: 574 pack_value[ data==fill_value ] = pack_fill_value 575 if np.any( np.isnan(data) ): 576 pack_value[ np.isnan( data ) ] = pack_fill_value 577 if np.any( np.isinf(data) ): 578 pack_value[ np.isinf( data ) ] = pack_fill_value 579 580 return np.round( pack_value ) 581 582def _create_geo_var( var, fid, dimIDs, compress=True, classic=True, time=False, 583 fill_value=None, verbose=False, isDim=False ): 584 '''Create a netCDF variable, assuming common geospatial conventions 585 586 Parameters 587 ---------- 588 var : dict-like 589 See create_geo_nc for required and optional dict keys. 590 fid : netCDF file ID 591 reference to an open file 592 dimIDs : list 593 list of dimensions ID numbers corresponding to the dimensions of var.value 594 compress : bool, default=True 595 compress=True indicates that variable should be deflated with lossless compression 596 classic : bool, default=True 597 specify if file is netCDF Classic or netCDF4 Classic 598 time : bool, default=False 599 specify if variable has time units that require handling with calendar 600 isDim : bool, default=False 601 indicates dimesion variables 602 603 Returns 604 ------- 605 ncVar : netCDF variable ID that was created 606 ''' 607 try: 608 assert ('name' in var), \ 609 'Var structure must contain "name" key: create_geo_var' 610 assert ('value' in var), \ 611 'Var structure must contain "value" key: create_geo_var' 612 except AssertionError as exc: 613 # Close file and exit 614 fid.close() 615 raise IOError from exc 616 617 # If this is a time variable and units are "<time units> since <date>", then convert 618 if (time and (' since ' in var['units']) ): 619 if 'calendar' in var.keys(): 620 calendar = var['calendar'] 621 else: 622 calendar = 'standard' 623 var['value'] = nc.date2num( var['value'], units=var['units'], calendar=calendar) 624 625 # Variable type 626 vartype = _get_nc_type( var['value'], var['name'], classic ) 627 628 # Fill value, if any 629 if 'fill_value' in var.keys(): 630 fill_value = var['fill_value'] 631 else: 632 fill_value = None 633 634 ### Progress towards packing variables as integers; 635 ### Appears to be working, but only minimal testing so far 636 637 # Check if packing keywords are set 638 if isDim: 639 # Dimension variables should not be packed 640 pack = False 641 elif 'pack' in var.keys(): 642 # Use pack key if provided 643 pack = var['pack'] 644 else: 645 # Default to pack = False 646 pack = False 647 648 # Pack to integer data, if requested 649 if pack: 650 651 # Integer type for packed data 652 if 'packtype' in var.keys(): 653 packtype = var['packtype'] 654 else: 655 packtype = 'i2' 656 vartype = packtype 657 658 # Number of bits in the packed data type 659 if packtype=='i2': 660 n = 16 661 elif packtype=='i1': 662 n = 8 663 else: 664 raise ValueError(f'Packing to type {packtype:s} has not been implemented' ) 665 666 # Get scale factor, offset and fill value for packing 667 scale_factor, add_offset, pack_fill_value = _infer_pack_scale_offset( var['value'][:], n ) 668 669 var['scale_factor'] = scale_factor 670 var['add_offset'] = add_offset 671 672 # Scale data into the integer range 673 pack_value = _integer_pack_data( var['value'], 674 scale_factor, 675 add_offset, 676 fill_value, 677 pack_fill_value ) 678 679 # Rename 680 fill_value = pack_fill_value 681 var['value'] = pack_value 682 683 else: 684 scale_factor = None 685 add_offset = None 686 var.pop('pack',None) 687 var.pop('packtype',None) 688 ### 689 690 #*** Check whether the data type is allowed in classic data type 691 692 # Create the variable 693 ncVar = fid.createVariable( var['name'], vartype, dimIDs, 694 zlib=compress, complevel=2, 695 fill_value=fill_value ) 696 697 # Write variable values 698 ncVar[:] = var['value'][:] 699 700 # These keys are not attributes, so remove them 701 var.pop('name',None) 702 var.pop('value',None) 703 var.pop('unlimited',None) 704 var.pop('dim_names',None) 705 var.pop('fill_value',None) 706 707 # Save the remaining attributes 708 ncVar.setncatts(var) 709 710 return ncVar
19def get_nc_var(filename,varname): 20 """Read a variable from a netCDF file 21 22 Parameters 23 ---------- 24 filename : str 25 name/path of netCDF file 26 varname : str 27 name of variable that will be retrieved 28 29 Returns 30 ------- 31 data : N-D array 32 value of variable 33 """ 34 35 # Open file for reading 36 ncfile = nc.Dataset(filename,'r') 37 38 # Get the desired variable 39 data = ncfile.variables[varname][:] 40 41 # Close the file 42 ncfile.close() 43 44 return data
Read a variable from a netCDF file
Parameters
- filename (str): name/path of netCDF file
- varname (str): name of variable that will be retrieved
Returns
- data (N-D array): value of variable
46def get_nc_att(filename,varname,attname,glob=False): 47 """ Read an attribute from a netCDF file 48 49 Parameters 50 ---------- 51 filename : str 52 name/path of netCDF file 53 varname : str 54 name of variable 55 attname : str 56 name of attribute that will be retrieved 57 glob : bool, default=False 58 Set glob=True to access global file attribues (varname will be ignored) 59 and glob=False for variable attributes 60 61 Returns 62 ------- 63 data : float or str 64 attribute value 65 """ 66 67 # Open file for reading 68 ncfile = nc.Dataset(filename,'r') 69 70 # Get the desired attribute 71 if glob: 72 data = ncfile.getncattr(attname) 73 else: 74 data = ncfile.variables[varname].getncattr(attname) 75 76 # Close the file 77 ncfile.close() 78 79 return data
Read an attribute from a netCDF file
Parameters
- filename (str): name/path of netCDF file
- varname (str): name of variable
- attname (str): name of attribute that will be retrieved
- glob (bool, default=False): Set glob=True to access global file attribues (varname will be ignored) and glob=False for variable attributes
Returns
- data (float or str): attribute value
81def get_nc_varnames(filename): 82 """ Read variable names from a netCDF file 83 84 Parameters 85 ---------- 86 filename : str 87 name/path of netCDF file 88 89 Returns 90 ------- 91 list of strings containing variable names within filename 92 """ 93 94 # Open file for reading 95 ncfile = nc.Dataset(filename,'r') 96 97 # Get the desired variable 98 data = list(ncfile.variables.keys()) 99 100 # Close the file 101 ncfile.close() 102 103 return data
Read variable names from a netCDF file
Parameters
- filename (str): name/path of netCDF file
Returns
- list of strings containing variable names within filename
105def get_nc_attnames(filename,varname,glob=False): 106 """ Read attributes from a netCDF file 107 108 Parameters 109 ---------- 110 filename : str 111 name/path of netCDF file 112 varname : str 113 name of variable 114 glob : bool, default=False 115 Set glob=True to access global file attribues (varname will be ignored) 116 and glob=False for variable attributes 117 118 Returns 119 ------- 120 list of strings containing attribute names 121 """ 122 123 # Open file for reading 124 ncfile = nc.Dataset(filename,'r') 125 126 # Get the attribute names 127 if glob: 128 data = ncfile.ncattrs() 129 else: 130 data = ncfile.variables[varname].ncattrs() 131 132 # Close the file 133 ncfile.close() 134 135 return data
Read attributes from a netCDF file
Parameters
- filename (str): name/path of netCDF file
- varname (str): name of variable
- glob (bool, default=False): Set glob=True to access global file attribues (varname will be ignored) and glob=False for variable attributes
Returns
- list of strings containing attribute names
137def put_nc_var(filename,varname,value): 138 """ Assign a new value to an existing variable and existing file 139 140 Parameters 141 ---------- 142 filename : str 143 name/path of netCDF file 144 varname : str 145 name of variable that will be assigned 146 value : N-D array 147 data values that will be assigned to variable 148 must have same shape as the current variable values 149 """ 150 151 # Open file for reading 152 ncfile = nc.Dataset(filename,'r+') 153 154 # Set value 155 ncfile.variables[varname][:] = value 156 157 # Close the file 158 ncfile.close()
Assign a new value to an existing variable and existing file
Parameters
- filename (str): name/path of netCDF file
- varname (str): name of variable that will be assigned
- value (N-D array): data values that will be assigned to variable must have same shape as the current variable values
160def put_nc_att(filename,varname,attname,value,glob=False): 161 """ Assign a new value to an existing attribute 162 163 Parameters 164 ---------- 165 filename : str 166 name/path of netCDF file 167 varname : str 168 name of variable 169 attname : str 170 name of attribute that will be assigned 171 value : str, float, list 172 data values that will be assigned to the attribute 173 """ 174 175 # Open file for reading 176 ncfile = nc.Dataset(filename,'r+') 177 178 # Set attribute 179 if glob: 180 ncfile.setncattr(attname,value) 181 else: 182 ncfile.variables[varname].setncattr(attname,value) 183 184 # Close the file 185 ncfile.close()
Assign a new value to an existing attribute
Parameters
- filename (str): name/path of netCDF file
- varname (str): name of variable
- attname (str): name of attribute that will be assigned
- value (str, float, list): data values that will be assigned to the attribute
187def write_geo_nc(filename, variables, 188 xDim=None, yDim=None, 189 zDim=None, zUnits=None, 190 tDim=None, tUnits=None, 191 globalAtt=None, 192 classic=True, nc4=True, compress=True, 193 clobber=False, verbose=False ): 194 '''Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant 195 196 This function allows specification of netCDF files more concisely than many alternative 197 python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and 198 inferring the dimensions for each variable from the variable shape. This is well suited 199 for many lat-lon-lev-time and xyzt datasets. 200 201 Each variable is defined as a dict. 202 203 Required keys: 204 - 'name' (str) variable name 205 - 'value' (array) N-D array of variable data 206 207 Special keys (all optional): 208 - 'dim_names' (list of str) names of the dimension variables corresponding to dimensions of variable 209 - If dim_names is not provided, the dimension variables will be inferred from the data shape. 210 - If all dimensions have unique lengths, the inferred dimensions are unambiguous. 211 - If two or more dimensions have equal lengths, then the dim_names key should be used. 212 - 'fill_value'(numeric) value that should replace NaNs 213 - 'unlimited' (bool) specifies if dimension is unlimited 214 - 'pack' (bool) specifies that variable should be compressed with integer packing 215 - 'packtype' (str, default='i2') numeric type for packed data, commonly i1 or i2 216 - 'calendar' (str) string for COARDS/CF calendar convention. Only used for time variable 217 218 All other keys are assigned to variable attributes. CF conventions expect the following: 219 - 'long_name' (str) long name for variable 220 - 'units' (str) units of variable 221 222 Example: ```{'name': 'O3', 223 'value': data, 224 'long_name': 'ozone mole fraction', 225 'units': 'mol/mol'}``` 226 227 228 Parameters 229 ---------- 230 filename : str 231 name/path for file that will be created 232 variables : list of dict-like 233 Each variable is specified as a dict, as described above. 234 xDim : array or dict-like, optional 235 x dimension of data. If dict-like, then it should contain same keys as variables. 236 If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat' 237 yDim : array or dict-like, optional 238 y dimension of data. If dict-like, then it should contain same keys as variables. 239 If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon' 240 zDim : array or dict-like, optional 241 z dimension of data. If dict-like, then it should contain same keys as variables. 242 If zDim is an array, then it is named 'lev' 243 zUnits is named used to infer the variable long name: 244 - m, km -> zDim is "altitude" 245 - Pa, hPa -> zDim is "pressure" 246 - None -> zDim is "level" 247 zUnits : {'m','km','Pa','hPa','level','' None} 248 Units for zDim. Ignored if zDim is dict-like. 249 tDim : array or dict-like, optional 250 time dimension of data. If dict-like, then it should contain the same keys as variables. 251 If tDim is an array, then tUnits is used and the dimension is set as unlimited and 252 named 'time'. datetime-like variables are supported, as are floats and numeric. 253 tUnits : str, optional 254 Units for tDim. Special treatment will be used for ``"<time units> since <date>"`` 255 globalAtt : dict-like, optional 256 dict of global file attributes 257 classic : bool, default=True 258 specify whether file should use netCDF classic data model (includes netCDF4 classic) 259 nc4 : bool, default=True 260 specify whether file should be netCDF4. Required for compression. 261 compress : bool, default=True 262 specify whether all variables should be compressed (lossless). 263 In addition to lossless compression, setting pack=True for individual variables enables 264 lossy integer packing compression. 265 clobber : bool, default=False 266 specify whether a pre-existing file with the same name should be overwritten 267 verbose : bool, default=False 268 specify whether extra output should be written while creating the netCDF file 269 ''' 270 271 # NetCDF file type 272 if nc4: 273 if classic: 274 ncfmt = 'NETCDF4_CLASSIC' 275 else: 276 ncfmt = 'NETCDF4' 277 else: 278 ncfmt = 'NETCDF3_64BIT_OFFSET' 279 280 ### Open file for output 281 282 f = nc.Dataset( filename, 'w', format=ncfmt, clobber=clobber ) 283 f.Conventions = "COARDS/CF" 284 f.History = datetime.now().strftime('%Y-%m-%d %H:%M:%S') + \ 285 ' : Created by write_geo_nc (python)' 286 287 # Write global attributes, if any 288 if globalAtt is not None: 289 f.setncatts(globalAtt) 290 291 ### Define dimensions 292 293 dimName = [] 294 dimList = [] 295 dimSize = [] 296 varList = [] 297 298 if xDim is not None: 299 if not isinstance(xDim, dict): 300 xDim = {'name': 'lon', 301 'value': xDim, 302 'long_name': 'longitude', 303 'units': 'degrees_east'} 304 305 ncDim, ncVar = _create_geo_dim( xDim, f, 306 compress=compress, 307 classic=classic, 308 verbose=verbose ) 309 310 dimName.append( ncDim.name ) 311 dimList.append( ncDim ) 312 varList.append( ncVar ) 313 dimSize.append( len(ncVar[:]) ) 314 315 if yDim is not None: 316 if not isinstance(yDim, dict): 317 yDim = {'name': 'lat', 318 'value': yDim, 319 'long_name': 'latitude', 320 'units': 'degrees_north'} 321 322 ncDim, ncVar = _create_geo_dim( yDim, f, 323 compress=compress, 324 classic=classic, 325 verbose=verbose ) 326 327 dimName.append( ncDim.name ) 328 dimList.append( ncDim ) 329 varList.append( ncVar ) 330 dimSize.append( len(ncVar[:]) ) 331 332 if zDim is not None: 333 if not isinstance(zDim, dict): 334 # Infer name from units 335 if zUnits in ['m','km']: 336 lname = 'altitude' 337 elif zUnits in ['Pa','hPa']: 338 lname = 'pressure' 339 elif zUnits in ['level','',None]: 340 lname = 'level' 341 zUnits = '' 342 else: 343 raise ValueError( f'Units of {zUnits:s} for zDim have not been implemented' ) 344 345 zDim = {'name': 'lev', 346 'value': zDim, 347 'long_name': lname, 348 'units': zUnits} 349 350 ncDim, ncVar = _create_geo_dim( zDim, f, 351 compress=compress, 352 classic=classic, 353 verbose=verbose ) 354 355 dimName.append( ncDim.name ) 356 dimList.append( ncDim ) 357 varList.append( ncVar ) 358 dimSize.append( len(ncVar[:]) ) 359 360 if tDim is not None: 361 if not isinstance(tDim, dict): 362 if tUnits is None: 363 tUnits = '' 364 tDim = {'name': 'time', 365 'value': tDim, 366 'long_name': 'time', 367 'units': tUnits, 368 'unlimited': True } 369 370 ncDim, ncVar = _create_geo_dim( tDim, f, 371 compress=compress, 372 classic=classic, 373 time=True, 374 verbose=verbose ) 375 376 dimName.append( ncDim.name ) 377 dimList.append( ncDim ) 378 varList.append( ncVar ) 379 dimSize.append( len(ncVar[:]) ) 380 381 # Dimension sizes that are not unique i.e. shared by two or more dimensions 382 duplicate_dim_sizes = {s for s in dimSize if dimSize.count(s) > 1} 383 384 ### Define variables 385 386 for var in variables: 387 388 if type(var) is not dict: 389 raise TypeError( 'All variables must be passed as dicts' ) 390 391 # Shape of the variable 392 vShape = var['value'].shape 393 394 # If dim_names is provided, otherwise use inference 395 if 'dim_names' in var.keys(): 396 397 # Set dimensions based on provided names 398 dID = var['dim_names'] 399 400 # Confirm that correct number of dimensions are provided 401 if len(dID) != len(vShape): 402 raise ValueError( 'Variable {:s} has dimension {:d} and {:d} dim_names'.\ 403 format( var['name'], len(vShape), len(dID) ) ) 404 # Shape of each named dimension 405 dShape = tuple( dimSize[dimName.index(d)] for d in dID ) 406 407 # Confirm that the named dimensions match the shape of the variable 408 if dShape != vShape: 409 raise ValueError( 'Shape of the dimensions [{:s}]=[{:s}] must match '.\ 410 format( ','.join(dID),','.join([str(i) for i in dShape]) ) 411 + 'the shape of variable {:s} which is [{:s}]'.\ 412 format( var['name'],','.join([str(i) for i in vShape]) ) ) 413 414 else: 415 416 # If this variable uses any of the duplicate dims, give a warning 417 if np.any(np.isin(vShape, duplicate_dim_sizes)): 418 warnings.warn('Dimensions of variable {:s} cannot be uniquely identified.\n'.\ 419 format(var['name']) 420 +'The "dim_names" key is strongly recommended.') 421 422 # Match dimensions of variable with defined dimensions based on shape 423 dID = [] 424 for s in vShape: 425 426 # Find the dimensions that match the variable dimension 427 try: 428 i = dimSize.index(s) 429 except ValueError: 430 # No dimensions match 431 raise ValueError( 'Cannot match dimensions for variable {:s}'.\ 432 format( var['name'] ) ) 433 434 # List of the dimension names 435 dID.append(dimName[i]) 436 437 # Create the variable 438 ncVar = _create_geo_var( var, f, dID, compress=compress, classic=classic, verbose=verbose ) 439 440 varList.append( ncVar ) 441 442 # Close the file 443 f.close()
Create a NetCDF file with geospatial data. Output file is COARDS/CF compliant
This function allows specification of netCDF files more concisely than many alternative python packages (e.g. xarray, netCDF4) by making assumptions about the dimensions and inferring the dimensions for each variable from the variable shape. This is well suited for many lat-lon-lev-time and xyzt datasets.
Each variable is defined as a dict.
Required keys:
- 'name' (str) variable name
- 'value' (array) N-D array of variable data
Special keys (all optional):
- 'dim_names' (list of str) names of the dimension variables corresponding to dimensions of variable
- If dim_names is not provided, the dimension variables will be inferred from the data shape.
- If all dimensions have unique lengths, the inferred dimensions are unambiguous.
- If two or more dimensions have equal lengths, then the dim_names key should be used.
- 'fill_value'(numeric) value that should replace NaNs
- 'unlimited' (bool) specifies if dimension is unlimited
- 'pack' (bool) specifies that variable should be compressed with integer packing
- 'packtype' (str, default='i2') numeric type for packed data, commonly i1 or i2
- 'calendar' (str) string for COARDS/CF calendar convention. Only used for time variable
All other keys are assigned to variable attributes. CF conventions expect the following:
- 'long_name' (str) long name for variable
- 'units' (str) units of variable
Example: {'name': 'O3',
'value': data,
'long_name': 'ozone mole fraction',
'units': 'mol/mol'}
Parameters
- filename (str): name/path for file that will be created
- variables (list of dict-like): Each variable is specified as a dict, as described above.
- xDim (array or dict-like, optional): x dimension of data. If dict-like, then it should contain same keys as variables. If xDim is an array, then it is assumed to be longitude in degrees east and named 'lat'
- yDim (array or dict-like, optional): y dimension of data. If dict-like, then it should contain same keys as variables. If yDim is an array, then it is assumed to be latitude in degrees north and named 'lon'
- zDim (array or dict-like, optional):
z dimension of data. If dict-like, then it should contain same keys as variables.
If zDim is an array, then it is named 'lev'
zUnits is named used to infer the variable long name:
- m, km -> zDim is "altitude"
- Pa, hPa -> zDim is "pressure"
- None -> zDim is "level"
- zUnits ({'m','km','Pa','hPa','level','' None}): Units for zDim. Ignored if zDim is dict-like.
- tDim (array or dict-like, optional): time dimension of data. If dict-like, then it should contain the same keys as variables. If tDim is an array, then tUnits is used and the dimension is set as unlimited and named 'time'. datetime-like variables are supported, as are floats and numeric.
- tUnits (str, optional):
Units for tDim. Special treatment will be used for
"<time units> since <date>"
- globalAtt (dict-like, optional): dict of global file attributes
- classic (bool, default=True): specify whether file should use netCDF classic data model (includes netCDF4 classic)
- nc4 (bool, default=True): specify whether file should be netCDF4. Required for compression.
- compress (bool, default=True): specify whether all variables should be compressed (lossless). In addition to lossless compression, setting pack=True for individual variables enables lossy integer packing compression.
- clobber (bool, default=False): specify whether a pre-existing file with the same name should be overwritten
- verbose (bool, default=False): specify whether extra output should be written while creating the netCDF file