Accessor package

The accessor package is used to extend the pandas and xarray data structures.

Base module

class pymepps.accessor.base.MetData(data)[source]

MetData is the base class for meteorological data, like station data, nwp forecast data etc.

data

Access the parent object.

static load(load_path)[source]

Load a new instance.

save(save_path)[source]

Save this instance.

Pandas module

class pymepps.accessor.pandas.PandasAccessor(data)[source]

Bases: pymepps.accessor.base.MetData

An accessor to extend the pandas data structure. This could be used more actively in the future to add more post-processing specific features to pandas.

static load(load_path)[source]

Load the given json file and return a TSData instance with the loaded file. The loader uses tries to locate the lonlat and the data keys within the json file. If there are not these keys the loader tries to load the whole json file into pandas.

Parameters:load_path (str) – Path to the json file which should be loaded. It is recommended to load only previously saved TSData instances.
Returns:load_data – The loaded pandas object.
Return type:pandas object
save(save_path)[source]

The data is saved as json file. The pandas to_json method is used to generate convert the data to json. If lonlat was given it will be saved under a lonlat key. Json is used instead of HDF5 due to possible corruption problems.

Parameters:save_path (str) – Path where the json file should be saved.
update(*items)[source]

Update the data.

Spatial module

class pymepps.accessor.spatial.SpatialAccessor(data, grid=None)[source]

Bases: pymepps.accessor.base.MetData

The SpatialAccessor extends a xarray.DataArray for post-processing of meteorological numerical weather model data. The SpatialAccessor works mostly with grid and gridded data.

check_data_coordinates(item)[source]

Check if items grid coordinates shape is the same as those of the grid.

Parameters:

item (xarray.DataArray) – Instance to test for type and grid dimension length.

Returns:

item – The checked item.

Return type:

xarray.DataArray

Raises:
  • TypeError: – The grid is not set.
  • ValueError: – The given item has not the same last coordinates as the grid.
grid

The corresponding grid of this xarray.DataArray instance. This grid is used to interpolate/remap the data and to select the nearest grid point to a given longitude/latitude pair.

grid_to_attrs()[source]
static load(load_path)[source]

Load a NetCDF-based previously saved xarray.DataArray instance. If the NetCDF file has grid attributes they will be decoded as new grid.

Parameters:load_path (str) – The path to the saved xarray.DataArray instance.
Returns:loaded_array – The loaded DataArray instance. If a grid could be created it will be set to the DataArray instance.
Return type:xarray.DataArray
merge(*items)[source]

The merge routine could be used to merge this SpatialData instance with other instances. The merge creates a new merge dimension, named after the variable names. The grid of this instance is used as merged grid.

Parameters:items (xarray.DataArray) – The items are merged with this xarray.DataArray instance. The grid dimensions have to be same as the grid.
Returns:merged_array – The DataArray instance with the merged data.
Return type:xarray.DataArray
merge_analysis_timedelta(analysis_axis='runtime', timedelta_axis='validtime')[source]

The analysis time axis will be merged with the valid time axis, which should be given as timedelta. The merged time coordinate is called time and will be the first coordinate .

Parameters:
  • analysis_axis (str, optional) – The analysis time axis name. This axis will be used as basis for the valid time. Default is runtime.
  • timedelta_axis (str, optional) – The time delta axis name. This axis should contain the difference to the analysis time.
Returns:

merged_array – The DataArray with the merged analysis and timedelta coordinate.

Return type:

xarray.DataArray

normalize_coords(runtime=None, ensemble='det', validtime=None, height=None)[source]

Normalize the coordinates of the DataArray. The number, order and names of the coordinates are normalized. The number of coordinates will be four to six, depending if the DataArray is a merged multi-variable DataArray and the number of grid coordinates. The values of the added coordinates is set to the given values or will be None as filling value. The order and name of the DataArray will be:

  • (variable) (Only if the DataArray is a multi-variable DataArray. This is the variable name)
  • runtime (The analysis time of the model. The model is started at this time. The runtime is np.datetime64 as type)
  • ensemble (The ensemble member of the model.)
  • validtime (The lead time of the model. The model is valid for this times. The validtime timedelta to the runtime.)
  • height (The height information of the model.)
  • first grid coordinate
  • (second grid coordinate) (Only if the grid is not an unstructured grid)
Parameters:
  • runtime (datetime.datetime, np.datetime64 or None, optional) – The runtime of the model. The runtime will be converted to np.datetime64[ns] if it is not already this type. Default is None.
  • ensemble (int or str, optional) – The ensemble member of the model. An integer is indicating the member number, with zero as control run. Default is ‘det’.
  • validtime (datetime.datetime, np.datetiime64, np.timedelta or None,) – optional The validtime of the model. The validtime is converted to np.timedelta. Default is None.
  • height (int, str or None, optional) – The height of the model. Default is None.
Returns:

normalized_array – The DataArray with normalized coordinates.

Return type:

xr.DataArray

remapbil(new_grid)[source]

Remap the horizontal grid with a bilinear approach to a given new grid.

Parameters:new_grid (Child instance of Grid) – The data is remapped to this grid.
Returns:remapped_array – The xarray.DataArray with the replaced grid.
Return type:xarray.DataArray
remapnn(new_grid)[source]

Remap the horizontal grid with a nearest neighbour approach to a given new grid.

Parameters:new_grid (Child instance of Grid) – The data is remapped to this grid.
Returns:remapped_array – The xarray.DataArray with the replaced grid.
Return type:xarray.DataArray
save(save_path)[source]

Save the DataArray and the grid as attributes together. The grid attributes are used by the load method to recreate the grid, but it is also possible to load the data with the normal xarray load functions.

Parameters:save_path (str) – The path where the netcdf file should be saved.
sellonlatbox(lonlatbox)[source]

This DataArray instance is sliced by given lonlatbox. A new grid is created and set based on the sliced coordinates.

Parameters:lonlatbox (tuple(float)) –

The longitude and latitude box with four entries as degree. The entries are handled in the following way:

(left/west, top/north, right/east, bottom/south)
Returns:sliced_array – The sliced data array with the new grid.
Return type:xarray.DataArray

Notes

For some grids the new grid is based on an UnstructuredGrid, due to technical limitations.

selpoint(lonlat)[source]

Select a longitude, latitude point within this DataArray. A new lonlat grid with a single point is created.

Parameters:lonlat (tuple(float)) – The longitude and latitude point as degree. The nearest neighbour point to this given coordinate pair is used.
Returns:sliced_array – The sliced data array with the data for the nearest neighbour point to the given coordinates.
Return type:xarray.DataArray
set_grid(grid=None)[source]

Set the grid to the given grid and set the grid coordinates. It is assumed that the last n dimensions (n=1 for unstructured grid, n=2 all other grids) are the grid coordinates. Please make sure that this assumption is fulfilled!

Parameters:grid (Grid or None, optional) – This grid is used to set the grid and the grid coordinates of the returned array. If this is None, the grid of this DataArray instance is used. Default is None.
Returns:gridded_array – The DataArray with the grid coordinates and the grid.
Return type:xarray.DataArray
Raises:ValueError – A ValueError is raised if the grid of this instance is used and not grid set.
to_pandas(lonlat=None)[source]

Transform the DataArray to Pandas based on given coordinates. If coordinates are given this method selects the nearest neighbour grid point to this coordinates. The data is flatten to a 2d-DataFrame with the time as row axis.

Parameters:lonlat (tuple(float, float) or None) – The nearest grid point to this coordinates (longitude, latitude) is used to generate the pandas data. If lonlat is None no coordinates will be selected and the data is flatten. If the horizontal grid coordinates are not a single point it is recommended to set lonlat.
Returns:extracted_data – The extracted pandas data. The data is based on either a Series (1 Column) or Dataframe (multiple column) depending on the dimensions.
Return type:pandas.Series or pandas.DataFrame
update(*items)[source]

The update routine could be used to update the DataArray, based on other DataArrays. There are some assumptions done:

1. The used data to update this DataArray instance has the same grid coordinates as this instance. 2. Beginning from the left the given items are used to update the data. Such that intersection problems are resolved in favor of the newest data.
Parameters:items (xarray.DataArray) – The items are merged with this xarray.DataArray instance. The grid dimensions have to be same as the grid.
Returns:merged_array – The DataArray instance with the updated data.
Return type:xarray.DataArray

Utilities module

pymepps.accessor.utilities.register_dataframe_accessor(name)[source]

Register a custom accessor on pandas.DataFrame objects.

Parameters:name (str) – Name under which the accessor should be registered. A warning is issued if this name conflicts with a preexisting attribute.
pymepps.accessor.utilities.register_series_accessor(name)[source]

Register a custom accessor on pandas.Series objects.

Parameters:name (str) – Name under which the accessor should be registered. A warning is issued if this name conflicts with a preexisting attribute.