GroupBy Operations
xarray supports “group by” operations with the same API as pandas to implement the split-apply-combine strategy:
- Split your data into multiple independent groups.
- Apply some function to each group.
- Combine your groups back into a single data object.
Group by operations work on both Dataset and DataArray objects. Most of the examples focus on grouping by a single one-dimensional variable, although support for grouping over a multi-dimensional variable is also supported:
- Using groupby to calculate a monthly climatology:
import xarray as xr
da = xr.open_dataarray("../data/air_temperature.nc")
da_climatology = da.groupby('time.month').mean('time')
da_climatology
In this case, we provide what we refer to as a virtual variable (time.month
). Other virtual variables include: year
, month
, day
, hour
, minute
, second
, dayofyear
, week
, dayofweek
, weekday
and quarter
. It is also possible to use another DataArray or pandas object as the grouper.
da.groupby('time.season').median('time')
Resampling Operations
In order to resample time-series data, xarray provides a resample
convenience method for frequency conversion and resampling of time series.
da
- Downsample our 6 hourly time-series data to quaterly data:
da1 = da.resample(time='QS').mean(dim='time')
da1
- Upsample our quarterly time-series data to daily data:
da.resample(time='1D').interpolate('linear')
Rolling Window Operations
Xarray objects include a rolling method to support rolling window aggregations:
roller = da.rolling(time=3)
roller
roller.mean()
- We can also provide a custom function
def sum_minus_2(da, axis):
return da.sum(axis=axis) - 273
roller.reduce(sum_minus_2)
%load_ext watermark
%watermark --iversion -g -m -v -u -d