Dataset Transforms¶
aka. derived datasets.
(experimental)
Intake allows for the definition of data sources which take as their input another source in the same directory, so that you have the opportunity to present processing to the user of the catalog.
API¶
|
Base source deriving from another source in the same catalog |
|
Transform where the input and output are both Dask-compatible dataframes |
|
Simple dataframe transform to pick columns |
-
class
intake.source.derived.
DerivedSource
(*args, **kwargs)¶ Base source deriving from another source in the same catalog
Target picking and parameter validation are performed here, but you probably want to subclass from one of the more specific classes like
DataFrameTransform
.
-
class
intake.source.derived.
GenericTransform
(*args, **kwargs)¶ -
optional_params
= {'allow_dask': True}¶ Perform an arbitrary function to transform an input
- transform: function to perform transform
function(container_object) -> output, or a fully-qualified dotted string pointing to it
- transform_params: dict
The keys are names of kwargs to pass to the transform function. Values are either concrete values to pass; or param objects which can be made into widgets (but must have a default value) - or a spec to be able to make these objects.
- allow_dask: bool (optional, default True)
Whether to_dask() is expected to work, which will in turn call the target’s to_dasK()
-
read
()¶ Load entire dataset into a container and return it
-
to_dask
()¶ Return a dask container for this data source
-
-
class
intake.source.derived.
Columns
(*args, **kwargs)¶ Simple dataframe transform to pick columns
Given as an example of how to make a specific dataframe transform. Note that you could use DataFrameTransform directly, by writing a function to choose the columns instead of a method as here.