schema¶

@schema is a function modifier that allows you to specify a schema for the function’s inputs/outputs. This can be used to validate data at runtime, visualize, etc…

Reference Documentation

class hamilton.function_modifiers.schema¶

Container class for schema stuff. This is purely so we can have a nice API for it – E.G. Schema.output

static output(*fields: Tuple[str, str], target_: str | None = None) → SchemaOutput¶

Initializes a @schema.output decorator. This takes in a list of fields, which are tuples of the form (field_name, field_type). The field type must be one of the function_modifiers.SchemaTypes types.

Parameters:

target – Target node to decorate – if None it’ll decorate all final nodes (E.G. sinks in the subdag), otherwise it will decorate the specified node.
fields – List of fields to add to the schema. Each field is a tuple of the form (field_name, field_type)

This is implemented using tags, but that might change. Thus you should not rely on the tags created by this decorator (which is why they are prefixed with internal).

To use this, you should decorate a node with @schema.output

Example usage:

@schema.output(
    ("a", "int"),
    ("b", "float"),
    ("c", "str")
 )
def example_schema() -> pd.DataFrame:
    return pd.DataFrame.from_records({"a": [1], "b": [2.0], "c": ["3"]})

Then, when drawing the DAG, the schema will be displayed as sub-elements in the node for the DAG (if display_schema is selected).