numpydantic/docs/syntax.md
sneakers-the-rat f482f07be9
1.2.0 - shape ranges
update docs, bump version
2024-06-14 22:56:28 -07:00

3.5 KiB

Syntax

General form:

field: NDArray[Shape["{shape_expression}"], dtype]

Dtype

Dtype checking is for the most part as simple as an isinstance check - the dtype attribute of the array is checked against the dtype provided in the NDArray annotation. Both numpy and builtin python types can be used.

A tuple of types can also be passed:

field: NDArray[Shape["2, 3"], (np.int8, np.uint8)]

Like nptyping, the {mod}~numpydantic.dtype module provides convenient access and aliases to the common dtypes, but also provides "generic" dtypes like {class}~numpydantic.dtype.Float that is a tuple of all subclasses of {class}numpy.floating. Numpy interprets float as being equivalent to {class}numpy.float64, and {class}numpy.floating is an abstract parent class, so "generic" tuple dtypes fill that narrow gap.

Future versions will support interfaces providing type maps for declaring
equality between dtypes that may be specific to that library but should be 
considered equivalent to numpy or other library's dtypes.
Future versions will also support declaring minimum or maximum precisions, 
so one might say "at least a 16-bit float" and also accept a 32-bit float.

Shape

Full documentation of nptyping's shape syntax is available in the nptyping docs, but for the sake of self-contained docs, the high points are:

Numerical Shape

A comma-separated list of integers.

For a 2-dimensional, 3 x 4-shaped array:

Shape["3, 4"]

Wildcards

Wildcards indicate a dimension can be any size

For a 2-dimensional, 3 x any-shaped array:

Shape["3, *"]

(shape-ranges)=

Ranges

Dimension sizes can also be specified as ranges1. Ranges must have no whitespace, and may use integers or wildcards. Range specifications are inclusive on both ends.

For an array whose...

  • First dimension can be of length 2, 3, or 4
  • Second dimension is 2 or greater
  • Third dimension is 4 or less
Shape["2-4, 2-*, *-4"]

Labels

Dimensions can be given labels, and in future versions these labels will be propagated to the generated JSON Schema

Shape["3 x, 4 y, 5 z"]

Arbitrary dimensions

After some specified dimensions, one can express that there can be any number of additional dimensions with an ... like

Shape["3, 4, ..."]

Any-Shaped

If dtype is also Any, one can just use

field: NDArray

If a dtype is being passed, use the '*' wildcard along with the '...'

field: NDArray[Shape['*, ...'], int]

Caveats

numpydantic currently does not support structured dtypes or {class}`numpy.recarray`
specifications like nptyping does. It will in future versions.
numpydantic also does not support the variable shape definition form like

```python
Shape['Dim, Dim']
```

where there are two dimensions of any shape as long as they are equal
because at the moment it appears impossible to express dynamic constraints
(ie. `minItems`/`maxItems` that depend on the shape of another array)
in JSON Schema. A future minor version will allow them by generating a JSON
schema with a warning that the equal shape constraint will not be represented.

See: https://github.com/orgs/json-schema-org/discussions/730


  1. This is an extension to nptyping's syntax, and so using nptyping.Shape is unsupported - use {class}numpydantic.Shape ↩︎