mirror of
https://github.com/p2p-ld/numpydantic.git
synced 2024-11-14 10:44:28 +00:00
changelog, bump version, remove pdb
This commit is contained in:
parent
16b0eb0542
commit
0a175d17c0
7 changed files with 95 additions and 16 deletions
|
@ -2,6 +2,73 @@
|
|||
|
||||
## 1.*
|
||||
|
||||
### 1.6.*
|
||||
|
||||
#### 1.6.0 - 24-09-23 - Roundtrip JSON Serialization
|
||||
|
||||
Roundtrip JSON serialization is here - with serialization to list of lists,
|
||||
as well as file references that don't require copying the whole array if
|
||||
used in data modeling, control over path relativization, and stamping of
|
||||
interface version for the extra provenance conscious.
|
||||
|
||||
Please see [serialization](./serialization.md) for narrative documentation :)
|
||||
|
||||
**Potentially Breaking Changes**
|
||||
- See [development](./development.md) for a statement about API stability
|
||||
- An additional {meth}`.Interface.deserialize` method has been added to
|
||||
{meth}`.Interface.validate` - downstream users are not intended to override the
|
||||
`validate method`, but if they have, then JSON deserialization will not work for them.
|
||||
- `Interface` subclasses now require a `name` attribute, a short string identifier for that interface,
|
||||
and a `json_model` that inherits from {class}`.interface.JsonDict`. Interfaces without
|
||||
these attributes will not be able to be instantiated.
|
||||
- {meth}`.Interface.to_json` is now an abstract method that all interfaces must define.
|
||||
|
||||
**Features**
|
||||
- Roundtrip JSON serialization - by default dump to a list of list arrays, but
|
||||
support the `round_trip` keyword in `model_dump_json` for provenance-preserving dumps
|
||||
- JSON Schema generation has been separated from `core_schema` generation in {class}`.NDArray`.
|
||||
Downstream interfaces can customize json schema generation without compromising ability to validate.
|
||||
- All proxy classes must have an `__eq__` dunder method to compare equality -
|
||||
in proxy classes, these compare equality of arguments, since the arrays that
|
||||
are referenced on disk should be equal by definition. Direct array comparison
|
||||
should use {func}`numpy.array_equal`
|
||||
- Interfaces previously couldn't be instantiated without explicit shape and dtype arguments,
|
||||
these have been given `Any` defaults.
|
||||
- New {mod}`numpydantic.serialization` module to contain serialization logic.
|
||||
|
||||
**New Classes**
|
||||
See the docstrings for descriptions of each class
|
||||
- `MarkMismatchError` for when an array serialized with `mark_interface` doesn't match
|
||||
the interface that's deserializing it
|
||||
- {class}`.interface.InterfaceMark`
|
||||
- {class}`.interface.MarkedJson`
|
||||
- {class}`.interface.JsonDict`
|
||||
- {class}`.dask.DaskJsonDict`
|
||||
- {class}`.hdf5.H5JsonDict`
|
||||
- {class}`.numpy.NumpyJsonDict`
|
||||
- {class}`.video.VideoJsonDict`
|
||||
- {class}`.zarr.ZarrJsonDict`
|
||||
|
||||
**Bugfix**
|
||||
- [`#17`](https://github.com/p2p-ld/numpydantic/issues/17) - Arrays are re-validated as lists, rather than arrays
|
||||
- Some proxy classes would fail to be serialized becauase they lacked an `__array__` method.
|
||||
`__array__` methods have been added, and tests for coercing to an array to prevent regression.
|
||||
- Some proxy classes lacked a `__name__` attribute, which caused failures to serialize
|
||||
when the `__getattr__` methods attempted to pass it through. These have been added where needed.
|
||||
|
||||
**Docs**
|
||||
- Add statement about versioning and API stability to [development](./development.md)
|
||||
- Add docs for serialization!
|
||||
- Remove stranded docs from hooks and monkeypatch
|
||||
- Added `myst_nb` to docs dependencies for direct rendering of code and output
|
||||
|
||||
**Tests**
|
||||
- Marks have been added for running subsets of the tests for a given interface,
|
||||
package feature, etc.
|
||||
- Tests for all the above functionality
|
||||
|
||||
|
||||
|
||||
### 1.5.*
|
||||
|
||||
#### 1.5.3 - 24-09-03 - Bugfix, type checking for empty HDF5 datasets
|
||||
|
|
|
@ -110,14 +110,17 @@ as `int` ({class}`numpy.int64`) or `float` ({class}`numpy.float64`)
|
|||
## Roundtripping
|
||||
|
||||
To roundtrip make arrays round-trippable, use the `round_trip` argument
|
||||
to {func}`~pydantic.BaseModel.model_dump_json`
|
||||
to {func}`~pydantic.BaseModel.model_dump_json`.
|
||||
|
||||
All the following should return an equivalent array from the same
|
||||
file/etc. as the source array when using
|
||||
`{func}`~pydantic.BaseModel.model_validate_json`` .
|
||||
|
||||
```{code-cell}
|
||||
print_json(model.model_dump_json(round_trip=True))
|
||||
```
|
||||
|
||||
Each interface should[^notenforced] implement a dataclass that describes a
|
||||
Each interface must implement a dataclass that describes a
|
||||
json-able roundtrip form (see {class}`.interface.JsonDict`).
|
||||
|
||||
That dataclass then has a {meth}`JsonDict.is_valid` method that checks
|
||||
|
@ -220,12 +223,34 @@ print_json(
|
|||
))
|
||||
```
|
||||
|
||||
When an array marked with the interface is deserialized,
|
||||
it short-circuits the {meth}`.Interface.match` method,
|
||||
attempting to directly return the indicated interface as long as the
|
||||
array dumped in `value` still satisfies that interface's {meth}`.Interface.check`
|
||||
method. Arrays dumped *without* `round_trip=True` might *not* validate with
|
||||
the originating model, even when marked -- eg. an array dumped without `round_trip`
|
||||
will be revalidated as a numpy array for the same reasons it is everywhere else,
|
||||
since all connection to the source file is lost.
|
||||
|
||||
```{todo}
|
||||
Currently, the version of the package the interface is from (usually `numpydantic`)
|
||||
will be stored, but there is no means of resolving it on the fly.
|
||||
If there is a mismatch between the marked interface description and the interface
|
||||
that was matched on revalidation, a warning is emitted, but validation
|
||||
attempts to proceed as normal.
|
||||
|
||||
This feature is for extra-verbose provenance, rather than airtight serialization
|
||||
and deserialization, but PRs welcome if you would like to make it be that way.
|
||||
```
|
||||
|
||||
```{todo}
|
||||
We will also add a separate `mark_version` parameter for marking
|
||||
the specific version of the relevant interface package, like `zarr`, or `numpy`,
|
||||
patience.
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Context parameters
|
||||
|
||||
A reference listing of all the things that can be passed to
|
||||
|
@ -305,9 +330,3 @@ print_json(data)
|
|||
|
||||
|
||||
[^normalstyle]: o ya we're posting JSON [normal style](https://normal.style)
|
||||
[^notenforced]: This is only *functionally* enforced at the moment, where
|
||||
a roundtrip test confirms that dtype and type are preserved,
|
||||
but there is no formal test for each interface having its own serialization class
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
[project]
|
||||
name = "numpydantic"
|
||||
version = "1.5.3"
|
||||
version = "1.6.0"
|
||||
description = "Type and shape validation and serialization for arbitrary array types in pydantic models"
|
||||
authors = [
|
||||
{name = "sneakers-the-rat", email = "sneakers-the-rat@protonmail.com"},
|
||||
|
|
|
@ -1,4 +1,3 @@
|
|||
import pdb
|
||||
import sys
|
||||
|
||||
import pytest
|
||||
|
|
|
@ -1,5 +1,3 @@
|
|||
import pdb
|
||||
|
||||
import pytest
|
||||
|
||||
from typing import Union, Optional, Any
|
||||
|
|
|
@ -3,8 +3,6 @@ Test serialization-specific functionality that doesn't need to be
|
|||
applied across every interface (use test_interface/test_interfaces for that
|
||||
"""
|
||||
|
||||
import pdb
|
||||
|
||||
import h5py
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
|
|
|
@ -1,5 +1,3 @@
|
|||
import pdb
|
||||
|
||||
import pytest
|
||||
|
||||
from typing import Any
|
||||
|
|
Loading…
Reference in a new issue