Merge pull request #40 from p2p-ld/bugfix-numpy-str
Some checks failed
Lint / Ruff Linting (push) Has been cancelled
Lint / Black Formatting (push) Has been cancelled
LinkML Tests / test-linkml (macos-latest, 3.12) (push) Has been cancelled
LinkML Tests / test-linkml (macos-latest, 3.9) (push) Has been cancelled
LinkML Tests / test-linkml (ubuntu-latest, 3.12) (push) Has been cancelled
LinkML Tests / test-linkml (ubuntu-latest, 3.9) (push) Has been cancelled
LinkML Tests / test-linkml (windows-latest, 3.12) (push) Has been cancelled
LinkML Tests / test-linkml (windows-latest, 3.9) (push) Has been cancelled
Tests / test (<2.0.0, macos-latest, 3.9) (push) Has been cancelled
Tests / test (<2.0.0, ubuntu-latest, 3.12) (push) Has been cancelled
Tests / test (<2.0.0, ubuntu-latest, 3.9) (push) Has been cancelled
Tests / test (<2.0.0, windows-latest, 3.9) (push) Has been cancelled
Tests / test (>=2.0.0, macos-latest, 3.13) (push) Has been cancelled
Tests / test (>=2.0.0, macos-latest, 3.9) (push) Has been cancelled
Tests / test (>=2.0.0, ubuntu-latest, 3.10) (push) Has been cancelled
Tests / test (>=2.0.0, ubuntu-latest, 3.11) (push) Has been cancelled
Tests / test (>=2.0.0, ubuntu-latest, 3.12) (push) Has been cancelled
Tests / test (>=2.0.0, ubuntu-latest, 3.13) (push) Has been cancelled
Tests / test (>=2.0.0, ubuntu-latest, 3.9) (push) Has been cancelled
Tests / test (>=2.0.0, windows-latest, 3.13) (push) Has been cancelled
Tests / test (>=2.0.0, windows-latest, 3.9) (push) Has been cancelled
Tests / finish-coverage (push) Has been cancelled

Support np.str_ dtype annotations, properly check tuple dtypes
This commit is contained in:
Jonny Saunders 2024-12-13 18:17:31 -08:00 committed by GitHub
commit 62f307f655
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 40 additions and 9 deletions

View file

@ -1,6 +1,10 @@
# Changelog
## Upcoming
## 1.*
### 1.6.*
#### 1.6.6 - 24-12-13
**Bugfix**
- [#38](https://github.com/p2p-ld/numpydantic/issues/38), [#39](https://github.com/p2p-ld/numpydantic/pull/39) -
@ -8,6 +12,11 @@
An additional check was added for presence of `__name__` when embedding.
- `NDArray` types were incorrectly cached s.t. pipe-union dtypes were considered equivalent to `Union[]`
dtypes. An additional tuple with the type of the args was added to the cache key to disambiguate them.
- [#38](https://github.com/p2p-ld/numpydantic/issues/38), [#40](https://github.com/p2p-ld/numpydantic/pull/40) -
- Tuple dtypes were naively checked by just testing for whether the given dtype was contained by the tuple,
ignoring special cases like string type checking. Tuple dtypes are now checked recursively with the same
logic as all other type checking.
- Zarr treats `dtype=str` as numpy type `O` - added special case when validating from JSON to cast to `np.str_`
**Testing**
- [#39](https://github.com/p2p-ld/numpydantic/pull/39) - Test that all combinations of shapes, dtypes, and interfaces
@ -15,10 +24,8 @@
- [#39](https://github.com/p2p-ld/numpydantic/pull/39) - Add python 3.13 to the testing matrix.
- [#39](https://github.com/p2p-ld/numpydantic/pull/39) - Add an additional `marks` field to ValidationCase
for finer-grained control over running tests.
## 1.*
### 1.6.*
- [#40](https://github.com/p2p-ld/numpydantic/pull/40) - Explicitly test for `np.str_` annotation dtypes alone and
in tuples.
#### 1.6.5 - 24-12-04 - Bump Pydantic Minimum

View file

@ -1,6 +1,6 @@
[project]
name = "numpydantic"
version = "1.6.5"
version = "1.6.6"
description = "Type and shape validation and serialization for arbitrary array types in pydantic models"
authors = [
{name = "sneakers-the-rat", email = "sneakers-the-rat@protonmail.com"},

View file

@ -74,7 +74,8 @@ class ZarrJsonDict(JsonDict):
if self.file:
array = ZarrArrayPath(file=self.file, path=self.path)
else:
array = zarr.array(self.value, dtype=self.dtype)
dtype = np.str_ if self.dtype == "str" else self.dtype
array = zarr.array(self.value, dtype=dtype)
return array

View file

@ -126,6 +126,27 @@ DTYPE_CASES = [
ValidationCase(annotation_dtype=str, dtype=str, passes=True, id="str-str"),
ValidationCase(annotation_dtype=str, dtype=int, passes=False, id="str-int"),
ValidationCase(annotation_dtype=str, dtype=float, passes=False, id="str-float"),
ValidationCase(
annotation_dtype=np.str_,
dtype=str,
passes=True,
id="np_str-str",
marks={"np_str", "str"},
),
ValidationCase(
annotation_dtype=np.str_,
dtype=np.str_,
passes=True,
id="np_str-np_str",
marks={"np_str", "str"},
),
ValidationCase(
annotation_dtype=(int, np.str_),
dtype=str,
passes=True,
id="tuple_np_str-str",
marks={"np_str", "str", "tuple"},
),
ValidationCase(
annotation_dtype=BasicModel, dtype=BasicModel, passes=True, id="model-model"
),

View file

@ -75,6 +75,8 @@ class HDF5Case(_HDF5MetaCase):
data = np.array(array, dtype=dtype)
elif dtype is str:
data = generator.random(shape).astype(bytes)
elif dtype is np.str_:
data = generator.random(shape).astype("S32")
elif dtype is datetime:
data = np.empty(shape, dtype="S32")
data.fill(datetime.now(timezone.utc).isoformat().encode("utf-8"))
@ -106,7 +108,7 @@ class HDF5CompoundCase(_HDF5MetaCase):
array_path = "/" + "_".join([str(s) for s in shape]) + "__" + dtype.__name__
if array is not None:
data = np.array(array, dtype=dtype)
elif dtype is str:
elif dtype in (str, np.str_):
dt = np.dtype([("data", np.dtype("S10")), ("extra", "i8")])
data = np.array([("hey", 0)] * np.prod(shape), dtype=dt).reshape(shape)
elif dtype is datetime:

View file

@ -32,7 +32,7 @@ def validate_dtype(dtype: Any, target: DtypeType) -> bool:
return True
if isinstance(target, tuple):
valid = dtype in target
valid = any(validate_dtype(dtype, target_dt) for target_dt in target)
elif is_union(target):
valid = any(
[validate_dtype(dtype, target_dt) for target_dt in get_args(target)]