Merge pull request #1 from yarikoptic/enh-codespell

Add codespell support (config, workflow to detect/not fix) and make it fix few typos
2025-01-09 21:54:27 +00:00 · 2024-04-17 17:05:10 -06:00 · 2024-04-17 17:05:10 -06:00 · 661194e0a5
commit 661194e0a5
parent 4ee97263ed 2e2ecf10c0
16 changed files with 49 additions and 19 deletions
--- a/.github/workflows/codespell.yml
+++ b/.github/workflows/codespell.yml
@ -0,0 +1,23 @@
+# Codespell configuration is within pyproject.toml
+---
+name: Codespell
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+permissions:
+  contents: read
+
+jobs:
+  codespell:
+    name: Check for spelling errors
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Codespell
+        uses: codespell-project/actions-codespell@v2
--- a/docs/_notes/pynwb.md
+++ b/docs/_notes/pynwb.md
@ -10,8 +10,8 @@
  * Unclear how the schema is used if the containers contain the same information
 * the [register_container_type](https://github.com/hdmf-dev/hdmf/blob/dd39b3878523c4b03f5286fc740752befd192d8b/src/hdmf/build/manager.py#L727-L736) method in hdmf's TypeMap class seems to overwrite the loaded schema???
  * `__NS_CATALOG` seems to actually hold references to the schema but it doesn't seem to be used anywhere except within `__TYPE_MAP` ? 
-* [NWBHDF5IO](https://github.com/NeurodataWithoutBorders/pynwb/blob/dev/src/pynwb/__init__.py#L237-L238) uses `TypeMap` to greate a `BuildManager`
-  * Parent class [HDF5IO](https://github.com/hdmf-dev/hdmf/blob/dd39b3878523c4b03f5286fc740752befd192d8b/src/hdmf/backends/hdf5/h5tools.py#L37) then reimplements a lot of basic functionality from elsehwere
+* [NWBHDF5IO](https://github.com/NeurodataWithoutBorders/pynwb/blob/dev/src/pynwb/__init__.py#L237-L238) uses `TypeMap` to create a `BuildManager`
+  * Parent class [HDF5IO](https://github.com/hdmf-dev/hdmf/blob/dd39b3878523c4b03f5286fc740752befd192d8b/src/hdmf/backends/hdf5/h5tools.py#L37) then reimplements a lot of basic functionality from elsewhere
  * Parent-parent metaclass [HDMFIO](https://github.com/hdmf-dev/hdmf/blob/dev/src/hdmf/backends/io.py) appears to be the final writing class?
  * `BuildManager.build` then [calls `TypeMap.build`](https://github.com/hdmf-dev/hdmf/blob/dd39b3878523c4b03f5286fc740752befd192d8b/src/hdmf/build/manager.py#L171) ???
 * `TypeMap.build` ...
--- a/docs/_notes/todo.md
+++ b/docs/_notes/todo.md
@ -1,6 +1,6 @@
 # TODO

-Stuff to keep track of that might have been manually overrided that needs to be fixed pre-release
+Stuff to keep track of that might have been manually overridden that needs to be fixed pre-release

 - Coerce all listlike things into lists if they are passed as single elements!
 - Use [fsspec](https://filesystem-spec.readthedocs.io/en/latest/index.html) to interface with DANDI!
--- a/docs/intro/purpose.md
+++ b/docs/intro/purpose.md
@ -72,7 +72,7 @@ is relatively complex, and so to use a schema extension one must also
 program the python classes or mappings to python class attributes
 needed to use them, configuration for getter and setter methods,
 i/o routines, etc. Since schema extensions are relatively hard to make,
-to accomodate heterogeneous data NWB uses `DynamicTable`s, which can be
+to accommodate heterogeneous data NWB uses `DynamicTable`s, which can be
 given arbitrary new columns.

 The loose coupling between schema and code has a few impacts:
--- a/docs/meta/changelog.md
+++ b/docs/meta/changelog.md
@ -2,4 +2,4 @@

 ## v0.1.0 - Package exists

-thats about as much as can be said
+that's about as much as can be said
--- a/nwb_linkml/src/nwb_linkml/adapters/namespaces.py
+++ b/nwb_linkml/src/nwb_linkml/adapters/namespaces.py
@ -243,7 +243,7 @@ class NamespacesAdapter(Adapter):
                    ns = ns[0]
                    break
            else:
-                raise NameError(f"Couldnt find namespace {name}")
+                raise NameError(f"Couldn't find namespace {name}")
        else:
            ns = ns[0]

--- a/nwb_linkml/src/nwb_linkml/generators/pydantic.py
+++ b/nwb_linkml/src/nwb_linkml/generators/pydantic.py
@ -460,7 +460,7 @@ class NWBPydanticGenerator(PydanticGenerator):
        if all([s.required for s in cls.attributes.values()]): # pragma: no cover
            return self._make_npytyping_range(cls.attributes)
        # otherwise we need to make permutations
-        # but not all permutations, because we typically just want to be able to exlude the last possible dimensions
+        # but not all permutations, because we typically just want to be able to exclude the last possible dimensions
        # the array classes should always be well-defined where the optional dimensions are at the end, so
        requireds = {k:v for k,v in cls.attributes.items() if v.required}
        optionals = [(k,v) for k, v in cls.attributes.items() if not v.required]
--- a/nwb_linkml/src/nwb_linkml/io/hdf5.py
+++ b/nwb_linkml/src/nwb_linkml/io/hdf5.py
@ -311,7 +311,7 @@ def truncate_file(source: Path, target: Optional[Path] = None, n:int=10) -> Path
        try:
            obj.resize(n, axis=0)
        except TypeError:
-            # contiguous arrays cant be trivially resized, so we have to copy and create a new dataset
+            # contiguous arrays can't be trivially resized, so we have to copy and create a new dataset
            tmp_name = obj.name + '__tmp'
            original_name = obj.name
            obj.parent.move(obj.name, tmp_name)
@ -326,7 +326,7 @@ def truncate_file(source: Path, target: Optional[Path] = None, n:int=10) -> Path

    # use h5repack to actually remove the items from the dataset
    if shutil.which('h5repack') is None:
-        warnings.warn('Truncated file made, but since h5repack not found in path, file wont be any smaller')
+        warnings.warn('Truncated file made, but since h5repack not found in path, file won't be any smaller')
        return target

    print('Repacking hdf5...')
--- a/nwb_linkml/src/nwb_linkml/lang_elements.py
+++ b/nwb_linkml/src/nwb_linkml/lang_elements.py
@ -22,7 +22,7 @@ FlatDType = EnumDefinition(

 DTypeTypes = []
 for nwbtype, linkmltype in flat_to_linkml.items():
-    # skip the dtypes that are the same as the builtin linkml types (which should alredy exist)
+    # skip the dtypes that are the same as the builtin linkml types (which should already exist)
    # to avoid a recursion error
    if linkmltype == nwbtype:
        continue
--- a/nwb_linkml/src/nwb_linkml/maps/hdf5.py
+++ b/nwb_linkml/src/nwb_linkml/maps/hdf5.py
@ -123,7 +123,7 @@ class HDF5Map(Map):
    priority: int = 0
    """
    Within a phase, sort mapping operations from low to high priority
-    (maybe this should be renamed because highest priority last doesnt make a lot of sense)
+    (maybe this should be renamed because highest priority last doesn't make a lot of sense)
    """

    @classmethod
@ -815,7 +815,7 @@ def resolve_references(src: dict, completed: Dict[str, H5ReadResult]) -> Tuple[d
        if isinstance(item, HDF5_Path):
            other_item = completed.get(item, None)
            if other_item is None:
-                errors.append(f"Couldnt find: {item}")
+                errors.append(f"Couldn't find: {item}")
            res[path] = other_item.result
            completes.append(item)

--- a/nwb_linkml/src/nwb_linkml/maps/hdmf.py
+++ b/nwb_linkml/src/nwb_linkml/maps/hdmf.py
@ -23,7 +23,7 @@ def model_from_dynamictable(group:h5py.Group, base:Optional[BaseModel] = None) -

        nptype = group[col].dtype.type
        if nptype == np.void:
-            warnings.warn(f"Cant handle numpy void type for column {col} in {group.name}")
+            warnings.warn(f"Can't handle numpy void type for column {col} in {group.name}")
            continue
        type_ = Optional[NDArray[Any, nptype]]

@ -64,7 +64,7 @@ def dynamictable_to_model(
                #     # dask can't handle this, we just arrayproxy it
                items[col] = NDArrayProxy(h5f_file=group.file.filename, path=group[col].name)
                #else:
-                #    warnings.warn(f"Dask cant handle object type arrays like {col} in {group.name}. Skipping")
+                #    warnings.warn(f"Dask can't handle object type arrays like {col} in {group.name}. Skipping")
                # pdb.set_trace()
                # # can't auto-chunk with "object" type
                # items[col] = da.from_array(group[col], chunks=-1)
--- a/nwb_linkml/src/nwb_linkml/types/ndarray.py
+++ b/nwb_linkml/src/nwb_linkml/types/ndarray.py
@ -189,7 +189,7 @@ class NDArrayProxy():
            obj = h5f.get(self.path)
            return obj[slice]
    def __setitem__(self, slice, value):
-        raise NotImplementedError(f"Cant write into an arrayproxy yet!")
+        raise NotImplementedError(f"Can't write into an arrayproxy yet!")


    @classmethod
--- a/nwb_linkml/tests/test_adapters/test_adapter_classes.py
+++ b/nwb_linkml/tests/test_adapters/test_adapter_classes.py
@ -16,7 +16,7 @@ def test_build_base(nwb_schema):
    assert len(base.classes) == 1
    img = base.classes[0]
    assert img.name == "Image"
-    # no parent class, tree_root shoudl be true
+    # no parent class, tree_root should be true
    assert img.tree_root
    assert len(img.attributes) == 3

--- a/nwb_linkml/tests/test_maps/test_hdmf.py
+++ b/nwb_linkml/tests/test_maps/test_hdmf.py
@ -24,7 +24,7 @@ def test_make_dynamictable(data_dir, dataset):
    model = model_from_dynamictable(group)
    data = dynamictable_to_model(group, model)

-    ser = data.model_dump_json()
+    _ = data.model_dump_json()
    end_time = time.time()
    total_time = end_time - start_time

--- a/nwb_schema_language/src/data/tests/nwb.ophys.yaml
+++ b/nwb_schema_language/src/data/tests/nwb.ophys.yaml
@ -340,12 +340,12 @@ groups:
    each point in time is assumed to be 2-D (has only x & y dimensions).'
  groups:
  - neurodata_type_inc: CorrectedImageStack
-    doc: Reuslts from motion correction of an image stack.
+    doc: Results from motion correction of an image stack.
    quantity: '+'

 - neurodata_type_def: CorrectedImageStack
  neurodata_type_inc: NWBDataInterface
-  doc: Reuslts from motion correction of an image stack.
+  doc: Results from motion correction of an image stack.
  groups:
  - name: corrected
    neurodata_type_inc: ImageSeries
--- a/pyproject.toml
+++ b/pyproject.toml
@ -30,3 +30,10 @@ ipywidgets = "^8.1.1"
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
+
+[tool.codespell]
+# Ref: https://github.com/codespell-project/codespell#using-a-config-file
+skip = '.git*,*.lock,*.css,./nwb_linkml/src/nwb_linkml/models,./nwb_linkml/src/nwb_linkml/schema'
+check-hidden = true
+# ignore-regex = ''
+# ignore-words-list = ''