-
Notifications
You must be signed in to change notification settings - Fork 53
Store function metadata in a machine readable format #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
That makes sense to me. I wanted to highlight one of the existing JSON formats I am using for python-record-api. Minimal example generated from this file: https://github.com/data-apis/python-record-api/blob/master/data/api/sample-usage.json It is specified/documented as pydantic models which are useful to easily serialize/deserialize from python into JSON: https://github.com/data-apis/python-record-api/blob/006faf0bba9cd4cb55fbacc13d2bbda365f5bf0b/record_api/apis.py#L69 For the "leaf nodes" of actual types I also built some pydantic models for different kinds of types: https://github.com/data-apis/python-record-api/blob/006faf0bba9cd4cb55fbacc13d2bbda365f5bf0b/record_api/type_analysis.py#L74. Normal python instances can just be saved with the type names and it has special handling for different generic types (like lists, tuples, etc) or literal types (strings). |
I don't want to get bogged down in a metaconversation on the "right" way to specify types for array functions. Any specification is fine, as long as it is machine readable. We could consider the JSON as an internal document and not part of the actual spec (i.e., the schema could change between minor spec versions). Some sorts of things that I could imagine wanting to parse here for the tests are:
If you already have some thoughts on the right way to specify these sorts of things, that's great, and we should use it. But I don't want to wait on a meta decision on how to specify types. My main motivation here is to make it so I can generate as much of the test suite automatically from the spec as possible, so that it's easier to keep them in sync. |
It's also fine if we can't represent some corner cases, at least to begin with. For example, we might not be able to represent valid shapes for something like |
…lementwise functions This will make these easier to parse from the test suite, barring something like #49.
…lementwise functions (#52) * Use a "special values" header before the list of special values for elementwise functions This will make these easier to parse from the test suite, barring something like #49. * Fix some inconsistencies in the wording of the special values listings * Fix more wording inconsistencies in the special value listings
I am revisiting this issue as I encounter a similar need. Parallel to the need for updating docstrings (#180), we also need this metadata to populate, say TOC of a doc page. Currently in CuPy I am using |
I should mention that in the test suite I am parsing parts of the spec and populating some function stubs https://github.com/data-apis/array-api-tests/tree/master/array_api_tests/function_stubs. Feel free to reuse these for your implementation, or use it to extract a manual list of functions. The dictionaries at the top of test_type_promotion.py may also be useful if you plan to restrict input dtypes like the NumPy implementation does (although it should be clear implementations do not need to be minimal like this. We did so for the NumPy one because it is a reference implementation, but dtype restrictions are not required by the spec). |
Things that it would be useful to have structured data for:
We already have effectively structured data for the siguratures and type annotations. Like I said, there should also be room for plain-text notes, as there will always be things that don't fit into the existing schemes, and we also want the ability to add things like motiations and implementation notes. |
It would be useful for the test suite to have the function metadata stored in a machine readable format. Currently I am parsing the function signatures from the spec files using some regular expressions, and I will probably end up parsing some other information such as types as well. This works fine for now, but it would be cleaner if this data were stored in a machine readable format, say in JSON, and the relevant parts of the spec documents generated from that automatically.
To be sure, not everything in the spec needs to be in JSON, just the parts that will need to be extracted for other things as well, such as the test suite. There should still be a lot of plain English descriptions of behavior.
This is likely too much work for version 1 given that we already have things inline in the Markdown, but it's something to consider for future iterations.
The text was updated successfully, but these errors were encountered: