Skip to content

PR: Add finfo and iinfo in the spec #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 21, 2021

Conversation

steff456
Copy link
Member

This PR

  • Adds specification for finfo, which defines the machine limits for floating point types
  • Adds specification for iinfo, which defines the machine limits for integer types

Notes

  • For finfo class the minimal common set of attributes was defined, reducing the number of attributes currently supported by NumPy, CuPy and JAX.

@steff456 steff456 requested review from rgommers and kgryte February 17, 2021 05:09
@steff456 steff456 self-assigned this Feb 17, 2021
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steff456. Overall looks good, I propose a few simplifications.

Also noticed:

>>> type(tf.experimental.numpy.finfo(tf.float64).dtype)
<class 'numpy.dtype'>
>>> tf.experimental.numpy.finfo?
...
Note that currently it just forwards to the numpy namesake, while
  tensorflow and numpy dtypes may have different properties.

hmm ....

- The largest representable number.
- **min**: _float_
- The smallest representable number.
- **tiny**: _float_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgryte do you happen to know if this is guaranteed to be the same across implementations? I.e. it's part of IEEE 754 somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the NumPy docs have in their notes:

Note that tiny is not actually the smallest positive representable value in a NumPy floating point type. As in the IEEE-754 standard [1], NumPy floating point types make use of subnormal numbers to fill the gap between 0 and tiny. However, subnormal numbers may have significantly reduced precision [2].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. While part of IEEE 754, support for subnormal numbers is less common, especially among older generation GPUs. Often subnormal numbers trigger slow paths, and, if not implemented in hardware, software emulation. More recent NVIDIA GPUs do support subnormal numbers. And among ARM processors, one can flush subnormal numbers to zero.

Based on NumPy docs (via @steff456 ), tiny refers to the smallest normal number. We should probably be explicit by what we mean by "representable number".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So the description is a bit off here. I'm not sure it should be part of the standard. I can imagine it may be useful in a few corner cases, but I've never seen it used in the wild.

Let's see if anyone else has an opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so there are use cases at the compiled code level at least.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose having this info is useful for meta inspection concerning platform capabilities. For example, if I want to know whether a platform supports subnormals. Apart from meta data, a user would need to compute a quantity which should resolve to a subnormal number and manually check whether the value is zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. It can also come up in user code, e.g., in the computation of transcendentals, where, if you know that a platform does not support subnormals, then you can avoid various, often slower, branching logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we wanted to be complete, instead of "tiny", we'd have smallest_normal and smallest_subnormal. If the latter is 0, then a platform does not support subnormal numbers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those do seem like better names indeed. TF and CuPy just alias numpy.finfo, which may be wrong. And JAX implements it but docs say "The smallest positive usable number", which is quite uninformative.

I think we can add the new names, but then we should make a PR to NumPy to see if that's accepted. And otherwise we should probably drop it completely.

Copy link
Contributor

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steff456! Added some minor style and punctuation touch-ups to maintain consistency with rest of spec. Otherwise, looks good!

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo my one small comment. I'll push a fix for that and merge.

Let's open a separate issue to keep track of the subnormal number attributes.

## Objects in API

(finfo)=
### finfo(type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type should be a positional argument here, so this should be finfo(type, /).

And same for iinfo.

@rgommers rgommers merged commit c6d5918 into data-apis:main Feb 21, 2021
@rgommers
Copy link
Member

Merged, thanks @steff456! And thanks @kgryte for reviewing.

@honno
Copy link
Member

honno commented Jan 21, 2022

Currently dask.array returns 0d arrays instead of Python scalars for the attributes of these info objects. I don't imagine that's a choice, just a vestige of copying NumPy proper, and subject to change once more efforts been made in adoption. But it's interesting to think if this could be a problem when these libraries start seriously adopting the spec, as 0d arrays have conveniences such as representing >64bit elements. And the spec might want to support such elements in the future too (tho that seems to be limited by CUDA?).

Currently numpy.array_api, cupy.array_api and pytorch return Python scalars. Not sure about the rest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants