Default acc_dtype to floatX #655

tvwenger · 2024-03-02T00:27:51Z

Description

This PR changes the default data type for the internal accumulator to config.floatX when the input data type is float. The previous behavior always attempted to upcast acc_dtype, which caused the graph to include higher precision floats even when the user requested floatX = 'float32'.

Related Issue

Closes local_sum_make_vector rewrite can introduce forbidden float64 operations at the graph level #653

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

tvwenger · 2024-03-02T06:52:26Z

Some of the tests are failing: TestRepeat and TestSubtensor. I can see where the logic is failing based on the test error:


self = CAReduce(scalar_op=Composite{(i0 + Cast{float32}(i1))},axis=(0,),dtype=float32,acc_dtype=float32,upcast_discrete_output=True)
idtype = 'float64'

    def _acc_dtype(self, idtype):
        acc_dtype = self.acc_dtype
        if acc_dtype is None:
            return dict(
                bool="int64",
                int8="int64",
                int16="int64",
                int32="int64",
                uint8="uint64",
                uint16="uint64",
                uint32="uint64",
                float16=config.floatX,
                float32=config.floatX,
                complex64="complex128",
            ).get(idtype, idtype)
        elif acc_dtype in continuous_dtypes and idtype in discrete_dtypes:
            # Specifying a continuous accumulator for discrete input is OK
            return acc_dtype
        else:
            # The conversion has to be considered an upcast.
            upcasted_dtype = upcast(idtype, acc_dtype)
            if acc_dtype != upcasted_dtype:
>               raise TypeError(
                    f"Cannot build {self} node with input dtype {idtype} "
                    f"and acc_dtype {acc_dtype}, as precision would be lost. "
                    "To correct this error, you can:\n"
                    "  - not specify acc_dtype, or\n"
                    f"  - use an acc_dtype at least as precise as {upcasted_dtype}.\n"
                    '  - specify "dtype" instead of "acc_dtype", so '
                    "the reduction will be precise, but the result will "
                    'be casted into "dtype" at the end.\n'
                    "If you are expecting the precision loss, you can "
                    f'use tensor.cast(..., dtype="{acc_dtype}"), on your input.'
                )
E               TypeError: Cannot build CAReduce{Composite{(i0 + Cast{float32}(i1))}, axis=0} node with input dtype float64 and acc_dtype float32, as precision would be lost. To correct this error, you can:
E                 - not specify acc_dtype, or
E                 - use an acc_dtype at least as precise as float64.
E                 - specify "dtype" instead of "acc_dtype", so the reduction will be precise, but the result will be casted into "dtype" at the end.
E               If you are expecting the precision loss, you can use tensor.cast(..., dtype="float32"), on your input.

Although CAReduce is given the correct dtype and acc_dtype, the function _acc_dtype that is raising the exception isn't testing against the passed dtype, but rather idtype, the type of the input: (i0 + Cast{float32}(i1)). This must be resolving to float64.

ricardoV94 · 2024-03-02T08:06:33Z

Yes I think the function can't and shouldn't downcast the accumulation phase. If the input is float64 the same should be used for accumulation by default. The float32 dtype you're talking is the output type.

Why did it choose float32 for the accumulator?

tvwenger · 2024-03-02T17:13:55Z

@ricardoV94

Yes I think the function can't and shouldn't downcast the accumulation phase.

I agree, but I think at most acc_dtype should be floatX in order to resolve the MWE failure that you posted.

If the input is float64 the same should be used for accumulation by default.

I also agree, but the failing tests are for config.floatX = "float32", so I think the problem here is that the input dtype to CAReduce is float64 for some reason.

The float32 dtype you're talking is the output type.

I see. For CAReduce, dtype is the output type, acc_dtype is the accumulator dtype, and the failure is happening because the input dtype is float64 and so _acc_dtype is trying to upcast acc_dtype to float64.

Why did it choose float32 for the accumulator?

Presumably because we're using config.floatX = "float32" for these tests.

…ction to output dtype

tvwenger · 2024-03-02T20:53:10Z

@ricardoV94 I found out why CAReduce had a float64 input despite floatX="float32". The reason was that verify_grad in pytensor/gradient.py sets the random projection to float64 no matter what, unless cast_to_output_type=True. I've updated the offending tests, so we'll see what happens.

I'd like to add a regression test using the MWE you provided. If you could point me to a good place to include such a test, I would be grateful!

tvwenger · 2024-03-02T21:13:37Z

There were more offending tests, so I changed the default in verify_grad instead.

ricardoV94

Looks good just some nitpick

tests/tensor/test_subtensor.py

ricardoV94 · 2024-03-02T23:14:59Z

For the regression test, probably in the same file where the rewrite is being tested

… tests

tvwenger · 2024-03-02T23:55:20Z

Well, we've reached an impasse I think. For some tests, unless we upcast the accumulator to float64 as is currently implemented, then we lose precision and some of the gradient tests and precision tests fail. Consider the following test:

     def test_mean_precision(self):
        # Check that the default accumulator precision is sufficient
        x = shared(np.asarray([1e8, 1, -1e8], dtype="float32"))
        m = x.mean()
        f = function([], m)
        m_val = f()
       assert np.allclose(m_val, 1.0 / 3)

The only way this test will succeed is if the internal accumulator uses float64 despite the input and output types being float32.

tvwenger · 2024-03-03T00:42:41Z

Closed in favor of #656

tvwenger added 2 commits March 1, 2024 18:23

default acc_dtype to floatX

639ef8c

update tests for default acc_dtype

da561b0

tvwenger added 2 commits March 2, 2024 12:37

enforce local_careduce_fusion acc_dtype

1494fa0

revert CAReduce acc_dtype, ensure verify_grad is casting random_proje…

7dfe452

…ction to output dtype

revert test cast_to_output_type, change default to True in verify_grad

6c93148

ricardoV94 reviewed Mar 2, 2024

View reviewed changes

tests/tensor/test_subtensor.py Show resolved Hide resolved

tests/tensor/test_subtensor.py Show resolved Hide resolved

revert verify_grad default, enforce cast_to_output_type for offending…

d8e5de5

… tests

tvwenger closed this Mar 3, 2024

This was referenced Mar 3, 2024

Prevent local_sum_make_vector from introducing forbidden float64 at graph level #656

Closed

Prevent local_sum_make_vector from introducing forbidden float64 #659

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Default acc_dtype to floatX #655

Default acc_dtype to floatX #655

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024 •

edited

Loading

Uh oh!

ricardoV94 commented Mar 2, 2024 •

edited

Loading

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

ricardoV94 left a comment

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 3, 2024

Uh oh!

Uh oh!

Default acc_dtype to floatX #655

Default acc_dtype to floatX #655

Uh oh!

Conversation

tvwenger commented Mar 2, 2024

Description

Related Issue

Checklist

Type of change

Uh oh!

tvwenger commented Mar 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Mar 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

ricardoV94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 2, 2024

Uh oh!

tvwenger commented Mar 3, 2024

Uh oh!

Uh oh!

tvwenger commented Mar 2, 2024 •

edited

Loading

ricardoV94 commented Mar 2, 2024 •

edited

Loading