Skip to content

Default acc_dtype to floatX #655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

tvwenger
Copy link
Contributor

@tvwenger tvwenger commented Mar 2, 2024

Description

This PR changes the default data type for the internal accumulator to config.floatX when the input data type is float. The previous behavior always attempted to upcast acc_dtype, which caused the graph to include higher precision floats even when the user requested floatX = 'float32'.

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 2, 2024

Some of the tests are failing: TestRepeat and TestSubtensor. I can see where the logic is failing based on the test error:


self = CAReduce(scalar_op=Composite{(i0 + Cast{float32}(i1))},axis=(0,),dtype=float32,acc_dtype=float32,upcast_discrete_output=True)
idtype = 'float64'

    def _acc_dtype(self, idtype):
        acc_dtype = self.acc_dtype
        if acc_dtype is None:
            return dict(
                bool="int64",
                int8="int64",
                int16="int64",
                int32="int64",
                uint8="uint64",
                uint16="uint64",
                uint32="uint64",
                float16=config.floatX,
                float32=config.floatX,
                complex64="complex128",
            ).get(idtype, idtype)
        elif acc_dtype in continuous_dtypes and idtype in discrete_dtypes:
            # Specifying a continuous accumulator for discrete input is OK
            return acc_dtype
        else:
            # The conversion has to be considered an upcast.
            upcasted_dtype = upcast(idtype, acc_dtype)
            if acc_dtype != upcasted_dtype:
>               raise TypeError(
                    f"Cannot build {self} node with input dtype {idtype} "
                    f"and acc_dtype {acc_dtype}, as precision would be lost. "
                    "To correct this error, you can:\n"
                    "  - not specify acc_dtype, or\n"
                    f"  - use an acc_dtype at least as precise as {upcasted_dtype}.\n"
                    '  - specify "dtype" instead of "acc_dtype", so '
                    "the reduction will be precise, but the result will "
                    'be casted into "dtype" at the end.\n'
                    "If you are expecting the precision loss, you can "
                    f'use tensor.cast(..., dtype="{acc_dtype}"), on your input.'
                )
E               TypeError: Cannot build CAReduce{Composite{(i0 + Cast{float32}(i1))}, axis=0} node with input dtype float64 and acc_dtype float32, as precision would be lost. To correct this error, you can:
E                 - not specify acc_dtype, or
E                 - use an acc_dtype at least as precise as float64.
E                 - specify "dtype" instead of "acc_dtype", so the reduction will be precise, but the result will be casted into "dtype" at the end.
E               If you are expecting the precision loss, you can use tensor.cast(..., dtype="float32"), on your input.

Although CAReduce is given the correct dtype and acc_dtype, the function _acc_dtype that is raising the exception isn't testing against the passed dtype, but rather idtype, the type of the input: (i0 + Cast{float32}(i1)). This must be resolving to float64.

@ricardoV94
Copy link
Member

ricardoV94 commented Mar 2, 2024

Yes I think the function can't and shouldn't downcast the accumulation phase. If the input is float64 the same should be used for accumulation by default. The float32 dtype you're talking is the output type.

Why did it choose float32 for the accumulator?

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 2, 2024

@ricardoV94

Yes I think the function can't and shouldn't downcast the accumulation phase.

I agree, but I think at most acc_dtype should be floatX in order to resolve the MWE failure that you posted.

If the input is float64 the same should be used for accumulation by default.

I also agree, but the failing tests are for config.floatX = "float32", so I think the problem here is that the input dtype to CAReduce is float64 for some reason.

The float32 dtype you're talking is the output type.

I see. For CAReduce, dtype is the output type, acc_dtype is the accumulator dtype, and the failure is happening because the input dtype is float64 and so _acc_dtype is trying to upcast acc_dtype to float64.

Why did it choose float32 for the accumulator?

Presumably because we're using config.floatX = "float32" for these tests.

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 2, 2024

@ricardoV94 I found out why CAReduce had a float64 input despite floatX="float32". The reason was that verify_grad in pytensor/gradient.py sets the random projection to float64 no matter what, unless cast_to_output_type=True. I've updated the offending tests, so we'll see what happens.

I'd like to add a regression test using the MWE you provided. If you could point me to a good place to include such a test, I would be grateful!

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 2, 2024

There were more offending tests, so I changed the default in verify_grad instead.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good just some nitpick

@ricardoV94
Copy link
Member

For the regression test, probably in the same file where the rewrite is being tested

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 2, 2024

Well, we've reached an impasse I think. For some tests, unless we upcast the accumulator to float64 as is currently implemented, then we lose precision and some of the gradient tests and precision tests fail. Consider the following test:

     def test_mean_precision(self):
        # Check that the default accumulator precision is sufficient
        x = shared(np.asarray([1e8, 1, -1e8], dtype="float32"))
        m = x.mean()
        f = function([], m)
        m_val = f()
       assert np.allclose(m_val, 1.0 / 3)

The only way this test will succeed is if the internal accumulator uses float64 despite the input and output types being float32.

@tvwenger
Copy link
Contributor Author

tvwenger commented Mar 3, 2024

Closed in favor of #656

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants