Published RecordIO (-protobuf?) deserializer & serializer #1994
Labels
component: pysdk-team
Related to SageMaker Python SDK Core Issues
contributions welcome
type: documentation
type: feature request
Describe the feature you'd like
Please can we promote the existing
RecordDeserializer
andRecordSerializer
into the proper documented serializers and deserializers packages?MXNet RecordIO is not an easy format to read in a Python environment that you don't want to / can't install MXNet on.
Across both training and inference use cases, I have in the past wasted a lot of time:
Imagine my surprise today, when I found a
RecordDeserializer
buried away in the not-really-documented src/amazon/common.py!How would this feature be used? Please describe.
Moving these classes to the standard
sagemaker.serializers
andsagemaker.deserializers
modules will greatly increase their discoverability, making it much easier for users to interact with SageMaker-provided algorithms like Semantic Segmentation - and to consider using MXNet RecordIO serializations for custom models too.Describe alternatives you've considered
I'm not sure why they haven't been made visible in this way already?
Also would need to carefully consider what it means for the actual RecordIO and protobuf functionality these classes are based on: That'd be too much to copy over, so would mean promoting many of these utility functions from the undocumented
amazon
area to I guess some top-level API?Additional context
I discovered these classes while fixing tests for PR #1993 (adding accept/content_type constructor argument overrides for all serializers & deserializers in the SDK).
Happy to try and help with implementation if needed, but would need some guidance on where you'd like the utility functions moved to when lifted out of
.amazon
, because it seems like too much stuff to drag in toserializers.py
anddeserializers.py
The text was updated successfully, but these errors were encountered: