-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add support for reading Stata 7 (non-SE) format dta files #47176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @bashtage |
Is the 110 format documented? I think we haven't supported 110 because there are no docs. |
The 110 format is documented in the book "Stata programming manual : release 7." (ISBN-10: 1881228525, ISBN-13: 9781881228523) on pages 88-94. |
Anyone want to share the relevant pages, which I think should fall under fair use? I also have it available through a local library, so we'll see what can be done. |
Here's an OCRed version of the text. There may be transcription errors but on the whole the text is very similar to that of other versions of the format, e.g. Stata 8. |
After some discussion with Stata technical support I now have documentation for all of the missing format versions. The 103, 105, 108 and 110 documents are all based on the descriptions from the corresponding Stata manuals. The 102, 104 and 111 documents are my best guesses based on available information. I have however tested that I am able to use these format descriptions to read and write files that Stata 18 is happy to load. dta_102.txt |
Is your feature request related to a problem?
If I attempt to read in a dta file saved in Stata 7 format using the
read_stata()
function I get the following error message:This is unfortunate as this is the version of the format used by default by R's
write.dta()
function in theforeign
package.Describe the solution you'd like
It would be nice if this version of the data format was supported, at least for reading, in the same manner as the other variants.
API breaking implications
None that I am aware of.
Describe alternatives you've considered
write.dta
also supports saving to the versions 6, 8 and 10 of the dta format which are supported, so I could manually specify a different version instead. Alternatively I could save my data using thereadstata13
orhaven
packages which both support more recent versions of the Stata dta format.Another option would be to use a different data format entirely that is also supported by both R and Pandas.
Additional context
It appears that to implement support for this additional variant the following two changes are required:
pandas/pandas/io/stata.py
Line 1414 in 8e94afb
To:
pandas/pandas/io/stata.py
Line 1397 in 8e94afb
To include 110 in the list of supported variants, i.e.
After making these changes locally my version 7 files appear to load correctly.
The text was updated successfully, but these errors were encountered: