-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add a Parser to read fixed-width ASCII data using a data description file or dictionary file #7030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
does |
Yes, once I have parsed the dictionary, it looks like I will be able to use But for the data files I am working with I don't think it would be possible Thanks! On Mon, May 5, 2014 at 9:21 AM, jreback [email protected] wrote:
|
@AllenDowney I am thinking of working on this project. Did you ever get anywhere with it? Or do you have any sample data that you would like to see treated by a function? |
Yes, I have a dataset and a hacky solution that might be a good example. https://github.com/AllenDowney/MarriageNSFG In thinkstats2.py, you'll see a function called ReadStataDct that returns a I am using it to read data from the NSFG; the data is also in the repo. Please let me know if I can help. On Wed, Jun 24, 2015 at 9:00 AM, tyler-abbot [email protected]
|
Closing and adding to a tracker issue #30407 for IO format requests, can re-open if interest is expressed. |
As an example, I would like to be able to read a dataset like this one:
http://www.cdc.gov/nchs/nsfg/nsfg_cycle6.htm
The data files themselves are ASCII with fixed-width fields. The variables names, types, and indices are in a separate data description file, available for SAS, SPSS and STATA. I would like to add a parser that reads at least one of these description files and then parses the data file. Since two files are used, it might require changes in the Parser API.
I am happy to write a parser that reads the dictionary file and then the data file. I could use help with either setting up the new parser ahead of time or (after the fact) integrating my code with the existing structure.
Also, is there a preference for the SAS, SPSS, or Stata format?
The text was updated successfully, but these errors were encountered: