-
Notifications
You must be signed in to change notification settings - Fork 601
incorporate dataportal.json guidance? #554
Comments
|
|
I'd love to learn more about this. Regrettably, that blog entry doesn't explain what, specifically, data.json doesn't do that they want it to do. As I recall, data.json will validate if it includes metadata about a repository without including an inventory, so that core aspect of dataportal.json seems to be unnecessary. (And, even if I'm wrong about that, it seems like it'd be collectively better to talk about modifying the data.json spec to accommodate that use case.) Also, dataportal.json appears to exist only in the sense that there's a Gist with an example—there's no schema, validator, generator, or documentation yet. We put together data.json over the course of months, in consultation with dozens of people at a dozen federal agencies, and then it has continued to mature in the years since. Although we kept simplicity as our guide star, the reality turned out to be that there are more edge cases than "normal" cases, because government agencies' needs vary so much. Surely there are specific shortcomings of data.json that spurred the creation of dataportal.json, and it'd be great to know about those, but they're not listed there. Kudos to OpenDataSoft to seeing a problem and working to correct, anyway! |
I think one possible contribution that it makes is distinguishing "portal" from "catalog", but I'm not sure how important a contribution that is. It adds some new fields that aren't in data.json, but as @waldoj points out, these could easily be added or extended from data.json instead of creating a conflicting format. I think they are somewhat naively approaching the problem as, "Why is data.json so complicated? It should be simple! Let's make it simple." without understanding why things are complicated. |
Hi everyone and thanks for your interest on dataportal.json :) data.json is obviously the closest thing to what we imagine. dataportal.json is a suggestion and we feel like there is something missing in terms of metadata at the portal level right now, but if extending data.json is a better solution that's totally fine for us! The first concern we have about data.json is that it's often quite heavy. Firefox almost crashes when I open NYC's data.json. We like the idea of having separate light files with links between them. But the main issue is that there are a lot of cases where there isn't any information about the portal itself. The datasets are really perfectly described, and it's easy to work with their metadata. However, the portal level is often forgotten. So it isn't that it's not possible or that it's too complicated, but it's often not done. Hence, why we propose a separate file. It might be naive; we haven't been in discussions with the agencies that you're talking about, and we never created a norm before. But we just feel there is something missing in the portals we encounter. If it's better to have it integrated to data.json, sure: we'll be glad to contribute and help. If a fresh start with a dataportal.json (plus compliant names suggested by @jpmckinney on the gist) is necessary we'll keep pushing for it. And if dataportal.json can be a experimentation before an integration to data.json, that can be done too: it's always nice to test something on a very agile way before suggesting that everybody implement something. Anyway, I'm very glad to have your comments and I'm going to comment on the more technical suggestion on the gist right now. |
I see value in separating out the catalog/portal-level information into a smaller file. Some aggregators may only want to know the top-level information, without aggregating all the datasets. I haven't compared the catalog fields in data.json against dataportal.json. @NTerpo Can you have a look at comparing these two to see what's already in data.json and what is unique to dataportal.json? |
Having read @philipashlock's comment, here's a possible way forward:
|
see: https://www.opendatasoft.com/2016/03/22/metadata-for-open-data-portals/ and https://gist.github.com/NTerpo/b81a0b195ceb99a7e53a
cc @philipashlock @JJediny
The text was updated successfully, but these errors were encountered: