-
Notifications
You must be signed in to change notification settings - Fork 601
Add reasonForNonRelease to schema #93
Comments
Marina, what do you envision going in this field? Free-form descriptive text? I don't know anything about the processes within government that will track this documentation—is the most logical way to relate a dataset to its private-only rationale to do so within a field like this? |
Hi Marina, Begin forwarded message:
|
Ideally, there would be something like a FOIA-type system, where if data doesn't meet one of a number of criteria for nonrelease it would be required to be released, and thus this field would need to be one of the predefined criteria. Logistically, however, this may be too ambitious. It may be good for us to create a set of "acceptable" criteria that we could give to agencies as suggested guidance for why a dataset may not be releasable (and the reverse). What about NonReleaseJustification or RestrictionJustification? |
@waldoj Yes I envision it as being a free-text field. The agencies already have to collect this information for each new dataset created/collected that's not going to be released, going forward. So isn't it logical to store this reason in the Enterprise data inventory (which, remember, is private -- not the public inventory)? They're storing it anyway -- but without a field they will, if I were to guess, store them separately and in a harder-to-find-internally spot. @seanherron I think the list of options here is way too broad and will be defined by agencies' general counsels. I would suggest leaving this as a free text field and not providing criteria. |
P.S. I have no problem with changing the name of this suggested field. |
@BernHyland We made great efforts to match DCAT in this schema wherever possible -- the only two existing fields that do not match DCAT are accessLevel and systemOfRecord. This issue is specifically about giving agencies a place to document the reason for NOT releasing a particular dataset, in their internal-only enterprise data asset inventories. I'm not so so sure that is widely applicable enough to warrant inclusion in a standard like DCAT but I appreciate the reminder to stay involved in those conversations! |
Does the benefit of encouraging better behavior outweigh the complexity that adding this brings? I don't think it's an overly large addition to the agency workload but it is an added lift. In general, I always worry about the Christmas tree effect when it comes to adding further to what each agency is required to do. |
@gbinal I think if some sort of rationale isn't included people will either a) assume that the intent is nefarious and that we are hiding the data for no good reason, b) email the POC and ask for clarification/release, or c) forget about it entirely. For high-volume and frequently desired datasets (maybe some of the HHS data that has potential for PII, etc) putting a reasonable statement out there as to why it's private is good for transparency and will reduce the number of queries to the POC and angry tweets if they assume it's private for a questionable reason. My concern would be that agencies wouldn't provide this information for legal reasons or would provide obtuse legalese that is difficult to parse and understand. If it's not going to be used, then there's not a lot of value in adding it. |
I understand the Christmas tree argument, but in this case it seems merited. It'll only add work for non-released datasets (which are already saving a lot of work by not being released!), and agencies should have an on-the-record reason for not releasing a dataset anyway. I also support keeping this a free text field, rather than selecting a preset exemption, to encourage a descriptive rationale. The field won't mean much if it doesn't communicate more than a category. |
For datasets with accessLevel = private, an agency has to document with its Office of General Counsel (or other designated entity) why it can't be released.
While this field would not be surfaced in the public data inventory, it should be captured in an additional metadata field for the Enterprise Data Inventory, and required for datasets where accessLevel = private.
The rationale is that simply documenting a reason for not releasing a dataset with OGC does not guarantee that any identifier or other information is collected alongside that reason. This information rightly belongs in the (private) Enterprise Data Inventory.
Agencies could selectively use the recordAccessLevel metadata field proposed earlier to surface private datasets and their reason for not being released.
Not thrilled with my suggested name, feel free to propose a better one!
The text was updated successfully, but these errors were encountered: