Skip to content

Make ordering keys when compaction optional #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gkellogg opened this issue Jul 8, 2018 · 9 comments
Closed

Make ordering keys when compaction optional #8

gkellogg opened this issue Jul 8, 2018 · 9 comments

Comments

@gkellogg
Copy link
Member

gkellogg commented Jul 8, 2018

Add an ordered option to Compaction, and other algorithms, defaulting to true, that would make the "ordered lexicographically" language depend on this option. (Note that, when compacting, expansion will already work properly is ordering is skipped).

Original issue: Feature request: Could framing return properties in the order listed by the frame? #542.

@azaroth42
Copy link
Contributor

I'm 👎 on this issue for the following reasons:

  • JSON is not ordered. Yes it's possible to order the serialization, but as soon as it is read and re-serialized, the order will disappear. The value of the order is only for humans to read it, and they can use tools that order the keys as desired.
  • Lexicographic ordering is complicated when unicode is introduced.
  • Ordering the serialization is expensive for processing
  • Lexicographic ordering is almost certainly not the easiest order for humans to read the data in. For example, following the best practice for id and type, one would imagine that those two would be the first keys (<uri> a <class> being the first triple in almost all turtle data, for example)
  • Specifying a custom, non-lexicographic ordering is even more expensive!

So I would propose close, won't fix.

@dlongley
Copy link
Contributor

dlongley commented Aug 28, 2018

@azaroth42,

I think the problem is that some of the algorithms already require sorting ... with no clear benefit. I'd like to see the requirement to sort relaxed if we can't find a reason to do it. I don't think we need to declare a flag that says to do it or not (that's an implementation specific feature) ... I'd just like a processor to be compliant without having to sort.

Actually ... maybe the algorithms don't really require sorting at all given that they were non-normative. The output doesn't have any requirement that the keys be sorted (and if it does, we should remove that). This would solve the problem, IMO -- where implementations are sorting things that they don't need to sort. Anyone who wants a sort feature in an implementation can take it up with that particular implementation.

@gkellogg
Copy link
Member Author

@azaroth42 part of the issue is that the algorithm requires that keys be processed in order, so that ordered option would allow us to not impose this ordering during processing; ordering keys in large objects can actually be a performance hit, so there's something to be said for not imposing such a requirement.

Alternatively, given that JSON objects are unordered by nature, then there is really no need for the algorithms to do such an ordering, even for testing the results. We might simply decide to remove the verbiage on "ordered lexicographically by key" from the various algorithms. This may cause some difference in the expanded output of maps, such as language maps, if processing is not done in lexicographical order. (Implementations may choose to do this when testing to simplify the result checking, however).

The original issue was about not re-ordering keys when framing IIRC, and this may be a byproduct for some implementations, but the really issue is to reduce the processing overhead required when ordering.

@azaroth42
Copy link
Contributor

Oh, that seems like a very different request compared to json-ld/json-ld.org#542 ?

... it seems natural to me that if an explicit ordering is given in the frame, that the algorithm would respect that order rather than alphabetize ...

👍 to not requiring sorting!

@azaroth42
Copy link
Contributor

Proposal: Remove the "ordered lexicographically by key" from the algorithm and allow implementations to handle this as they see fit.

@azaroth42
Copy link
Contributor

WG resolved on 8/31 call to remove the ordered lexicographically by key from the algorithm.

@gkellogg
Copy link
Member Author

gkellogg commented Sep 6, 2018

I have implemented this in my library, and as I expected, it creates a challenge for testing. I think it best to add an ordered property to perform the existing ordering, and update the language to be based on that option. The test README can then suggest using this to simplify comparing results.

The issue is that array order is different, not the ordering of members within an object.

Testing unordered results is feasible for the expanded results, but basically requires that both expected and result files have their array values reordered (other than values of @list) as part of the validation.

Testing compacted results is more challenging, as it requires using the context, along with all the various scoped and embedded contexts, to know what to reorder, which is tantamount to performing the expansion. I think the only feasible way to do this is to reorder all arrays, even if they might form a list, and compare them as for expanded results, but then follow by expanding both results and expected and comparing as above.

@azaroth42
Copy link
Contributor

Does this also affect the linked data signatures computation? @cwebber?

@gkellogg
Copy link
Member Author

gkellogg commented Sep 6, 2018

Linked Data Signatures, IIRC, are based on RDF Dataset Normalization, so are immune to the specifics of the JSON representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants