Skip to content

JSON-LD grammar #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lanthaler opened this issue May 5, 2012 · 21 comments
Closed

JSON-LD grammar #114

lanthaler opened this issue May 5, 2012 · 21 comments

Comments

@lanthaler
Copy link
Member

This was "Value space of keywords" before. Pleas read the comment below for an updated description.

In #91 we decided to not change the value space of @type but to allow the use of rdf:type for use cases that require different forms. Nevertheless, in 67a0909, the algorithms were changed to allow also other forms of @type. expand-0026-in.jsonld basically contains:

   "@type": [
        "http://example.com/d",
        {
          "@id": "http://example.com/e"
        }
   ]

So, what do we wanna do with this? Do we wanna allow it or not?

A similar issue exists for the @graph keyword. Do we wanna allow @value objects there? So, would the following snippet be valid?

   "@graph": {
        "@value": "My named graph 91"
   }

Just to make sure we all agree on the value space of all of our keywords, here a list of what I think is the value space (perhaps we should include this in some form in the syntax spec):

  • @context: (array of) object | string
  • @graph: (array of) object (not @value, @list, @set objects) | string
  • @id: string
  • @value: string | number | boolean | null
  • @language: string
  • @type: (array of) string
  • @container: string
  • @list: (array of) object | string | number | boolean | null
  • @set: (array of) object | string | number | boolean | null
@lanthaler
Copy link
Member Author

RESOLVED: In general, if the author's intent is clear, we should transform the input into proper JSON-LD (keeping the processor mode, if any, in mind - in strict mode, throw exceptions, in lax mode, attempt to interpret the value).

@lanthaler
Copy link
Member Author

I think the initial description of the issue was not clear enough, so here's another attempt.

While the syntax of JSON-LD is already specified by JSON , the current JSON-LD Syntax spec lacks a formal specification of the allowed grammar, i.e., what constructs are allowed where. The outcome of this issue should be exactly that. This grammar will then form the base for what JSON-LD processors MUST be able to process. Maybe we will also come up with some constructs that are either ignored or automatically corrected.

Let me first introduce a number of definitions that I'll use later:

  • value objects: a JSON object that has either a @value, a @set, or a @list property
  • node: an object that is not a value object
  • scalar: a JSON string, number, true, false, or null // what about array

The following description of what I think should be the valid grammar for JSON-LD is a bit a mixture of ABNF, pseudo-code and prose. I hope it's understandable. I numbered the statements so that we can easily reference them.

  1. document = node / node[](i.e. an array of node objects)
  2. a node MUST NOT have an @language property
  3. a node or value object MUST NOT have an @container property
  4. a node MAY have a @context property
  5. the value of an @context MUST be a string (that will be used as relative or absolute IRI), an object, null, or an array of a combination thereof
  6. The value of properties used in @context MUST be a null, a string, or an object.
  7. @context.[..].@id and @context[..].@type MUST be a string (equal @id or it will be interpreted as term, absolute, or compact IRI; not as an relative IRI) or null.
  8. @context.[..].@container values have just a defined meaning for @set and @list
  9. @context.[..].@language MUST be a string or null
  10. Any other property in @context.[..].___ MUST be ignored, but MUST be preserved in compaction and framing
  11. A node MAY have an @graph property.
  12. The value of an @graph property MUST be null, a string (that will be interpreted as an IRI), or a node.
  13. The value of @id in a document (i.e., outside a context), MUST be null or a string (that will be interpreted as term, absolute, relative, or compact IRI).
  14. a a value object with an @set property MUST NOT have any other properties
  15. a a value object with an @list property MUST NOT have any other properties
  16. the value of an @set or @listproperty can be a scalar, a node, a value object, or an array of a combination thereof
  17. a value object with a @value property MAY HAVE a @language or @type property and MUST NOT have any other properties
  18. in a value object @language and @type MUST NOT be present at the same time
  19. the value of the @value property MUST be a scalar
  20. the value of the @language property MUST be null or a string. If the value of @value is not a string, @language will be ignored and will this not be checked
  21. in a value object @type MUST be null, a string that will be interpreted as term, absolute, relative or compact IRI or a node (or an array thereof with just one entry)
  22. in a node @type MUST be null, a string (that will be interpreted as term, absolute, relative or compact IRI), a node, or an array of a combination thereof (with an arbitrary number of entries)
  23. In the document the value of @type MUST NOT be @id; in contrast to the context where this is allowed.

The question is first of all if we agree on this grammar and secondly what should happen if a input document violates this grammar. Should a processor throw an error and stop processing? Should it ignore that statement (and possible the whole subtree) and try to continue?

@gkellogg
Copy link
Member

I think if we're going to define a grammar for JSON-LD, we should probably do it in W3C EBNF. It so happens, I'm working on a parser for it, but there are a number of other existing resources.

I'll start work on such a grammar myself. It might start something like this:

[1] document ::= node | array
[2] array ::= '[' valueList ']'
[3] node ::= '{' propList* '}'
[4] propList ::= propValue ( ',' propValue )*
[5] propValue ::= key ':' value
[6] key ::= keyword | STRING
[7] keyword ::= '@' WORD
[8] value ::= array | node | STRING | BOOLEAN | NULL

Establishing specific semantics of possible property/value limitations will be more challenging, but is doable.

@lanthaler
Copy link
Member Author

Agree that it should be a bit more formal, but before that we should agree on the grammar itself.. that's what this issue is all about. I'm not so sure that we all agree yet. E.g. 21) and 22) might need some discussion as well as where exactly we allow relative IRIs and where not.

@niklasl
Copy link
Member

niklasl commented May 22, 2012

Does these added rules in the specification currently allow:

"@context": {"@language": "en"}

It seems that item 8 there doesn't permit it?

@lanthaler
Copy link
Member Author

It does:

8.The value associated with the keys used in a @context must be a null, an IRI, or a _JSON object.
9.For each value that is a JSON object that is associated with a key in a @context:
...
_9.3. @language must be a string expressed in [BCP47] or null.

@niklasl
Copy link
Member

niklasl commented May 22, 2012

That seems to take care of using it within a JSON object (i.e. in a term definition). But in the case above the value associated with the @language key is a string, not a JSON object. (I.e. it's about setting a global default language.)

@lanthaler
Copy link
Member Author

Oh I see, I read it too fast. I think that's included in 7) in the spec. But as I said on the telecon there are things that got lost. Let's discuss based on the list in this issue.

It goes a bit around this specific issue since in point 9 is not very well defined what the [..] means.

@lanthaler
Copy link
Member Author

RESOLVED: Express the JSON-LD Grammar in prose with supporting tables/examples. Clarify that violating the grammar does not necessarily mean that a JSON-LD processor will not process the document.

@msporny
Copy link
Member

msporny commented Aug 18, 2012

@gkellogg wrote the section, I just proof-read it and made a few minor improvements in 2565c51. I think that the section addresses Andy's concerns. If it doesn't, he will have to provide text that does.

@msporny msporny closed this as completed Aug 18, 2012
@lanthaler lanthaler reopened this Aug 20, 2012
@lanthaler
Copy link
Member Author

I think I was the one who initially raised this issue and while I really appreciate Gregg’s efforts, this is not really what I had in mind. Gregg did a terrific job in summarizing the spec from about 35 to roughly 6 pages. Maybe we should look at the spec and try to apply this style in more places.

For the grammar section however, I think six pages are still way too much. It’s not really the length that worries me. It’s more the fact that it is still written completely in prose. I agree that EBNF goes too far and isn’t really suitable for such a set of conventions but a short list or some other schema-like description would be more suitable in my opinion. Definitions of what something is and examples shouldn’t be part of a grammar IMO.

The grammar should allow an implementer to see all possible values it should expect and thus handle. Everything else should trigger an error.

Thoughts?

@gkellogg
Copy link
Member

The grammar section is not so much an attempt to summarize the spec, as to show the legal productions for each type of syntactic element using something that is not EBNF.

The examples are gratuitous, but I was responding to feedback from Andy. If the group feels that the grammar doesn't benefit from repeating usage examples, and that some of the motivation that's included in the grammar are repetitive of normative text in the main body, then I can go along with trimming that out, but I think it makes it easier to read; I think we should solicit feedback from @afs and others in the RDF WG who commented on the grammar.

@msporny
Copy link
Member

msporny commented Sep 30, 2012

I went into the spec with the intent to clean up the JSON-LD Grammar section. After reading through it with fresh eyes, I think it's fine.

I agree with the principle of what Markus is saying, which is why I went in trying to clean up the section. However, I don't see how we could apply the principle to the Grammar section and not lose something.

I think we should close this issue unless Markus has specific spec text to replace what we have right now. Markus, if you could take what is there right now and do what you did here:

#114 (comment)

That may allow us to resolve this issue once and for all.

@lanthaler
Copy link
Member Author

I will do that once the spec is complete, i.e., all other issues have been resolved and the spec has been updated accordingly.

I marked this issue as on-hold for the time being.

@lanthaler
Copy link
Member Author

RESOLVED: Move all examples in the JSON-LD Grammar section of the spec to the main body and point to them with links.

tidoust pushed a commit to tidoust/json-ld.org that referenced this issue Nov 13, 2012
As agreed in json-ld#114, I moved all the examples that appeared in the
grammar section to the "Basic Concepts" and "Advanced Concepts"
sections when appropriate.

I dropped examples that were already duplicates of examples in these
sections as well as examples that were very close to other examples.

The grammar section does not directly link to the examples in the
other sections but links to relevant sections each time. That seems
enough to me (and that also simplifies things since ReSpec does not
naturally create IDs for examples, making it a pain to link to them
directly).
lanthaler pushed a commit that referenced this issue Nov 17, 2012
As agreed in #114, I moved all the examples that appeared in the
grammar section to the "Basic Concepts" and "Advanced Concepts"
sections when appropriate.

I dropped examples that were already duplicates of examples in these
sections as well as examples that were very close to other examples.

The grammar section does not directly link to the examples in the
other sections but links to relevant sections each time. That seems
enough to me (and that also simplifies things since ReSpec does not
naturally create IDs for examples, making it a pain to link to them
directly).

(cherry picked from commit 4710e9a)
(cherry picked from commit 9249686)
(cherry picked from commit dc1f63f)

This addresses #114
@msporny
Copy link
Member

msporny commented Dec 3, 2012

@lanthaler @tidoust - where are we on this issue? Do more edits need to be made to the document?

@lanthaler
Copy link
Member Author

I think there are still some errors in the grammar section. I will review it as soon as language maps, property generators, and annotations have been added to the spec.

@msporny
Copy link
Member

msporny commented Dec 22, 2012

Language maps are covered in appendix B.3.
Property generators are covered in appendix B.6.
I added data annotations to the grammar in commit 51703b8.

I believe that addresses all of the remaining issues for the JSON-LD grammar. @tidoust and @lanthaler do you concur? If so, let's send notice out to the mailing list and close this issue.

@lanthaler
Copy link
Member Author

I will review it in the coming days.. btw. json-ld.org seems to be down

@msporny
Copy link
Member

msporny commented Dec 26, 2012

json-ld.org is back up - root cause was a power supply failure.

ACK on all of your other comments - I'll wait for you to close those issues (although, please make sure to either action me to change stuff that needs to be changed, change the language yourself, or make it clear that something is blocked and needs immediate attention).

lanthaler added a commit that referenced this issue Dec 28, 2012
lanthaler added a commit that referenced this issue Jan 4, 2013
lanthaler added a commit that referenced this issue Jan 4, 2013
@lanthaler
Copy link
Member Author

I've reviewed the grammar and fixed the last remaining issues. Unless I hear objections I will therefore close this issue in 24 hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants