JSound-C 2.0
The complete reference
Edition specification version 2.0.8 for JSound 2.0
Copyright © 2018 Cezar Andrei, Dana Florescu, Ghislain Fourny, Jonathan Robie, Pavel Velikhov
Abstract
This document is a description of JSound-C, the compact version of JSound 2.0, the JSON schema definition language. It describes how to declare constraints on the structure of JSON documents, and processes instances of the TYSON syntax for typed values.
Over the past decade, the need for more flexible and scalable databases has greatly increased. The NoSQL universe brings many new ideas on how to build both scalable data storage and scalable computing infrastructures.
XML and
JSON are probably the most popular two data formats that emerged. While XML reached a level of maturity that gives it an enterprise-ready status, JSON databases are still in their early stages. Scalable data stores, such as
MongoDB,
ElasticSearch,
Cosmos DB (DocumentDB),
CouchDB,
Couchbase, are already available. One fundamental piece for a full-fledged JSON database is a way to make sure that the data stored is consistent and sound. This is where schemas come into play.
Many lessons can be learned from 40 years of relational databases history and 15 years of XML. The goal of this document is to introduce a schema language, JSound, which is much simpler than XML Schema and than JSON Schema, just like JSON syntax is much simpler than XML syntax. At the same time, JSound bring many lessons from the desing of XML Schema, building on the shoulders of a giant. A comparison between JSound and JSON Schema is available
here
JSound 2.0 provides a framework and syntax to validate and annotate JSON documents. JSound-C provides a simple, compact syntactic sugar covering 80% of the JSound 2.0 use cases. A JSound-C schema is a syntactic sugar for a JSound 2.0 schema.
JSound as such can be very verbose. For this reason, an alternate, very simple syntax is introduced. It cannot be mixed with the verbose syntax, meaning that these are two different file formats.
The design intent behind JSound-C is that a schema expressed in this simple syntax mirrors the instance layout.
Note that the simple syntax only covers a subset of features of JSound, which was designed to cover 80% of the cases. If features are needed that are not covered by the simple syntax, then the regular JSound 2.0 syntax MUST be used.
1.2. Forbidden characters in JSound-C type names
Special characters (!, ?, =, @ and |) are forbidden in type names in the JSound-C syntax. These characters may only be used in the way specified. If a schema defines type names or values that contain these characters, the regular JSound 2.0 syntax must be used.
This specification describes a mapping M from any type expressed of the simple syntax T, to its verbose syntax M(T), and it does so recursively.
A JSound-C schema is an object mapping type names to types expressed in the simple syntax. These types cannot be atomic, as JSound-C cannot create new atomic types: it can only reference them by name.
A JSound-C 2.0 schema is mapped to a JSound 2.0 schema as follows.
{
"<type1>" : <T>,
"<type2>" : <U>,
...
}
is recursively mapped to:
{
"types" : [
{ "name" : "<type1>", <M(T)> },
{ "name" : "<type2>", <M(U)> },
...
]
}
Each type is thus mapped to the JSound 2.0 syntax, and extended with a name field containing the name of the type.
The remainder of this specification specifies the mapping M for each kind of type.
Chapter 2. Atomic Types, and mapping type names
Atomic types are left unchanged by the mapping M. In JSound-C, atomic types can only be referred by name. No new atomic types can be created.
"<t>"
is recursively mapped (untouched) to:
"<t>"
The same mapping applies to object, array or union types referenced by name. They are left untouched.
The names of the referenced types may not contain any special characters. Otherwise, the JSound 2.0 syntax must be used.
In JSound-C syntax, an object type is expressed as an object layout that matches that of a valid instance, in that the names of the data fields are also fields in the simple syntax, recursively mapped to the simple syntax of the field descriptors.
If fields do not contain any special characters, and if none of the values is a string containing a special character either, then the mapping is as follows.
{
"<field1>" : <T1>,
"<field2>" : <T2>,
...
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{ "name" : "<field1>", "type" : <M(T1)> },
{ "name" : "<field2>", "type" : <M(T2)> },
...
]
}
An exclamation mark can be used at the end of a key to mark this key as required.
If fields do not contain any special characters, and if none of the values is a string containing a special character either, then the mapping is as follows.
{
"!<field1>" : <T>
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{ "name" : "<field1>", "type" : <M(T)>, "required" : true }
...
]
}
3.3. Setting a default value
An equal sign can be used to give a default value if the typed is referenced by name.
{
"<field>" : "<t>=<v>"
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{ "name" : "<field>", "type" : "<t)>", "default" : "<v>" }
]
}
3.4. Marking a field as primary key
For arrays of objects, @ is used as a prefix to a field to set its field descriptor $unique to true.
{
"@<field>" : <T>
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{ "name" : "<field>", "type" : <M(T)>, "unique" : true }
]
}
? can be used as a suffix in the field name to allow for null, implicitly creating a union type.
{
"@<field>?" : <T>
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{
"name" : "<field>",
"type" : {
"kind" : "union",
"content" : [ <M(t)>, "null" ]
}
}
]
}
3.6. Combining markers arbitrarily
!, @, ? and = can be used cumulatively
{
"@!<field>?" : "<t>=<v>"
}
is recursively mapped to:
{
"kind" : "object",
"content" : [
{
"name" : "<field>",
"type" : {
"kind" : "union",
"content" : [ "<t>", "null" ]
},
"default" : "<v>",
"required" : true,
"unique" : true
}
]
}
Below is a concrete example.
{
"my-object" : {
"foo" : "string=foobar",
"bar" : {
"foobar!" : "boolean"
}
}
}
is a shortcut for the verbose:
{
"types" : [
{
"name" : "my-object",
"kind" : "object",
"content" : [
{
"name" : "foo",
"type" : "string",
"default" : "foobar"
},
{
"name" : "bar",
"type" : {
"kind" : "object",
"content" : [
{
"name" : "foobar",
"type" : "boolean",
"required" : true }
]
}
}
]
}
]
}
An array type can be expressed as an array recursively containing a type in simple syntax.
[ <T> ]
is recursively mapped to:
{
"kind" : "array",
"content" : <M(T)>
}
Below is a concrete example.
{
"my-array" : [ "date" ],
"my-array-of-objects" : [ { "my-key@" : "string", "foo" : "integer" } ]
}
is a shortcut for
{
"types" : [
{
"name" : "my-array",
"kind" : "array",
"content" : "date"
},
{
"name" : "my-array-of-objects",
"kind" : "array",
"content" : [
{
"kind" : "object",
"content" : [
{
"name" : "my-key",
"type" : "string",
"$unique" : true
},
{
"name" : "foo",
"type" : "integer"
}
]
}
]
}
]
}
Union can be expressed as strings using the | symbol. The question mark can also be used in the field as a shortcut for "|null", as explained in the shortcut object type syntax.
The union simple syntax only works with type names. Type descriptors in simple syntax cannot be nested. More complex unions must use the verbose syntax.
"<t>|<u>|<v>"
is recursively mapped to:
{
"kind" : "union",
"content" : [ "<t>", "<u>", "<v>" ]
}
Below is a concrete example showcasing both | and ?.
{
"my-union" : "string|integer",
"my-object" : {
"string-or-null?" : "string"
}
}
is a shortcut for
{
"types" : [
{
"name" : "my-union",
"kind" : "union",
"content" : [ "string", "integer" ]
},
{
"name" : "my-object",
"kind" : "object",
"content" : [
{
"name" : "string-or-null",
"type" : {
"kind" : "union",
"content" : [ "string", "null" ]
}
}
]
}
}
}
{
"mytype" : {
"foo" : "string",
"bar" : [ "boolean" ],
"foobar" : {
"!foo" : "date",
"@bar?" : "hexBinary"
}
}
}
{
"types" : [
{
"name" : "mytype",
"kind" : "object",
"content" : [
{
"name" : "foo",
"type" : "string"
},
{
"name" : "bar",
"type" : {
"kind" : "array",
"content" : "boolean"
}
},
{
"name" : "foobar",
"kind" : "object",
"content" : [
{
"name" : "foo",
"type" : "date",
"required" : true
},
{
"name" : "bar",
"type" : {
"kind" : "union",
"content" : [ "hexBinary", "null" ]
},
"unique" : true
}
]
}
]
}
]
}