JSON Schema

JSON Schema is a vocabulary for making assertions about JSON documents. You can use a JSON Schema document to annotate and validate other JSON documents.

The official website for JSON Schema is at json-schema.org.

Uses for JSON Schema

Writing tests for JSON documents.: Use JSON Schema in your application tests to ensure that data follows basic constraints, in declarative code instead of procedural code.
Validation of JSON-like data structures.: Most JSON Schema implementations can validate data structures in memory, not just JSON documents in a file. Use JSON Schema to write declarative assertions in your application, and compile them into your application.
Validating user input.: Use JSON Schema to verify the range/types of user input is correct and report problems to users, often in fewer lines of code than can be done by hand.
Describing expected input to users.: Share a JSON Schema to express to other parties what input you're expacting, so they can verify correctness before submission.
Add links to plain JSON APIs.: Use JSON Schema to add link relationships to JSON documents in HTTP APIs, allowing hypermedia user agents to browse the API and improve flexibility.
Describe the structure of JSON documents.: Use JSON Schema to describe what each property in a document means, especially for use in IDE hints, or autogenerated documentation.
API documentation.: Use JSON Schema to describe what each property in a document means, especially for use in IDE hints, or autogenerated documentation.

When not to use JSON Schema

JSON Schema does not by itself verify the consistentcy of data: There's no standard mechanism to verify that, for example, a given string is a key in a database, or a value elsewhere in the same document. There are some extensions to provide this functionality, but it is currently out-of-scope of the standard vocabulary.

Fundementals of JSON Schema Usage

The JSON document being validated or described we call the instance, and the document containing the description is called the schema.

Fundamentally, a schema is a list of rules, collected into a JSON object. The empty object represents every well-formed JSON document, and prohibits nothing:

{
}

By adding properties to this object, called keywords, you can add constraints or annotations to a JSON document. The name of the keyword is used as the property key, and any arguments to the keyword are put in the property value. JSON Schema lists keywords in an object because most keywords only need to be used once, and most keywords take some sort of argument.

One of the most widely used keywords is "type", which checks that the type of the JSON value is one of a few specified types (object, array, string, number, boolean, or null). This schema verifies that the document is a string:

{
	"type": "string"
}

This means that objects, arrays, numbers, booleans, and null will cause a validation error when checked against this schema; only strings will pass validation.

The type keyword also accepts an array, if multiple types are acceptable:

{
	"type": ["string", "number"]
}

Some keywords are only checked when the instance is a specific type. For example, the "minLength" keyword only applies to strings, and the "minItems" only applies to arrays. This is by design, so that keywords for multiple types may be mixed without interfering with each other. Consider this schema:

{
	"type": ["string", "number"],
	"exclusiveMinimum": 0
}

This schema allows instances to be a string or a number; and if a number, it must be a positive (nonzero) number. If it’s a string, there’s no limit on its contents, it can even be the blank string "", or the a one-character long string zero "0".

Some keywords take schemas as an argument, and apply those schemas to values within the document. The "properties" keyword maps schemas onto specific properties in an instance, if those properties exist:

{
	"type": "object",
	"properties": {
		"id": {"type":"string", "maxLength":"20"},
		"label": {"type":"string", "maxLength":"250"}
	}
}

The "additionalProperties" keyword supplies a schema that is applied to any properties not listed in "properties" or "patternProperties":

{
	"type": "object",
	"additionalProperties": {"type":"number"},
}

This schema verifies that each value in the object is a number.

Any properties not listed are not checked. The "required" keyword lists properties that must exist:

{
	"type": "object",
	"required": ["id"],
	"properties": {
		"id": {"type":"string", "maxLength":"20"},
		"label": {"type":"string", "maxLength":"250"}
	}
}

Likewise for arrays, the "items" keyword accepts any number of schemas for an array:

{
	"type": "array",
	"items": [
		{"type":"number"},
		{"type":"number", "minimum":0}
	],
	"additionalItems": {"type":"string"}
}

This allows an array, and the first item must be a number, the second item must be a non-negative number, and any items after that must be a string.

Typically, keywords will not interfere with each other's behavior. However, "additionalProperties" and "additionalItems" are two exceptions, where their behavior is influenced by other keywords, and those keywords must be checked first.

Sometimes the same schema is re-used in multiple places. To uniquely identify a schema, they may be given a URI with the $id keyword:

{
	"$id": "http://example.com/schema/address",
	"type": "object",
	"properties": {
		"id": {"type":"string", "maxLength":"20"},
		"label": {"type":"string", "maxLength":"250"}
	}
}

Then from other schemas, you can import the schema by using the $ref keyword:

{
	"$ref": "http://example.com/schema/addres"
}

Using $ref is roughly equivalent to embedding the referenced schema as an item inside allOf, with the URI base changed to the document’s URI.

You may also use $ref to perform recursion:

{
	"$id": "http://example.com/schema/root#",
	"type": "object",
	"required": ["id"],
	"properties": {
		"id": { "type": "string", "maxLength": 32 },
		"children": {
			"type": "array",
			"items": { "$ref": "#" }
		}
	}
}

Here, the schema describes an object where the "children" property is an array of objects, with the same structure, of unlimited depth. For example:

{
	"id": "0",
	"children": [
		{
			"id": "1",
			"children": [
				{"id":"2"},
				{"id":"3"}
			]
		},
		{
			"id": "4",
			"children": [
				{"id":"5"},
				{"id":"6"}
			]
		}
	]
}

Note how the value for $ref may be a URI Reference: it permits relative references, in addition to a complete, fully qualified URI. Resolving a reference into a full URI is done the same way that Web browsers do, against the URI of the document it’s inside. The $id keyword may itself be a URI Reference, though you should always use a full URI.

If a relative reference is encountered, it works the same way as a base tag in HTML: it is resolved against the URI would have been (either the URL used to download the document, or the URI of the parent schema); then any other URI References in the document are resolved against this.