Quantcast
Channel: Web Science and Digital Libraries Research Group
Viewing all articles
Browse latest Browse all 742

2023-12-20: JSON Schemas Help You Write Better JSON

$
0
0
JSON schemas are a powerful tool for validating the structure and data types of JSON documents. They can help you ensure that your JSON data conforms to a specific format, avoid errors, and improve the quality and interoperability of your applications. In this blog post, I will explain what JSON schemas are, how they work, and why you might need them.

Quick Recap - JSON

JSON allows us to represent different types of elements. Compared to XML based markup (e.g., HTML) JSON is lightweight, and based on native data structures of JavaScript. For this reason, JSON is widely used by web apps to communicate over networks. Here are some key differences between XML and JSON [Source].
  • JSON objects have a type, whereas XML data is type-less.
  • JSON does not provide namespace support while XML provides namespaces support.
  • JSON has no display capabilities whereas XML offers the capability to display data.
  • JSON is less secured whereas XML is more secure compared to JSON.
  • JSON supports only UTF-8 encoding whereas XML supports various encoding formats.

JSON Primitives

JSON has 4 types of primitives:
  • string– represented within double quotes. e.g., "John Doe", "Norfolk"
  • number - represented as the number itself. e.g., 3.14
  • boolean - represented as true or false (lower case, unlike in Python)
  • null - represented as null.

JSON Objects

A JSON object is a collection of comma-separated (key: value) elements wrapped in curly parentheses {}. It's generally used to describe some entity by its defining properties.

Examples of JSON Objects

JSON Arrays

A JSON Array is a collection of comma-separated objects/primitives wrapped in square parentheses []. It's generally used to describe a set of related elements.

Examples of JSON Arrays

JSON Documents

To represent richer information in JSON format, we typically use a mix of primitives, objects, and arrays, and even nest objects/arrays within other objects/arrays.
Examples of a JSON Document

Reading JSON Data

When reading JSON objects, we access its values by key, whereas when reading JSON arrays, we access its values by position (i.e., array index). e.g.,

An Example of JSON Data (Left) and Python Code to Lookup its Data (Right)

At top-level, a valid JSON element could be of {...} form, or [...] form. Intuitively, this means JSON data could represent a single object, or a collection of smaller objects. Note that real-world JSON data structures could be much more complex than the example shown above.

Limitations of JSON

The JSON syntax, being general-purpose, has downsides when representing information. Consider trying to represent the name of a person "John Doe". This arbitrary piece of information could be represented in many ways, such as {"name": "John Doe"} and {“name": {"first": "John", "last": "Doe"}}. While they provide the same information, the way that information is organized (i.e., semantics) differs. As such, developers have the option to either (a) write JSON readers for every possible case, or (b) agree upfront on the structure of JSON data being used. For case (b), JSON schemas can be quite useful. Since you define the structure of data ahead of time, the parsers can first validate the JSON against its schema, and then parse the content according to what’s defined in the schema.

JSON Schema

In layman's terms, a JSON schema is a JSON document defining the shape and structure of another JSON document. It defines the properties, types, formats, constraints, and relationships of the JSON data. For example, a JSON schema can specify that a JSON document must have a certain number of properties, that each property must be a string or a number, that some properties are required, and others are optional, that some properties must match a regular expression or a predefined list of values, etc.

JSON schema brings several benefits to JSON data, such as:
  • Documenting JSON data and making them easier to understand and maintain.
  • Avoiding common errors and bugs, such as missing or incorrect properties, invalid data types or formats, etc.
  • Enforcing data quality and consistency across your applications and systems that use JSON data.
  • Improving the interoperability and compatibility of your JSON data with other applications and systems that use the same or similar schemas.
  • Automating tasks such as data generation, testing, conversion, transformation, etc.
A JSON schema can be used to validate a JSON document against the schema rules. This can be done by using a JSON schema validator, which is a software library or tool that can check if a JSON document conforms to a given schema. If the JSON document is valid, the validator returns true; otherwise, it returns false or an error message with the details of the validation failure.

Creating a JSON Schema is straightforward; it starts with the following keywords:
  • $schema: specifies which JSON Schema standard (i.e., draft) we will use to define our schema. There are several versions that you could use:
  • $id: sets a URI for the schema. You can use this unique URI to refer to elements of the schema from inside the same document or from external JSON documents.
  • title and description: state the intent of the schema. These keywords are merely descriptive and does not constrain the structure of JSON data being validated against the schema.
  • type: defines the first constraint on JSON data. It specifies the type of data allowed within a part/entirety of JSON data. For instance, if you define the type as string, defining JSON data from any other type will make it invalid.
Depending on need, there are many more constraints that can be added into the JSON schema. This includes constraints such as required and pattern (for strings). Moreover, if you are defining a complex JSON schema with many data types (e.g., arrays of values, references, etc.), they can be independently defined under a $defs keyword and reused, either within the schema or externally.

Here's a snippet from a JSON schema that I created for my research, and a JSON object that validates against it.

Schema definition of the Data Flow Description Schema


A JSON document created using Data Flow Description Schema

If you haven't used JSON schemas before, I hope this post will motivate you to give them a try. If you do, here are a few resources on how to write, validate, and apply JSON Schema. Happy schema-writing!
-- Yasith Jayawardana

Viewing all articles
Browse latest Browse all 742

Trending Articles