JSON Schema by Example: Validate Any Data Shape

Quick answer

💡JSON Schema uses keywords like type, properties, required, and additionalProperties to define the allowed shape of a JSON document. Validate using ajv in Node.js: const ajv = new Ajv(); const validate = ajv.compile(schema); const valid = validate(data). If valid is false, ajv.errors contains structured error objects explaining each violation.

Error symptoms

  • API receives data with missing required fields and proceeds silently, causing downstream null reference errors
  • A field expected to be a number arrives as a string because the client sent it without quotes but JSON parsing coerced the type
  • Validation passes for data that has extra unexpected fields, which then break a strict backend deserializer
  • ajv.validate() returns false but ajv.errors is null or empty because the schema was compiled incorrectly
  • format validation for email or date-time always passes even for obviously wrong values because ajv-formats plugin is missing
  • A $ref circular reference causes the ajv schema compiler to hang or throw a maximum call stack size error

Common causes

  • Omitting the required array from an object schema, so all properties become optional by default
  • Not setting additionalProperties: false, allowing extra undocumented fields to pass validation
  • Confusing oneOf and anyOf semantics, causing schemas that never validate or that validate when they should not
  • Using format keywords without installing and configuring the ajv-formats plugin in ajv v7 and later
  • Referencing a $ref path like #/definitions/Address when the schema actually uses $defs for Draft 2019-09
  • Forgetting to enable strict mode options in ajv v8 which changes defaults around additional properties and format handling

When it happens

  • When building a REST API that needs to validate incoming request bodies before processing them
  • When migrating from custom validation logic to a schema-based approach and the schema does not cover all the original rules
  • When consuming a third-party API and needing to validate that its responses match the documented contract before using the data
  • When using OpenAPI 3.x tooling that generates schemas from definitions, and the generated schemas conflict with ajv's strict mode
  • When sharing data schemas between a Node.js backend and a TypeScript frontend to ensure both validate the same constraints

Examples and fixes

A complete JSON Schema for a user registration payload validates required fields, string formats, and numeric ranges. The ajv validator reports structured errors when validation fails.

Defining and validating a user registration schema with ajv

❌ Wrong

const Ajv = require('ajv');
const ajv = new Ajv();

// Schema is missing required array and additionalProperties constraint
const registrationSchema = {
  type: 'object',
  properties: {
    username: { type: 'string' },
    email: { type: 'string' },
    age: { type: 'number' }
  }
};

const incomingRequest = {
  username: 'jordanwebb',
  injectedField: 'malicious data',
  age: 'not a number'
};

const valid = ajv.validate(registrationSchema, incomingRequest);
console.log('Valid:', valid); // true — but data is wrong!

✅ Fixed

const Ajv = require('ajv');
const addFormats = require('ajv-formats');

const ajv = new Ajv({ allErrors: true });
addFormats(ajv);

const registrationSchema = {
  $schema: 'https://json-schema.org/draft-07/schema#',
  type: 'object',
  required: ['username', 'email', 'age'],
  additionalProperties: false,
  properties: {
    username: { type: 'string', minLength: 3, maxLength: 30 },
    email: { type: 'string', format: 'email' },
    age: { type: 'integer', minimum: 13, maximum: 120 }
  }
};

const validate = ajv.compile(registrationSchema);
const incomingRequest = { username: 'jordanwebb', age: 'not a number' };

if (!validate(incomingRequest)) {
  console.log(validate.errors);
  // Reports: missing required 'email', age must be integer, injected field not allowed
}

The broken schema has no required array, so all fields are optional and the missing email is not caught. It has no additionalProperties: false, so the injected extra field passes unchecked. The age field is typed as number but receives the string 'not a number', which should fail but the schema also does not enforce integer type precisely. The fixed schema adds required for all mandatory fields, sets additionalProperties: false to reject extra fields, uses format: 'email' with the ajv-formats plugin for email validation, constrains age to integer type with a minimum and maximum, and compiles the schema once before reusing the validate function. The allErrors: true option makes ajv report all validation failures rather than stopping at the first one.

A $ref allows defining a schema once in the definitions section and referencing it from multiple places, keeping the schema DRY and consistent across nested types.

Using $ref for schema reuse across nested objects

❌ Wrong

const orderSchema = {
  type: 'object',
  required: ['orderId', 'billingAddress', 'shippingAddress'],
  properties: {
    orderId: { type: 'string' },
    billingAddress: {
      type: 'object',
      required: ['street', 'city', 'postalCode'],
      properties: {
        street: { type: 'string' },
        city: { type: 'string' },
        postalCode: { type: 'string', pattern: '^[0-9]{5}$' }
      }
    },
    shippingAddress: {
      type: 'object',
      required: ['street', 'city', 'postalCode'],
      properties: {
        street: { type: 'string' },
        city: { type: 'string' },
        postalCode: { type: 'string', pattern: '^[0-9]{5}$' }
      }
    }
  }
};

✅ Fixed

const Ajv = require('ajv');
const ajv = new Ajv();

const orderSchema = {
  $schema: 'https://json-schema.org/draft-07/schema#',
  definitions: {
    Address: {
      type: 'object',
      required: ['street', 'city', 'postalCode'],
      additionalProperties: false,
      properties: {
        street: { type: 'string', minLength: 1 },
        city: { type: 'string', minLength: 1 },
        postalCode: { type: 'string', pattern: '^[0-9]{5}$' }
      }
    }
  },
  type: 'object',
  required: ['orderId', 'billingAddress', 'shippingAddress'],
  additionalProperties: false,
  properties: {
    orderId: { type: 'string', pattern: '^ORD-[0-9]+$' },
    billingAddress: { '$ref': '#/definitions/Address' },
    shippingAddress: { '$ref': '#/definitions/Address' }
  }
};

const validate = ajv.compile(orderSchema);
console.log('Schema compiled successfully');

The broken schema duplicates the address definition for both billingAddress and shippingAddress. If the address format needs to change, both copies must be updated, and it is easy to introduce inconsistencies. The fixed schema defines Address once in the definitions section and references it with $ref in both property definitions. The $ref value '#/definitions/Address' is a JSON Pointer that the ajv compiler resolves within the same schema document. If the schema were split across files, the $ref could point to an external URL or a file path. Using $ref also means ajv validates both addresses with exactly the same rules, guaranteeing consistency.

How JSON Schema constrains the shape of data

JSON Schema is a vocabulary for annotating and validating JSON documents. Unlike a database schema or a type system, JSON Schema validates data at runtime by checking each value against a set of keyword constraints. The schema itself is a JSON document, which means it can be read, written, and transmitted using the same tools as any other JSON. This self-describing quality makes JSON Schema portable across languages and platforms.

The core of JSON Schema is the type keyword, which restricts a value to one or more of the six JSON types: string, number, integer, boolean, array, object, or null. A schema with only type: 'object' accepts any JSON object regardless of its properties. To constrain the properties, the properties keyword maps property names to subschemas, each of which defines the rules for that property's value. Properties defined under the properties keyword are optional by default; to require specific properties, list them in the required array.

The additionalProperties keyword controls whether a JSON object may contain properties beyond those defined in the properties keyword. By default, additionalProperties is effectively true, meaning any extra property is allowed. Setting additionalProperties to false turns the schema into a closed type: only the explicitly listed properties are accepted, and any extra property causes validation to fail. This is particularly useful for API request validation, where accepting unexpected extra fields could indicate a client programming error or a security concern.

Beyond type-level constraints, JSON Schema provides a rich set of value-level keywords. For strings, minLength, maxLength, and pattern (a regular expression) validate the content and length of string values. For numbers, minimum, maximum, exclusiveMinimum, and exclusiveMaximum set numeric bounds. For arrays, minItems, maxItems, and uniqueItems control the array's length and element uniqueness. For objects, minProperties and maxProperties limit the number of allowed keys.

The $schema keyword at the top of a schema document declares which draft of the JSON Schema specification the schema uses. This matters because different drafts have slightly different keyword semantics. The most widely supported draft for tooling is draft-07, which introduced if/then/else conditional validation. Draft 2019-09 and Draft 2020-12 are the current IETF drafts and introduce changes to $ref handling, $defs replacing definitions, and improved composition keyword semantics.

Writing JSON schemas that validate nested objects precisely

Validating nested objects requires composing multiple subschemas. Each property in an object schema can have its own subschema, which itself may be an object schema with further nested subschemas. This recursive composition is one of JSON Schema's most powerful features, but it also means that the total schema can become large and complex for deeply nested data structures.

When writing a schema for a nested object, start from the outermost type and work inward. Define the top-level object's required array and additionalProperties setting first, then define each property's type. For properties that are themselves objects, write a nested object schema in the properties definition, again specifying type, required, and additionalProperties for the nested object. For properties that are arrays of objects, use items to define the schema that each element must match.

A common mistake when writing nested schemas is forgetting to set required and additionalProperties on inner objects. If the outer object has additionalProperties: false but the inner objects do not, extra fields on the inner objects will be silently accepted. Each object subschema in a composition must independently declare its own constraints; they are not inherited from the parent schema.

Testing nested schemas incrementally is more efficient than writing the complete schema first and debugging it all at once. Start by validating valid examples against the schema. If the valid example fails, the schema is too strict. Then test invalid examples and verify that each violation is caught. If an invalid example passes, the schema is too lenient. Tools like ajv's human-readable error messages and the --formats option in the ajv CLI make this iterative process faster by reporting which specific keyword and which data path caused each failure.

For schemas that represent deeply nested API responses with many optional fields, using TypeScript's type system as a guide is helpful. Each TypeScript interface property that is required and non-null maps to a required, non-nullable JSON Schema property. An optional TypeScript property (marked with ?) maps to a property that is not in the required array. A union type like string | null maps to a JSON Schema type array: { type: ['string', 'null'] }.

Validating JSON payloads with ajv in Node.js

The ajv library is the de-facto standard JSON Schema validator for JavaScript and TypeScript. It compiles a schema to a validation function using code generation, making repeated validation extremely fast compared to interpreted validators. The basic pattern is to create an Ajv instance, compile the schema once, and reuse the compiled validate function for every request.

Installing ajv and the format validation plugin: npm install ajv ajv-formats. The ajv package provides the core validator, and ajv-formats adds support for the format keyword with built-in recognizers for email, uri, date, date-time, time, uuid, and several other common formats. Without ajv-formats, the format keyword is recognized but not validated, meaning any string passes regardless of content.

Creating an ajv instance with useful options is important. The allErrors: true option makes ajv collect all validation errors before returning, rather than stopping at the first failure. This is important for API validation because it lets you report all the problems in a request at once rather than requiring the client to fix one error and resubmit. The strict: false option relaxes several default behaviors in ajv v8 that break compatibility with schemas written for earlier versions, such as schemas that use unknown keywords or that define properties not listed in the required array.

The compiled validate function returns true if the data passes all constraints and false if any constraint fails. When it returns false, the errors are available on the validate.errors array rather than throwing an exception. Each error object has an instancePath property pointing to the data location that failed, a keyword property naming the schema keyword that was violated, and a message property with a human-readable description. For API error responses, these error objects can be serialized directly into a 400 Bad Request response body.

For production applications, compile schemas at startup rather than on each request. Schema compilation involves code generation and is expensive relative to the actual validation. Caching compiled validators in a module-level variable or a map keyed by schema identifier is a standard pattern. If schemas are loaded from files at runtime, compile each schema once when the application starts and reuse the compiled validators throughout the application's lifetime.

oneOf versus anyOf in JSON Schema composition

JSON Schema provides three composition keywords for combining multiple subschemas: oneOf, anyOf, and allOf. Understanding the difference between them is essential for writing schemas that correctly validate discriminated union types and polymorphic data structures.

oneOf requires the data to be valid against exactly one of the listed subschemas. If the data matches zero subschemas, validation fails. If the data matches two or more subschemas, validation also fails. This makes oneOf suitable for strict discriminated unions where the data must be one specific type and no two types can be simultaneously satisfied. For example, a payment method schema might use oneOf to enforce that a payment is either a credit card (with cardNumber and expiryDate fields) or a bank transfer (with accountNumber and routingNumber fields), but not both.

anyOf requires the data to be valid against at least one of the listed subschemas. Multiple matches are allowed. This makes anyOf suitable for union types where overlap between alternatives is acceptable. A field that accepts either a string or an array of strings can use anyOf with two subschemas: one that matches strings and one that matches arrays of strings. Since no value is both a string and an array, the overlap concern does not arise for these types.

The subtle gotcha with oneOf is that it can produce confusing error messages when validation fails. If the data fails all subschemas, ajv reports errors from each subschema, and the user must understand that all of them are alternatives rather than cumulative requirements. For discriminated unions, a better pattern is to use if/then/else with a discriminator field rather than oneOf, because the discriminator makes it clear which subschema should apply and the error messages point to the relevant subschema only.

allOf requires the data to be valid against every listed subschema simultaneously. This is JSON Schema's mechanism for schema inheritance and extension. A base schema can be defined once, and an extended schema can use allOf to include all the base schema's constraints and add more. This pattern is commonly used in OpenAPI to express that one data type extends another. The risk with allOf is that conflicting constraints from different subschemas make the combined schema unsatisfiable, meaning no value can pass validation.

When $ref creates circular schema dependencies

Circular schema references occur when schema A references schema B, and schema B references schema A, either directly or through a chain of intermediate references. JSON Schema allows circular references as a mechanism for validating recursive data structures like trees or linked lists. However, some validators and tools have difficulty handling circular schemas, and certain patterns create infinite loops during schema compilation.

Ajv supports circular references for data validation, but the validation must terminate because the data being validated is finite. A schema for a tree node that references itself through the children property works correctly because each recursive validation call processes a child node that is smaller than its parent. Validation terminates when it reaches leaf nodes with no children array, or with an empty children array.

The problems arise when tools other than the validator encounter the schema. JSON Schema documentation generators, code generation tools that produce TypeScript types from schemas, and schema editors may not handle circular references gracefully. Some tools resolve $ref references by inlining the referenced schema, which turns a circular reference into an infinite inlining process that never terminates. Others detect the cycle and either stop with an error or emit a sentinel type like any to break the cycle.

The practical approach for schemas that represent recursive data is to use $ref thoughtfully and test the schema with every tool in the pipeline, not just the validator. If documentation generation fails on a circular schema, consider breaking the cycle by defining the recursive type differently: for example, representing a tree node's children as an array of any type and handling the recursive validation separately, or using a maximum depth convention where the schema only validates a fixed number of nesting levels.

In Draft 2019-09 and Draft 2020-12, the $ref keyword's semantics changed. In draft-07, $ref could not appear alongside other keywords in the same schema object; any sibling keywords were ignored. In the newer drafts, $ref is treated as a regular keyword and can be combined with other keywords in an allOf or directly. This change means that schemas written for draft-07 that use $ref alongside other keywords may behave differently under newer validators. Always test schemas with the validator version your production code uses.

Schema versioning with the $schema declaration

The $schema keyword at the top of a JSON Schema document is both documentation and a machine-readable instruction. It tells schema validators and tooling which draft specification to use when interpreting the schema's keywords. Without a $schema declaration, tools must make assumptions about the draft, which can lead to subtle validation differences between tools or between versions of the same tool.

The three most important $schema values in current use are: http://json-schema.org/draft-07/schema# for Draft 7, https://json-schema.org/draft/2019-09/schema for Draft 2019-09, and https://json-schema.org/draft/2020-12/schema for Draft 2020-12. The change from http to https in the URI matters for some tools that perform strict URI matching. Always use the exact URI string that the specification document defines.

When writing new schemas for projects starting today, Draft 2020-12 is the current standard and provides the most consistent semantics. It is fully supported by ajv v8 when configured with the draft mode option. For projects that use OpenAPI 3.1, which adopts JSON Schema 2020-12 as its schema vocabulary, using the same draft version for standalone schemas avoids having to maintain two different schema dialects in the same project.

Versioning schemas across API releases requires deciding how to handle backward-incompatible changes. Adding a new optional property to a schema is backward compatible because existing data still validates. Making an optional property required is backward incompatible because existing data that omits the field will fail. Adding a stricter constraint, like reducing maxLength, is backward incompatible. A common strategy is to version schemas alongside the API version, keeping the old schema for validation of data produced by or for old clients while introducing the new schema for the current version.

Publishing schemas to a stable URL enables $ref references from other schemas and allows tools to fetch the schema dynamically. If your API publishes a JSON Schema at a consistent URL like https://api.example.com/schemas/v2/user.json, other teams can reference it directly in their $ref declarations and always get the current version. Adding versioning to the URL path allows breaking changes to be introduced at a new URL while old references continue to point to the old schema.

Quick fix checklist

  • Add the required array to every object schema to specify which properties must be present
  • Set additionalProperties: false on object schemas to reject unexpected extra fields
  • Install ajv-formats and call addFormats(ajv) before using any format keyword for email, uri, or date validation
  • Compile schemas once at application startup and reuse the compiled validate function for each request
  • Enable allErrors: true on the ajv instance to collect all validation errors instead of stopping at the first one
  • Use $ref with the definitions or $defs section to avoid duplicating nested object schemas
  • Understand whether you need oneOf (exactly one match) or anyOf (at least one match) before choosing composition keywords
  • Always include the $schema declaration to explicitly specify the JSON Schema draft version

Related guides

Frequently asked questions

What is the difference between Draft-07 and Draft 2020-12 JSON Schema?

Draft 2020-12 is the current IETF standard and introduces several improvements: $defs replaces definitions for schema reuse, $ref can now appear alongside other keywords in the same schema object (not allowed in draft-07), the unevaluatedProperties keyword provides more precise control than additionalProperties, and the prefixItems keyword handles tuple validation more clearly. For most practical validation scenarios, the differences are small, but new projects should use Draft 2020-12.

Why does ajv's format validation always pass even for invalid emails?

In ajv v7 and later, format validation is not enabled by default even when the format keyword is present in the schema. You must install the ajv-formats package and call addFormats(ajv) after creating the Ajv instance. Without this step, ajv recognizes the format keyword but ignores it, allowing any string to pass format constraints. This behavior changed from earlier versions of ajv where format validation was built in.

How do I validate an array where each element must match a schema?

Use the items keyword in an array schema: { type: 'array', items: { type: 'string' } } validates that every element is a string. For arrays of objects, set items to an object schema with its own type, required, and properties. The minItems and maxItems keywords constrain the array length. The uniqueItems: true keyword rejects arrays with duplicate values, comparing elements by deep equality.

What is the difference between oneOf and anyOf in JSON Schema?

oneOf requires the data to match exactly one subschema from the list — matching zero or two or more subschemas all cause validation failure. anyOf requires the data to match at least one subschema from the list — matching multiple subschemas is allowed. Use oneOf for strict discriminated unions where alternatives are mutually exclusive. Use anyOf for type unions like string-or-array where partial overlap is acceptable.

Can JSON Schema validate the relationship between two fields?

Yes, using the if/then/else keywords introduced in draft-07. For example, if the paymentMethod property equals 'credit_card', then the cardNumber property should be required. Write the if clause as a schema that the data must satisfy, the then clause as the schema applied when it does, and the else clause as the schema applied when it does not. This allows cross-field conditional validation without custom validation code.

How do I share a schema between a Node.js backend and a TypeScript frontend?

Store the schema in a shared package or a common file that both the backend and frontend import. The schema is a plain JavaScript object or a JSON file that any JSON Schema validator can consume. On the backend, use ajv to validate incoming requests. On the frontend, use the same schema for form validation. Tools like json-schema-to-typescript generate TypeScript types from JSON Schema definitions, keeping types and validation in sync.

What does ajv's instancePath mean in validation error objects?

The instancePath property in an ajv error object is a JSON Pointer string that identifies the location in the validated data where the error occurred. An empty string means the error is at the root of the data. The path /address/postalCode means the error is at data.address.postalCode. Array elements are referenced with their index like /items/2/name, meaning the name property of the third item in the items array failed validation.

Does JSON Schema support inheritance or extension of schemas?

JSON Schema does not have explicit inheritance, but the allOf composition keyword achieves the same effect. Define a base schema in definitions, then create an extended schema that uses allOf: [{ '$ref': '#/definitions/BaseType' }, { properties: { extraField: ... }, required: ['extraField'] }]. The extended schema validates all constraints from the base type plus the additional constraints. OpenAPI uses this pattern extensively for polymorphic data models.

All tools run in your browser. Your data never leaves your device. Last updated: 2026-05-05.