← Back to Blog
Developer6 min read

JSON Formatting and Validation: A Developer's Practical Guide

JSON looks simple but hides real traps: float precision loss, BOM characters, circular references, and schema gaps that let malformed data propagate deep into a system. This guide covers JSON Schema, ajv, streaming parsers, JSON5/JSONC, and the jq command-line tool.

Tanvrit Team
22 March 2026 · Engineering
Share

JSON is the lingua franca of modern software. REST APIs, configuration files, message queues, log aggregators — virtually every data exchange layer you interact with daily speaks JSON. Yet despite its apparent simplicity, JSON is a surprisingly common source of subtle bugs: malformed payloads that pass TypeScript's type checker, number precision loss that rounds financial figures, and validation gaps that let malformed data propagate deep into a system before causing a failure. This guide covers everything you need to work with JSON correctly and defensively.

JSON Syntax: The Strict Rules

JSON looks like a JavaScript object literal but has stricter syntax rules that trip up developers who conflate the two. The complete JSON specification (ECMA-404 / RFC 8259) is remarkably short — the entire spec fits in a few pages — but it leaves no ambiguity.

JSON supports exactly six data types:

  • string: a sequence of Unicode characters in double quotes. Single quotes are not valid JSON.
  • number: integer or floating-point, no quotes. No special values like Infinity or NaN — these are JavaScript-specific and not part of JSON.
  • boolean: true or false (lowercase only).
  • null: null (lowercase only).
  • array: an ordered list of values in square brackets.
  • object: an unordered collection of key-value pairs in curly braces, where keys must be strings in double quotes.

JSON vs JavaScript Object Literals

The differences between JSON and JavaScript object literals are small but significant:

  • No trailing commas: A trailing comma after the last array element or object property is a syntax error in JSON. It is valid in JavaScript since ES5. A JSON file from a developer who wrote it by hand will often have a trailing comma that makes it invalid JSON.
  • No comments: JSON has no comment syntax. Not // ... and not /* ... */. Configuration files that people want to annotate with comments require a superset like JSONC or JSON5.
  • Keys must be strings in double quotes: JavaScript allows unquoted keys ({name: "Alice"}) and single-quoted keys. JSON requires {"name": "Alice"} exclusively.
  • No undefined: undefined is a JavaScript value with no JSON equivalent. When you call JSON.stringify() on an object that contains undefined values, those keys are silently omitted from the output.
  • No functions, no Date objects: JSON.stringify(new Date()) serializes to an ISO string. Functions are silently dropped.

JSON Schema: Validating Structure

JSON Schema (jsonschema.org) is a vocabulary for describing the structure of JSON documents. A schema is itself a JSON object that declares what valid instances must look like. It is the standard approach for API contract validation, configuration file validation, and form validation against a server-side model.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["id", "email", "role"],
  "additionalProperties": false,
  "properties": {
    "id": {
      "type": "string",
      "format": "uuid"
    },
    "email": {
      "type": "string",
      "format": "email",
      "maxLength": 255
    },
    "role": {
      "type": "string",
      "enum": ["admin", "editor", "viewer"]
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    },
    "tags": {
      "type": "array",
      "items": { "type": "string" },
      "uniqueItems": true
    }
  }
}

Key schema keywords to know: type (string, number, integer, boolean, array, object, null), required (array of required property names), properties (per-property schemas), additionalProperties: false (reject unknown keys — critical for security), and enum (value allowlist).

Validating JSON in Code

JavaScript: ajv

ajv (Another JSON Schema Validator) is the fastest and most widely used JSON Schema validator for JavaScript. It compiles schemas to optimized validator functions:

import Ajv from "ajv";
import addFormats from "ajv-formats";

const ajv = new Ajv();
addFormats(ajv);  // adds email, uuid, date-time, etc.

const schema = {
  type: "object",
  required: ["email", "role"],
  properties: {
    email: { type: "string", format: "email" },
    role: { type: "string", enum: ["admin", "viewer"] },
  },
  additionalProperties: false,
};

const validate = ajv.compile(schema);

const data = { email: "user@example.com", role: "admin" };
if (!validate(data)) {
  console.error(validate.errors);
  // [{ instancePath: "/role", message: "must be equal to one of the allowed values" }]
}

Python: jsonschema

from jsonschema import validate, ValidationError

schema = {
    "type": "object",
    "required": ["email", "role"],
    "properties": {
        "email": {"type": "string", "format": "email"},
        "role": {"type": "string", "enum": ["admin", "viewer"]},
    },
    "additionalProperties": False,
}

data = {"email": "user@example.com", "role": "admin"}

try:
    validate(instance=data, schema=schema)
    print("Valid")
except ValidationError as e:
    print(f"Invalid: {e.message}")

Formatting: Pretty-Print vs Minification

JSON.stringify() accepts two optional arguments after the replacer: a space argument that controls indentation. Passing a number (typically 2) produces a human-readable indented format; omitting it or passing 0 produces a compact single-line string:

const data = { name: "Alice", scores: [98, 87, 94] };

// Pretty-print (2-space indent)
JSON.stringify(data, null, 2);
// {
//   "name": "Alice",
//   "scores": [
//     98,
//     87,
//     94
//   ]
// }

// Minified (no whitespace)
JSON.stringify(data);
// {"name":"Alice","scores":[98,87,94]}

Use pretty-print for: configuration files, developer-facing API responses, debugging output, files committed to version control. Use minified JSON for: production API responses over the wire (then rely on gzip/brotli for compression, which handles repetitive JSON keys extremely well), serialized storage in databases, and message queue payloads where byte count matters.

Deep Nesting: When JSON Structure Becomes Unmaintainable

Deeply nested JSON is harder to query, harder to update, and harder to validate. It also performs worse — deep key traversal and nested serialization is O(depth × breadth). When you find yourself accessing values like response.data.user.profile.address.city, it is a signal that the data model needs normalization.

JSON:API and GraphQL both address this with explicit normalization strategies. JSON:API flattens relationships into top-level resources linked by ID rather than nesting. GraphQL lets clients specify exactly the fields they need, avoiding both over-fetching and deeply nested response structures. For internal data models, the same normalization principles that apply to relational databases apply to JSON: represent entities once and reference them by ID.

Large JSON Performance: Streaming Parsers

JSON.parse() in JavaScript is a synchronous, blocking operation. Parsing a 10 MB JSON file blocks the main thread for tens of milliseconds — enough to cause a perceptible UI freeze. For large JSON payloads, use a streaming parser that processes the JSON incrementally:

// JSONStream — streaming JSON parser for Node.js
import JSONStream from "jsonstream";
import { createReadStream } from "fs";

// Parse an array of records from a large file, one record at a time
const stream = createReadStream("large-data.json")
  .pipe(JSONStream.parse("records.*"));

stream.on("data", (record) => {
  // Process one record at a time — never holds the whole array in memory
  processRecord(record);
});

stream.on("end", () => console.log("Done"));

For server-side high-throughput JSON parsing, consider simdjson (C++ with Node bindings via simdjson-node), which uses SIMD CPU instructions to parse JSON 2–5× faster than V8's built-in parser. For most applications, JSON.parse() is sufficient; reach for alternatives only when profiling shows it as a genuine bottleneck.

JSON in APIs: Conventions

The most common JSON API design debates:

  • camelCase vs snake_case: JavaScript conventionally uses camelCase; Python and most backend languages use snake_case. Pick one and be consistent. Most large public APIs (GitHub, Stripe, Twilio) use snake_case for JSON fields. If you control both ends, pick what your primary consumer language prefers and transform at the boundary.
  • Date formats: Use ISO 8601 strings with UTC offset (2024-03-31T08:00:00Z) for human-readable APIs. Use Unix integer timestamps for high-throughput or storage-optimized scenarios.
  • null vs absent field: The meaning of a missing key versus a key set to null should be explicitly defined in your API contract. A common convention:null means "known to be empty"; absent means "not provided in this response" (e.g., a partial update payload).

JSON5 and JSONC: Supersets for Config Files

Configuration files are frequently written by humans who want to leave comments explaining why a value is set, or who want trailing commas so adding a new entry at the end does not require modifying the previous line. Standard JSON supports neither.

JSONC (JSON with Comments) adds single-line // comments and block /* comments */ to JSON. It is used by VS Code's tsconfig.json and .vscode/settings.json, as well as TypeScript's own configuration format.

JSON5 (json5.org) is a more extensive superset: it adds comments, trailing commas, unquoted keys, single-quoted strings, hexadecimal numbers, and multi-line strings. It is used by some build tools and configuration systems. Neither JSONC nor JSON5 is appropriate for API responses — they are configuration file formats only.

Common JSON Mistakes

  • BOM characters: A UTF-8 BOM (byte order mark, U+FEFF) at the start of a JSON file causes parse errors in most parsers. Some Windows text editors add BOMs by default. The JSON spec (RFC 8259) explicitly prohibits BOMs.
  • Floating-point precision: JSON numbers are IEEE 754 double-precision floats in most parsers. Numbers with more than 15-16 significant digits lose precision. A monetary value like 1234567890123456.78 will be silently rounded to 1234567890123456.8. Always represent monetary amounts as integers (e.g., cents) or strings.
  • Large integers: JavaScript's Number type cannot safely represent integers larger than 253 − 1 (9,007,199,254,740,991). Twitter's API famously had to return tweet IDs as both a number and a string because JavaScript clients were losing precision on large integer IDs. Use BigInt or string representation for IDs and counts that may exceed this limit.
  • Circular references: JSON.stringify() throws a TypeError: Converting circular structure to JSON if the object contains a cycle. Detect and break cycles before serializing, or use a library like flatted that supports circular references.
  • UTF-8 encoding: JSON must be encoded in UTF-8, UTF-16, or UTF-32, with UTF-8 being the mandatory requirement for interchange (RFC 8259 §8.1). Non-UTF-8 characters that slip in from a database with incorrect collation settings will cause parse errors or data corruption.

Command-Line JSON with jq

jq is the indispensable command-line tool for slicing, filtering, and transforming JSON. Every developer who works with APIs regularly should have it installed:

# Pretty-print JSON
curl https://api.example.com/users | jq .

# Extract a field
jq '.name' data.json

# Filter array
jq '.users[] | select(.role == "admin")' data.json

# Construct new objects
jq '.users[] | {id: .id, email: .email}' data.json

# Count items
jq '.users | length' data.json

# Sort and deduplicate
jq '[.users[].role] | unique | sort' data.json

Format, validate, and explore JSON right in your browser with Tanvrit's JSON formatter — paste any JSON and get instant pretty-printing, syntax highlighting, and validation feedback. Open the JSON Formatter →

JSONJSON SchemaajvJSON validationjqJSON formattingAPI design
← Back to Blog
Share