Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

path-binder

path-binder

Label your columns with paths.
path-binder builds the JSON.

A TypeScript library that parses flat data into nested JSON.

From Flat Data, To Nested JSON

Flat Data
user.nameuser.ageuser.role
Taro25admin
Jiro30editor
↓
JSON Output
{
  "user": [
    { "name": "Taro", "age": 25, "role": "admin" },
    { "name": "Jiro", "age": 30, "role": "editor" }
  ]
}

Write JSON paths as column headers. Fill in the data rows. Call generate(). That’s it.

Get Started in 3 Steps

1. Install

npm install path-binder

2. Prepare your data

Convert each spreadsheet column into { path, value } pairs.

import { generate } from 'path-binder'

const input = {
  Sheet1: [
    [{ path: 'user.name', value: 'Taro' }, { path: 'user.age', value: 25 }],
    [{ path: 'user.name', value: 'Jiro' }, { path: 'user.age', value: 30 }],
  ],
}

3. Transform

const { result } = generate(input)
// β†’ {
//   user: [
//     { name: 'Taro', age: 25 },
//     { name: 'Jiro', age: 30 }
//   ]
// }

Why path-binder?

Ever needed to transform flat data β€” from CSVs, spreadsheets, or databases β€” into structured JSON in a B2B SaaS product?

With path-binder, just add JSON path labels to your data columns. No complex transformation logic needed. Once paths are defined, the mapping works permanently β€” no matter how the data layout changes.

"Wouldn't Excel formulas be easier?"

For a one-off conversion, maybe. But maintaining complex formula chains across evolving business requirements creates ongoing support overhead. path-binder’s approach β€” simple labels that map directly to your data model β€” eliminates that cost entirely. Introduce it with your support team, and the reduction in maintenance effort speaks for itself.

Features

0

Zero Dependencies

No external dependencies. Keeps your node_modules clean β€” lightweight and fast.

{}

Type Safe

Strict type inference based on your schema definition. Get the most out of TypeScript.

βš™

Schema Support

Declaratively define casting and filtering. Unwanted columns are automatically excluded.

πŸ“Š

Multi Sheet

Automatically join data across sheets using reference keys ($). Build relational structures in a single call.

Advanced Usage

Combine schemas for type casting with $ reference keys for multi-sheet joining.

import { generate, defineSchema, asNumber, asString, arrayOf } from 'path-binder'

const input = {
  sheetA: [
    [{ path: 'user.id', value: 1 }, { path: 'user.name', value: 'Taro' }],
    [{ path: 'user.id', value: 2 }, { path: 'user.name', value: 'Jiro' }],
  ],
  sheetB: [
    [{ path: 'user.$id', value: 1 }, { path: 'user.info[].type', value: 'google' }],
  ],
}

const schema = defineSchema({
  user: {
    id: asNumber(),
    name: asString(),
    info: arrayOf({ type: asString() }),
  },
})

const { result } = generate(input, { schema })
// β†’ {
//   user: [
//     { id: 1, name: 'Taro', info: [{ type: 'google' }] },
//     { id: 2, name: 'Jiro' },
//   ]
// }

Next steps: Learn Path Syntax for nesting and arrays β†’ Set up Schemas for type casting and filtering β†’ Try the Playground

Path Syntax

path-binder interprets the path in { path, value } pairs to transform flat data into nested JSON structures. This page explains how to write paths and the transformation rules.

Overview: How Paths Become JSON

Let’s start with the big picture of the transformation.

Spreadsheet (row 2 = paths)
user.nameuser.ageuser.tags[]
Taro25admin
Jiro30editor
↓
Generated JSON
{
  "user": [
    { "name": "Taro", "age": 25, "tags": ["admin"] },
    { "name": "Jiro", "age": 30, "tags": ["editor"] }
  ]
}

Dots (.) in the path correspond to object nesting, and [] corresponds to arrays. Let’s explore each syntax element in detail.


Property (Dot Notation)

When to use: When you want to express nested object structures

Use dots (.) to define nested properties.

PathValueGenerated JSON
name"Taro"{ "name": "Taro" }
user.name"Taro"{ "user": { "name": "Taro" } }
user.address.city"Tokyo"{ "user": { "address": { "city": "Tokyo" } } }
import { generate } from 'path-binder'

const input = {
  Sheet1: [
    [{ path: 'user.name', value: 'Taro' }, { path: 'user.address.city', value: 'Tokyo' }],
  ],
}

const { result } = generate(input)
// β†’ { user: [{ name: 'Taro', address: { city: 'Tokyo' } }] }

There is no limit to dot depth. You can nest freely like a.b.c.d.e.


Array Append []

When to use: When you want to collect values from multiple rows into an array

Appending [] to a path adds each row’s value as a new array element.

PathRow 1 ValueRow 2 ValueGenerated JSON
tags[]"admin""editor"{ "tags": ["admin", "editor"] }
user.skills[]"TypeScript""React"{ "user": [{ "skills": ["TypeScript", "React"] }] }
const input = {
  Sheet1: [
    [{ path: 'user.name', value: 'Taro' }, { path: 'user.tags[]', value: 'admin' }],
    [{ path: 'user.name', value: 'Taro' }, { path: 'user.tags[]', value: 'editor' }],
  ],
}

const { result } = generate(input)
// β†’ { user: [{ name: 'Taro', tags: ['admin', 'editor'] }] }

Index Access [n]

When to use: When you want to set a value at a specific position in an array

Use [n] to set a value directly at the nth position. Unspecified positions become undefined.

PathValueGenerated JSON
items[0]"first"{ "items": ["first"] }
items[2]"third"{ "items": [undefined, undefined, "third"] }
const input = {
  Sheet1: [
    [{ path: 'scores[0]', value: 90 }, { path: 'scores[1]', value: 85 }],
  ],
}

const { result } = generate(input)
// β†’ { scores: [[90, 85]] }

Reference Key $ β€” Joining Data Across Sheets

When to use: When you want to merge data from multiple sheets into a single object using a common key

This is path-binder’s most powerful feature. It performs JOIN-like operations β€” similar to relational databases β€” directly on spreadsheet data.

How It Works

  1. Primary rows: Rows with normal paths (e.g., user.id). These form the base data
  2. Reference rows: Rows with $-prefixed paths (e.g., user.$id). They are matched against primary rows by removing the $ (matching id), and their data is merged

Example: Joining User Master with Email Data

Imagine you have two sheets.

Sheet1 (Primary) β€” Basic user information:

user.iduser.name
1Taro
2Jiro

Sheet2 (Reference) β€” Adding email via $:

user.$iduser.email
1taro@example.com

Removing $ from user.$id gives user.id. Sheet2’s $id = 1 matches Sheet1’s id = 1 (Taro), and the email data is merged.

import { generate } from 'path-binder'

const input = {
  Sheet1: [
    [{ path: 'user.id', value: 1 }, { path: 'user.name', value: 'Taro' }],
    [{ path: 'user.id', value: 2 }, { path: 'user.name', value: 'Jiro' }],
  ],
  Sheet2: [
    [{ path: 'user.$id', value: 1 }, { path: 'user.email', value: 'taro@example.com' }],
  ],
}

const { result } = generate(input)
// β†’ {
//   user: [
//     { id: 1, name: 'Taro', email: 'taro@example.com' },
//     { id: 2, name: 'Jiro' }
//   ]
// }

Reference Key Constraints

ConstraintDescription
Primary rows requiredIf all rows are reference rows (all have $), they are skipped
Primitive key values$key values must be string, number, or boolean (no objects)
Same root path$keys in the same row must belong to the same root path
No nesting$key inside array paths (e.g., info[].$type) is not allowed

If you encounter reference key errors, see the Skip Handling page for detailed causes and solutions.


Escape $$

When to use: When you want to include a literal $ character in the path

$ is normally interpreted as a reference key prefix. To use a literal $, escape it with $$.

PathGenerated JSON
config.$$ref{ "config": { "$ref": "..." } }
data.$$type{ "data": { "$type": "..." } }
const input = {
  Sheet1: [
    [{ path: 'schema.$$ref', value: '#/definitions/User' }],
  ],
}

const { result } = generate(input)
// β†’ { schema: [{ $ref: '#/definitions/User' }] }

Combination Patterns

Common path combination patterns used in practice.

Nesting + Array: User Address List

// Paths: user.addresses[].city, user.addresses[].zip
// β†’ { user: [{ addresses: [{ city: 'Tokyo', zip: '100-0001' }, ...] }] }

Objects in Arrays: Order Line Items

// Paths: order.items[].name, order.items[].price, order.items[].qty
// β†’ { order: [{ items: [{ name: '...', price: 100, qty: 2 }, ...] }] }

Reference Key + Array: Joining Master Data

// Sheet1: product.id, product.name
// Sheet2: product.$id, product.reviews[].comment
// β†’ Primary product data is merged with review arrays from reference rows

Next step: Set up type casting and filtering with Schemas

Schema

Why Use a Schema?

generate() works without a schema. So why define one?

1. Type Casting β€” Get the Right Types from Input Data

Input values are often a mix of strings and numbers. With a schema, you can guarantee the types in your output JSON.

// Without schema: age remains the string "25"
{ name: 'Taro', age: '25' }

// With asNumber(): age is cast to number 25
{ name: 'Taro', age: 25 }

2. Filtering β€” Automatically Exclude Unwanted Columns

Schemas act as an allowlist. Only paths defined in the schema are included in the output; everything else is silently excluded. No need to worry about extra columns in the input data polluting your API responses.

// Even if input has "Notes" and "Manager" columns, they're excluded if not in the schema
const schema = defineSchema({
  customer: {
    name: asString(),
    email: asString(),
    // ← "Notes" and "Manager" are not defined here, so they're excluded
  },
})

Cast Functions

Built-in functions that define type conversion for each field.

FunctionConversionInput ExampleOutput Example
asString()String(value)123"123"
asNumber()Number(value)"42"42
asBoolean()Boolean(value)"true"true
asDate()new Date(value)"2024-01-15"Date object
asCustom(fn)Custom functionanyany
import { defineSchema, asString, asNumber, asBoolean, asDate, asCustom } from 'path-binder'

const schema = defineSchema({
  user: {
    name: asString(),
    age: asNumber(),
    active: asBoolean(),
    joinedAt: asDate(),
    score: asCustom((v) => Math.round(Number(v))),
  },
})

Array Schema

Use arrayOf() to define schemas for array elements.

import { defineSchema, asString, asNumber, arrayOf } from 'path-binder'

const schema = defineSchema({
  user: {
    name: asString(),
    // Primitive array
    tags: arrayOf(asString()),
    // Object array
    contacts: arrayOf({
      type: asString(),
      value: asString(),
    }),
  },
})

Loose Schema (asAny)

When to use: During prototyping when you want all columns to pass through, or when you only need to cast specific fields

import { defineSchema, asAny, asNumber } from 'path-binder'

// Pass all paths through without casting
const looseSchema = defineSchema({
  user: asAny(),
})

// Pass all paths through, but cast age to number
const partialSchema = defineSchema({
  user: asAny({
    age: asNumber(),
  }),
})

asAny() disables filtering. Paths not defined in the schema will also be included in the output.


Custom Cast

Use asCustom() to define arbitrary transformation logic.

Splitting Comma-Separated Tags

const schema = defineSchema({
  product: {
    tags: asCustom((v) => String(v).split(',').map((t) => t.trim())),
    // "food,frozen,sale" β†’ ["food", "frozen", "sale"]
  },
})

Behavior on Exception

When a cast function throws an exception inside asCustom(), that entry is skipped.

const schema = defineSchema({
  config: {
    priority: asCustom((v) => {
      const n = Number(v)
      if (Number.isNaN(n)) {
        throw new Error('Invalid number')
      }
      return n
    }),
  },
})

Schema vs No Schema

AspectWithout SchemaWith Schema
Type castingNone (raw values)Controlled by cast functions
Column exclusionNot applied (all paths output)Acts as allowlist
Skip on exceptionNoneEntries are skipped when a cast function throws
Development speedFast (no definition needed)Slightly slower (definition required)
Production useNot recommendedRecommended

Next step: Set up error handling with Skip Handling

API Reference

generate(input, options?)

The main function that transforms flat spreadsheet data into nested JSON.

import { generate } from 'path-binder'

const { result, skipped } = generate(input, {
  schema,              // optional: SchemaObject
  skipScope: 'cell',   // optional: 'cell' | 'row' (default: 'cell')
})

Parameters

ParameterTypeRequiredDescription
inputInputDataYes{ path, value } pair arrays per sheet
options.schemaSchemaObjectNoType casting and filtering definition
options.skipScope'cell' | 'row'NoSkip granularity (default: 'cell')

Returns: GenerateResult

type GenerateResult = {
  readonly result: Record<string, unknown>
  readonly skipped: readonly ParseSkipped[]
}
  • result: Keyed by top-level property, values are always arrays. This is because each spreadsheet row corresponds to one entity.
// Input: 2 rows with user.name = "Taro", user.name = "Jiro"
// Result: { user: [{ name: 'Taro' }, { name: 'Jiro' }] }
//           ↑ user is always an array
  • skipped: Array of entries that could not be processed. See Skip Handling for details.

InputData Format

3-layer structure: Sheet β†’ Rows β†’ Cells

type InputData = {
  readonly [sheetName: string]: readonly (readonly PathValuePair[])[]
}
//                               ↑ array of rows    ↑ array of cells (path + value pairs)

type PathValuePair = {
  readonly path: string
  readonly value: unknown
}

Visualized:

InputData = {
  "Sheet1": [          // ← sheet name
    [                  // ← row 1
      { path, value }, // ← cell 1
      { path, value }, // ← cell 2
    ],
    [                  // ← row 2
      { path, value },
      { path, value },
    ],
  ],
  "Sheet2": [ ... ],   // ← another sheet
}

defineSchema(definition)

Defines a schema object.

import { defineSchema, asString, asNumber } from 'path-binder'

const schema = defineSchema({
  user: {
    name: asString(),
    age: asNumber(),
  },
})

Cast Functions

FunctionSignatureConversion
asString()() => SchemaNodeString(value)
asNumber()() => SchemaNodeNumber(value)
asBoolean()() => SchemaNodeBoolean(value)
asDate()() => SchemaNodenew Date(value)
asCustom(fn)(fn: CastFn) => SchemaNodeCustom function
asAny()(partial?: object) => SchemaNodeNo casting (pass all paths)
arrayOf(schema)(schema: SchemaNode | object) => ArraySchemaArray element schema
type CastFn = (value: unknown) => unknown

Type Exports

import type {
  InputData,         // Input data type
  PathValuePair,     // { path: string, value: unknown }
  GenerateOptions,   // generate() options
  GenerateResult,    // generate() return type
  ParseSkipped,      // Skipped entry
  ParseSkipReason,   // Skip reason code
  SchemaObject,      // defineSchema() argument type
  SchemaNode,        // Each schema node
  CastFn,            // Cast function type
  ArraySchema,       // arrayOf() return type
  AnySchema,         // asAny() return type
} from 'path-binder'

Skip Handling

path-binder does not throw errors when it encounters invalid data. Instead, it collects unprocessable entries in the skipped array and returns them alongside the result. This prevents a single bad row from halting the entire batch.

Inspecting Skipped Entries

import { generate } from 'path-binder'

const { result, skipped } = generate(input, { schema })

skipped.forEach((entry) => {
  console.log(entry.name)    // sheet name
  console.log(entry.index)   // row index
  console.log(entry.path)    // the problematic path string
  console.log(entry.value)   // cell value (stringified)
  console.log(entry.reason)  // skip reason code
})

skipScope Option

Controls the granularity of skipping.

skipScopeBehaviorRecommended For
'cell' (default)Only the invalid cell is skipped; other cells in the row are processedUser input data (tolerating minor gaps)
'row'If any cell in a row is invalid, the entire row is skippedStrict data conversion (preventing incomplete entities)
// Cell level: only invalid cells are skipped
const cellResult = generate(input, { skipScope: 'cell' })

// Row level: if any cell is invalid, the entire row is skipped
const rowResult = generate(input, { skipScope: 'row' })

Common Skips and How to Fix Them

empty β€” Empty Path Segment

Triggering input: "user..name" (double dots), "user." (trailing dot)

Fix: Check the path row and remove extra dots.

reference_not_found β€” No Matching Reference

Triggering input: Sheet2 has user.$id = 3, but Sheet1 has no user with id = 3

Fix: Verify that matching primary data exists in the referenced sheet.

cast β€” Cast Function Threw an Exception

Triggering input: asNumber() field receives "abc"

Fix: Check the input data, or add error handling in asCustom().


Skip Reasons Reference

Path Syntax Errors

CodeDescriptionExample Input
emptyPath segment is empty"user..name"
bracketUnclosed bracket ]"items[0"
indexNon-numeric array index"items[abc]"
unnamedNo property name before []"[]"
escapeNo name after $$ escape prefix"config.$$"
keyNo name after $ key prefix"user.$"

Reference Key Errors

CodeDescriptionExample Input
reference_not_foundNo matching primary row$id=3 but no primary id=3
no_primary_dataAll rows are reference rowsAll paths have $ prefix
conflicting_key_propSame row has $key and non-$key with same nameuser.$id and user.id in same row
nested_key$key inside array pathinfo[].$type
invalid_key_valueKey value is not a primitive$id value is an object
mixed_key_root$keys in same row belong to different root pathsuser.$id and order.$id in same row

Data Conflict & Cast Errors

CodeDescriptionExample Input
property_conflictReference data conflicts with primary dataBoth have different values for same property
castCast function threw an exceptionasNumber() receives "abc"

Skip Logging Patterns

Example of monitoring skips in production.

const { result, skipped } = generate(input, { schema })

// Log when skips occur
if (skipped.length > 0) {
  console.warn(`${skipped.length} entries were skipped`)
  skipped.forEach((s) => {
    console.warn(`  Sheet: ${s.name}, Row: ${s.index + 1}, Path: ${s.path}, Reason: ${s.reason}`)
  })
}

// Generate error report for user feedback
const errorReport = skipped.map((s) =>
  `Row ${s.index + 1}: "${s.path}" β€” ${s.reason}`
)

Next step: See the API Reference for all types and options

Playground

Drop an Excel file (.xlsx) to convert it into nested JSON using path-binder. The second row of each sheet defines the JSON paths, and rows 3+ are data.

Excel format example:

UserIdUserNameLoginMethod
user.iduser.nameuser.loginInfo[].type
1TaroID/Password
2JiroemailLink

Download sample Excel (single sheet) | Download sample Excel (multi-sheet with $ reference)

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>path-binder Playground</title>
  <style>
    body { font-family: sans-serif; margin: 20px; }
    #dropzone {
      border: 2px dashed #1a73e8;
      border-radius: 8px;
      padding: 40px;
      text-align: center;
      color: #5f6368;
      cursor: pointer;
      transition: background-color 0.2s;
    }
    #dropzone.dragover { background-color: rgba(26,115,232,0.08); }
    #output { white-space: pre-wrap; font-family: monospace; margin-top: 16px; }
  </style>
</head>
<body>
  <div id="dropzone">
    Drop .xlsx file here or click to select
    <!-- Not using D&D alone: cross-origin iframe nesting blocks drop events. -->
    <input type="file" id="file-input" accept=".xlsx" style="display:none">
  </div>
  <h3>Result</h3>
  <pre id="output">Waiting for file...</pre>
  <script type="module" src="./index.ts"></script>
</body>
</html>
import ExcelJS from 'exceljs'
import { generate, defineSchema, asNumber, asString, arrayOf } from 'path-binder'
import type { InputData, PathValuePair, GenerateResult } from 'path-binder'

// =====================================================================
// ✏️ Edit this function to try different schemas and options!
//
// Try:
//   - Add asNumber() cast:  id: asNumber()
//   - Add skipScope:        generate(inputData, { schema, skipScope: 'row' })
//   - Remove schema:        return generate(inputData)
// =====================================================================
function example(inputData: InputData): GenerateResult {
  const schema = defineSchema({
    user: {
      id: asNumber(),
      name: asString(),
      loginInfo: arrayOf({ type: asString() }),
    },
  })

  return generate(inputData, { schema })
}

// Row 1: header labels (ignored by path-binder)
// Row 2: JSON paths
// Row 3+: data values
const PATH_ROW_INDEX = 2

/**
 * Reads an Excel workbook buffer and converts it to path-binder InputData.
 *
 * Uses sheet name as the key and row 2 as paths.
 * Not reading paths from row 1: row 1 is a human-readable header,
 * keeping paths on row 2 makes it easy to swap path definitions
 * without touching the header.
 */
async function excelToInputData(buffer: ArrayBuffer): Promise<InputData> {
  const workbook = new ExcelJS.Workbook()
  await workbook.xlsx.load(buffer)

  const inputData: Record<string, PathValuePair[][]> = {}

  workbook.eachSheet((sheet) => {
    const pathRow = sheet.getRow(PATH_ROW_INDEX)
    const paths: string[] = []
    pathRow.eachCell((cell, colNumber) => {
      paths[colNumber] = String(cell.value ?? '')
    })

    const rows: PathValuePair[][] = []
    sheet.eachRow((row, rowNumber) => {
      if (rowNumber <= PATH_ROW_INDEX) {
        return
      }

      const pairs: PathValuePair[] = []
      row.eachCell((cell, colNumber) => {
        const path = paths[colNumber]
        if (!path) {
          return
        }
        pairs.push({ path, value: cell.value })
      })

      if (pairs.length > 0) {
        rows.push(pairs)
      }
    })

    inputData[sheet.name] = rows
  })

  return inputData
}

async function processFile(file: File, dropzone: HTMLElement, output: HTMLElement): Promise<void> {
  dropzone.textContent = `Loaded: ${file.name}`

  try {
    const buffer = await file.arrayBuffer()
    const inputData = await excelToInputData(buffer)
    const { result, skipped } = example(inputData)

    output.textContent = JSON.stringify(result, null, 2)
    if (skipped.length > 0) {
      output.textContent += '\n\n--- Skipped ---\n' + JSON.stringify(skipped, null, 2)
    }
  } catch (err) {
    output.textContent = 'Error: ' + String(err)
  }
}

function setupDropzone(): void {
  const dropzone = document.getElementById('dropzone')
  const output = document.getElementById('output')
  const fileInput = document.getElementById('file-input') as HTMLInputElement | null
  if (!dropzone || !output || !fileInput) {
    return
  }

  // Click to open file picker.
  // Not relying on D&D alone: cross-origin iframe nesting blocks drop events.
  dropzone.addEventListener('click', () => {
    fileInput.click()
  })

  fileInput.addEventListener('change', async () => {
    const file = fileInput.files?.[0]
    if (!file) {
      return
    }
    await processFile(file, dropzone, output)
  })

  dropzone.addEventListener('dragover', (e) => {
    e.preventDefault()
    dropzone.classList.add('dragover')
  })

  dropzone.addEventListener('dragleave', () => {
    dropzone.classList.remove('dragover')
  })

  dropzone.addEventListener('drop', async (e) => {
    e.preventDefault()
    dropzone.classList.remove('dragover')

    const file = e.dataTransfer?.files[0]
    if (!file) {
      return
    }
    await processFile(file, dropzone, output)
  })
}

setupDropzone()

Excel format rules

  • Row 1 β€” Human-readable header (ignored)
  • Row 2 β€” JSON paths for path-binder (e.g. user.id, user.name, user.loginInfo[].type)
  • Row 3+ β€” Data values
  • Sheet name β€” Used as the top-level key in InputData

What you can try

  • Edit the example() function to add/change the schema
  • Change the paths in row 2 of your Excel
  • Add a second sheet with $ reference keys