path-binder
path-binder
Label your columns with paths.
path-binder builds the JSON.
A TypeScript library that parses flat data into nested JSON.
From Flat Data, To Nested JSON
| user.name | user.age | user.role |
|---|---|---|
| Taro | 25 | admin |
| Jiro | 30 | editor |
{
"user": [
{ "name": "Taro", "age": 25, "role": "admin" },
{ "name": "Jiro", "age": 30, "role": "editor" }
]
}
Write JSON paths as column headers. Fill in the data rows. Call generate(). Thatβs it.
Get Started in 3 Steps
1. Install
npm install path-binder
2. Prepare your data
Convert each spreadsheet column into { path, value } pairs.
import { generate } from 'path-binder'
const input = {
Sheet1: [
[{ path: 'user.name', value: 'Taro' }, { path: 'user.age', value: 25 }],
[{ path: 'user.name', value: 'Jiro' }, { path: 'user.age', value: 30 }],
],
}
3. Transform
const { result } = generate(input)
// β {
// user: [
// { name: 'Taro', age: 25 },
// { name: 'Jiro', age: 30 }
// ]
// }
Why path-binder?
Ever needed to transform flat data β from CSVs, spreadsheets, or databases β into structured JSON in a B2B SaaS product?
With path-binder, just add JSON path labels to your data columns. No complex transformation logic needed. Once paths are defined, the mapping works permanently β no matter how the data layout changes.
"Wouldn't Excel formulas be easier?"
For a one-off conversion, maybe. But maintaining complex formula chains across evolving business requirements creates ongoing support overhead. path-binderβs approach β simple labels that map directly to your data model β eliminates that cost entirely. Introduce it with your support team, and the reduction in maintenance effort speaks for itself.
Features
Zero Dependencies
No external dependencies. Keeps your node_modules clean β lightweight and fast.
Type Safe
Strict type inference based on your schema definition. Get the most out of TypeScript.
Schema Support
Declaratively define casting and filtering. Unwanted columns are automatically excluded.
Multi Sheet
Automatically join data across sheets using reference keys ($). Build relational structures in a single call.
Advanced Usage
Combine schemas for type casting with $ reference keys for multi-sheet joining.
import { generate, defineSchema, asNumber, asString, arrayOf } from 'path-binder'
const input = {
sheetA: [
[{ path: 'user.id', value: 1 }, { path: 'user.name', value: 'Taro' }],
[{ path: 'user.id', value: 2 }, { path: 'user.name', value: 'Jiro' }],
],
sheetB: [
[{ path: 'user.$id', value: 1 }, { path: 'user.info[].type', value: 'google' }],
],
}
const schema = defineSchema({
user: {
id: asNumber(),
name: asString(),
info: arrayOf({ type: asString() }),
},
})
const { result } = generate(input, { schema })
// β {
// user: [
// { id: 1, name: 'Taro', info: [{ type: 'google' }] },
// { id: 2, name: 'Jiro' },
// ]
// }
Next steps: Learn Path Syntax for nesting and arrays β Set up Schemas for type casting and filtering β Try the Playground
Path Syntax
path-binder interprets the path in { path, value } pairs to transform flat data into nested JSON structures. This page explains how to write paths and the transformation rules.
Overview: How Paths Become JSON
Letβs start with the big picture of the transformation.
| user.name | user.age | user.tags[] |
|---|---|---|
| Taro | 25 | admin |
| Jiro | 30 | editor |
{
"user": [
{ "name": "Taro", "age": 25, "tags": ["admin"] },
{ "name": "Jiro", "age": 30, "tags": ["editor"] }
]
}
Dots (.) in the path correspond to object nesting, and [] corresponds to arrays. Letβs explore each syntax element in detail.
Property (Dot Notation)
When to use: When you want to express nested object structures
Use dots (.) to define nested properties.
| Path | Value | Generated JSON |
|---|---|---|
name | "Taro" | { "name": "Taro" } |
user.name | "Taro" | { "user": { "name": "Taro" } } |
user.address.city | "Tokyo" | { "user": { "address": { "city": "Tokyo" } } } |
import { generate } from 'path-binder'
const input = {
Sheet1: [
[{ path: 'user.name', value: 'Taro' }, { path: 'user.address.city', value: 'Tokyo' }],
],
}
const { result } = generate(input)
// β { user: [{ name: 'Taro', address: { city: 'Tokyo' } }] }
There is no limit to dot depth. You can nest freely like
a.b.c.d.e.
Array Append []
When to use: When you want to collect values from multiple rows into an array
Appending [] to a path adds each rowβs value as a new array element.
| Path | Row 1 Value | Row 2 Value | Generated JSON |
|---|---|---|---|
tags[] | "admin" | "editor" | { "tags": ["admin", "editor"] } |
user.skills[] | "TypeScript" | "React" | { "user": [{ "skills": ["TypeScript", "React"] }] } |
const input = {
Sheet1: [
[{ path: 'user.name', value: 'Taro' }, { path: 'user.tags[]', value: 'admin' }],
[{ path: 'user.name', value: 'Taro' }, { path: 'user.tags[]', value: 'editor' }],
],
}
const { result } = generate(input)
// β { user: [{ name: 'Taro', tags: ['admin', 'editor'] }] }
Index Access [n]
When to use: When you want to set a value at a specific position in an array
Use [n] to set a value directly at the nth position. Unspecified positions become undefined.
| Path | Value | Generated JSON |
|---|---|---|
items[0] | "first" | { "items": ["first"] } |
items[2] | "third" | { "items": [undefined, undefined, "third"] } |
const input = {
Sheet1: [
[{ path: 'scores[0]', value: 90 }, { path: 'scores[1]', value: 85 }],
],
}
const { result } = generate(input)
// β { scores: [[90, 85]] }
Reference Key $ β Joining Data Across Sheets
When to use: When you want to merge data from multiple sheets into a single object using a common key
This is path-binderβs most powerful feature. It performs JOIN-like operations β similar to relational databases β directly on spreadsheet data.
How It Works
- Primary rows: Rows with normal paths (e.g.,
user.id). These form the base data - Reference rows: Rows with
$-prefixed paths (e.g.,user.$id). They are matched against primary rows by removing the$(matchingid), and their data is merged
Example: Joining User Master with Email Data
Imagine you have two sheets.
Sheet1 (Primary) β Basic user information:
| user.id | user.name |
|---|---|
| 1 | Taro |
| 2 | Jiro |
Sheet2 (Reference) β Adding email via $:
| user.$id | user.email |
|---|---|
| 1 | taro@example.com |
Removing $ from user.$id gives user.id. Sheet2βs $id = 1 matches Sheet1βs id = 1 (Taro), and the email data is merged.
import { generate } from 'path-binder'
const input = {
Sheet1: [
[{ path: 'user.id', value: 1 }, { path: 'user.name', value: 'Taro' }],
[{ path: 'user.id', value: 2 }, { path: 'user.name', value: 'Jiro' }],
],
Sheet2: [
[{ path: 'user.$id', value: 1 }, { path: 'user.email', value: 'taro@example.com' }],
],
}
const { result } = generate(input)
// β {
// user: [
// { id: 1, name: 'Taro', email: 'taro@example.com' },
// { id: 2, name: 'Jiro' }
// ]
// }
Reference Key Constraints
| Constraint | Description |
|---|---|
| Primary rows required | If all rows are reference rows (all have $), they are skipped |
| Primitive key values | $key values must be string, number, or boolean (no objects) |
| Same root path | $keys in the same row must belong to the same root path |
| No nesting | $key inside array paths (e.g., info[].$type) is not allowed |
If you encounter reference key errors, see the Skip Handling page for detailed causes and solutions.
Escape $$
When to use: When you want to include a literal $ character in the path
$ is normally interpreted as a reference key prefix. To use a literal $, escape it with $$.
| Path | Generated JSON |
|---|---|
config.$$ref | { "config": { "$ref": "..." } } |
data.$$type | { "data": { "$type": "..." } } |
const input = {
Sheet1: [
[{ path: 'schema.$$ref', value: '#/definitions/User' }],
],
}
const { result } = generate(input)
// β { schema: [{ $ref: '#/definitions/User' }] }
Combination Patterns
Common path combination patterns used in practice.
Nesting + Array: User Address List
// Paths: user.addresses[].city, user.addresses[].zip
// β { user: [{ addresses: [{ city: 'Tokyo', zip: '100-0001' }, ...] }] }
Objects in Arrays: Order Line Items
// Paths: order.items[].name, order.items[].price, order.items[].qty
// β { order: [{ items: [{ name: '...', price: 100, qty: 2 }, ...] }] }
Reference Key + Array: Joining Master Data
// Sheet1: product.id, product.name
// Sheet2: product.$id, product.reviews[].comment
// β Primary product data is merged with review arrays from reference rows
Next step: Set up type casting and filtering with Schemas
Schema
Why Use a Schema?
generate() works without a schema. So why define one?
1. Type Casting β Get the Right Types from Input Data
Input values are often a mix of strings and numbers. With a schema, you can guarantee the types in your output JSON.
// Without schema: age remains the string "25"
{ name: 'Taro', age: '25' }
// With asNumber(): age is cast to number 25
{ name: 'Taro', age: 25 }
2. Filtering β Automatically Exclude Unwanted Columns
Schemas act as an allowlist. Only paths defined in the schema are included in the output; everything else is silently excluded. No need to worry about extra columns in the input data polluting your API responses.
// Even if input has "Notes" and "Manager" columns, they're excluded if not in the schema
const schema = defineSchema({
customer: {
name: asString(),
email: asString(),
// β "Notes" and "Manager" are not defined here, so they're excluded
},
})
Cast Functions
Built-in functions that define type conversion for each field.
| Function | Conversion | Input Example | Output Example |
|---|---|---|---|
asString() | String(value) | 123 | "123" |
asNumber() | Number(value) | "42" | 42 |
asBoolean() | Boolean(value) | "true" | true |
asDate() | new Date(value) | "2024-01-15" | Date object |
asCustom(fn) | Custom function | any | any |
import { defineSchema, asString, asNumber, asBoolean, asDate, asCustom } from 'path-binder'
const schema = defineSchema({
user: {
name: asString(),
age: asNumber(),
active: asBoolean(),
joinedAt: asDate(),
score: asCustom((v) => Math.round(Number(v))),
},
})
Array Schema
Use arrayOf() to define schemas for array elements.
import { defineSchema, asString, asNumber, arrayOf } from 'path-binder'
const schema = defineSchema({
user: {
name: asString(),
// Primitive array
tags: arrayOf(asString()),
// Object array
contacts: arrayOf({
type: asString(),
value: asString(),
}),
},
})
Loose Schema (asAny)
When to use: During prototyping when you want all columns to pass through, or when you only need to cast specific fields
import { defineSchema, asAny, asNumber } from 'path-binder'
// Pass all paths through without casting
const looseSchema = defineSchema({
user: asAny(),
})
// Pass all paths through, but cast age to number
const partialSchema = defineSchema({
user: asAny({
age: asNumber(),
}),
})
asAny()disables filtering. Paths not defined in the schema will also be included in the output.
Custom Cast
Use asCustom() to define arbitrary transformation logic.
Splitting Comma-Separated Tags
const schema = defineSchema({
product: {
tags: asCustom((v) => String(v).split(',').map((t) => t.trim())),
// "food,frozen,sale" β ["food", "frozen", "sale"]
},
})
Behavior on Exception
When a cast function throws an exception inside asCustom(), that entry is skipped.
const schema = defineSchema({
config: {
priority: asCustom((v) => {
const n = Number(v)
if (Number.isNaN(n)) {
throw new Error('Invalid number')
}
return n
}),
},
})
Schema vs No Schema
| Aspect | Without Schema | With Schema |
|---|---|---|
| Type casting | None (raw values) | Controlled by cast functions |
| Column exclusion | Not applied (all paths output) | Acts as allowlist |
| Skip on exception | None | Entries are skipped when a cast function throws |
| Development speed | Fast (no definition needed) | Slightly slower (definition required) |
| Production use | Not recommended | Recommended |
Next step: Set up error handling with Skip Handling
API Reference
generate(input, options?)
The main function that transforms flat spreadsheet data into nested JSON.
import { generate } from 'path-binder'
const { result, skipped } = generate(input, {
schema, // optional: SchemaObject
skipScope: 'cell', // optional: 'cell' | 'row' (default: 'cell')
})
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
input | InputData | Yes | { path, value } pair arrays per sheet |
options.schema | SchemaObject | No | Type casting and filtering definition |
options.skipScope | 'cell' | 'row' | No | Skip granularity (default: 'cell') |
Returns: GenerateResult
type GenerateResult = {
readonly result: Record<string, unknown>
readonly skipped: readonly ParseSkipped[]
}
result: Keyed by top-level property, values are always arrays. This is because each spreadsheet row corresponds to one entity.
// Input: 2 rows with user.name = "Taro", user.name = "Jiro"
// Result: { user: [{ name: 'Taro' }, { name: 'Jiro' }] }
// β user is always an array
skipped: Array of entries that could not be processed. See Skip Handling for details.
InputData Format
3-layer structure: Sheet β Rows β Cells
type InputData = {
readonly [sheetName: string]: readonly (readonly PathValuePair[])[]
}
// β array of rows β array of cells (path + value pairs)
type PathValuePair = {
readonly path: string
readonly value: unknown
}
Visualized:
InputData = {
"Sheet1": [ // β sheet name
[ // β row 1
{ path, value }, // β cell 1
{ path, value }, // β cell 2
],
[ // β row 2
{ path, value },
{ path, value },
],
],
"Sheet2": [ ... ], // β another sheet
}
defineSchema(definition)
Defines a schema object.
import { defineSchema, asString, asNumber } from 'path-binder'
const schema = defineSchema({
user: {
name: asString(),
age: asNumber(),
},
})
Cast Functions
| Function | Signature | Conversion |
|---|---|---|
asString() | () => SchemaNode | String(value) |
asNumber() | () => SchemaNode | Number(value) |
asBoolean() | () => SchemaNode | Boolean(value) |
asDate() | () => SchemaNode | new Date(value) |
asCustom(fn) | (fn: CastFn) => SchemaNode | Custom function |
asAny() | (partial?: object) => SchemaNode | No casting (pass all paths) |
arrayOf(schema) | (schema: SchemaNode | object) => ArraySchema | Array element schema |
type CastFn = (value: unknown) => unknown
Type Exports
import type {
InputData, // Input data type
PathValuePair, // { path: string, value: unknown }
GenerateOptions, // generate() options
GenerateResult, // generate() return type
ParseSkipped, // Skipped entry
ParseSkipReason, // Skip reason code
SchemaObject, // defineSchema() argument type
SchemaNode, // Each schema node
CastFn, // Cast function type
ArraySchema, // arrayOf() return type
AnySchema, // asAny() return type
} from 'path-binder'
Skip Handling
path-binder does not throw errors when it encounters invalid data. Instead, it collects unprocessable entries in the skipped array and returns them alongside the result. This prevents a single bad row from halting the entire batch.
Inspecting Skipped Entries
import { generate } from 'path-binder'
const { result, skipped } = generate(input, { schema })
skipped.forEach((entry) => {
console.log(entry.name) // sheet name
console.log(entry.index) // row index
console.log(entry.path) // the problematic path string
console.log(entry.value) // cell value (stringified)
console.log(entry.reason) // skip reason code
})
skipScope Option
Controls the granularity of skipping.
| skipScope | Behavior | Recommended For |
|---|---|---|
'cell' (default) | Only the invalid cell is skipped; other cells in the row are processed | User input data (tolerating minor gaps) |
'row' | If any cell in a row is invalid, the entire row is skipped | Strict data conversion (preventing incomplete entities) |
// Cell level: only invalid cells are skipped
const cellResult = generate(input, { skipScope: 'cell' })
// Row level: if any cell is invalid, the entire row is skipped
const rowResult = generate(input, { skipScope: 'row' })
Common Skips and How to Fix Them
empty β Empty Path Segment
Triggering input: "user..name" (double dots), "user." (trailing dot)
Fix: Check the path row and remove extra dots.
reference_not_found β No Matching Reference
Triggering input: Sheet2 has user.$id = 3, but Sheet1 has no user with id = 3
Fix: Verify that matching primary data exists in the referenced sheet.
cast β Cast Function Threw an Exception
Triggering input: asNumber() field receives "abc"
Fix: Check the input data, or add error handling in asCustom().
Skip Reasons Reference
Path Syntax Errors
| Code | Description | Example Input |
|---|---|---|
empty | Path segment is empty | "user..name" |
bracket | Unclosed bracket ] | "items[0" |
index | Non-numeric array index | "items[abc]" |
unnamed | No property name before [] | "[]" |
escape | No name after $$ escape prefix | "config.$$" |
key | No name after $ key prefix | "user.$" |
Reference Key Errors
| Code | Description | Example Input |
|---|---|---|
reference_not_found | No matching primary row | $id=3 but no primary id=3 |
no_primary_data | All rows are reference rows | All paths have $ prefix |
conflicting_key_prop | Same row has $key and non-$key with same name | user.$id and user.id in same row |
nested_key | $key inside array path | info[].$type |
invalid_key_value | Key value is not a primitive | $id value is an object |
mixed_key_root | $keys in same row belong to different root paths | user.$id and order.$id in same row |
Data Conflict & Cast Errors
| Code | Description | Example Input |
|---|---|---|
property_conflict | Reference data conflicts with primary data | Both have different values for same property |
cast | Cast function threw an exception | asNumber() receives "abc" |
Skip Logging Patterns
Example of monitoring skips in production.
const { result, skipped } = generate(input, { schema })
// Log when skips occur
if (skipped.length > 0) {
console.warn(`${skipped.length} entries were skipped`)
skipped.forEach((s) => {
console.warn(` Sheet: ${s.name}, Row: ${s.index + 1}, Path: ${s.path}, Reason: ${s.reason}`)
})
}
// Generate error report for user feedback
const errorReport = skipped.map((s) =>
`Row ${s.index + 1}: "${s.path}" β ${s.reason}`
)
Next step: See the API Reference for all types and options
Playground
Drop an Excel file (.xlsx) to convert it into nested JSON using path-binder. The second row of each sheet defines the JSON paths, and rows 3+ are data.
Excel format example:
| UserId | UserName | LoginMethod |
|---|---|---|
| user.id | user.name | user.loginInfo[].type |
| 1 | Taro | ID/Password |
| 2 | Jiro | emailLink |
Download sample Excel (single sheet) | Download sample Excel (multi-sheet with $ reference)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>path-binder Playground</title>
<style>
body { font-family: sans-serif; margin: 20px; }
#dropzone {
border: 2px dashed #1a73e8;
border-radius: 8px;
padding: 40px;
text-align: center;
color: #5f6368;
cursor: pointer;
transition: background-color 0.2s;
}
#dropzone.dragover { background-color: rgba(26,115,232,0.08); }
#output { white-space: pre-wrap; font-family: monospace; margin-top: 16px; }
</style>
</head>
<body>
<div id="dropzone">
Drop .xlsx file here or click to select
<!-- Not using D&D alone: cross-origin iframe nesting blocks drop events. -->
<input type="file" id="file-input" accept=".xlsx" style="display:none">
</div>
<h3>Result</h3>
<pre id="output">Waiting for file...</pre>
<script type="module" src="./index.ts"></script>
</body>
</html>
import ExcelJS from 'exceljs'
import { generate, defineSchema, asNumber, asString, arrayOf } from 'path-binder'
import type { InputData, PathValuePair, GenerateResult } from 'path-binder'
// =====================================================================
// βοΈ Edit this function to try different schemas and options!
//
// Try:
// - Add asNumber() cast: id: asNumber()
// - Add skipScope: generate(inputData, { schema, skipScope: 'row' })
// - Remove schema: return generate(inputData)
// =====================================================================
function example(inputData: InputData): GenerateResult {
const schema = defineSchema({
user: {
id: asNumber(),
name: asString(),
loginInfo: arrayOf({ type: asString() }),
},
})
return generate(inputData, { schema })
}
// Row 1: header labels (ignored by path-binder)
// Row 2: JSON paths
// Row 3+: data values
const PATH_ROW_INDEX = 2
/**
* Reads an Excel workbook buffer and converts it to path-binder InputData.
*
* Uses sheet name as the key and row 2 as paths.
* Not reading paths from row 1: row 1 is a human-readable header,
* keeping paths on row 2 makes it easy to swap path definitions
* without touching the header.
*/
async function excelToInputData(buffer: ArrayBuffer): Promise<InputData> {
const workbook = new ExcelJS.Workbook()
await workbook.xlsx.load(buffer)
const inputData: Record<string, PathValuePair[][]> = {}
workbook.eachSheet((sheet) => {
const pathRow = sheet.getRow(PATH_ROW_INDEX)
const paths: string[] = []
pathRow.eachCell((cell, colNumber) => {
paths[colNumber] = String(cell.value ?? '')
})
const rows: PathValuePair[][] = []
sheet.eachRow((row, rowNumber) => {
if (rowNumber <= PATH_ROW_INDEX) {
return
}
const pairs: PathValuePair[] = []
row.eachCell((cell, colNumber) => {
const path = paths[colNumber]
if (!path) {
return
}
pairs.push({ path, value: cell.value })
})
if (pairs.length > 0) {
rows.push(pairs)
}
})
inputData[sheet.name] = rows
})
return inputData
}
async function processFile(file: File, dropzone: HTMLElement, output: HTMLElement): Promise<void> {
dropzone.textContent = `Loaded: ${file.name}`
try {
const buffer = await file.arrayBuffer()
const inputData = await excelToInputData(buffer)
const { result, skipped } = example(inputData)
output.textContent = JSON.stringify(result, null, 2)
if (skipped.length > 0) {
output.textContent += '\n\n--- Skipped ---\n' + JSON.stringify(skipped, null, 2)
}
} catch (err) {
output.textContent = 'Error: ' + String(err)
}
}
function setupDropzone(): void {
const dropzone = document.getElementById('dropzone')
const output = document.getElementById('output')
const fileInput = document.getElementById('file-input') as HTMLInputElement | null
if (!dropzone || !output || !fileInput) {
return
}
// Click to open file picker.
// Not relying on D&D alone: cross-origin iframe nesting blocks drop events.
dropzone.addEventListener('click', () => {
fileInput.click()
})
fileInput.addEventListener('change', async () => {
const file = fileInput.files?.[0]
if (!file) {
return
}
await processFile(file, dropzone, output)
})
dropzone.addEventListener('dragover', (e) => {
e.preventDefault()
dropzone.classList.add('dragover')
})
dropzone.addEventListener('dragleave', () => {
dropzone.classList.remove('dragover')
})
dropzone.addEventListener('drop', async (e) => {
e.preventDefault()
dropzone.classList.remove('dragover')
const file = e.dataTransfer?.files[0]
if (!file) {
return
}
await processFile(file, dropzone, output)
})
}
setupDropzone()
Excel format rules
- Row 1 β Human-readable header (ignored)
- Row 2 β JSON paths for path-binder (e.g.
user.id,user.name,user.loginInfo[].type) - Row 3+ β Data values
- Sheet name β Used as the top-level key in InputData
What you can try
- Edit the
example()function to add/change the schema - Change the paths in row 2 of your Excel
- Add a second sheet with
$reference keys