jsonedit

Mapping Configuration

The mapping configuration is a JSON file that defines how CSV columns are mapped to properties in the output. The configuration has the following structure:

{
  "mapping": {
    "key1": {
      "property": "propertyName",
      "type": "dataType"
    },
    "key2": {
      "property": "nested.property",
      "type": "dataType"
    }
  },
  "calculated": [
    {
      "property": "calculatedProperty",
      "kind": "kindOfCalculation",
      "format": "formatString",
      "type": "dataType",
      "location": "record"
    }
  ],
  "extra_variables": {
    "variable-name": {
      "value": "variable-value"
    }
  }
}

Where:

key1, key2, etc. are either:
- Column indices (0, 1, 2, …) when not using the -named flag
- Column names from the CSV header when using the -named flag
propertyName is the name of the property in the output
nested.property demonstrates how to create nested objects using dot notation
dataType is one of:
- int - converts the value to an integer
- float - converts the value to a floating-point number
- bool - converts the value to a boolean
- string (default) - keeps the value as a string
- date / time - parses date or time values using Go time layouts. You can specify:
  - date or time with no arguments: uses default layouts (2006-01-02 for date, 15:04:05 for time)
  - date:INPUT_LAYOUT or time:INPUT_LAYOUT: parses using the provided layout
  - date:INPUT_LAYOUT:OUTPUT_LAYOUT or time:INPUT_LAYOUT:OUTPUT_LAYOUT: parses using the input layout and outputs a formatted string using the output layout

Date and Time Types

Date and time parsing uses Go’s time layouts (e.g., 2006-01-02 for dates, 15:04:05 for times). When a date or time value is parsed without an OUTPUT_LAYOUT, it is stored as a time value and when serialized to JSON it appears in RFC3339 format (e.g., 2025-10-23T00:00:00Z). For time values without a date, the serialized JSON will use a zero date component (e.g., 0000-01-01T15:00:00Z).

Examples:

{
  "mapping": {
    "3": { "property": "date_no_format", "type": "date" },
    "4": { "property": "date", "type": "date:20060102" },
    "5": { "property": "datetime", "type": "time:20060102150405" },
    "6": { "property": "time", "type": "time:150405" },
    "7": { "property": "date_custom", "type": "time:20060102:02.01.2006" }
  }
}

date uses the default layout 2006-01-02.
date:20060102 parses a compact date (YYYYMMDD).
time:20060102150405 parses a full datetime (YYYYMMDDhhmmss).
time:150405 parses only a time (hhmmss).
time:20060102:02.01.2006 parses a date using the input layout and outputs it formatted as DD.MM.YYYY.

Multiple Properties from a Single Column

In addition to mapping a column to a single property, you can map a single column to multiple properties using the properties array. This is useful when a column contains data that needs to be split into multiple fields or when you want to duplicate a value across multiple properties.

{
  "mapping": {
    "columnKey": {
      "properties": [
        {
          "property": "firstProperty",
          "type": "dataType"
        },
        {
          "property": "nested.secondProperty",
          "type": "dataType"
        }
      ]
    }
  }
}

Where:

columnKey is either a column index or name (depending on whether you’re using the -named flag)
Each item in the properties array defines a separate output property that will receive the value from the same input column
Each property definition has the same structure as a regular mapping (with property and type fields)
You can use dot notation in the property field to create nested structures

Example

Given a CSV with an address column that contains full addresses:

id,name,address
1,"John Doe","123 Main St, Springfield, IL 62701"

You can map the address column to multiple properties:

{
  "mapping": {
    "id": {
      "property": "userId",
      "type": "int"
    },
    "name": {
      "property": "fullName",
      "type": "string"
    },
    "address": {
      "properties": [
        {
          "property": "originalAddress",
          "type": "string"
        },
        {
          "property": "contact.address",
          "type": "string"
        },
        {
          "property": "shipping.address",
          "type": "string"
        }
      ]
    }
  }
}

This will produce:

{
  "userId": 1,
  "fullName": "John Doe",
  "originalAddress": "123 Main St, Springfield, IL 62701",
  "contact": {
    "address": "123 Main St, Springfield, IL 62701"
  },
  "shipping": {
    "address": "123 Main St, Springfield, IL 62701"
  }
}

Calculated Fields

Calculated fields allow you to add dynamic values to your output that are not directly derived from the CSV input. These fields are defined in the calculated array of the mapping configuration.

Each calculated field has the following properties:

property: The name of the property in the output (supports dot notation for nested objects)
kind: The type of calculation to perform (see below)
format: Additional information for the calculation, varies by kind
type: The data type of the calculated value (int, float, bool, or string)
location: Where the calculated field should be applied - either record (default) or document

Kinds of Calculated Fields

datetime: Adds the current date/time formatted according to the format string
- format: A Go time format string (e.g., “2006-01-02” for date, “15:04:05” for time)
application: Adds application-specific values
- format: Currently only supports “record”, which adds the record index (0-based)
environment: Adds the value of an environment variable
- format: The name of the environment variable to read
extra: Adds the value of an extra variable defined in the configuration
- format: The name of the extra variable to use
- Extra variables are defined in the extra_variables section of the configuration
mapping: Maps values from a source field to different output values
- format: Specified as “field:mapping_list” where:
  - field is the source field name (when using -named) or index
  - mapping_list is a comma-separated list of “from=to” pairs
  - A special “default” mapping can be specified for values that don’t match any explicit mapping
- Example: “from-to:a=0,b=1,default=99” maps values from the “from-to” field:
  - “a” becomes 0
  - “b” becomes 1
  - Any other value becomes 99

Record-Level vs Document-Level Calculated Fields

Calculated fields can be applied at two different levels:

Record-Level Fields (location: "record"):
- Applied to each individual record in the output
- This is the default if no location is specified
- Always included regardless of output format
Document-Level Fields (location: "document"):
- Applied to the entire document, not to individual records
- Only applied when using array output (with -array flag) or when using TOML/YAML output formats
- Typically used for metadata about the entire document
- Often placed under a top-level property like _meta

Document-level calculated fields are useful for adding metadata about the entire dataset, such as:

Total number of records processed
Processing timestamp
Global configuration values

Note: Document-level calculated fields are only applied when the output is a single document containing all records (array mode). They are not applied when outputting individual records as separate JSON objects.

Example

{
  "mapping": {
    "id": {
      "property": "productId",
      "type": "int"
    }
  },
  "calculated": [
    {
      "property": "metadata.recordNumber",
      "kind": "application",
      "format": "record",
      "type": "int",
      "location": "record"
    },
    {
      "property": "metadata.processedDate",
      "kind": "datetime",
      "format": "2006-01-02",
      "type": "string",
      "location": "record"
    },
    {
      "property": "metadata.processedTime",
      "kind": "datetime",
      "format": "15:04:05",
      "type": "string",
      "location": "record"
    },
    {
      "property": "metadata.userHome",
      "kind": "environment",
      "format": "HOME",
      "type": "string",
      "location": "record"
    },
    {
      "property": "metadata.version",
      "kind": "extra",
      "format": "app-version",
      "type": "string",
      "location": "record"
    },
    {
      "property": "_meta.totalRecords",
      "kind": "application",
      "format": "records",
      "type": "int",
      "location": "document"
    },
    {
      "property": "_meta.processedAt",
      "kind": "datetime",
      "format": "2006-01-02 15:04:05",
      "type": "string",
      "location": "document"
    }
  ],
  "extra_variables": {
    "app-version": {
      "value": "1.0.0"
    }
  }
}

This configuration would add the following calculated fields:

Record-level fields (added to each record):

metadata.recordNumber: The 0-based index of the record
metadata.processedDate: The current date in YYYY-MM-DD format
metadata.processedTime: The current time in HH:MM:SS format
metadata.userHome: The value of the HOME environment variable
metadata.version: The string “1.0.0” from the extra variable “app-version”

Document-level fields (added to the top-level document when using array output):

_meta.totalRecords: The total number of records processed
_meta.processedAt: The date and time when the document was processed

Conditional Properties

Conditional properties allow you to include or exclude properties in the output based on specific conditions. This feature is useful when you want to selectively include fields only when certain criteria are met.

Condition Structure

A condition is defined as an object with the following properties:

operator: The comparison operator to use (e.g., =, !=, >, <, >=, <=)
operand1: The first operand in the comparison
operand2: The second operand in the comparison
type: The data type to use for comparison (string, int, float, or bool)

Each operand is an object with:

type: Either value for a fixed value or column for a value from a CSV column
value: Either a literal value (when type is value) or a column index/name (when type is column)

Example

{
  "mapping": {
    "0": {
      "properties": [
        {
          "property": "id",
          "type": "int"
        },
        {
          "property": "premiumId",
          "type": "int",
          "condition": {
            "operator": "=",
            "operand1": {
              "type": "column",
              "value": "3"
            },
            "operand2": {
              "type": "value",
              "value": "premium"
            },
            "type": "string"
          }
        }
      ]
    }
  }
}

In this example:

The id property is always included
The premiumId property is only included when the value in column 3 equals “premium”
The comparison is done as strings

Supported Operators

=: Equal to
!=: Not equal to
>: Greater than
<: Less than
>=: Greater than or equal to
<=: Less than or equal to

Comparison Types

The type field in the condition determines how the values are compared:

string: Values are compared as strings (lexicographically)
int: Values are converted to integers before comparison
float: Values are converted to floating-point numbers before comparison
bool: Values are converted to booleans before comparison

Use Cases

Conditional properties are useful for:

Including special fields only for certain types of records
Creating different output structures based on data values
Implementing business logic in the mapping process
Filtering out unwanted or irrelevant data

Row Filtering

You can filter out (skip) input CSV rows before mapping them by defining a filter section in the configuration. A row is skipped if it matches any of the defined filter groups.

A filter is a map of named groups to an array of conditions.
Conditions inside the same group are combined with AND (all must be true).
Different groups are combined with OR (if any group is true, the row is filtered out).

Configuration schema:

{
  "filter": {
    "groupName": [
      {
        "operator": "=|!=|>|<",
        "operand1": { "type": "value|column", "value": "..." },
        "operand2": { "type": "value|column", "value": "..." },
        "type": "string|int|float|bool"
      }
    ]
  }
}

operator: Supported operators are =, !=, >, < (for bool only = and != are meaningful).
type: Comparison type. Values are coerced to this type before comparing.
operand.type:
- value: use a literal value from the configuration.
- column: read the value from a CSV column. Use the numeric index as string (e.g., “0”, “1”, …). If you run in named/header mode, you can use the header name instead of the index.

Examples

Filter rows where id equals 1:

{
  "mapping": {
    "0": { "property": "id", "type": "int" },
    "1": { "property": "name", "type": "string" },
    "2": { "property": "email", "type": "string" }
  },
  "filter": {
    "id_is_1": [
      {
        "operator": "=",
        "operand1": { "type": "value", "value": "1" },
        "operand2": { "type": "column", "value": "0" },
        "type": "string"
      }
    ]
  }
}

Filter rows where name is “John Doe” AND email is “john2@example.com”:

{
  "filter": {
    "john_with_second_email": [
      {
        "operator": "=",
        "operand1": { "type": "value", "value": "John Doe" },
        "operand2": { "type": "column", "value": "1" },
        "type": "string"
      },
      {
        "operator": "=",
        "operand1": { "type": "value", "value": "john2@example.com" },
        "operand2": { "type": "column", "value": "2" },
        "type": "string"
      }
    ]
  }
}

Filter rows where id is 1 OR id is 3 (two groups, OR logic between them):

{
  "filter": {
    "id_1": [
      {
        "operator": "=",
        "operand1": { "type": "value", "value": "1" },
        "operand2": { "type": "column", "value": "0" },
        "type": "string"
      }
    ],
    "id_3": [
      {
        "operator": "=",
        "operand1": { "type": "value", "value": "3" },
        "operand2": { "type": "column", "value": "0" },
        "type": "string"
      }
    ]
  }
}

Notes:

Filtering happens before mapping and calculated fields are applied. Filtered rows are completely skipped from the output.
Use named/header mode if you prefer to reference columns by their header names rather than indices.

Advanced Callback Functions

The csv2json package provides two callback functions that can be used for advanced use cases:

NewRecordFunc

NewRecordFunc is a callback function that is called when a new record is being processed. It receives the current record and header as parameters:

NewRecordFunc func([]string, []string)

This function can be used to:

Store record data for later use
Perform custom validation on records
Implement custom logging or monitoring
Trigger external actions based on record content

The function is called before any mapping or transformation is applied to the record, giving you access to the raw CSV data.

AskForValueFunc

AskForValueFunc is a callback function that dynamically provides values for calculated fields with the “ask” kind. It receives the current record, header, and the calculated field definition as parameters:

AskForValueFunc func(record, header []string, field CalculatedField) (string, error)

This function can be used to:

Retrieve values from external sources
Compute values based on record data
Implement complex business logic
Access values stored by NewRecordFunc

To use the “ask” kind in calculated fields, define a calculated field with:

kind: “ask”
format: A string that can be used to identify what value to retrieve

Example configuration:

{
  "calculated": [
    {
      "property": "dynamicValue",
      "kind": "ask",
      "format": "some-identifier",
      "type": "string",
      "location": "record"
    }
  ]
}

When this calculated field is processed, the AskForValueFunc will be called with the current record, header, and the field definition. The function should return the value to use for the field, or an error if the value cannot be determined.

PreProcess

PreProcess lets you transform each raw CSV row before any filtering, mapping, or calculated fields are applied. This is useful for data normalization and complex transformations that are easier to do on the row as a whole.

Signature:

PreProcess func(record, header []string) ([]string, error)

When it runs:

Immediately after reading a row from the CSV reader.
Before filter rules are evaluated and before any mapping happens.
The header slice will be populated when running in named/header mode; otherwise it may be nil.

What to return:

Return a new slice with the fields to be used for the rest of the pipeline.
Keep the number of fields aligned with the expected CSV structure (especially when indexing by column). If you change the number or order of columns, make sure your mapping configuration and/or named headers are consistent with the new layout.
Return an error to abort the whole process if the row cannot be preprocessed; the error will stop processing the record.

Typical use cases:

Trim/cleanup fields, normalize encodings, or replace substrings.
Split or merge fields to better match the target mapping.
Standardize dates, numbers, or booleans before typing.
Apply domain-specific fixups not expressible in the mapping file alone.

Example (replace “com” with “org” in all fields):

mapper, _ := csv2json.NewMapper(csv2json.WithOptions(cfgBytes))
mapper.SetPreProcessFunc(func(record, header []string) ([]string, error) {
    out := make([]string, len(record))
    for i := range record {
        out[i] = strings.ReplaceAll(record[i], "com", "org")
    }
    return out, nil
})

Notes:

PreProcess is the earliest hook; if you want to skip rows conditionally, prefer the Filter section so you can still observe skips via FilteredNotification.
In named mode, you can rely on header to look up column positions if needed.

FilteredNotification

FilteredNotification lets you observe when a row is filtered out by the filter rules described in the Row Filtering section. This is useful for metrics, audits, or debug logging.

Signature:

FilteredNotification func(record, header []string)

record: the raw CSV fields for the filtered row
header: the CSV header row when running in named/header mode; it can be nil when not using named mode

How it works:

During processing, before mapping or calculated fields, each input row is checked against the configured filter groups.
If the row matches any group (i.e., will be skipped), the FilteredNotification callback is invoked with the raw record and header.
After the callback returns, the row is skipped and does not appear in the output.

Usage example (counting filtered rows):

mapper, err := csv2json.NewMapper(
    csv2json.WithOptions(cfgBytes),
    csv2json.WithNamed(true), // optional; provides header slice
)
if err != nil { /* handle */ }

var filtered int
mapper.SetFilteredNotification(func(record, header []string) {
    filtered++
})

out, err := mapper.Map(inputCSV)
// use out; 'filtered' contains the number of rows skipped by filters

Note:

This callback is only triggered for rows excluded by the filter configuration; it is not called for rows that pass filtering.
See Row Filtering for how to define filters and when rows are considered filtered.

Output Behavior

Without `-array`

When processing multiple CSV rows without the -array flag, each row is converted to a separate JSON document and written to the output with newlines between them. This produces a newline-delimited JSON format (NDJSON/JSON Lines), where each line is a valid JSON object, but the file as a whole is not a standard JSON array.

With `-array`

When using the -array flag, all rows are collected into a single array and output as one document.

Nested Property Output

When using the -nested-property flag with the -array flag (or when using TOML output which implicitly enables array mode), the output data is nested under the specified property name:

JSON Output with Nested Property

Will produce:

{
  "items": [
    {
      "property1": 1,
      "property2": {
        "property3": "hello"
      }
    },
    {
      "property1": 2,
      "property2": {
        "property3": "world"
      }
    }
  ]
}

YAML Output with Nested Property

Will produce:

items:
  - property1: 1
    property2:
      property3: hello
  - property1: 2
    property2:
      property3: world

TOML Output Format

When using TOML as the output format, the array data is always wrapped in a property. By default, this property is named “data”, but you can customize it using the -nested-property flag:

Will produce:

[_meta]
  processedAt = "2023-05-12 15:30:45"
  totalRecords = 2

[[items]]
property1 = 1
property2 = { property3 = "hello" }

[[items]]
property1 = 2
property2 = { property3 = "world" }

Note how the document-level calculated fields appear in the _meta section at the top of the document, while record-level calculated fields would appear within each record.

Examples

Basic Usage

Given the following CSV:

1,"hello",2.3

And this mapping.json:

{
  "mapping": {
    "0": {
      "property": "property1",
      "type": "int"
    },
    "1": {
      "property": "property2.property3",
      "type": "string"
    },
    "2": {
      "property": "property4",
      "type": "float"
    }
  }
}

The default output will be:

{
  "property1": 1,
  "property2": {
    "property3": "hello"
  },
  "property4": 2.3
}

Value Mapping Example

Given the following CSV:

id,status,value
1,"active",10.5
2,"inactive",20.3
3,"pending",15.7

And this mapping.json with value mapping:

{
  "mapping": {
    "id": {
      "property": "id",
      "type": "int"
    },
    "status": {
      "property": "originalStatus",
      "type": "string"
    },
    "value": {
      "property": "amount",
      "type": "float"
    }
  },
  "calculated": [
    {
      "property": "statusCode",
      "kind": "mapping",
      "format": "status:active=1,inactive=0,pending=2,default=-1",
      "type": "int",
      "location": "record"
    }
  ]
}

Running with the -named

Will produce:

{"id":1,"originalStatus":"active","amount":10.5,"statusCode":1}
{"id":2,"originalStatus":"inactive","amount":20.3,"statusCode":0}
{"id":3,"originalStatus":"pending","amount":15.7,"statusCode":2}

This example demonstrates how to map string status values to numeric codes using the value mapping feature.

Using Named Columns

Given the following CSV:

id,name,price
1,"Product A",19.99
2,"Product B",29.99

And this mapping.json:

{
  "mapping": {
    "id": {
      "property": "productId",
      "type": "int"
    },
    "name": {
      "property": "productName",
      "type": "string"
    },
    "price": {
      "property": "pricing.retail",
      "type": "float"
    }
  }
}

Running with the -named

Will produce (in NDJSON format, with each line being a separate JSON document):

{"productId":1,"productName":"Product A","pricing":{"retail":19.99}}
{"productId":2,"productName":"Product B","pricing":{"retail":29.99}}

Note: The actual output will not be pretty-printed but shown as compact JSON objects, one per line.

Output as Array in YAML Format

Will produce:

- productId: 1
  productName: Product A
  pricing:
    retail: 19.99
- productId: 2
  productName: Product B
  pricing:
    retail: 29.99

This site is open source. Improve this page.

jsonedit

Mapping Configuration

Date and Time Types

Multiple Properties from a Single Column

Example

Calculated Fields

Kinds of Calculated Fields

Record-Level vs Document-Level Calculated Fields

Example

Conditional Properties

Condition Structure

Example

Supported Operators

Comparison Types

Use Cases

Row Filtering

Examples

Advanced Callback Functions

NewRecordFunc

AskForValueFunc

PreProcess

FilteredNotification

Output Behavior

Without -array

With -array

Nested Property Output

JSON Output with Nested Property

YAML Output with Nested Property

TOML Output Format

Examples

Basic Usage

Value Mapping Example

Using Named Columns

Output as Array in YAML Format

Without `-array`

With `-array`