The mapping configuration is a JSON file that defines how CSV columns are mapped to properties in the output. The configuration has the following structure:
{
"mapping": {
"key1": {
"property": "propertyName",
"type": "dataType"
},
"key2": {
"property": "nested.property",
"type": "dataType"
}
},
"calculated": [
{
"property": "calculatedProperty",
"kind": "kindOfCalculation",
"format": "formatString",
"type": "dataType",
"location": "record"
}
],
"extra_variables": {
"variable-name": {
"value": "variable-value"
}
}
}
Where:
key1, key2, etc. are either:
-named flag-named flagpropertyName is the name of the property in the outputnested.property demonstrates how to create nested objects using dot notationdataType is one of:
int - converts the value to an integerfloat - converts the value to a floating-point numberbool - converts the value to a booleanstring (default) - keeps the value as a stringdate / time - parses date or time values using Go time layouts. You can specify:
date or time with no arguments: uses default layouts (2006-01-02 for date, 15:04:05 for time)date:INPUT_LAYOUT or time:INPUT_LAYOUT: parses using the provided layoutdate:INPUT_LAYOUT:OUTPUT_LAYOUT or time:INPUT_LAYOUT:OUTPUT_LAYOUT: parses using the input layout and outputs a formatted string using the output layoutDate and time parsing uses Go’s time layouts (e.g., 2006-01-02 for dates, 15:04:05 for times). When a date or time value is parsed without an OUTPUT_LAYOUT, it is stored as a time value and when serialized to JSON it appears in RFC3339 format (e.g., 2025-10-23T00:00:00Z). For time values without a date, the serialized JSON will use a zero date component (e.g., 0000-01-01T15:00:00Z).
Examples:
{
"mapping": {
"3": { "property": "date_no_format", "type": "date" },
"4": { "property": "date", "type": "date:20060102" },
"5": { "property": "datetime", "type": "time:20060102150405" },
"6": { "property": "time", "type": "time:150405" },
"7": { "property": "date_custom", "type": "time:20060102:02.01.2006" }
}
}
date uses the default layout 2006-01-02.date:20060102 parses a compact date (YYYYMMDD).time:20060102150405 parses a full datetime (YYYYMMDDhhmmss).time:150405 parses only a time (hhmmss).time:20060102:02.01.2006 parses a date using the input layout and outputs it formatted as DD.MM.YYYY.In addition to mapping a column to a single property, you can map a single column to multiple properties using the properties array. This is useful when a column contains data that needs to be split into multiple fields or when you want to duplicate a value across multiple properties.
{
"mapping": {
"columnKey": {
"properties": [
{
"property": "firstProperty",
"type": "dataType"
},
{
"property": "nested.secondProperty",
"type": "dataType"
}
]
}
}
}
Where:
columnKey is either a column index or name (depending on whether you’re using the -named flag)properties array defines a separate output property that will receive the value from the same input columnproperty and type fields)property field to create nested structuresGiven a CSV with an address column that contains full addresses:
id,name,address
1,"John Doe","123 Main St, Springfield, IL 62701"
You can map the address column to multiple properties:
{
"mapping": {
"id": {
"property": "userId",
"type": "int"
},
"name": {
"property": "fullName",
"type": "string"
},
"address": {
"properties": [
{
"property": "originalAddress",
"type": "string"
},
{
"property": "contact.address",
"type": "string"
},
{
"property": "shipping.address",
"type": "string"
}
]
}
}
}
This will produce:
{
"userId": 1,
"fullName": "John Doe",
"originalAddress": "123 Main St, Springfield, IL 62701",
"contact": {
"address": "123 Main St, Springfield, IL 62701"
},
"shipping": {
"address": "123 Main St, Springfield, IL 62701"
}
}
Calculated fields allow you to add dynamic values to your output that are not directly derived from the CSV input. These fields are defined in the calculated array of the mapping configuration.
Each calculated field has the following properties:
property: The name of the property in the output (supports dot notation for nested objects)kind: The type of calculation to perform (see below)format: Additional information for the calculation, varies by kindtype: The data type of the calculated value (int, float, bool, or string)location: Where the calculated field should be applied - either record (default) or documentformat: A Go time format string (e.g., “2006-01-02” for date, “15:04:05” for time)format: Currently only supports “record”, which adds the record index (0-based)format: The name of the environment variable to readformat: The name of the extra variable to useextra_variables section of the configurationformat: Specified as “field:mapping_list” where:
field is the source field name (when using -named) or indexmapping_list is a comma-separated list of “from=to” pairsCalculated fields can be applied at two different levels:
location: "record"):
location: "document"):
-array flag) or when using TOML/YAML output formats_metaDocument-level calculated fields are useful for adding metadata about the entire dataset, such as:
Note: Document-level calculated fields are only applied when the output is a single document containing all records (array mode). They are not applied when outputting individual records as separate JSON objects.
{
"mapping": {
"id": {
"property": "productId",
"type": "int"
}
},
"calculated": [
{
"property": "metadata.recordNumber",
"kind": "application",
"format": "record",
"type": "int",
"location": "record"
},
{
"property": "metadata.processedDate",
"kind": "datetime",
"format": "2006-01-02",
"type": "string",
"location": "record"
},
{
"property": "metadata.processedTime",
"kind": "datetime",
"format": "15:04:05",
"type": "string",
"location": "record"
},
{
"property": "metadata.userHome",
"kind": "environment",
"format": "HOME",
"type": "string",
"location": "record"
},
{
"property": "metadata.version",
"kind": "extra",
"format": "app-version",
"type": "string",
"location": "record"
},
{
"property": "_meta.totalRecords",
"kind": "application",
"format": "records",
"type": "int",
"location": "document"
},
{
"property": "_meta.processedAt",
"kind": "datetime",
"format": "2006-01-02 15:04:05",
"type": "string",
"location": "document"
}
],
"extra_variables": {
"app-version": {
"value": "1.0.0"
}
}
}
This configuration would add the following calculated fields:
Record-level fields (added to each record):
metadata.recordNumber: The 0-based index of the recordmetadata.processedDate: The current date in YYYY-MM-DD formatmetadata.processedTime: The current time in HH:MM:SS formatmetadata.userHome: The value of the HOME environment variablemetadata.version: The string “1.0.0” from the extra variable “app-version”Document-level fields (added to the top-level document when using array output):
_meta.totalRecords: The total number of records processed_meta.processedAt: The date and time when the document was processedConditional properties allow you to include or exclude properties in the output based on specific conditions. This feature is useful when you want to selectively include fields only when certain criteria are met.
A condition is defined as an object with the following properties:
operator: The comparison operator to use (e.g., =, !=, >, <, >=, <=)operand1: The first operand in the comparisonoperand2: The second operand in the comparisontype: The data type to use for comparison (string, int, float, or bool)Each operand is an object with:
type: Either value for a fixed value or column for a value from a CSV columnvalue: Either a literal value (when type is value) or a column index/name (when type is column){
"mapping": {
"0": {
"properties": [
{
"property": "id",
"type": "int"
},
{
"property": "premiumId",
"type": "int",
"condition": {
"operator": "=",
"operand1": {
"type": "column",
"value": "3"
},
"operand2": {
"type": "value",
"value": "premium"
},
"type": "string"
}
}
]
}
}
}
In this example:
id property is always includedpremiumId property is only included when the value in column 3 equals “premium”=: Equal to!=: Not equal to>: Greater than<: Less than>=: Greater than or equal to<=: Less than or equal toThe type field in the condition determines how the values are compared:
string: Values are compared as strings (lexicographically)int: Values are converted to integers before comparisonfloat: Values are converted to floating-point numbers before comparisonbool: Values are converted to booleans before comparisonConditional properties are useful for:
You can filter out (skip) input CSV rows before mapping them by defining a filter section in the configuration. A row is skipped if it matches any of the defined filter groups.
Configuration schema:
{
"filter": {
"groupName": [
{
"operator": "=|!=|>|<",
"operand1": { "type": "value|column", "value": "..." },
"operand2": { "type": "value|column", "value": "..." },
"type": "string|int|float|bool"
}
]
}
}
Filter rows where id equals 1:
{
"mapping": {
"0": { "property": "id", "type": "int" },
"1": { "property": "name", "type": "string" },
"2": { "property": "email", "type": "string" }
},
"filter": {
"id_is_1": [
{
"operator": "=",
"operand1": { "type": "value", "value": "1" },
"operand2": { "type": "column", "value": "0" },
"type": "string"
}
]
}
}
Filter rows where name is “John Doe” AND email is “john2@example.com”:
{
"filter": {
"john_with_second_email": [
{
"operator": "=",
"operand1": { "type": "value", "value": "John Doe" },
"operand2": { "type": "column", "value": "1" },
"type": "string"
},
{
"operator": "=",
"operand1": { "type": "value", "value": "john2@example.com" },
"operand2": { "type": "column", "value": "2" },
"type": "string"
}
]
}
}
Filter rows where id is 1 OR id is 3 (two groups, OR logic between them):
{
"filter": {
"id_1": [
{
"operator": "=",
"operand1": { "type": "value", "value": "1" },
"operand2": { "type": "column", "value": "0" },
"type": "string"
}
],
"id_3": [
{
"operator": "=",
"operand1": { "type": "value", "value": "3" },
"operand2": { "type": "column", "value": "0" },
"type": "string"
}
]
}
}
Notes:
The csv2json package provides two callback functions that can be used for advanced use cases:
NewRecordFunc is a callback function that is called when a new record is being processed. It receives the current record and header as parameters:
NewRecordFunc func([]string, []string)
This function can be used to:
The function is called before any mapping or transformation is applied to the record, giving you access to the raw CSV data.
AskForValueFunc is a callback function that dynamically provides values for calculated fields with the “ask” kind. It receives the current record, header, and the calculated field definition as parameters:
AskForValueFunc func(record, header []string, field CalculatedField) (string, error)
This function can be used to:
To use the “ask” kind in calculated fields, define a calculated field with:
kind: “ask”format: A string that can be used to identify what value to retrieveExample configuration:
{
"calculated": [
{
"property": "dynamicValue",
"kind": "ask",
"format": "some-identifier",
"type": "string",
"location": "record"
}
]
}
When this calculated field is processed, the AskForValueFunc will be called with the current record, header, and the field definition. The function should return the value to use for the field, or an error if the value cannot be determined.
PreProcess lets you transform each raw CSV row before any filtering, mapping, or calculated fields are applied. This is useful for data normalization and complex transformations that are easier to do on the row as a whole.
Signature:
PreProcess func(record, header []string) ([]string, error)
When it runs:
header slice will be populated when running in named/header mode; otherwise it may be nil.What to return:
Typical use cases:
Example (replace “com” with “org” in all fields):
mapper, _ := csv2json.NewMapper(csv2json.WithOptions(cfgBytes))
mapper.SetPreProcessFunc(func(record, header []string) ([]string, error) {
out := make([]string, len(record))
for i := range record {
out[i] = strings.ReplaceAll(record[i], "com", "org")
}
return out, nil
})
Notes:
header to look up column positions if needed.FilteredNotification lets you observe when a row is filtered out by the filter rules described in the Row Filtering section. This is useful for metrics, audits, or debug logging.
Signature:
FilteredNotification func(record, header []string)
How it works:
Usage example (counting filtered rows):
mapper, err := csv2json.NewMapper(
csv2json.WithOptions(cfgBytes),
csv2json.WithNamed(true), // optional; provides header slice
)
if err != nil { /* handle */ }
var filtered int
mapper.SetFilteredNotification(func(record, header []string) {
filtered++
})
out, err := mapper.Map(inputCSV)
// use out; 'filtered' contains the number of rows skipped by filters
Note:
-arrayWhen processing multiple CSV rows without the -array flag, each row is converted to a separate JSON document and written to the output with newlines between them. This produces a newline-delimited JSON format (NDJSON/JSON Lines), where each line is a valid JSON object, but the file as a whole is not a standard JSON array.
-arrayWhen using the -array flag, all rows are collected into a single array and output as one document.
When using the -nested-property flag with the -array flag (or when using TOML output which implicitly enables array mode), the output data is nested under the specified property name:
Will produce:
{
"items": [
{
"property1": 1,
"property2": {
"property3": "hello"
}
},
{
"property1": 2,
"property2": {
"property3": "world"
}
}
]
}
Will produce:
items:
- property1: 1
property2:
property3: hello
- property1: 2
property2:
property3: world
When using TOML as the output format, the array data is always wrapped in a property. By default, this property is named “data”, but you can customize it using the -nested-property flag:
Will produce:
[_meta]
processedAt = "2023-05-12 15:30:45"
totalRecords = 2
[[items]]
property1 = 1
property2 = { property3 = "hello" }
[[items]]
property1 = 2
property2 = { property3 = "world" }
Note how the document-level calculated fields appear in the _meta section at the top of the document, while record-level calculated fields would appear within each record.
Given the following CSV:
1,"hello",2.3
And this mapping.json:
{
"mapping": {
"0": {
"property": "property1",
"type": "int"
},
"1": {
"property": "property2.property3",
"type": "string"
},
"2": {
"property": "property4",
"type": "float"
}
}
}
The default output will be:
{
"property1": 1,
"property2": {
"property3": "hello"
},
"property4": 2.3
}
Given the following CSV:
id,status,value
1,"active",10.5
2,"inactive",20.3
3,"pending",15.7
And this mapping.json with value mapping:
{
"mapping": {
"id": {
"property": "id",
"type": "int"
},
"status": {
"property": "originalStatus",
"type": "string"
},
"value": {
"property": "amount",
"type": "float"
}
},
"calculated": [
{
"property": "statusCode",
"kind": "mapping",
"format": "status:active=1,inactive=0,pending=2,default=-1",
"type": "int",
"location": "record"
}
]
}
Running with the -named
Will produce:
{"id":1,"originalStatus":"active","amount":10.5,"statusCode":1}
{"id":2,"originalStatus":"inactive","amount":20.3,"statusCode":0}
{"id":3,"originalStatus":"pending","amount":15.7,"statusCode":2}
This example demonstrates how to map string status values to numeric codes using the value mapping feature.
Given the following CSV:
id,name,price
1,"Product A",19.99
2,"Product B",29.99
And this mapping.json:
{
"mapping": {
"id": {
"property": "productId",
"type": "int"
},
"name": {
"property": "productName",
"type": "string"
},
"price": {
"property": "pricing.retail",
"type": "float"
}
}
}
Running with the -named
Will produce (in NDJSON format, with each line being a separate JSON document):
{"productId":1,"productName":"Product A","pricing":{"retail":19.99}}
{"productId":2,"productName":"Product B","pricing":{"retail":29.99}}
Note: The actual output will not be pretty-printed but shown as compact JSON objects, one per line.
Will produce:
- productId: 1
productName: Product A
pricing:
retail: 19.99
- productId: 2
productName: Product B
pricing:
retail: 29.99