Skip to main content

CSV to JSON Converter

Convert CSV data to JSON array format.

Reviewed by · Last reviewed

How to Use the CSV to JSON Converter

  1. Paste CSV into the input box. The first row must be the header - column names that will become JSON object keys. A spreadsheet copy-paste, a .csv export from a BI tool, or raw output from psql -c "COPY ... TO STDOUT CSV HEADER" all work.
  2. Pick a delimiter. Auto-detect recognises comma, tab, semicolon, and pipe by counting occurrences on the first line. Manual selection forces the chosen character.
  3. Toggle trim to strip leading and trailing spaces from each field. Useful when the source was aligned by column for human reading.
  4. Click Convert. You get a JSON array of objects, one per data row, with keys matching the header and values as strings by default.

What the Parser Does

The parser implements a state machine that mirrors RFC 4180. It tracks whether it is currently inside quoted text, escaping a quote (a doubled "" inside a quoted field represents a literal quote per the spec), or reading plain content. Delimiters inside quoted fields are ignored; newlines inside quoted fields are preserved so multi-line cells survive intact. The header row is captured first, then each subsequent row becomes an object whose keys are the headers and whose values are the parsed cells in matching order.

The converter emits every cell as a string rather than trying to guess numeric types. That is deliberate: CSV has no type system, and what looks like a number (0123, phone numbers, leading-zero ZIP codes, ISBNs) is very often a string whose leading zero would be lost by a greedy parser. If you need numeric casting, post-process the JSON with your own mapping step where you know which columns are actually numbers. Extra columns beyond the header length are dropped; missing columns in short rows produce empty-string values.

Tasks This Tool Fits

  • Ingesting a spreadsheet export into a Node or Deno script that expects JSON.parse-able data.
  • Converting a survey CSV into records for a NoSQL store like MongoDB or DynamoDB.
  • Preparing test fixtures for integration tests where JSON is easier to check into git than CSV.
  • Feeding report data into a chart library that consumes arrays of objects (Chart.js, D3, ECharts).
  • Normalising an e-commerce product export into a structure you can POST to a REST API.
  • Pulling a SELECT ... INTO OUTFILE CSV back into a JSON document for a code review or bug report.

CSV Quirks and Pitfalls

  • Byte-order mark. Files saved by Excel on Windows often start with a UTF-8 BOM (0xEF 0xBB 0xBF). The first header cell will appear to have a stray character at the front; trim it by pasting and re-pasting the header or saving without BOM.
  • Embedded newlines. A quoted field can contain a newline, and the parser handles it correctly. However, if an opening quote is missing, a "field" will swallow the rest of the file. Validate that opening and closing quote counts match.
  • Quote style. RFC 4180 mandates double quotes. Single-quoted fields are non-standard and the parser does not recognise them. If your source uses single quotes, replace them first.
  • Trailing empty cells. A row ending in ,,, produces three empty-string values. Some CSV producers trim trailing empty cells, creating short rows; the missing columns become empty strings in the JSON.
  • Mixed line endings. Files produced by Windows use \\r\\n; Unix uses \\n. The parser tolerates both but a mix within a single file can throw off row counts in edge cases.

RFC 4180 in Practice

CSV is specified in RFC 4180 (October 2005), which is an informational RFC documenting the format rather than mandating it - which is why "CSV" in the wild is actually a family of related formats. RFC 4180 requires: one record per line, fields separated by commas, optional quoting with double quotes, doubled quotes to escape a literal quote inside a quoted field, and CRLF line endings. Real CSVs violate all of these daily. Tab-separated values are sometimes called CSV and are slightly safer because tabs rarely appear in data. Semicolon-separated CSV is common in European locales where the comma is the decimal separator. Excel\'s CSV export is especially quirky - it uses the system list separator, which is locale-dependent, and will sometimes quote fields unnecessarily.

Alternative Approaches

Command-line conversion is easiest with csvkit: csvjson file.csv produces typed JSON output. jq can consume CSV with the @csv input spec. Python\'s csv module in the standard library is reliable and offers type inference through csv.Sniffer. The xsv tool, written in Rust, is the fastest option for files in the gigabyte range. Excel and Google Sheets can export to JSON via add-ons. This in-browser tool is best for small to medium files where you want the result immediately; for multi-gigabyte logs or enterprise CSVs with non-standard quirks use one of the command-line tools above.

Frequently Asked Questions

Does the parser follow RFC 4180?

Yes, with sensible extensions for the real world. Quotes, doubled quotes inside quoted fields, and embedded newlines inside quoted fields all behave per the spec. The extensions include auto-detecting tab, semicolon, and pipe delimiters (the RFC mandates only comma), tolerating both <code>\\n</code> and <code>\\r\\n</code> line endings (the RFC mandates CRLF), and treating short rows as having trailing empty columns rather than raising an error.

Why are numeric-looking cells still strings in the JSON?

Because CSV has no types. A cell like <code>01234</code> could be a leading-zero integer, a ZIP code, or a phone number prefix - and casting it to a JavaScript number would silently drop the leading zero. Rather than guess, the converter leaves every cell as a string; you can cast specific columns yourself with a one-line <code>map</code> in JavaScript when you know which are actually numeric.

How does it handle quoted fields with commas inside?

Correctly. The state machine switches into "inside quoted field" mode when it encounters an opening quote and only leaves that mode when it sees a closing quote not followed by another quote (doubled quotes mean literal quote). While in that mode, delimiters and newlines are treated as content rather than structural markers. Unclosed quotes will propagate the mode until the end of input, which is usually visible as one gigantic field in the output.

Can I convert tab-separated values (TSV)?

Yes. Auto-detect recognises tabs by counting them on the first line - if tabs outnumber commas, the file is parsed as TSV. You can also pick Tab explicitly in the delimiter dropdown to force it. TSV is generally a safer format than comma-separated because tab characters almost never appear inside data fields; if you can choose the format at the source, prefer TSV.

Is my spreadsheet data sent to a server?

No. The converter is Preact code shipped with the page bundle and runs inside your browser tab. Clicking Convert calls a local parser function; there is no fetch request, no storage in localStorage, and no background service worker processing your data. That matters because CSV exports frequently contain PII - customer names, email addresses, account numbers - and this tool never sees a server.

What happens to empty rows?

Fully empty rows (either a blank line or a line with only delimiters) are skipped rather than producing an object with empty-string values for every field. This avoids polluting the output with ghost records that real CSVs sometimes accumulate between actual data rows. If you need those as explicit records, you can split on your own line boundaries before pasting.

How large a file can I paste?

The browser's clipboard can handle tens of megabytes, and the parser is linear in input size. In practice files up to about five megabytes convert in a few seconds on a typical laptop. Larger files will freeze the tab temporarily because parsing runs on the main thread. For multi-gigabyte files use <code>xsv</code> or <code>csvkit</code> on the command line where streaming parsers avoid loading the entire document into memory.

Will duplicate column names confuse the output?

Yes. JSON objects cannot have duplicate keys, so if your header has two columns named <code>email</code>, only the last one wins per row. The parser does not rename them. If your CSV really has duplicates, rewrite the header to disambiguate (<code>email_home</code>, <code>email_work</code>) before converting. Tools like <code>csvkit</code> report duplicates as warnings when loading.

Does it support CSV files with a BOM?

Partially. A UTF-8 BOM at the start of the file will appear as an invisible character in the first header cell. If downstream code keys off a specific header name it may fail to match. The cleanest fix is to save without BOM from Excel or use <code>sed -i &#39;1s/^\xef\xbb\xbf//&#39; file.csv</code> before pasting. Some browsers strip the BOM on clipboard paste, so the issue is intermittent.

How are nested structures represented?

CSV is flat. The converter produces an array of flat objects with string values - there is no way to express "this cell is actually a JSON object" within CSV itself. If your data has nested fields, encode them as JSON strings inside a cell (the converter will leave them as strings), or pick a richer interchange format like newline-delimited JSON.

More Developer Tools