Skip to main content

SQL Formatter

Format and beautify SQL queries with keyword uppercasing.

Reviewed by · Last reviewed

How to Use the SQL Formatter

  1. Paste your query into the input pane. SELECT statements, data-modifying INSERT/UPDATE/DELETE, DDL like CREATE TABLE, and multi-statement scripts separated by semicolons are all accepted.
  2. Pick indent depth: 2 spaces (common in PostgreSQL and MySQL team styles) or 4 spaces (Oracle SQL Developer default).
  3. Click Format. Keywords are uppercased, clause boundaries gain newlines, and subqueries inside parentheses indent one level deeper so nesting is visible at a glance.
  4. Copy the result back into your query runner, BI tool, or pull request. The formatter preserves string literals and identifiers so your query still returns the same rows.

What It Does Under the Hood

The formatter operates on tokens rather than a SQL parse tree. It walks the query character by character, tracking string literals, identifier quotes (", \`, [ ]), and parenthesis depth. Whenever it hits a top-level clause keyword - SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, UNION, WITH, and the join variants - it emits a newline and aligns the keyword at the current indent. Join conditions starting with ON stay on the following line to keep the table name visible on the join line.

Case conversion uppercases every reserved word recognised in the token list while leaving identifiers and string contents alone. Case inside quoted identifiers is preserved because in PostgreSQL and standards-compliant databases "Users" and "users" are different tables. The tool does not attempt to detect dialect-specific features like MySQL USE INDEX hints, PostgreSQL LATERAL joins, or SQL Server MERGE statements as specially indented constructs - they are preserved verbatim and indented by parenthesis depth.

When This Tool Helps

  • Making a generated query from an ORM (Prisma, SQLAlchemy, ActiveRecord) readable for manual inspection during a slow query investigation.
  • Preparing a query for a pull request or migration file where indentation shows the logical structure.
  • Reading long analytics queries pulled out of Looker, Tableau, or Metabase logs.
  • Formatting queries inside comments or docstrings where your IDE does not know how to handle SQL.
  • Pasting a stringified query out of application logs back into a readable shape before explaining it.
  • Teaching SQL: uppercase keywords make the grammar jump out for learners scanning an example.

Edge Cases and Limits

  • Window functions. OVER (PARTITION BY x ORDER BY y) formats correctly because the partition and order clauses live inside the parentheses and are indented by depth. Very long window definitions may not wrap as nicely as a dialect-aware formatter would.
  • CTEs. WITH cte AS (...) receives a newline at the opening parenthesis and indents the inner select. Multiple CTEs separated by commas each get their own block.
  • Strings with keywords. A literal like 'select your language' is preserved untouched because the tokenizer tracks string state. Case inside strings is never changed.
  • Comments. Single-line -- comments and block /* ... */ comments are preserved, but may not land on the exact line they were on originally if they sit in the middle of a clause.
  • Dynamic SQL. Queries built by string concatenation in application code often contain placeholders like ? or $1. Those are treated as plain identifiers and pass through unchanged.

SQL Standards Context

SQL is standardised as ISO/IEC 9075. The most-cited editions are SQL:1999 (common table expressions, recursive queries, triggers), SQL:2003 (window functions, XML features), SQL:2008 (TRUNCATE, INSTEAD OF triggers), and SQL:2016 (JSON functions, row pattern matching). In practice each database ships its own dialect with extensions, so a query that formats correctly here is not automatically portable between PostgreSQL, MySQL, and SQL Server. Keywords like LIMIT (PostgreSQL/MySQL/SQLite), TOP (SQL Server), and FETCH FIRST (standard) all do similar things but are not interchangeable. Uppercasing keywords is a stylistic convention, not a requirement - ANSI SQL is case-insensitive for keywords - but it makes queries easier to scan and matches the conventions used in every major SQL textbook.

How It Stacks Up Against Other Formatters

Dedicated tools like sql-formatter and pgFormatter do build parse trees and support dialect-specific keywords, alignment of SELECT columns, and commenting conventions that a regex-based formatter cannot match. If you format SQL daily, install one of those as an editor extension. IDE-integrated formatters like those inside DataGrip or DBeaver are also excellent because they know the schema of your connection and can indent multi-CTE queries beautifully. This in-browser tool is for the one-off case: you have a query from a log or a colleague, you want to read it, and you do not want to open a full IDE. For production style enforcement, check the query into a repo and run a real formatter as part of CI.

Frequently Asked Questions

Does this formatter parse SQL semantically?

No. It is a token-level formatter that recognises about 80 reserved keywords and tracks parentheses, strings, and identifier quotes. It does not build an abstract syntax tree, so it cannot validate that the columns you select actually exist in the tables you are joining. That makes it extremely tolerant of dialect differences - PostgreSQL, MySQL, SQL Server, Oracle, SQLite, Redshift, Snowflake, BigQuery queries all pass through - but also means it cannot do deep reformatting like aligning <code>SELECT</code> columns or breaking long <code>WHERE</code> clauses.

Will it change the result of my query?

No. Formatting only changes whitespace and keyword case. SQL reserved words are case-insensitive per the ANSI standard, so uppercasing <code>select</code> to <code>SELECT</code> makes no difference to any database engine. String literals and quoted identifiers are left byte-for-byte identical. Running the formatted query returns exactly the same rows as running the original source.

Is my SQL ever uploaded?

No. The formatter is a Preact component that ships with the page bundle and runs entirely in your browser tab. The Format button invokes a local JavaScript function; there is no fetch or websocket to a remote service. Because people often paste queries that include schema names, table names, and literal example data, that local-only guarantee matters - nothing you paste reaches any server.

How are subqueries indented?

Every opening parenthesis pushes the indent by one level, so a subquery in the <code>FROM</code> clause lines up under its parent <code>SELECT</code>. Deeply nested subqueries receive proportionally deeper indentation, which makes the nesting visible even for queries that span many lines. This approach does not know the difference between a scalar subquery and a correlated subquery, but it reads cleanly in both cases.

What about dialect-specific features like PIVOT or MATCH_RECOGNIZE?

Those are preserved verbatim but are not specially indented. The formatter will still produce readable output because it uppercases the recognised keywords around them and indents by parenthesis depth. If you frequently work with a specific dialect, a dialect-aware formatter like <code>pgFormatter</code> for PostgreSQL or a BigQuery-aware formatter will produce noticeably better results for vendor-specific syntax.

Does it normalise identifier quoting?

No. If your source uses backticks (MySQL style), double quotes (ANSI and PostgreSQL), or square brackets (SQL Server style), the formatter leaves them exactly as written. Converting between quoting styles has real semantic consequences - MySQL backticks are case-insensitive on Linux and case-sensitive on macOS depending on filesystem settings - so conservatism is the safer default.

Can I format a script with multiple statements separated by semicolons?

Yes. The formatter treats a top-level semicolon as a statement terminator and inserts a blank line after it, so multi-statement scripts produce readable blocks. Transactional statements like <code>BEGIN</code>, <code>COMMIT</code>, and <code>ROLLBACK</code> are uppercased as keywords but are not specially grouped with the statements they enclose.

How does it handle SQL comments?

Both <code>-- line comments</code> and <code>/* block comments */</code> are preserved in their original positions. Comments are useful markers for schema migrations and for documenting the purpose of complex CTEs, so the formatter never strips them. A line comment will push any keyword following it onto a new line, which matches how SQL parsers treat the token.

Should I commit formatted SQL to git?

Yes for migrations, no for generated queries. Hand-written migrations and stored procedures benefit from consistent formatting because they are reviewed by humans. Queries generated by an ORM will be regenerated at runtime and checking them in as formatted files just creates merge conflicts. A good rule: format source-of-truth SQL files, leave tool output unformatted or uncommitted.

Why does my <code>JOIN</code> keyword sometimes appear on the same line as <code>FROM</code>?

The formatter places a newline before major clause keywords, and <code>JOIN</code> counts as a major clause. If your source already had the JOIN on a new line it will stay on one; if the JOIN was part of a longer expression wrapped in parentheses, the formatter keeps parenthesis depth as the controlling indent. If the result does not match your team style, consider adopting a parse-tree-based formatter that always breaks before every JOIN.

More Developer Tools