env.dev

jq: Filter and Transform JSON From the Command Line

jq is the C-based JSON processor (1.8.1, 2025) that filters, reshapes, and pipes JSON inside shell scripts. Real-world recipes for kubectl, GitHub API, and AWS CLI.

Last updated:

jq is a small C binary that filters and transforms JSON inside a shell pipeline. Stephen Dolan released it in 2012, and the current 1.8.1 release (July 2025) — 34.6k GitHub stars and zero runtime dependencies — is still the de-facto JSON awk for kubectl, curl, the GitHub REST API, and AWS CLI workflows. Without it, finding the names of every Pending pod across a cluster means writing a throwaway Python script; with it, the same job is one line: kubectl get pods -A -o json | jq -r '.items[] | select(.status.phase=="Pending") | .metadata.name'. The language has barely changed since 1.6, but 1.7 cut cold-start latency from ~50 ms to single digits, 1.8 added --raw-output0 for safe xargs -0 piping, and a Rust rewrite called jaq now wins 20 out of 31 published benchmarks if you push enough JSON through it.

TL;DR

  • jq 1.8.1 is the current stable release; it patches CVE-2025-49014 (heap use-after-free in strftime) — upgrade if you handle untrusted input.
  • Use -r for raw strings, -c for compact output, -e to make missing keys exit non-zero in scripts.
  • select(.foo == "bar"), map(.x), group_by(.k), and @csv/@tsv handle 90% of real-world filter, reshape, and export jobs.
  • Reach for jaq (Rust) when piping > 100 MB; gojq (Go) when you need a library inside a Go binary.
  • jq is for shell glue. If a query starts spanning multiple lines and uses reduce, write a Python script.

What Is jq?

jq is a command-line JSON processor written in portable C. You feed it JSON on stdin (or a file path), give it a filter expression in jq's small functional language, and it writes JSON to stdout. Every shell idiom you know from grep, awk, and sed has a JSON-aware equivalent in jq, with one critical difference: jq understands the structure. Indenting, quoting, and escaping are handled for you.

Anatomy of a jq invocation
# Read JSON from stdin, filter, write JSON to stdout
echo '{"name":"alice","age":30}' | jq '.name'
# "alice"

# Read from a file, output raw strings (no JSON quoting)
jq -r '.users[].email' users.json

# Read from URL via curl, compact output (one object per line)
curl -s https://api.github.com/repos/jqlang/jq | jq -c '{stars: .stargazers_count, lang: .language}'

The binary is ~700 KB on Linux x86_64. Install on macOS with brew install jq, on Debian/Ubuntu with apt-get install jq, on Windows with winget install jqlang.jq, or grab a static binary from GitHub Releases if you're scripting inside a Docker image. For copy-paste filters, the jq cheatsheet has every operator on one page.

How Does jq Filter Syntax Work?

Every jq program is a pipeline of filters separated by |. Each filter takes a JSON value as input and produces zero, one, or many JSON values as output. The identity filter . returns the input unchanged — that's why jq . works as a pretty-printer.

FilterReadsExample output
.identityreturns the input unchanged
.fooobject access.foo of {"foo": 1} → 1
.foo?optional accessno error if .foo missing or input is not an object
.[]array iteration[1,2,3] → 1, then 2, then 3 (three outputs)
.[2]array index[10,20,30] → 30
.[2:4]slice[10,20,30,40,50] → [30,40]
|pipe.users[] | .email — emit each user, then take .email
,comma.a, .b — emit .a, then .b for each input
select(f)filterselect(.age > 30) — drop inputs where filter is false
map(f)array mapmap(.id) on [{"id":1},{"id":2}] → [1,2]

The mental model that unlocks jq: filters are streams. .users[] doesn't return an array — it emits each user as a separate value, one at a time. | select(...) on the next stage either passes that value through or drops it. Wrap the whole thing in [ ... ] to collect the stream back into an array.

Streams vs. arrays
echo '[{"n":1},{"n":2},{"n":3}]' | jq '.[] | select(.n > 1)'
# {"n": 2}
# {"n": 3}     ← two separate JSON outputs

echo '[{"n":1},{"n":2},{"n":3}]' | jq '[.[] | select(.n > 1)]'
# [{"n": 2}, {"n": 3}]   ← one array, because of the surrounding [ ]

Which jq Flags Should You Memorize?

Five flags handle most scripting. Skip the rest until you hit a specific need.

FlagEffectWhen to use
-rRaw output — strip JSON quotes from string resultsEvery time you pipe a string into another shell command
-cCompact — one JSON value per line (no pretty indent)Streaming logs to grep, awk, or appending to a JSONL file
-eExit non-zero on null or false (or no output)Inside `if jq -e ... ; then` checks
-sSlurp — read all inputs into one big arrayCombining multiple JSON files into a single value
--arg name valuePass a shell variable as a stringAvoid embedding `"$VAR"` and shell-quoting hell
--argjson name valuePass a shell variable as parsed JSONNumbers, booleans, or pre-built objects
-nNull input — start with null instead of stdinBuilding JSON from --arg variables only
--raw-output0NUL-separated raw output (jq 1.8+)Safe `xargs -0` piping when names contain spaces
--arg and --argjson save you from quote hell
# WRONG — breaks if $USER contains a quote, dollar sign, or newline
jq ".users[] | select(.name == \"$USER\")" data.json

# RIGHT — let jq handle the quoting
jq --arg u "$USER" '.users[] | select(.name == $u)' data.json

# Pass numbers or objects with --argjson (not --arg, which forces string)
jq --argjson min 30 '.users[] | select(.age >= $min)' data.json
jq --argjson tags '["api","prod"]' '.items[] | select(.tags | inside($tags))' data.json

How Do You Filter and Reshape JSON Arrays?

Three filters cover almost every transformation: select for filtering, map for projecting, and group_by + map for aggregations.

Filter, project, aggregate
# Filter: keep only users older than 30
jq '.users | map(select(.age > 30))' data.json

# Project: pick a subset of fields, rename one
jq '.users | map({id, name, primary_email: .email})' data.json

# Aggregate: count users per country, sorted by count desc
jq '.users | group_by(.country) | map({country: .[0].country, n: length}) | sort_by(-.n)' data.json

# Top 5 by score
jq '.scores | sort_by(.points) | reverse | .[:5]' data.json

# Flatten a nested array of arrays
jq '.regions | map(.cities) | add' data.json

# Sum a numeric field
jq '[.orders[].total] | add' data.json

Two non-obvious facts: add on an array of strings concatenates them (["a","b"] | add"ab"), and add on an array of objects merges them (later keys win). That single function replaces three different helpers in most languages.

How Do You Use jq With kubectl, GitHub, and AWS?

These three CLIs return JSON the moment you add a flag, and jq is the standard glue. The recipes below are the ones that pay off the most often in production debugging.

kubectl

Pod and node debugging recipes
# Names of every Pending pod in the cluster
kubectl get pods -A -o json \
  | jq -r '.items[] | select(.status.phase=="Pending") | "\(.metadata.namespace)/\(.metadata.name)"'

# Pods with at least one container restart in the last hour
kubectl get pods -A -o json \
  | jq -r '.items[]
      | select(.status.containerStatuses // [] | any(.restartCount > 0))
      | "\(.metadata.namespace)/\(.metadata.name) restarts=\(.status.containerStatuses[].restartCount)"'

# Containers missing CPU or memory requests (a common cluster smell)
kubectl get pods -A -o json \
  | jq -r '.items[].spec.containers[]
      | select(.resources.requests.cpu == null or .resources.requests.memory == null)
      | .name'

# Group pods by node, count them, sort descending
kubectl get pods -A -o json \
  | jq '.items | group_by(.spec.nodeName) | map({node: .[0].spec.nodeName, n: length}) | sort_by(-.n)'

# Decode every value in a Secret to plain text
kubectl get secret my-secret -o json \
  | jq '.data | with_entries(.value |= @base64d)'

GitHub REST API

Open PRs, stale branches, repo stats
# Open PRs not yet reviewed, sorted by age
curl -s -H "Authorization: Bearer $GH_TOKEN" \
  "https://api.github.com/repos/jqlang/jq/pulls?state=open&per_page=100" \
  | jq -r 'map(select(.requested_reviewers == [] and .review_comments == 0))
            | sort_by(.created_at)
            | .[] | "\(.created_at[:10])  #\(.number)  \(.title)"'

# Repos in an org sorted by stars, top 10
curl -s -H "Authorization: Bearer $GH_TOKEN" \
  "https://api.github.com/orgs/jqlang/repos?per_page=100" \
  | jq -r 'sort_by(-.stargazers_count) | .[:10] | .[] | [.name, .stargazers_count] | @tsv'

# Branches that haven't been pushed to in > 90 days
curl -s -H "Authorization: Bearer $GH_TOKEN" \
  "https://api.github.com/repos/$ORG/$REPO/branches?per_page=100" \
  | jq -r --arg now "$(date -u +%s)" \
      '.[] | select((.commit.commit.committer.date | fromdateiso8601) < ($now|tonumber - 7776000))
           | .name'

Two GitHub API gotchas worth flagging. Pagination is silent — without ?per_page=100 and a follow-the-Link-header loop, you get the first 30 items and a misleading result. And timestamps are ISO 8601 strings, not epoch numbers; pipe through fromdateiso8601 before you compare dates.

AWS CLI

EC2, S3, IAM debugging
# All running EC2 instance IDs in a region, with their private IPs
aws ec2 describe-instances --region us-east-1 \
  | jq -r '.Reservations[].Instances[]
      | select(.State.Name == "running")
      | [.InstanceId, .PrivateIpAddress] | @tsv'

# S3 buckets without server-side encryption (security audit)
aws s3api list-buckets \
  | jq -r '.Buckets[].Name' \
  | while read -r b; do
      enc=$(aws s3api get-bucket-encryption --bucket "$b" 2>&1 | jq -e .ServerSideEncryptionConfiguration > /dev/null && echo on || echo OFF)
      echo "$enc  $b"
    done

# IAM users with console access but no MFA
aws iam list-users \
  | jq -r '.Users[].UserName' \
  | while read -r u; do
      mfa=$(aws iam list-mfa-devices --user-name "$u" | jq '.MFADevices | length')
      [ "$mfa" = "0" ] && echo "$u  no MFA"
    done

AWS responses nest deeply (.Reservations[].Instances[], .SecurityGroups[].IpPermissions[]) and the field names are PascalCase, not camelCase — running aws ec2 describe-instances | jq keys on a sample is the fastest way to find the path to whatever you're looking for. AWS's own --query flag uses JMESPath instead of jq, which is fine for one-off lookups but breaks the moment you need group_by or reduce.

How Do You Build JSON From Scratch With jq?

Use -n (null input) plus --arg and --argjson to assemble JSON without a heredoc. This is the right way to build a webhook payload or a kubectl apply -f - manifest from shell variables.

Construct JSON from shell variables
COMMIT="$(git rev-parse HEAD)"
BRANCH="$(git branch --show-current)"
COUNT=$(git rev-list --count HEAD)

jq -n \
  --arg commit "$COMMIT" \
  --arg branch "$BRANCH" \
  --argjson count "$COUNT" \
  '{commit: $commit, branch: $branch, commit_count: $count, ts: now}' \
  | curl -s -X POST -H "Content-Type: application/json" -d @- https://hooks.example.com/build

now evaluates to the current Unix timestamp. The whole pipeline produces fully-quoted, validly-escaped JSON without ever calling printf with a format string — which is exactly the class of bug that ships secrets in webhook payloads when a value contains a quote.

What Are the Most Common jq Pitfalls?

MistakeSymptomFix
Forgetting -rStrings come out wrapped in "double quotes"Add -r whenever you pipe into another command
Embedding $VAR in the filterBreaks on quotes, dollars, newlinesPass via --arg or --argjson, reference as $name
Using .foo on missing keysjq prints null, scripts proceed silentlyUse .foo? for optional, or jq -e to fail loudly
select() inside [ ]select drops values; surrounding [] re-collects them[ .users[] | select(.active) ] — wrap to keep array shape
Chained .[]Cartesian product, output explodesWrap in [ ] at each level, or pipe through map()
== vs. equals on numbers"30" != 30 — JSON strings vs. numbersUse tonumber / tostring before comparing
Comparing dates as stringsISO 8601 sorts correctly, but math doesn'tPipe through fromdateiso8601, then arithmetic
Slurp without -sMultiple JSON inputs evaluated independentlyAdd -s to wrap the whole input stream into one array
Piping huge files through jqMemory blows up, OOM killUse --stream for streaming parse, or jaq for speed
Trusting null != absentjq treats null and missing the same in == checksUse has("foo") to distinguish "not set" from "set to null"

jq vs. jaq vs. gojq — Which Should You Use?

Three drop-in compatible implementations, three different reasons to pick one. The maintainer of jaq publishes a benchmark suite that runs the latest releases of all three; the 2025 round had jaq 3.0 winning 20 of 31 benchmarks, jq 1.8.1 winning 5, and gojq 0.12.18 winning 6.

ToolLanguageWhy pick itTrade-off
jqCReference implementation; pre-installed everywhere; smallest binarySlowest on large inputs; some Unicode edge cases lag behind jaq
jaqRust2–10× faster on big files; supports YAML, TOML, CBOR, XML inputs nativelyNo resource limits — a malicious filter can OOM the process
gojqGoImportable as a Go library; native flatten implementationSlowest startup; some semantic differences with jq

Practical recommendation: stick with jq for shell scripts, CI pipelines, and Dockerfiles — it's the one your collaborators have installed, and the speed difference on a 100 KB kubectl response is ~5 ms vs. ~3 ms. Reach for jaq when you're processing multi-megabyte files in a tight loop, or when you want to filter YAML/TOML without a separate yq step. Reach for gojq only if you're embedding it as a library.

When Should You Not Use jq?

  • Multi-line filters with reduce or recursion. Once a jq program spans more than three lines, debugging it is harder than rewriting in Python with json.load. The break-even is roughly when you reach for a named variable.
  • JSON Lines (NDJSON) at scale. jq can read a JSONL stream, but for a 10 GB log file Miller (mlr) or DuckDB's read_json_auto finishes in a fraction of the time and lets you SQL the data.
  • YAML, TOML, or CSV input. Use yq for YAML, jaq for native multi-format, or convert with yq -o=json . file.yaml | jq ....
  • Mutating files in place. jq has no -i flag — write to a temp file and mv it, or accept that jq '.x = 1' f.json > f.json truncates the file before reading it. (This bites every jq user once.)
  • Untrusted JSON in a library. jq is a CLI; calling out to it from a long-running service spawns a subprocess per query. Use a real JSON path library (Jackson, jsonpath-ng) inside services.

Frequently Asked Questions

What is the latest version of jq?

jq 1.8.1, released 2025-07-01. It is a security and stability patch over 1.8.0, fixing CVE-2025-49014 (heap use-after-free in strftime) and reverting a 1.8.0 change to reduce/foreach state that caused a serious performance regression.

Is jq still maintained?

Yes. After a multi-year quiet period, the project moved to the jqlang GitHub organisation in 2023 and has shipped 1.7, 1.7.1, 1.8.0, and 1.8.1 since. New releases land roughly every 6–12 months and patch security issues within weeks of disclosure.

How do I pass a shell variable to a jq filter safely?

Use --arg name value for strings and --argjson name value for numbers, booleans, or pre-built JSON. Reference the value inside the filter as $name. Never embed "$VAR" directly in the filter — it breaks the moment the value contains a quote, dollar sign, or newline.

Why does jq print my strings with double quotes?

jq emits valid JSON by default, and JSON strings are quoted. Add -r (raw output) when piping into another shell command. Use --raw-output0 (jq 1.8+) for NUL-separated output that pipes safely into xargs -0 when names contain spaces or newlines.

How do I edit a JSON file in place with jq?

jq has no in-place flag. The pattern is: jq '<filter>' f.json > f.json.tmp && mv f.json.tmp f.json. Writing back to the same file with > truncates it before jq reads it, leaving you with an empty file.

Is jq faster than jaq or gojq?

Generally no. The jaq benchmark suite shows jaq 3.0 fastest on 20 of 31 tests, jq 1.8.1 fastest on 5, gojq 0.12.18 fastest on 6. jq remains plenty fast for shell-pipeline use (sub-100 ms on most kubectl responses) and is the most widely installed.

References

Was this helpful?

Read next

tmux: Sessions, Panes, .tmux.conf, and Neovim Integration

tmux is the BSD-licensed terminal multiplexer (3.6a, 2026) that survives SSH drops and reboots. Sessions, panes, .tmux.conf, scripting, and the Neovim handshake.

Continue →

Frequently Asked Questions

What is the latest version of jq?

jq 1.8.1, released 2025-07-01. It is a security and stability patch over 1.8.0, fixing CVE-2025-49014 (heap use-after-free in strftime) and reverting a 1.8.0 change to reduce/foreach state that caused a serious performance regression.

Is jq still maintained?

Yes. After a multi-year quiet period, the project moved to the jqlang GitHub organisation in 2023 and has shipped 1.7, 1.7.1, 1.8.0, and 1.8.1 since. New releases land roughly every 6–12 months and patch security issues within weeks of disclosure.

How do I pass a shell variable to a jq filter safely?

Use --arg name value for strings and --argjson name value for numbers, booleans, or pre-built JSON. Reference the value inside the filter as $name. Never embed "$VAR" directly in the filter — it breaks the moment the value contains a quote, dollar sign, or newline.

Why does jq print my strings with double quotes?

jq emits valid JSON by default, and JSON strings are quoted. Add -r (raw output) when piping into another shell command. Use --raw-output0 (jq 1.8+) for NUL-separated output that pipes safely into xargs -0 when names contain spaces or newlines.

How do I edit a JSON file in place with jq?

jq has no in-place flag. The pattern is: jq '<filter>' f.json > f.json.tmp && mv f.json.tmp f.json. Writing back to the same file with > truncates it before jq reads it, leaving you with an empty file.

Is jq faster than jaq or gojq?

Generally no. The jaq benchmark suite shows jaq 3.0 fastest on 20 of 31 tests, jq 1.8.1 fastest on 5, gojq 0.12.18 fastest on 6. jq remains plenty fast for shell-pipeline use (sub-100 ms on most kubectl responses) and is the most widely installed.

Stay up to date

Get notified about new guides, tools, and cheatsheets.