jq trickery
Here are some jq
tricks that I learned over the years to parse JSONs. By
“learned” I mean that I know my way around, not that I fully understand them
all, this stuff is magical.
TOC:
Jq
commands
I’ll use the JSON below in all examples below, but wont type it all the time:
{
"foo": "bar",
"numbers": [
{
"one": 1
},
{
"two": 2
},
{
"tree": "NaN"
},
{
"foo": "NaN"
}
]
}
Export the JSON if you want to copy-pasta the examples:
$ export my_json_above='{"foo": "bar", "numbers": [{"one": 1}, {"two": 2}, {"tree": "NaN"}, {"foo": "NaN"}]}'
Now the trickeryfoo.
Pretty Print
Maybe the most used feature of jq
is to pretty-print a JSON. By pretty-print,
jq
mean: split the input into multiple lines and align them vertically in a
meaningful and colorful way. For example:
$ echo '{"foo": "bar", "numbers": [{"one": 1}, {"two": 2}, {"tree": "NaN"}, {"foo": "NaN"}]}' | jq
{
"foo": "bar",
"numbers": [
{
"one": 1
},
{
"two": 2
},
{
"tree": "NaN"
},
{
"foo": "NaN"
}
]
}
In this example, jq
operates on the entire JSON. jq
can also operate on
“parts”, or internal/nested objects, of the input JSON. jq
uses “filters” to
parse/modify its input.
In this example, the filter used is the identity: .
. This filter doesn’t
modify the JSON instead, it returns the input. The identity filter is like
multiplying a number by 1. jq .
is equivalent to jq
.
Individual object operations
To work with an internal object or value:
-
to get value of key
foo
, the filter is.foo
:$ echo "$my_json_above" | jq .foo bar
In this case,
.
in.foo
is not the identity operator..foo
is the syntax to select thefoo
object. -
to get all keys, the filter is
keys[]
:$ echo "$my_json_above" | jq 'keys[]' "foo" "numbers"
-
to get all elements of an array, the filter is
[]
, so to get all elements of thenumbers
array we first selectnumbers
and then[]
it:$ echo $my_json_above | jq .numbers[] { "one": 1 } { "two": 2 } { "tree": "NaN" } { "foo": "NaN" }
Note: without the square brackets (
.numbers
instead of.numbers[]
) you get thenumbers
sub object: the array. With the square brackets, you get the elements of the array. -
to get all keys of an internal array, you combine the filters with a
|
:$ echo "$my_json_above" | jq '.numbers[] | keys[]' "one" "two" "tree" "foo"
-
to get all values, the filter is
values[]
:$ echo $my_json_above | jq '.numbers[] | values[]' 1 2 "NaN" "NaN"
-
to get all keys and their values, the magic is
'keys[] as $k | "\($k), \(.[$k])"'
:$ echo $my_json_above | jq 'keys[] as $k | "key: _\($k)_ value: \(.[$k])"' "key: _foo_ value: bar" "key: _numbers_ value: [{\"one\":1},{\"two\":2},{\"tree\":\"NaN\"},{\"foo\":\"NaN\"}]"
In this case, this is similar as the identity operator, but with some extra formatting text. It is more interesting to get the keys and their values of internal objects:
$ echo "$my_json_above" | jq '.numbers.[] | keys[] as $k | "\($k): \(.[$k])"' "one: 1" "two: 2" "tree: NaN" "foo: NaN"
Note: this creates a variable
k
. We can reference its value with$k
. -
to get the value of a key or a default one in case the key does not exist, the filter is
.key // "value"
:echo "$my_json_above" | jq '.foos // "bars"' "bars"
-
to add/update key/value, the filter is
. * {"key": "value"}
:$ echo "$my_json_above" | jq '. * {"foo": "FOO", "bar": "baz"}' { "foo": "FOO", "numbers": [ { "one": 1 }, { "two": 2 }, { "tree": "NaN" }, { "foo": "NaN" } ], "bar": "baz" }
-
to delete key/value, the filter is
del(.key)
:echo "$my_json_above" | jq 'del(.numbers)' { "foo": "bar" }
-
to delete all keys that contain a specific value, the filter is
del(.[] | select . == "value")
:$ echo "$my_json_above" | jq 'del(.[] | select(. == "bar"))' { "numbers": [ { "one": 1 }, { "two": 2 }, { "tree": "NaN" }, { "foo": "NaN" } ] }
Array operations
Array:
-
the array operator is square brackets, to “extract” all objects from an array, the filter is
[]
:❯ echo $my_json_above | jq .numbers[] { "one": 1 } { "two": 2 } { "tree": "NaN" } { "foo": "NaN" }
This first “selects” the
numbers
sub object and then all its elements. -
the first element is zero:
jq .[0]
$ echo $my_json_above | jq .numbers[0] { "one": 1 } $ echo $my_json_above | jq .numbers[2] { "tree": "NaN" }
-
to get the last element of the array, check the
-1
position:$ echo $my_json_above | jq .numbers[-1] { "foo": "NaN" } # To get the second last: $ echo $my_json_above | jq .numbers[-2] { "tree": "NaN" } # You got the idea
-
the filter to get the length of an object is
length
. In this case, we want the input for thelength
filter to be the array. Here comes the Pipe|
to feed the output of one filter to the input of the next:$ echo $my_json_above | jq '.numbers | length' 4
-
the filter to append to an array is
+=
:$ echo $my_json_above | jq '.numbers += [{"three": 3}]' { "foo": "bar", "numbers": [ { "one": 1 }, { "two": 2 }, { "tree": "NaN" }, { "foo": "NaN" }, { "three": 3 } ] }
Another approach is with the update
|=
operator:% echo $my_json_above | jq '.numbers |= . + [{"three": 3}]' { "foo": "bar", "numbers": [ { "one": 1 }, { "two": 2 }, { "tree": "NaN" }, { "foo": "NaN" }, { "three": 3 } ] }
-
add element as first position of the array: add an element and “sum” the entire object. Pretty much the same as the previous example, but with the position of the “summed” elements swapped:
echo $my_json_above | jq '.numbers |= [{"zero": 0}] + .' { "foo": "bar", "numbers": [ { "zero": 0 }, { "one": 1 }, { "two": 2 }, { "tree": "NaN" }, { "foo": "NaN" } ] }
The order matters.
-
to delete second element from array is similar to deleting any element:
$ echo $my_json_above | jq '. |= del(.numbers[1])' { "foo": "bar", "numbers": [ { "one": 1 }, { "tree": "NaN" }, { "foo": "NaN" } ] }
-
to delete all elements matching a criteria you pass the selected elements to
del()
filter.$ echo $my_json_above | jq 'del(.numbers[] | select(.[] == "NaN"))' { "foo": "bar", "numbers": [ { "one": 1 }, { "two": 2 } ] }
Conclusion
jq
is handy to get some values in an interactive way, piping from other
commands or operating on files (like jq . data.json
instead of cat data.json | jq .
). jq
is handy, but not straightforward. At least to me.
When you need to parse complex JSONs, modify them, use them as --data
for
curl
, etc, use Python instead 🐍
Really. Python is more understandable for “complex” operations. Don’t get me
wrong, jq
is great to visualize JSONs in the terminal. But the moment that
you need to do some nasty actions to nested JSON thingies, Bash will hit you
with a big stick.
Look at this Python code:
import json
json_str='''
{
"foo": "bar",
"numbers": [{"one": 1}, {"two": 2}, {"tree": "NaN"}, {"foo": "NaN"}]
}
'''
# Turn the string into a "JSON" object
my_json = json.loads(json_str) # The JSON object is actually a Python dict
# Get the value of a key
foo = my_json['foo']
numbers = my_json['numbers']
# Get value of a key or a default one if the key doesn't exist
non_existent_key = my_json.get('blarg', 'default_value')
# Add new key
my_json['bla'] = "bla"
# Update key
my_json['bla'] = "bleb"
# Delete a key
my_json.pop('foo')
# Append to `numbers`
my_json["numbers"].append({"xablau":"NaN"})
# Get all non "NaN"s from `numbers`
non_nans = []
for number_dict in my_json["numbers"]:
for value in number_dict.values():
if value != "NaN":
non_nans.append(number_dict)
# Convert JSON dict to a valid JSON string
json_str_final = json.dumps(my_json)
Is it more readable? As a bonus, you get all Pythonic power at your fingertips. No magic needed.