JQ: Another Tool We Thought We Knew

Published: 2023-07-24
Last Updated: 2023-07-24 14:15:25 UTC
by Rob VandenBrink (Version: 1)
2 comment(s)

So often you'll see folks (me included) use "jq" to take an unformatted JSON mess and turn it into a readable output.  For instance, last thursday we used the Shodan API to dump about 650k of host info like this:
curl -s -k "https://api.shodan.io/shodan/host/%1?key=%shodan-api-key%" | jq

In other words, up to today, I've just used jq as a JSON "prettifier".

At some point (OK, TIL), I finally clued into the fact that that the "q" in "jq" stood for "query"

First, let's simplify things first by making a file that we can play with (see Thursday's diary - https://isc.sans.edu/diary/Shodans+API+For+The+Recon+Win/30050/ - for details on the API call below):

curl -s -k "https://api.shodan.io/shodan/host/45.60.31.34?key=%shodan-api-key%" > isc.txt

Let's use jq to query / extract the "ports" array in the file:

type isc.txt | jq ".ports"
[
  1024,
  8200,
  25,
  8112,
  2082,
  2083,
  2087,
  554,
  14344,
  53,
  12345,
  60001,
  9800,
  587,
  80,
  5201,
.. and so on (123 open ports)

printing these without the carriage returns gets it all on one page, sometimes that's important:

type isc.txt | jq ".ports" --compact-output
[1024,8200,25,8112,2082,2083,2087,554,14344,53,12345,60001,9800,587,80,5201,82,83,14265,9306,8800,7777,7779,31337,631,8834,16010,5269,1177,5800,2222,8880,8888,8889,3268,3269,10443,3790,9080,1234,10134,3299,4848,9001,8443,13579,5900,5901,9998,9999,10000,10001,7443,9000,2345,9002,6443,4911,9009,7474,1337,9530,3389,8001,8009,8010,50000,9443,7001,4443,4444,5985,5986,5007,5009,6000,6001,1400,8060,9600,9090,9091,389,9095,5000,5001,9100,5005,5006,1935,8081,5010,8083,4500,8085,8086,8089,8090,7071,4000,8098,25001,2480,4022,5560,3001,8123,444,8126,6080,4040,8139,465,4567,4064,9191,3050,9200,1521,8181,443]


Let's extract both the subdomains and hostnames? (often these are the same):
type isc.txt | jq ".domains,.hostnames"
[
  "cio.org",
  "ranges.io",
  "cyberaces.org",
  "sans.co",
  "imperva.com",
  "cyberfoundations.org",
  "securingthehuman.org",
  "sans.org",
  "giac.net",
  "sans.edu",
  "giac.org",
  "cybercenters.org"
]
[
  "cio.org",
  "ranges.io",
  "cyberaces.org",
  "sans.co",
  "giac.net",
  "imperva.com",
  "cyberfoundations.org",
  "qms.sans.org",
  "content.sans.org",
  "sans.org",
  "sso.securingthehuman.org",
  "isc.sans.edu",
  "sans.edu",
  "giac.org",
  "cybercenters.org"
]

At some point, you'll find that the IP addresses returned by shodan are typically in decimal.  No problem, convert decimal value to hex, then convert each octet back to digital and stuff the dots in!  Or you can just ask for both the ip and ip_str values:

type isc.txt | jq ".ip,.ip_str"
  758914850,
  "45.60.31.34"


How about just dumping out the keys that you can mess with?

type isc.txt |  jq "keys"
[
  "area_code",
  "asn",
  "city",
  "country_code",
  "country_name",
  "data",
  "domains",
  "hostnames",
  "ip",
  "ip_str",
  "isp",
  "last_update",
  "latitude",
  "longitude",
  "org",
  "os",
  "ports",
  "region_code",
  "tags"
]

There's way more to jq - you can execute scripts, add and delete keys, sort output or do math on the various values.  From the "query" perspective, you can treat your JSON input very much like a SQL database - you can use statements like select, index and join, which should all look very familiar.
You can also write scripts for jq to execute.  The scripts have all the scripty things you'd expect: if/then/else, try/catch boolean operators, regex support, text manipulation operators and so on. 
If you have jq installed, typing "man jq" will give you several pages of possibilties, even "jq --help" will get you started.  Googling "man jq" will give you the same if you don't have it installed yet.

For me, basic queries do the job most days (which is what was discussed above) - if I need more I tend to use other scripting solutions, most days bash, python or powershell.  But (just like most of us do with AWK), I'm just scratching the surface of what jq can do.

If you've done something cool with jq, please share in our comment form!  
 

===============
Rob VandenBrink
rob@coherentsecurity.com

Keywords: jq linux tools
2 comment(s)

Comments

jq is one of my favorite CLI utilities. We use it extensively in SANS SEC510: Public Cloud Security: AWS, Azure, and GCP. This query is probably the most advanced one in the course and the gnarliest one I have ever written. It counts the number of Google Cloud VPC Flow Logs for outbound traffic that was sent over the same protocol to the same IP and port, ordering the results based on the number of instances descending:

jq -r 'group_by(.jsonPayload.connection.dest_ip, .jsonPayload.connection.dest_port, .jsonPayload.connection.protocol)[] | {"dest_ip": .[0].jsonPayload.connection.dest_ip, "dest_port": .[0].jsonPayload.connection.dest_port, "protocol": .[0].jsonPayload.connection.protocol, "count": length}' | jq -s '. | sort_by(-.count)'
I see that jq has been ported to Cygwin. That can be useful!

Diary Archives