Score:-1

How to extract a string from a json file and put into a variable (Linux)

us flag

I have the following in one of my json file file1.json :-

{
  "$quer": {
    "args": [{
      "args": [
        "select\n      db1.table1 as tab1,\n      db1.table2 as tab2,\n      db1.table3 as tab3\n      from db1.table4 as tab4"
      ],
      "fn": "from-sql",
      "ns": "op"
    }],
    "fn": "operators",
    "ns": "op"
  }
}

I want to extract the string db1.table4 from this json file and store into a variable.

I don't know much about sed and awk. Can someone help here?

terdon avatar
cn flag
How can we know what to extract? Should it always be the second word from the first element in the the `args` array? Should it be whatever is after `select\n`? How can we identify the part of the file you want to extract?
Score:2
us flag
  1. Assume that the string you want to extract stands in the same position in every file you can use head, tail and cut commands using pipes.

  2. For example:

    $ head -6 file.json | tail -1 | cut -b 121-129
    db1.table
    
  3. And here is an example of a script setting the output into a variable:

    #!/bin/bash
    v1=$(head -6 file.json | tail -1 | cut -b 121-130)
    echo "$v1"
    

The output of the script will be db1.table4 which is the value of V1 varaible.

You can read more about those commands here:

Of course you can use those commands to extract any other string from a file.

Aviator avatar
us flag
thanks a lot. it helped me.
Score:2
ru flag

Take a look at jq the command line JSON processor, install for example with:

sudo apt install jq

The string you want is not a JSON value, it's part of a JSON value. So I suggest you use jq to get the string you need to manipulate into a variable, for example:

my_var=$(jq -r .[$quer].args[0].args[0] file1.json)

This gets you a variable containing the SELECT statement:

select db1.table1 as tab1, db1.table2 as tab2, db1.table3 as tab3 from db1.table4 as tab4

Then you will need to use other tools like sed, awk, cut etc to get the substring you want from that variable. For your specific case this would work but of course may not work for a different SELECT statement. Cutting by space delimiter and returning the 12th value:

my_table=$(echo $my_var | cut -d' ' -f12)
Aviator avatar
us flag
its already there but my question is not json related its to use the unix command.
terdon avatar
cn flag
@Aviator `jq` _is_ a command, just like any other. It is just the right tool to use when parsing json files.
codlord avatar
ru flag
Examples added above.
terdon avatar
cn flag
Thanks! That's much clearer :)
Score:1
cn flag

You should generally avoid using generic text parsing tools for structured data. Since you have a json file, it is safer and simpler to use a dedicated json parser. In your case, you would want to extract the value of the first element of the array args which is itself the first element of the top level array args, the child of the top level hash $quer:

$ jq '."$quer"."args"[0]["args"]' file.json
[
  "select\n      db1.table1 as tab1,\n      db1.table2 as tab2,\n      db1.table3 as tab3\n      from db1.table4 as tab4"
]

From here, you no longer have structured data and you need to resort to cruder methods. I don't know how you want to identify your target string, you didn't explain that. So, depending on what you actually want, you could do:

  1. Skip lines starting with [ or ] and then print the second word of the remaining lines:

    $ jq '."$quer"."args"[0]["args"]' file.json | awk '/^[^][]/{print $2}'
    db1.table1
    
  2. Print the second word of the second line

    $ jq '."$quer"."args"[0]["args"]' file.json | awk 'NR==2{print $2}'
    db1.table1
    
  3. Print the longest stretch of non-whitespace after the string "select\n:

    $ jq '."$quer"."args"[0]["args"]' file.json | grep -oP '"select\\n\s*\K\S*'
    db1.table1
    

If you explain exactly how we are supposed to know what string to extract, I could give you a more targeted answer.


For the sake of completion, in your specific example, and I stress that this will not be portable and is almost certain to fail if your input data change in any way, you can use simple text tools directly:

$ grep -oP '"select\\n\s*\K\S*' file.json 
db1.table1

$ awk '$1=="\"select\\n"{print $2}' file.json 
db1.table1

$ sed -nE 's/.*"select\\n\s*(\S+).*/\1/p' file.json 
db1.table1
hr flag
You could possibly do some of the awk-like slicing'n'dicing in jq, for example `jq -r '."$quer".args[0].args[] | split("\n")[-1] | split(" ")[-3]' file.json` or maybe something regex based like `jq -r '."$quer".args[0].args[] | capture("from (?<a>[^ ]+)") | .a'`
terdon avatar
cn flag
@steeldriver yeah, but I figured that since I don't know what the OP actually wants to extract, I may as well give some simple choices.
hr flag
indeed ... not clear
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.