Score:4

Ubuntu

Convert text file to CSV on shell

jotrocken

11/29/23, 10:11 AM

I would like to convert a text file into CSV format.

Here is an excerpt from the file:

{"Outdated":false,"Watt":233,"Timestamp":1669647142,"A_Plus":6523.896,"A_Plus_HT":4494.82,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":235,"Timestamp":1669647152,"A_Plus":6523.896,"A_Plus_HT":4494.82,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":235,"Timestamp":1669647596,"A_Plus":6523.896,"A_Plus_HT":4494.82,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":233,"Timestamp":1669651191,"A_Plus":6524.496,"A_Plus_HT":4495.42,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":276,"Timestamp":1669654797,"A_Plus":6524.816,"A_Plus_HT":4495.74,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":437,"Timestamp":1669658393,"A_Plus":6525.901,"A_Plus_HT":4496.825,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":362,"Timestamp":1669661992,"A_Plus":6526.732,"A_Plus_HT":4497.656,"A_Plus_NT":2029.076,"A_Minus":8841.63}
{"Outdated":false,"Watt":471,"Timestamp":1669665603,"A_Plus":6527.062,"A_Plus_HT":4497.986,"A_Plus_NT":2029.076,"A_Minus":8841.63}

The structure resembles a JSON file, but unfortunately the structure does not quite match. I have already used this tool (https://github.com/alingse/jsoncsv), but unfortunately I get error messages because of the wrong structure.

How can I turn it into a CSV file?

Thanks for your help!

2823

4 + 2

command-line

csv

text-processing

json

Raffa

11/29/23, 10:17 AM

Please [edit] your question to provide an example of the expected result.

aborruso

12/1/23, 3:42 PM

You have had a lot of good answer. Has none been good for you? If yes, please indicate which

Score:9

Ubuntu

aborruso

11/30/23, 8:28 AM

This is a very used data format, Json lines

You can use Miller 6 and run

mlrgo --ijsonl --ocsv cat input.jsonl

to get

Outdated,Watt,Timestamp,A_Plus,A_Plus_HT,A_Plus_NT,A_Minus
false,233,1669647142,6523.896,4494.82,2029.076,8841.63
false,235,1669647152,6523.896,4494.82,2029.076,8841.63
false,235,1669647596,6523.896,4494.82,2029.076,8841.63
false,233,1669651191,6524.496,4495.42,2029.076,8841.63
false,276,1669654797,6524.816,4495.74,2029.076,8841.63
false,437,1669658393,6525.901,4496.825,2029.076,8841.63
false,362,1669661992,6526.732,4497.656,2029.076,8841.63
false,471,1669665603,6527.062,4497.986,2029.076,8841.63

In the useful comment below, there is a note about the way to do it in Miller 5:

mlr --ijson --ocsv cat input >output

+ 2

steeldriver

11/30/23, 4:57 PM

Nice! even Miller 5.6.2 seems to work, with the generic `--ijson` format: `mlr --ijson --ocsv cat file`

aborruso

11/30/23, 5:10 PM

thank you @steeldriver , I add it on my answer

Score:9

Ubuntu

steeldriver

11/29/23, 2:40 PM

Each row of your input is a valid JSON object - so if you don't care about CSV headers, you could simply deconstruct them back into arrays and pass them through the @csv filter:

$ jq -r '[.[]] | @csv' file
false,233,1669647142,6523.896,4494.82,2029.076,8841.63
false,235,1669647152,6523.896,4494.82,2029.076,8841.63
false,235,1669647596,6523.896,4494.82,2029.076,8841.63
false,233,1669651191,6524.496,4495.42,2029.076,8841.63
false,276,1669654797,6524.816,4495.74,2029.076,8841.63
false,437,1669658393,6525.901,4496.825,2029.076,8841.63
false,362,1669661992,6526.732,4497.656,2029.076,8841.63
false,471,1669665603,6527.062,4497.986,2029.076,8841.63

If you do care about headers, it's more complicated. The best I could come up with is:

slurp the whole file into an indexed array of rows
perform a reduction of the array of rows, initializing it with the keys extracted from the 0^th element

You now have an array-of-arrays, with a header array at the top, that may be mapped back to an array of CSV (and finally to individual CSV rows):

$ jq -r --slurp 'to_entries |
    reduce . as $row ([.[0] | .value | keys_unsorted]; . + [$row[] | .value | to_entries | map(.value)]) |
    map(@csv) | .[]
  ' file
"Outdated","Watt","Timestamp","A_Plus","A_Plus_HT","A_Plus_NT","A_Minus"
false,233,1669647142,6523.896,4494.82,2029.076,8841.63
false,235,1669647152,6523.896,4494.82,2029.076,8841.63
false,235,1669647596,6523.896,4494.82,2029.076,8841.63
false,233,1669651191,6524.496,4495.42,2029.076,8841.63
false,276,1669654797,6524.816,4495.74,2029.076,8841.63
false,437,1669658393,6525.901,4496.825,2029.076,8841.63
false,362,1669661992,6526.732,4497.656,2029.076,8841.63
false,471,1669665603,6527.062,4497.986,2029.076,8841.63

+ 0

Score:8

Ubuntu

pLumo

11/29/23, 10:33 AM

For me it seems easier and safer to first convert to proper json, then use a parser (e.g. jq or jsoncsv that you have tried) instead of writing your own parser.

Convert to json by simply adding [ in then first row and append ] after the last row and a comma to every except the last row.

sed '1s/^/[/;$!s/$/,/;$a]' file

Then convert to csv using jq

jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[] | @csv'

And put it together:

$ sed '1s/^/[/;$!s/$/,/;$a]' file | jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[] | @csv'
"Outdated","Watt","Timestamp","A_Plus","A_Plus_HT","A_Plus_NT","A_Minus"
false,233,1669647142,6523.896,4494.82,2029.076,8841.63
false,235,1669647152,6523.896,4494.82,2029.076,8841.63
false,235,1669647596,6523.896,4494.82,2029.076,8841.63
false,233,1669651191,6524.496,4495.42,2029.076,8841.63
false,276,1669654797,6524.816,4495.74,2029.076,8841.63
false,437,1669658393,6525.901,4496.825,2029.076,8841.63
false,362,1669661992,6526.732,4497.656,2029.076,8841.63
false,471,1669665603,6527.062,4497.986,2029.076,8841.63

+ 1

steeldriver

11/29/23, 3:40 PM

You could avoid the sed by using --slurp (or -s) I think

Score:3

Ubuntu

Raffa

11/30/23, 6:06 PM

The type of data you provide in your example is called Newline delimited JSON(AKA NDJSON) ... There are tools that support it natively like in2csv which is provided by the csvkit package ... so install it first like so:

sudo apt install csvkit

Then, specify the format with the option -f ndjson and simply use it on your file like so:

in2csv -f ndjson file

That should result in:

Outdated,Watt,Timestamp,A_Plus,A_Plus_HT,A_Plus_NT,A_Minus
False,233,1669647142,6523.896,4494.82,2029.076,8841.63
False,235,1669647152,6523.896,4494.82,2029.076,8841.63
False,235,1669647596,6523.896,4494.82,2029.076,8841.63
False,233,1669651191,6524.496,4495.42,2029.076,8841.63
False,276,1669654797,6524.816,4495.74,2029.076,8841.63
False,437,1669658393,6525.901,4496.825,2029.076,8841.63
False,362,1669661992,6526.732,4497.656,2029.076,8841.63
False,471,1669665603,6527.062,4497.986,2029.076,8841.63

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Convert text file to CSV on shell

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.