I have sample:
"name": "The title of website",
"sync_transaction_version": "1",
"type": "url",
"url": "https://url_of_website"
I want to get the following output:
"The title of website" url_of_website
I need to remove the protocol prefix from the URL, so that only url_of_website
is left (and no http
in the front).
Problem is I'm not quite familiar with sed
reading multiple lines, doing some research reach me https://unix.stackexchange.com/a/337399/256195, still can't produce the result.
A valid json object that I'm trying to parse is Bookmark
of google chrome , sample:
{
"checksum": "9e44bb7b76d8c39c45420dd2158a4521",
"roots": {
"bookmark_bar": {
"children": [ {
"children": [ {
"date_added": "13161269379464568",
"id": "2046",
"name": "The title is here",
"sync_transaction_version": "1",
"type": "url",
"url": "https://the_url_is_here"
}, {
"date_added": "13161324436994183",
"id": "2047",
"meta_info": {
"last_visited_desktop": "13176472235950821"
},
"name": "The title here",
"sync_transaction_version": "1",
"type": "url",
"url": "https://url_here"
} ]
} ]
}
}
}
sed
. JSON is a structured document format unsuitable for parsing by anything other than a JSON parser. Doing it withsed
would require you to implement a JSON parser insed
that could handle the different entity encoding etc. that could be present in the data (especially in URLs). – Kusalananda Nov 29 '18 at 14:51