I am having trouble getting my sed syntax down to add a varying number of leading zeros to a numeric organizational scheme. The strings I am operating on appear like
1.1.1.1,Some Text Here
leveraging the sed syntax
sed -r ":r;s/\b[0-9]{1,$((1))}\b/0&/g;tr"
I am able to elicit the response
01.01.01.01,Some Text Here
However, What I am looking for is something to zero-fill up to 2 digits in fields 2 and 3 and 3 digits in field 4 so that all items are of a standard length at [0-9].[0-9]{2}.[0-9]{2}.[0-9]{3}
1.01.01.001,Some Text Here
For the life of me I cannot figure even how to modify the boundary to include the parameters necessary to snap to only numerals following a period. I think it has something to do with the use of the \b which I understand matches zero characters at a word boundary, but I do not understand why my attempts to add a period to the match fail as follows:
sed -r ":r;s/\.\b[0-9]{1,$((1))}\b/0&/g;tr"
sed -r ":r;s/\b\.[0-9]{1,$((1))}\b/0&/g;tr"
Both cause the statement to hang
sed -r ":r;s/\b[0-9]\.{1,$((1))}\b/0&/g;tr"
sed -r ":r;s/\b[0-9]{1,$((1))}\.\b/0&/g;tr"
sed -r ":r;s/\b[0-9]{1,$((1))}\b\./0&/g;tr"
cause the statement to output:
1.01.01.1,Some Text Here
Additionally, I expect that I will have additional problems if the statement contains text like:
1.1.1.1,Some Number 1 Here
It is a foregone conclusion that I need to really learn sed and all of its complexities. I am working on that, but expect that this particular statement will continue to cause me trouble for a while. Any help would be greatly appreciated.
EDIT: I've figured out a way... This statement seems to do what I am looking for, but there has got to be a more elegant way to do this.
sed -r ':r;s/\b[0-9]{1,1}\.\b/0&/;tr;:i;s/\b[0-9]{1,2},\b/0&/;ti;s/.//'
Also, syntactically this will cause problems if a similar number format appears in the text... similar to:
1.1.1.1,Some Text Referring to Document XXX Heading 1.2.3
In which case it will result in:
1.01.01.001,Some Text Referring to Document XXX Heading 01.02.03
Solved Thank you all for your help here. I initially solved the problem with the answer I accepted below. I've sense moved the solution into Python as a part of a larger solution leveraging the sort below:
def getPaddedKey(line):
keyparts = line[0].split(".")
keyparts = map(lambda x: x.rjust(5, '0'), keyparts)
return '.'.join(keyparts)
s=sorted(reader, key=getPaddedKey)
sed -r ':r;s/\b[0-9]{1,1}\.\b/0&/;tr;:i;s/\b[0-9]{1,2},\b/0&/;ti;s/.//'
However, I'd love to know if there is a more elegant approach.
– daijizai Jul 18 '17 at 18:39printf
(or aprintf
call within Awk) may be more straightforward. – Wildcard Jul 18 '17 at 21:51