I want to sort some text in emacs that is based on a field that contains verse numbers in unicode (devanagari). The text is like this:
Verse text bla १०.३ #10.3
Verse text blah This is १.१९ #1.19
Verse text ble १०.१३ #10.13
Verse text bleh ६.२७ #6.27
Verse text blu १९.२ #19.2
Verse text bluh ४.७ #4.7
I've added the corresponding arabic numerals with #
at the end of each line (these will not appear in the original text). I've been able to do with python. Firstly, I wrote a function get_num()
that converts the unicode text into an arabic decimal. Later, I used sorted()
with a custom key function for sorting.
Is it possible to achieve this level of customized sorting with an elisp function? I looked at sort-regexp-fields
and sort-fields
but haven't understood if they are as customizable as python's sorted()
Below is the python code for reference:
In [87]: inp
Out[87]:
['Verse text bla १०.३ #10.3 ',
'Verse text blah This is १.१९ #1.19 ',
'Verse text ble १०.१३ #10.13 ',
'Verse text bleh ६.२७ #6.27 ',
'Verse text blu १९.२ #19.2 ',
'Verse text bluh ४.७ #4.7 ']
In [88]: myre = re.compile(r'([०१२३४५६७८९]+\.[०१२३४५६७८९]+)')
In [90]: def get_num(inp):
...: parts = inp.split('.')
...: p1 = ''.join([str(ord(x) - 2406) for x in parts[0]])
...: p2 = ''.join([str(ord(x) - 2406) for x in parts[1]])
...: return '{}.{}'.format(p1, p2)
In [91]: sorted(inp, key=lambda x: [int(i) for i in get_num(myre.search(x).group()).rstrip(".").split('.')])
Out[91]:
['Verse text blah This is १.१९ #1.19 ',
'Verse text bluh ४.७ #4.7 ',
'Verse text bleh ६.२७ #6.27 ',
'Verse text bla १०.३ #10.3 ',
'Verse text ble १०.१३ #10.13 ',
'Verse text blu १९.२ #19.2 ']