I want to modify a file to remove all punctuation, numbers and uppercases and also change the file so that there is only 1 word per line exemple:
Hello, how are you!
hello
how
are
you
With some help I came up with this:
tr -d '[:punct:]' < file | tr -s '[:space:]' '\n' | tr -d '[0-9]' | tr '[A-Z]' '[a-z]' > cleanfile.txt
The issue however is when I have an adress in my file I end up with httpadresscom instead of
http
adress
com
I also DON'T want words like "don't" or "readme.txt" to have this output
don
t
readme
txt