11

The question What options are there for doing spell-checking in emacs discuss various spell checking solutions for Emacs. Emacs ispell interface uses external tools such as Aspell or Hunspell for spell checking. Considering the fact that many free dictionary files are available (for example by Openoffice see for example the dictionary file en_US.zip) I am wondering if it would be possible to write a native spell checking function in Emacs using such free dictionary files.

Added: More precisely I am wondering if there are existing packages which can be used for spell-checking (without using external tools such as Aspell or Hunspell) within Emacs. A tool which checks if a word is correct and if not suggests some corrections.

In case the answer is negative, any hint to do this would be helpful.

Name
  • 7,689
  • 4
  • 38
  • 84
  • 2
    Of course it would be possible. Can you clarify if you are looking for existing packages that do this or for pointers of how you could implement it yourself. – verdammelt Feb 10 '15 at 17:49
  • 2
    @Name, I've seen you ask a few Windows-related questions, and I suspect that Windows is what prompted you to ask this one. I highly recommend using 32-bit Cygwin on Windows. I use it on 2/3 of my computers (work and gaming pc) With the `emacs-w32` package, Emacs uses the native Windows GUI. You get access to a ton of prebuilt Unix, Linux, and GNU packages (including aspell), and the ability to easily compile others from source (e.g., aspell compiles fine with no extra effort in Cygwin). Granted, there are hiccups, but overall I think it's better than native Windows. – nanny Feb 10 '15 at 18:10
  • @verdammelt Thanks, I edited the question. – Name Feb 10 '15 at 19:10
  • @nanny Yes, this is a Windows related question. Many thanks for suggesting to use Cygwin, I will give a try. But I am still looking to a solution without using external tools. – Name Feb 10 '15 at 19:17
  • 6
    @Name the problem with on-the-fly spellchecking, of course, will be performance. Emacs Lisp is not fast, and this is not really the sort of thing that it does well. Because you _really_ don't want to block the main thread, you'll need to spawn an [async](http://melpa.org/#/async) child process, to do the spellchecking. This would probably be more work than it's worth, unless it's worth an awful lot. – PythonNut Feb 10 '15 at 19:56
  • 1
    Emacs is simply better with the common external tools available. Spend the time getting Cygwin working and integrated. Or use the Cygwin-native Emacs (it's a bit slower, but is well-integrated by default). Or run Emacs in a Linux VM (using a local (Windows) display). – phils Feb 10 '15 at 20:51
  • 3
    Peter Norvig says that his Python spell checker is *very* fast. I guess reimplementing it in Elisp might be a nice exercise. – mbork Mar 23 '15 at 20:48
  • 1
    @mbork 10 words/second is hardly fast enough for large buffers, right? That's 0.1s lag that blocks the thread, even when using lazy checking. – PythonNut Apr 24 '15 at 04:19
  • 1
    @PythonNut: fair enough. OTOH, for on-the-fly checking it *might* work fast enough on fast machines. I don't type 10 words/second, for instance; more importantly, I don't type *all the time*, and wouldn't mind a 0.1 lag when using idle timers. – mbork Apr 24 '15 at 07:57
  • 2
    I have ported an existing CL version of the norvig checker to elisp, and wrote a simple on-demand spell checking function for it. Like others have said, its not very fast, using it for on the flychecking would not be pleasant. But here is the code: https://gist.github.com/jordonbiondo/111af9c304725391e378 and here is a gif of the on demand checker running: http://i.imgur.com/guuVT2O.gif. This is the text file I used to train the checker: https://ia600502.us.archive.org/21/items/encyclopediabrit26ed11arch/encyclopediabrit26ed11arch.txt – Jordon Biondo Apr 30 '15 at 15:23
  • 2
    I take back what I said about on the fly checking. Because the known words are in a hash, going through a buffer and marking words that arent in the hash in quite fast. Here is a full buffer check going on and off: http://i.imgur.com/MbqJG9i.gif I think this could very well be turned into a nice package and work well. You'll notice things like `delete` are marked in the buffer, not because the spell checker is broken, but because it wasnt given `delete` as a known word. With a good training text that would be taken care of. – Jordon Biondo Apr 30 '15 at 15:52
  • 3
    Here it is as a decently working minor mode: https://gist.github.com/jordonbiondo/7a729b652360a528f117 You'll need to provide your own dictionary file, but there is a link to one in the docs. – Jordon Biondo May 01 '15 at 13:45
  • @JordonBiondo in `se-spell.el` it would be great if there would be the possibility to add a new word in the dictionary file when the list of suggested words does not cover the desired word. – Name Jul 31 '15 at 18:57
  • 1
    @JordonBiondo: any chance you could write that up as an answer? We seem to have the solution to the question in the comments which could now be marked as answered. – stsquad Mar 10 '16 at 11:43

2 Answers2

2

From the comments, Jordon Biondo has some proof-of-concept code at

https://gist.github.com/jordonbiondo

see in particular se-spell.el and elisp-checker.el.

Andrew Swann
  • 3,436
  • 2
  • 15
  • 43
0

See: spell-fu this highlights misspelled words, without calling external processes.

Although currently the aspell is used for the initial dictionary dump.

ideasman42
  • 8,375
  • 1
  • 28
  • 105
  • Thank you. I get the error `spell-fu--word-list-ensure: Wrong type argument: stringp, nil` after enabling spell-fu mode. – Name May 19 '21 at 14:29
  • Could you run `C-x`, `toggle-debug-on-error` - then enable spell-fu mode? (this is worth a bug report). – ideasman42 May 19 '21 at 15:09
  • Apparently, the problem came from the fact that the file `\.emacs.d\spell-fu\words_default.txt` didn't exist. I create that file and the error disappeared. Could you please explain, what should be the content of this file? Any large file of correct spelled words? Perhaps it would be helpful to give indication to use a freely available file. – Name May 19 '21 at 15:40