Remove ispell pipe mode (`enchant-2 -a`) #393

rrthomas · 2024-08-07T09:30:29Z

Emacs is the only user, and with the jinx package, no longer needs it.

astoff · 2024-08-11T08:19:22Z

I'm afraid this is a bit of a misunderstanding. Jinx is just one of the Emacs spell-checking packages, with its own advantages and disadvantages. Note also that any spellchecking package that doesn't rely on the -a interface necessarily has the disadvantage of requiring to link Emacs against some library or compile an external module.

rrthomas · 2024-08-11T08:23:52Z

@astoff Thanks for your feedback. It's not a misunderstanding: I wrote the Enchant support for ispell.el and use it daily.

I agree that there are advantages to not needing to compile against an external library, but I also believe these are largely outweighed in the case of Enchant: the pipe method is slow and limits access to Enchant functionality. It also causes me a maintenance burden.

rrthomas · 2024-08-11T19:52:03Z

I have filed minad/jinx#199 and started using Jinx myself to explore how far we can go on the Emacs side. Certainly Jinx was easy to set up and use.

rrthomas · 2024-10-09T09:34:27Z

It is worth considering the interface of nuspell(1) to copy for a simplified enchant(1).

astoff · 2024-10-09T11:19:29Z

What use do you have in mind? IMHO ispell -a is the only that that anyone might care about, and the nuspell command (kinda like but not quite the same) is pretty useless.

rrthomas · 2024-10-09T11:22:57Z

Mostly diagnostics, also simple batch-mode checking. As you say, not that useful. But I don't think there's much use for pipe-based checking any more.

astoff · 2024-10-09T12:45:28Z

Just for the record, I wrote an Emacs package last year that does pipe-based checking and in my book it works very well (in fact, it served as inspiration for Jinx).

Also, apart from figuring out the async communication (which as always was tricky), it is a simpler package than Jinx, and doesn't require compiling a module.

Also, as a design decision, my package doesn't dabble into tokenization questions, which is language dependent, and leaves it entirely to the spellchecker. Because of #316, enchant doesn't work well with my package, so I've been using hunspell instead. (Was that issue solved? I can't reproduce it right now). If I was using enchant, though, I'd protest vehemently :-).

rrthomas · 2024-10-09T12:49:33Z

Also, as a design decision, my package doesn't dabble into tokenization questions

This is interesting, because I'm not clear that Enchant should be doing this either. While it is reasonable to think that Enchant could do it when fed plain text, I don't think that makes sense for most applications (which aren't editing plain texts), which already have to know where the words are.

astoff · 2024-10-09T13:21:27Z

I'm not sure either, but the rules for what constitutes a word are language-dependent (see e.g. /usr/share/hunspell/*.aff) and I would argue that belongs to the spell checker, not to the application using it. There are lots of tricky details, such as when surrounding punctuation is part of the word.

Certainly there is a part of the story that belongs to the application. For example, restrict to comments in source code or filter out markup commands.

rrthomas · 2024-10-09T13:37:23Z

The problem is that the application will typically also be involved in deciding what is the text that has to be checked, according to non-textual markup. It has to work out all this stuff anyway, e.g. for layout purposes.

astoff · 2024-10-09T14:21:03Z

Sure, and that's true even when there's no visual markup, like I said in the second paragraph of my previous message.

But deciding what constitutes a word (or in general any information encoded in the dictionary aff file) is a different story.

rrthomas · 2024-10-09T14:26:23Z

By "all this stuff", I meant that applications requiring spell-checking will already have to work out where word boundaries are, for layout purposes. So it's not a different story, it's exactly the same story, I'm claiming.

astoff · 2024-10-09T14:51:49Z

How is the application supposed to know if a hypen is a word boundary (well-known) or part of a fixed expression (vis-à-vis)? Splitting on whitespace or something like that isn't exactly the same story as splitting words.

minad · 2024-10-16T19:45:41Z

I think what is needed is some API where the application can ask Enchant about language-dependent tokenization rules. There is already enchant_dict_get_extra_word_characters but this is probably too limited. The bulk of the tokenization work is still better done on the side of the application, since it knows about the relevant context sensitive information, the markup format, comment syntax, etc.

Regarding the pipe mode, I think using it for a just in time Emacs spell checker was the wrong decision, @astoff. The performance is poor, otherwise I wouldn't have made Jinx. In contrast, interacting with the Enchant API is completely painless, except for the module compilation process. But then I think Elisp should provide an FFI at some point which would completely alleviate this pain.

astoff · 2024-10-16T20:28:59Z

The performance is poor

Well, I still can't reproduce the poor performance you got, but also I think it would be very sad if there were any fundamental reason for it to be that way. It would mean LSP cannot work smoothly either.

In contrast, interacting with the Enchant API is completely painless, except for the module compilation process.

Sure, I have no doubt about that ;-)

I think what is needed is some API where the application can ask Enchant about language-dependent tokenization rules.

I think that's how most APIs (e.g. Hunspell) work, actually. It seems to me it's not general enough (for example in English you need to expresses that ' is part of a word when inside, but not when around a word). To me it would seem to make more sense if the API allowed the application just split on whitespace. So you can ask if "well-knownx" is correct and the answer would not be yes or no, but rather that there's a typo form characters 5 to 11.

minad · 2024-10-17T07:22:47Z

Well, I still can't reproduce the poor performance you got, but also I think it would be very sad if there were any fundamental reason for it to be that way. It would mean LSP cannot work smoothly either.

You probably didn't sufficiently profile your package, since jit-spell appeared prominently in the profiles in my test, which is already completely unacceptable for an always-on minor mode, even if not noticeable. Given all these complaints about slow Emacs, the recipe for that are expensive mode lines and slow minor modes, which are always on. I run EXWM and everything runs inside Emacs, so I may be more affected. Also I assume that the effect becomes more noticeable on slower systems. Now contrast this with Jinx, which doesn't add much cost at all to the profile. IPC is simply more expensive than a single C function call plus some FFI marshaling. Why are we even arguing this?

Regarding LSP - LSP in Emacs had massive problems due to the IPC and JSON parsing, this is for sure. See the lsp/eglot-booster projects and also the new JSON parser in Emacs 30. In any case, for LSP we get some benefit by using the IPC protocol due to the various servers and simply because the protocol already exists in this form and cannot be replaced easily. If LSP servers could run inside the host process without the expensive communication one could save some latency, but the downsides would be more significant given the complexity of LSP. In the case of spell checking, the API essentially boils down to a single function to check a word plus a little bit of setup. There the trade-offs are different and Enchant already provides a nice abstraction to embed the spell checker inside the host program with very little hassle. So this is just an apples to oranges comparison with different answers.

Just to be clear regarding this discussion - I am neutral. I have need for the pipe mode but I had assumed that the pipe mode would also be useful for other software, which would speak in favor of keeping it. Also there is jit-spell. Then I've learned that @rrthomas only added the pipe mode for ispell.el integration, so it is up to him to consider removing it again if the maintenance effort is not justifiable.

astoff · 2024-10-17T10:01:18Z

This is not a matter of profiling. The issue you observe I can't reproduce, and that's the problem. I did profile quite a bit and say on an Org mode buffer I always I see something like this, which says all is fine:

        1328  71% + redisplay_internal (C function)
         194  10%   Automatic GC
         135   7% + command-execute
         112   6% + org-appear--post-cmd
          40   2% + jit-spell--process-filter
          33   1% + timer-event-handler
           9   0% + winner-save-old-configurations
           2   0% + internal-echo-keystrokes-prefix
           2   0%   internal-timer-start-idle
           2   0%   undo-auto--add-boundary
           1   0%   org-appear--pre-cmd
           1   0%   tooltip-hide
           0   0%   ...

Also I assume that the effect becomes more noticeable on slower systems.

In the absence of further information I would assume a slow system is slower overall. I would be very interested in getting to the bottom of this, because it must be due to some setting or build configuration. But obviously I have no access to your computer to further investigate, neither did I get any similar bug report.

[I also have no complaints regarding LSP — or rather, I do notice some latency but it doesn't noticeably slows down my Emacs, which is in line with my jit-spell experience.]

Concerning the ispell -a interface, I just came here to say that Emacs still uses it. Not with the intent to advocate for anything, but just to let @rrthomas take an informed decision (it turn out it was unnecessary to mention anything, since he was aware of the situation already.)

minad · 2024-10-17T10:12:57Z

@astoff

This is not a matter of profiling. The issue you observe I can't reproduce, and that's the problem. I did profile quite a bit and say on an Org mode buffer I always I see something like this, which says all is fine:

The profile you showed already demonstrates the problem. It is clearly not fine. If you do the same with Jinx it won't appear with 2% in the profile. 2% is a lot for the process filter only to send a few words back and forth, and it will even get up when scrolling, and the backend process cost is not even included yet. Anyway, I am not interested in discussing this further given your ignorant position. There is simply nothing wrong on my system, nothing to diagnose, nothing to get to the bottom of, except that I don't have the most powerful machine - still no latency or performance issues in my current setup without jit-spell. Look, you know that I am experienced enough given my other packages, long time programming and Emacs development experience, so you could also trust me on this, instead of repeatedly claiming that there is no problem, despite the profile on your machine already showing some impact. I gave your package a fair and long try. In the beginning it worked well, but after short time of usage I found that latency went up unacceptably. I tried to make suggestions regarding how the efficiency could be improved via some rate limiting and queuing to reduce the pressure on the inefficient Emacs process filters, which you were not interested in. Why should I trade this given that Jinx can just do without any increase in latency? Jinx started as an experiment and it immediately solved the problem, otherwise I wouldn't have pursued it further.

[I also have no complaints regarding LSP — or rather, I do notice some latency but it doesn't noticeably slows down my Emacs, which is in line with my jit-spell experience.]

When reading around one will see reports about LSP performance and latency issues. It depends on the server you are using, and I assume that you only use good ones which keep the amount of data over IPC to a minimum. Of course you also have the experience to diagnose given that you've even authored your own server.

astoff · 2024-10-17T15:03:12Z

Man, you really went ballistic in that message. I don't quite see what you think you need to defend. It's obvious that a module is way faster than IPC, and Jinx is probably a great package.

We should indeed stop discussing this, but I do want to correct your point that I am

repeatedly claiming that there is no problem

To quote myself, “the issue you observe I can't reproduce, and that's the problem”. Rephrasing more straightforwardly, the problem is that I can't reproduce the issue you observe. Please don't read in that assertion more than is written. I'm just saying that unfortunately I couldn't work with the info you provided me, despite the effort of both parts.

rrthomas mentioned this issue Aug 11, 2024

Add tests for enchant binary #214

Open

rrthomas added this to the v3 milestone Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove ispell pipe mode (`enchant-2 -a`) #393

Remove ispell pipe mode (`enchant-2 -a`) #393

rrthomas commented Aug 7, 2024

astoff commented Aug 11, 2024

rrthomas commented Aug 11, 2024

rrthomas commented Aug 11, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

minad commented Oct 16, 2024

astoff commented Oct 16, 2024

minad commented Oct 17, 2024 •

edited

Loading

astoff commented Oct 17, 2024

minad commented Oct 17, 2024 •

edited

Loading

astoff commented Oct 17, 2024

Remove ispell pipe mode (enchant-2 -a) #393

Remove ispell pipe mode (enchant-2 -a) #393

Comments

rrthomas commented Aug 7, 2024

astoff commented Aug 11, 2024

rrthomas commented Aug 11, 2024

rrthomas commented Aug 11, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

rrthomas commented Oct 9, 2024

astoff commented Oct 9, 2024

minad commented Oct 16, 2024

astoff commented Oct 16, 2024

minad commented Oct 17, 2024 • edited Loading

astoff commented Oct 17, 2024

minad commented Oct 17, 2024 • edited Loading

astoff commented Oct 17, 2024

Remove ispell pipe mode (`enchant-2 -a`) #393

Remove ispell pipe mode (`enchant-2 -a`) #393

minad commented Oct 17, 2024 •

edited

Loading

minad commented Oct 17, 2024 •

edited

Loading