First of all, I’m quite new here. So I fear the following problem is really trivial.
I’m trying to set up an offline bot for french language. I chose v12.1.1, installed duckling, installed two language sets (fr + en) for 100 dimensions. I run everything in Kubernetes (I plan to share related YAML files, if people are interested).
I created some intents, mostly with accent (hard to not use accent in french ). I set a flow to collect all non-identified intents. When I ask a question, exactly as written in the intent, with accents, the bot fallback. When I replace accented letters with non-accented one, the bot identify the intent and select the right flow.
- “mon compte est désactivé” failed
- “mon compte est desactive” succeed
During experiment, I saw strange behavior. For a given sentence, in the debugger, the word “créer” was identified as two token “cr” and “er”.
Note: I created an pattern entity like “[a-z0-9]+” for login.
As I understand internationalization is heavily used, I’m quite sure I misconfigured something. But I do not understand what.