Message ID | 20250703102301.3566750-1-ejo@pengutronix.de |
---|---|
State | Accepted |
Headers | show |
Series | conf.py: improve SearchEnglish to handle terms with dots | expand |
On Thu, 03 Jul 2025 12:23:01 +0200, Enrico Jörns wrote: > While search queries already handled words with hyphens correctly, they > did not do so for words with dots. > > To fix this, we > > - enhance the word tokenizer to treat both dots ('.') and hyphens ('-') > as valid characters within words. > (For robustness, explicitly exclude dots/hyphens at the start or end > of a word from indexing.) > - adjust query processing to avoid splitting on dots in search input > > [...] Applied, thanks! [1/1] conf.py: improve SearchEnglish to handle terms with dots commit: 80084a4cabdf7f61c7e93eda8ddbd5bc7d54e041 Best regards,
diff --git a/documentation/conf.py b/documentation/conf.py index 1eca8756a..c07b6c419 100644 --- a/documentation/conf.py +++ b/documentation/conf.py @@ -179,13 +179,13 @@ from sphinx.search import SearchEnglish from sphinx.search import languages class DashFriendlySearchEnglish(SearchEnglish): - # Accept words that can include hyphens - _word_re = re.compile(r'[\w\-]+') + # Accept words that can include 'inner' hyphens or dots + _word_re = re.compile(r'[\w]+(?:[\.\-][\w]+)*') js_splitter_code = r""" function splitQuery(query) { return query - .split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}-]+/gu) + .split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}\-\.]+/gu) .filter(term => term.length > 0); } """
While search queries already handled words with hyphens correctly, they did not do so for words with dots. To fix this, we - enhance the word tokenizer to treat both dots ('.') and hyphens ('-') as valid characters within words. (For robustness, explicitly exclude dots/hyphens at the start or end of a word from indexing.) - adjust query processing to avoid splitting on dots in search input This allows search queries to correctly match terms such as 'local.conf', 'site.conf', and similar ones now. Fixes: [YOCTO #14534] Signed-off-by: Enrico Jörns <ejo@pengutronix.de> --- documentation/conf.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)