Follow up of #2269
The script to update the descriptions of the engines does no longer work since
PR #2269 has been merged.
searx/engines/wikipedia.py
==========================
1. There was a misusage of zh-classical.wikipedia.org:
- `zh-classical` is dedicate to classical Chinese [1] which is not
traditional Chinese [2].
- zh.wikipedia.org has LanguageConverter enabled [3] and is going to
dynamically show simplified or traditional Chinese according to the
HTTP Accept-Language header.
2. The update_engine_descriptions.py needs a list of all wikipedias. The
implementation from #2269 included only a reduced list:
- https://meta.wikimedia.org/wiki/Wikipedia_article_depth
- https://meta.wikimedia.org/wiki/List_of_Wikipedias
searxng_extra/update/update_engine_descriptions.py
==================================================
Before PR #2269 there was a match_language() function that did an approximation
using various methods. With PR #2269 there are only the types in the data model
of the languages, which can be recognized by babel. The approximation methods,
which are needed (only here) in the determination of the descriptions, must be
replaced by other methods.
[1] https://en.wikipedia.org/wiki/Classical_Chinese
[2] https://en.wikipedia.org/wiki/Traditional_Chinese_characters
[3] https://www.mediawiki.org/wiki/Writing_systems#LanguageConverter
Closes: https://github.com/searxng/searxng/issues/2330
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Since [bb3a01f8] has been merged to the Farside project, Farside instances do no
longer need to send requests to SearXNG instances [1].
There are some old unmaintained Farside instances on the web that continue to
query SearXNG instances --> we can safely block their requests.
[1] https://github.com/benbusby/farside/issues/95
[bb3a01f8] https://github.com/benbusby/farside/commit/bb3a01f8
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
When the user press [TAB] the input form should be filled with the highlighted
item from the autocomplete list, but not release a search / with other words:
what we now have by pressing once on [ENTER] should be mapped to the [TAB] key
and pressing [ENTER] once should release a search query. [1]
[1] https://github.com/searxng/searxng/issues/778#issuecomment-1016593816
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- Update input when selecting autocomplete prediction with keyboard
- Search immediately by pressing enter key
- Search immediately by clicking on an autocomplete suggestion
Related:
- https://github.com/searxng/searxng/issues/778
On some result items from Bing-WEB the `<span class='algoSlug_icon'>` tag is the
only tag that contains a description. The issue can be reproduced by [1]::
!bi vmware
[1] https://github.com/searxng/searxng/issues/1764#issuecomment-1417990531
Reported-by: @AlyoshaVasilieva
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This PR does no functional change it is just an attempt to make more clear in
the code, what a default category is and what a subcategory is. The previous
name 'others' leads to confusion with the **category 'other'**.
If a engine is not assigned to a category, the default is assigned::
DEFAULT_CATEGORY = 'other'
If an engine has only one category and this category is shown as tab in the user
interface, this engine has no further subgrouping::
NO_SUBGROUPING = 'without further subgrouping'
Related:
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
When using ``use_default_settings: true``, removing default categories from
settings.yml will not remove them from the UI.
The value ``categories_as_tabs`` is a dictionary type (a4c2cfb) and dictionary
types are merged additive by ``settings_loader.update_settings()``.
This patch replaces the default ``categories_as_tabs`` by the one from the
``user_settings``.
Related: https://github.com/searxng/searxng/issues/1019#issuecomment-1193145654
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- requests without HTTP header 'Connection' or missing 'User-Agent' will be
blocked by the limiter
- re_bot is related to 'User-Agent' and has been renamed to block_user_agent
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Google-News returns internal links where the origin URL is encoded in a
base64 (RFC 2045 aka URL-safe) string.
Closes: https://github.com/searxng/searxng/issues/1959
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In debug mode more detailed logging is needed to evaluate if an access should
have been blocked by the limiter.
BTW: remove duplicate code checking bot signature ``re_bot.match(user_agent)``
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Since 28. March google has changed its response, this patch fixes the google
engine to scrap out the results & images from the new designed response.
closes: https://github.com/searxng/searxng/issues/2287
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This patch replaces the *full of magic* ``utils.match_language`` function by a
``locales.match_locale``. The ``locales.match_locale`` function is based on the
``locales.build_engine_locales`` introduced in 9ae409a0 [1].
In the past SearXNG did only support a search by a language but not in a region.
This has been changed a long time ago and regions have been added to SearXNG
core but not to the engines. The ``utils.match_language`` was the function to
handle the different aspects of language/regions in SearXNG core and the
supported *languages* in the engine. The ``utils.match_language`` did it with
some magic and works good for most use cases but fails in some edge case.
To replace the concurrence of languages and regions in the SearXNG core the
``locales.build_engine_locales`` was introduced in 9ae409a0 [1]. With the last
patches all engines has been migrated to a ``fetch_traits`` and a
language/region concept that is based on ``locales.build_engine_locales``.
To summarize: there is no longer a need for the ``locales.match_language``.
[1] https://github.com/searxng/searxng/pull/1652
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>