Commit graph

47 commits

Author SHA1 Message Date
0xhtml
8b6a3f3e11 [enh] engine: mojeek - add language support
Improve region and language detection / all locale

Testing has shown the following behaviour for the different
default and empty values of Mojeeks parameters:

| param    | idx | value  | behaviour                 |
| -------- | --- | ------ | ------------------------- |
| region   |  0  | ''     | detect region based on IP |
| region   |  1  | 'none' | all regions               |
| language |  0  | ''     | all languages             |
2024-10-15 06:37:01 +02:00
Markus Heiser
11fe88bb40 [fix] update wikidata units - remove URL prefix from Q-name
Sometimes the URL prefix switches from a http to a https, this patch harden the
code that removes the URL prefix from wikidata Q-name, issue has been reported
in [1].

[1] https://github.com/searxng/searxng/pull/3437#issuecomment-2082121730

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-01 18:58:28 +02:00
Bnyro
46efb2f36d [feat] plugins: new unit converter plugin 2024-04-27 18:11:33 +02:00
Markus Heiser
542f7d0d7b [mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTION
In the past, some files were tested with the standard profile, others with a
profile in which most of the messages were switched off ... some files were not
checked at all.

- ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished
- the distinction ``# lint: pylint`` is no longer necessary
- the pylint tasks have been reduced from three to two

  1. ./searx/engines -> lint engines with additional builtins
  2. ./searx ./searxng_extra ./tests -> lint all other python files

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11 14:55:38 +01:00
Markus Heiser
ce4aaf6cad [mod] comprehensive revision of the searxng_extra/update/ scripts
- pylint all scripts
- fix some errors reported by pyright
- from searx.data import data_dir (Path.open)
- fix import from pygments.formatters.html

NOTE: none functional changes!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-10 15:56:50 +01:00
Markus Heiser
cff0097289 [fix] update_external_bangs: BANGS_URL 'https://duckduckgo.com/bang.js'
JSON file which contains the bangs / there is no longer a versioning of this
file.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-10 11:58:20 +01:00
Markus Heiser
50d5a9ff60 [fix] issues reported by pylint 3.1.0
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-09 09:28:13 +01:00
Markus Heiser
894f164869 [fix] sort RTL_LOCALES before written into locales.json
To avoid unnecessary changes to the file, the list should be sorted before it is
written to the file.

You can test it by calling multiple times::

    make data.locales

and searx/data/locales.json should be unchanged.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-20 12:29:13 +01:00
Alexandre Flament
ed66ed758d [mod] reduce memory footprint by not calling babel.Locale.parse at runtime
babel.Locale.parse loads more than 60MB in RAM.  The only purpose is to get:

    LOCALE_NAMES   - searx.data.LOCALES["LOCALE_NAMES"]
    RTL_LOCALES    - searx.data.LOCALES["RTL_LOCALES"]

This commit calls babel.Locale.parse when the translations are update from
weblate and stored in::

    searx/data/locales.json

This file can be build by::

    ./manage data.locales

By store these variables in searx.data when the translations are updated we save
round about 65MB (usually 4 worker = 260MB of RAM saved.

Suggested-by: https://github.com/searxng/searxng/discussions/2633#discussioncomment-8490494
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-20 10:43:20 +01:00
dalf
14f73ef3d9 Update searx.data - update_engine_traits.py 2024-01-29 14:02:30 +01:00
Markus Heiser
3665b32aff Revert "[fix] update user agent"
This reverts commit 3c6549a17f.

Related:

- https://github.com/searxng/searxng/pull/2826
2023-12-23 07:48:38 +01:00
jazzzooo
3c6549a17f [fix] update user agent 2023-09-25 22:46:22 +02:00
jazzzooo
223b3487c3 [fix] spelling 2023-09-18 16:20:27 +02:00
Markus Heiser
935aed7ca4 [feature] dark theme for code highlighter in the result list
Closes: https://github.com/searxng/searxng/issues/1354

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-11 12:27:56 +02:00
Markus Heiser
0ebff871a5 [fix] update_currencies.py - AttributeError: 'str' object has no attribute 'insert'
Replace lists with one item by the item, not before last currency has been
added.  In this traceback 'MXN' is added to 'pesos' while pesos is no longer a
list as the optimization was carried out too early.

    $ ./local/py3/bin/python searxng_extra/update/update_currencies.py
    Traceback (most recent call last):
      File "searxng_extra/update/update_currencies.py", line 164, in <module>
        main()
      File "searxng_extra/update/update_currencies.py", line 157, in main
        add_currency_name(db, "pesos", 'MXN')
      File "searxng_extra/update/update_currencies.py", line 89, in add_currency_name
        iso4217_set.insert(0, iso4217)
      AttributeError: 'str' object has no attribute 'insert'

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 21:21:53 +02:00
Marc Abonce Seguin
d72bfb8ef5 [fix] Israeli flag emoji in locale dropdown
🇮🇱 https://emojipedia.org/flag-israel/
2023-04-16 08:40:23 +02:00
Alexandre FLAMENT
bb5db079c7 [fix] searxng_extra/update/update_engine_descriptions.py (part 2)
Wikipedia description are fetched without the help the wikipedia engine:

* the SQPARL query return the wikipedia URL of the article
2023-04-15 16:04:05 +02:00
Markus Heiser
27369ebec2 [fix] searxng_extra/update/update_engine_descriptions.py (part 1)
Follow up of #2269

The script to update the descriptions of the engines does no longer work since
PR #2269 has been merged.

searx/engines/wikipedia.py
==========================

1. There was a misusage of zh-classical.wikipedia.org:

   - `zh-classical` is dedicate to classical Chinese [1] which is not
     traditional Chinese [2].

   - zh.wikipedia.org has LanguageConverter enabled [3] and is going to
     dynamically show simplified or traditional Chinese according to the
     HTTP Accept-Language header.

2. The update_engine_descriptions.py needs a list of all wikipedias.  The
   implementation from #2269 included only a reduced list:

   - https://meta.wikimedia.org/wiki/Wikipedia_article_depth
   - https://meta.wikimedia.org/wiki/List_of_Wikipedias

searxng_extra/update/update_engine_descriptions.py
==================================================

Before PR #2269 there was a match_language() function that did an approximation
using various methods.  With PR #2269 there are only the types in the data model
of the languages, which can be recognized by babel.  The approximation methods,
which are needed (only here) in the determination of the descriptions, must be
replaced by other methods.

[1] https://en.wikipedia.org/wiki/Classical_Chinese
[2] https://en.wikipedia.org/wiki/Traditional_Chinese_characters
[3] https://www.mediawiki.org/wiki/Writing_systems#LanguageConverter

Closes: https://github.com/searxng/searxng/issues/2330
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 16:03:59 +02:00
Markus Heiser
16f0db4493 [mod] replace utils.match_language by locales.match_locale
This patch replaces the *full of magic* ``utils.match_language`` function by a
``locales.match_locale``.  The ``locales.match_locale`` function is based on the
``locales.build_engine_locales`` introduced in 9ae409a0 [1].

In the past SearXNG did only support a search by a language but not in a region.
This has been changed a long time ago and regions have been added to SearXNG
core but not to the engines.  The ``utils.match_language`` was the function to
handle the different aspects of language/regions in SearXNG core and the
supported *languages* in the engine.  The ``utils.match_language`` did it with
some magic and works good for most use cases but fails in some edge case.

To replace the concurrence of languages and regions in the SearXNG core the
``locales.build_engine_locales`` was introduced in 9ae409a0 [1].  With the last
patches all engines has been migrated to a ``fetch_traits`` and a
language/region concept that is based on ``locales.build_engine_locales``.

To summarize: there is no longer a need for the ``locales.match_language``.

[1] https://github.com/searxng/searxng/pull/1652

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
c9cd376186 [mod] replace searx.languages by searx.sxng_locales
With the language and region tags from the EngineTraitsMap the handling of
SearXNG's tags of languages and regions has been normalized and is no longer
a *mystery*.  The "languages" became "locales" that are supported by babel and
by this, the update_engine_traits.py can be simplified a lot.

Other code places can be simplified as well, but these simplifications
should (respectively can) only be done when none of the engines work with the
deprecated EngineTraits.supported_languages interface anymore.

This commit replaces searx.languages by searx.sxng_locales and fix the naming of
some names from "language" to "locale" (e.g. language_codes --> sxng_locales).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
6e5f22e558 [mod] replace engines_languages.json by engines_traits.json
Implementations of the *traits* of the engines.

Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*.  Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.

To load traits from the persistence::

    searx.enginelib.traits.EngineTraitsMap.from_data()

For new traits new properties can be added to the class::

    searx.enginelib.traits.EngineTraits

.. hint::

   Implementation is downward compatible to the deprecated *supported_languages
   method* from the vintage implementation.

   The vintage code is tagged as *deprecated* an can be removed when all engines
   has been ported to the *traits method*.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
9a710587e8 [fix] remove usage of deprecated-module distutis
Closes: https://github.com/searxng/searxng/issues/2168

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-10 15:31:54 +01:00
Markus Heiser
4c06837a50 [mod] make python code pylint 2.16.1 compliant
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-10 13:59:21 +01:00
ArtikusHG
1f8f8c1e91 Replace langdetect with fasttext 2022-12-16 21:07:39 +02:00
Alexandre Flament
e473addaff User agent: don't include the patch number in the Firefox version
The Firefox version in the user agent doesn't include the patch version: 106.0 not 106.0.2

Close #1914
2022-11-05 22:04:37 +01:00
Markus Heiser
9933155a2e [fix] update_osm_keys_tags.py: sort JSON dump
To get meaningful diff, the keys in JSON dump needs to be sorted.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-10-11 11:45:26 +02:00
Markus Heiser
ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Alexandre Flament
578b2a8183 fix searxng_extra/update/update*.py scripts
call searx.locales.locales_initialize before using LOCALE_NAMES

Related to https://github.com/searxng/searxng/pull/1306
2022-07-02 12:16:00 +02:00
Markus Heiser
e8541b6006 [theme] peel out oscar from SearXNG development
This is the first step of removing oscar theme

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-30 13:20:27 +02:00
Markus Heiser
62982c8812 [fix] add back missing languages & regions (followup of PR #1071)
In PR #1071 the language catalog of dailymotion has been cleaned up, before
there had been over 7000 "languages" in the catalog.

As a side effect of this clean-up the language & region catalog in SearXNG has
been reduced [1].

This patch reduce the ``min_engines_per_lang`` from 13 to 12 to get the missed
languages back in language & region catalog of SearXNG.

[1] 3bb62823ec (diff-f3f00db0f87f95b882624a192e0aac21525638af0b18c9514e765fcf1991678d)

Requested-by: @tiekoetter in a Matrix chat
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-22 12:09:42 +02:00
Markus Heiser
effcde3d0e [fix] add missing territory (country) name
Related-to: https://github.com/searxng/searxng/pull/1029#issuecomment-1086824911
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-05 16:48:25 +02:00
Alexandre Flament
0379856712
Merge pull request #967 from return42/language-filter
[mod] add flags to the languages filter
2022-03-28 21:36:20 +02:00
Markus Heiser
34fd2021d8 [fix] pylint issue in py3.10
searxng_extra/update/update_firefox_version.py:16:0: W0402:
Uses of a deprecated module 'distutils.version' (deprecated-module)

[1] https://github.com/searxng/searxng/pull/1007

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-25 08:39:40 +01:00
Markus Heiser
2e4557f3f3 [fix] languages: show country name even if there is only one country
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-19 16:45:14 +01:00
Markus Heiser
a25e3767d4 [fix] don't show flags for languages without region identifier
SearXNG shows two different things:

region:
  "de-CH" is the equivalent of "Schweiz (de)" in DDG.

languages:
  "en" doesn't say anything about the location. It is up the engines to do their
  best to select English results without a region.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/967#issuecomment-1072979693
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-19 15:09:13 +01:00
Markus Heiser
2841abaf55 [mod] add flags to the languages filter
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-19 15:09:13 +01:00
Markus Heiser
7cdd31440e [fix] external bangs: don't overwrite Bangs in data trie
Bangs with a `*` suffix (e.g. `!!d*`) overwrite Bangs with the same
prefix (e.g. `!!d`) [1].  This can be avoid when a non printable character is
used to tag a LEAF_KEY.

[1] https://github.com/searxng/searxng/pull/740#issuecomment-1010411888

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-12 19:37:13 +01:00
Markus Heiser
295876abaa [pylint] add scripts from searxng_extra/update to pylint
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-05 16:09:40 +01:00
Markus Heiser
ffea5d8ef5 [docs] add documentation for the scripts in searxng_extra/update
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-05 16:09:40 +01:00
Markus Heiser
8191e1a253 [fix] update_languages.py: generate code that passes CI
File searx/languages.py, created by update_languages.py has to pass quality
check from CI::

    make format.python

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-01 18:32:21 +01:00
Markus Heiser
8a07559ab5 [fix] update_languages.py: no excption on unknown locale & language
Fix exception handling of unknown locales and languages::

    ERROR: ca_ES_valencia --> [Errno 2] No such file or directory: 'local/py3/lib/python3.8/site-packages/babel/locale-data/ca_ES_valencia.dat'
    ERROR: languages['fil-PH'] --> {'name': None, 'english_name': None}
    ERROR: languages['nb-NO'] --> {'name': None, 'english_name': None}

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-01 17:31:38 +01:00
Markus Heiser
3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Markus Heiser
fcdc2c2cd2 [format.python] disable py code formatting for some hunks of code
Disable the python code formatting from python-black, where the readability of
code suffers by formatting.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:16:03 +01:00
Alexandre Flament
56e6d19b48
update_firefox_version.py: update user agent signature
The user agent from Windows is
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0

See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent/Firefox#windows
2021-12-16 23:10:39 +01:00
Alexandre Flament
828088fa5a [mod] update_languages: min_engines_per_country=7
a (language,country) tuple is included if 7 engines have it, was 10 before.

close #432
2021-10-26 12:13:23 +02:00
Markus Heiser
955eab8240 [mod] searxng_extras - minor improvements
- fix docs/searxng_extra/standalone_searx.py.rst
- add SPDX tag
- pylint standalone_searx.py and update_wikidata_units.py

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-03 19:04:18 +02:00
Alexandre Flament
1bb82a6b54 SearXNG: searxng_extra 2021-10-02 17:30:39 +02:00