Commit graph

106 commits

Author SHA1 Message Date
ArtikusHG
1f8f8c1e91 Replace langdetect with fasttext 2022-12-16 21:07:39 +02:00
Markus Heiser
ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Alexandre Flament
2babf59adc [fix] pyright repported errors
The errors make pyright usage useless since a new error won't be seen [1].

[1] https://github.com/searxng/searxng/pull/1569

```
  searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
    "Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
    Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
  searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
    Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
  searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
    Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
  searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
    Type "str" cannot be assigned to type "BytesPath"
      "str" is incompatible with "bytes"
      "str" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['themes']" cannot be assigned to type "BytesPath"
      "Literal['themes']" is incompatible with "bytes"
      "Literal['themes']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "str | Any | None" cannot be assigned to type "BytesPath"
      Type "str" cannot be assigned to type "BytesPath"
        "str" is incompatible with "bytes"
        "str" is incompatible with protocol "PathLike[bytes]"
          "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['img']" cannot be assigned to type "BytesPath"
      "Literal['img']" is incompatible with "bytes"
      "Literal['img']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
  searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
  searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
  searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
  searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
  searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
    Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
  searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
  searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```
2022-07-30 18:04:44 +02:00
Markus Heiser
2de007138c [fix] prepare for pylint 2.14.0
Remove issue reported by Pylint 2.14.0:

- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
  for max(), min() and sum(). [2]

.pylintrc:
  - <option name>-hint has been removed since long, Pylint 2.14.0 raises an
    error on invalid options
  - bad-continuation and bad-whitespace have been removed [3]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-03 15:41:52 +02:00
Markus Heiser
cf644b413e [test.pyright] suppress unneeded error & warning messages
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-22 11:16:41 +02:00
Alexandre Flament
4224607c62 searx.utils.html_to_text: replace <br/> by a space 2022-04-16 09:45:57 +02:00
Alexandre Flament
2d5929cc59 [mod] searx.utils: more typing 2022-01-30 22:14:12 +01:00
Alexandre Flament
0eacc46ee3 [mod] add documentation about searx.utils
This module is a toolbox for the engines.
Is should be documented.

In addition, searx/utils.py is checked by pylint.
2022-01-29 22:49:42 +01:00
Markus Heiser
3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Marc Abonce Seguin
66b7be0965 [fix] fix match_language issue to make zh-TW match to zh-Hant-TW
pybabel separates locales with underscores but we use hyphens
everywhere babel doesn't directly touch
2021-10-12 21:06:20 +02:00
Markus Heiser
de0249ddae [fix] don't mix loaded modules with imported modules (sys.modules)
The utils.load_module() function is used to load a python file (aka module) and
return the module's namespace.  SearXNG uses this function to load *engines and
answerers* from arbitrary locations with arbitrary modifications.  These are not
real python modules and it is not intended to mix this *engines and answerers*
with the python modules registered in sys.modules.

Closes: https://github.com/searxng/searxng/issues/312
Suggested-by: @dalf in https://github.com/searxng/searxng/issues/312
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-06 18:45:00 +02:00
Alexandre Flament
697ebeddcc [mod] searx.utils.dict_subset: rewrite with comprehension 2021-08-24 15:28:08 +02:00
Alexandre Flament
4b43775c91 version based on the git repository
This commit remove the need to update the brand for GIT_URL and GIT_BRANCH:
there are read from the git repository.

It is possible to call python -m searx.version freeze to freeze the current version.
Useful when the code is installed outside git (distro package, docker, etc...)
2021-07-30 14:40:09 +02:00
Alexandre Flament
92c8a8829f [fix] strip spaces from searx user agent
h11 (used by httpx) rejects HTTP request with a trailing space in HTTP headers
2021-06-09 18:08:23 +02:00
Alexandre Flament
4b07df62e5 [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
Markus Heiser
96b223023a [mod] utils.get_value() - avoidance of a recursion
In a comment [1] dalf suggested to avoid a recursion of get_value()

[1] https://github.com/searxng/searxng/pull/99#discussion_r640833716

Suggested-by: Alexandre Flament <alex@al-f.net>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 08:32:52 +02:00
Markus Heiser
6ed4616da9 [enh] add settings option to enable/disable search formats
Access to formats can be denied by settings configuration::

    search:
        formats: [html, csv, json, rss]

Closes: https://github.com/searxng/searxng/issues/95
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 08:32:52 +02:00
Alexandre Flament
eaa694fb7d [enh] replace requests by httpx 2021-04-10 15:38:33 +02:00
Alexandre Flament
3f8ebf70b1 [fix] pylint: use "raise ... from ..." 2020-12-20 09:46:53 +01:00
Alexandre Flament
de887c6347 [mod] bing_news: use eval_xpath_getindex
remove unused function searx.utils.list_get
2020-12-03 10:22:48 +01:00
Alexandre Flament
1d0c368746 [enh] record details exception per engine
add an new API /stats/errors
2020-12-03 10:22:48 +01:00
Alexandre Flament
b00d108673 [mod] pylint: numerous minor code fixes 2020-12-01 15:21:19 +01:00
Alexandre Flament
3038052c79 [mod] remove unused import
use
from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url  # NOQA
so it is possible to easily remove all unused import using autoflake:
autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-11-14 14:11:02 +01:00
Alexandre Flament
ca593728af [mod] duckduckgo_definitions: display only user friendly attributes / URL
various bug fixes
2020-10-28 08:09:25 +01:00
Alexandre Flament
a9dc54bebc [mod] Add searx.data module
Instead of loading the data/*.json in different location,
load these files in the new searx.data module.
2020-10-07 10:29:34 +02:00
Alexandre Flament
15013e64d8 [fix] drop Python 2: use importlib instead of imp.load_source
imp.load_source is not documented in Python 3
see documentation : https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly

partial fix of https://github.com/searx/searx/issues/1674
2020-10-06 09:42:11 +02:00
Alexandre Flament
8f914a28fa [mod] searx.utils.normalize_url: remove Yahoo hack
* The hack for Yahoo URLs is not necessary anymore. (see searx.engines.yahoo.parse_url)
* move the URL normalization in extract_url to normalize_url
2020-10-03 10:02:50 +02:00
Alexandre Flament
c1d10bde02 [mod] searx/utils.py: add docstring 2020-10-02 18:17:01 +02:00
Alexandre Flament
2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Alexandre Flament
ad0758e52a [mod] add searx/webutils.py
contains utility functions and classes used only by webapp.py
2020-09-22 11:57:06 +02:00
Alexandre Flament
6deb85072a [fix] searx.utils.HTMLTextExtractor: invalid HTML don't raise an Exception
Close #2188
2020-09-13 10:28:11 +02:00
Alexandre Flament
bdac99d4f0 Drop Python 2 (5/n): searx.utils.is_valid_lang, input parameter is a str instead of bytes
Fix bug in translated.py and dictzone.py
2020-09-10 10:49:42 +02:00
Dalf
c225db45c8 Drop Python 2 (4/n): SearchQuery.query is a str instead of bytes 2020-09-10 10:49:42 +02:00
Dalf
1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 2020-09-10 10:39:04 +02:00
Dalf
85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Noémi Ványi
5796dc60c9 fix pep 8 check 2019-10-16 15:52:48 +02:00
Noémi Ványi
a6f20caf32 add initial support for offline engines && command engine 2019-10-16 15:52:48 +02:00
Adam Tauber
72459b246b [fix] convert bytes type to string in language detection (fixes dictzone) 2019-10-16 14:52:57 +02:00
Alexandre Flament
2179079a91
[fix] fix flickr_noapi decoding (#1655)
Characters that were not ASCII were incorrectly decoded.
Add an helper function: searx.utils.ecma_unescape (Python implementation of unescape Javascript function).
2019-08-02 13:37:13 +02:00
Dalf
7e201cbf65 [mod] use cache in _match_language function to speed up searx start time significantly 2019-07-19 08:58:08 +02:00
rachmadani haryono
ec88fb8a0f [fix] secret_key can be bytes instead of a string (#1602)
Fix #1600
In settings.yml, the secret_key can be written as string or as base64 encoded data using !!binary notation.
2019-07-17 10:09:09 +02:00
Alex
50c836864a fetch_firefox_version.py : compatible with Python 3 and minor fixes. 2018-08-05 10:55:42 +02:00
Alexandre Flament
066bd916bf [mod] fetch firefox versions in a standalone script 2018-08-05 10:10:15 +02:00
Adam Tauber
d51732c0e5
Merge pull request #1303 from MarcAbonce/bing
Fix bing "garbage" results
2018-07-09 11:00:37 +02:00
Marc Abonce Seguin
c7000cd1df [fix] update user agent versions
this fixes duckduckgo error response
2018-06-23 16:24:06 -05:00
Adam Tauber
aef2b07969 [fix] add basestring for py3 2018-06-14 11:48:31 +02:00
Marc Abonce Seguin
75b276f408 fix bing "garbage" results (issue #1275) 2018-05-20 18:13:32 -05:00
Marc Abonce Seguin
772c048d01 refactor engine's search language handling
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.

Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
2018-03-27 00:08:03 -06:00
Adam Tauber
0969e50c5b [fix] convert json engine result attributes to string - closes #1006 2017-12-01 20:54:12 +01:00
Adam Tauber
b5071fea6a [fix] remove trailing 0x00 from csv output 2017-11-21 16:58:51 +01:00