Commit graph

101 commits

Author SHA1 Message Date
Alexandre Flament
4224607c62 searx.utils.html_to_text: replace <br/> by a space 2022-04-16 09:45:57 +02:00
Alexandre Flament
2d5929cc59 [mod] searx.utils: more typing 2022-01-30 22:14:12 +01:00
Alexandre Flament
0eacc46ee3 [mod] add documentation about searx.utils
This module is a toolbox for the engines.
Is should be documented.

In addition, searx/utils.py is checked by pylint.
2022-01-29 22:49:42 +01:00
Markus Heiser
3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Marc Abonce Seguin
66b7be0965 [fix] fix match_language issue to make zh-TW match to zh-Hant-TW
pybabel separates locales with underscores but we use hyphens
everywhere babel doesn't directly touch
2021-10-12 21:06:20 +02:00
Markus Heiser
de0249ddae [fix] don't mix loaded modules with imported modules (sys.modules)
The utils.load_module() function is used to load a python file (aka module) and
return the module's namespace.  SearXNG uses this function to load *engines and
answerers* from arbitrary locations with arbitrary modifications.  These are not
real python modules and it is not intended to mix this *engines and answerers*
with the python modules registered in sys.modules.

Closes: https://github.com/searxng/searxng/issues/312
Suggested-by: @dalf in https://github.com/searxng/searxng/issues/312
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-06 18:45:00 +02:00
Alexandre Flament
697ebeddcc [mod] searx.utils.dict_subset: rewrite with comprehension 2021-08-24 15:28:08 +02:00
Alexandre Flament
4b43775c91 version based on the git repository
This commit remove the need to update the brand for GIT_URL and GIT_BRANCH:
there are read from the git repository.

It is possible to call python -m searx.version freeze to freeze the current version.
Useful when the code is installed outside git (distro package, docker, etc...)
2021-07-30 14:40:09 +02:00
Alexandre Flament
92c8a8829f [fix] strip spaces from searx user agent
h11 (used by httpx) rejects HTTP request with a trailing space in HTTP headers
2021-06-09 18:08:23 +02:00
Alexandre Flament
4b07df62e5 [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
Markus Heiser
96b223023a [mod] utils.get_value() - avoidance of a recursion
In a comment [1] dalf suggested to avoid a recursion of get_value()

[1] https://github.com/searxng/searxng/pull/99#discussion_r640833716

Suggested-by: Alexandre Flament <alex@al-f.net>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 08:32:52 +02:00
Markus Heiser
6ed4616da9 [enh] add settings option to enable/disable search formats
Access to formats can be denied by settings configuration::

    search:
        formats: [html, csv, json, rss]

Closes: https://github.com/searxng/searxng/issues/95
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28 08:32:52 +02:00
Alexandre Flament
eaa694fb7d [enh] replace requests by httpx 2021-04-10 15:38:33 +02:00
Alexandre Flament
3f8ebf70b1 [fix] pylint: use "raise ... from ..." 2020-12-20 09:46:53 +01:00
Alexandre Flament
de887c6347 [mod] bing_news: use eval_xpath_getindex
remove unused function searx.utils.list_get
2020-12-03 10:22:48 +01:00
Alexandre Flament
1d0c368746 [enh] record details exception per engine
add an new API /stats/errors
2020-12-03 10:22:48 +01:00
Alexandre Flament
b00d108673 [mod] pylint: numerous minor code fixes 2020-12-01 15:21:19 +01:00
Alexandre Flament
3038052c79 [mod] remove unused import
use
from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url  # NOQA
so it is possible to easily remove all unused import using autoflake:
autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-11-14 14:11:02 +01:00
Alexandre Flament
ca593728af [mod] duckduckgo_definitions: display only user friendly attributes / URL
various bug fixes
2020-10-28 08:09:25 +01:00
Alexandre Flament
a9dc54bebc [mod] Add searx.data module
Instead of loading the data/*.json in different location,
load these files in the new searx.data module.
2020-10-07 10:29:34 +02:00
Alexandre Flament
15013e64d8 [fix] drop Python 2: use importlib instead of imp.load_source
imp.load_source is not documented in Python 3
see documentation : https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly

partial fix of https://github.com/searx/searx/issues/1674
2020-10-06 09:42:11 +02:00
Alexandre Flament
8f914a28fa [mod] searx.utils.normalize_url: remove Yahoo hack
* The hack for Yahoo URLs is not necessary anymore. (see searx.engines.yahoo.parse_url)
* move the URL normalization in extract_url to normalize_url
2020-10-03 10:02:50 +02:00
Alexandre Flament
c1d10bde02 [mod] searx/utils.py: add docstring 2020-10-02 18:17:01 +02:00
Alexandre Flament
2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Alexandre Flament
ad0758e52a [mod] add searx/webutils.py
contains utility functions and classes used only by webapp.py
2020-09-22 11:57:06 +02:00
Alexandre Flament
6deb85072a [fix] searx.utils.HTMLTextExtractor: invalid HTML don't raise an Exception
Close #2188
2020-09-13 10:28:11 +02:00
Alexandre Flament
bdac99d4f0 Drop Python 2 (5/n): searx.utils.is_valid_lang, input parameter is a str instead of bytes
Fix bug in translated.py and dictzone.py
2020-09-10 10:49:42 +02:00
Dalf
c225db45c8 Drop Python 2 (4/n): SearchQuery.query is a str instead of bytes 2020-09-10 10:49:42 +02:00
Dalf
1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 2020-09-10 10:39:04 +02:00
Dalf
85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Noémi Ványi
5796dc60c9 fix pep 8 check 2019-10-16 15:52:48 +02:00
Noémi Ványi
a6f20caf32 add initial support for offline engines && command engine 2019-10-16 15:52:48 +02:00
Adam Tauber
72459b246b [fix] convert bytes type to string in language detection (fixes dictzone) 2019-10-16 14:52:57 +02:00
Alexandre Flament
2179079a91
[fix] fix flickr_noapi decoding (#1655)
Characters that were not ASCII were incorrectly decoded.
Add an helper function: searx.utils.ecma_unescape (Python implementation of unescape Javascript function).
2019-08-02 13:37:13 +02:00
Dalf
7e201cbf65 [mod] use cache in _match_language function to speed up searx start time significantly 2019-07-19 08:58:08 +02:00
rachmadani haryono
ec88fb8a0f [fix] secret_key can be bytes instead of a string (#1602)
Fix #1600
In settings.yml, the secret_key can be written as string or as base64 encoded data using !!binary notation.
2019-07-17 10:09:09 +02:00
Alex
50c836864a fetch_firefox_version.py : compatible with Python 3 and minor fixes. 2018-08-05 10:55:42 +02:00
Alexandre Flament
066bd916bf [mod] fetch firefox versions in a standalone script 2018-08-05 10:10:15 +02:00
Adam Tauber
d51732c0e5
Merge pull request #1303 from MarcAbonce/bing
Fix bing "garbage" results
2018-07-09 11:00:37 +02:00
Marc Abonce Seguin
c7000cd1df [fix] update user agent versions
this fixes duckduckgo error response
2018-06-23 16:24:06 -05:00
Adam Tauber
aef2b07969 [fix] add basestring for py3 2018-06-14 11:48:31 +02:00
Marc Abonce Seguin
75b276f408 fix bing "garbage" results (issue #1275) 2018-05-20 18:13:32 -05:00
Marc Abonce Seguin
772c048d01 refactor engine's search language handling
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.

Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
2018-03-27 00:08:03 -06:00
Adam Tauber
0969e50c5b [fix] convert json engine result attributes to string - closes #1006 2017-12-01 20:54:12 +01:00
Adam Tauber
b5071fea6a [fix] remove trailing 0x00 from csv output 2017-11-21 16:58:51 +01:00
Adam Tauber
3d6c67951a [fix] resurrect csv output in py2 2017-11-21 16:51:45 +01:00
Noémi Ványi
e73cb14889 fix hmac python3 compatibility 2017-09-08 21:33:11 +02:00
misnyo
33fd938016 [mod] int_or_zero refactored to searx_utils 2017-09-04 20:05:04 +02:00
potato
9b82cb1908 [fix] is_valid_lang fixed for new languages.py + dictzone engine encoding 2017-06-25 18:29:19 +02:00
Alexandre Flament
9c91ab33f8 [mod] settings.yml can be /etc/searx/settings.yml
The exact order is
* first from SEARX_SETTINGS_PATH,
* if not found then from searx code base,
* if not found then from /etc/searx/settings.yml
* if not found an exception stops searx loading
2017-05-15 22:19:42 +02:00