Commit graph

2596 commits

Author SHA1 Message Date
Markus Heiser
7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
e436287385 [mod] checker: add some additional tests
BTW: fix indentation by 2 spaces

The additional tests has been commented out in the google engines to not release
any CAPTCHA issues.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Markus Heiser
ff6804e545 [data] make engines.languages
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:52:32 +01:00
Markus Heiser
8cdad5d85d [fix] google-videos: parse values for 'length' & 'author'
The 'video.html' template from the 'oscar' design supports replacement
for *author* and *length*.  Google-videos does not have an author, alternatively
the publisher info from is used for the *author*.

Hint: these replacements are not supported by the 'simple' design.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:51:24 +01:00
Markus Heiser
89b3050b5c [fix] revise of the google-Video engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:39:30 +01:00
Alexandre Flament
8c46b767d0 [fix] google_news: avoid one HTTP redirect except for the English results
also add
params['soft_max_redirects'] = 1
to avoid false error reporting in /stats/errors
2021-01-24 08:53:35 +01:00
Markus Heiser
5f92dfcdbe [fix] google-news: query uses locale without country tag
Wthout country-region tag google will redirect to correct the contry tag [1]:

    SEARX_DEBUG=1 searx-checker -v "google news"
    ...
    https://news.google.com:443 "GET /search?q=computer&hl=en...      HTTP/1.1" 302 0
    https://news.google.com:443 "GET /search?q=computer&hl=en-US&.... HTTP/1.1" 200 None
    ...

[1] https://github.com/searx/searx/pull/2483#issuecomment-765600849

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-23 11:37:14 +01:00
Markus Heiser
baec54c492 [fix] revise of the google-news engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 18:49:45 +01:00
Alexandre Flament
73c86f9bf2 [mod] checker: disable by default 2021-01-19 21:44:48 +01:00
Alexandre Flament
3b7b852aa8 [fix] checker: minor fix about language detection 2021-01-19 21:29:31 +01:00
Alexandre Flament
aa887eb375 [mod] checker : replace pycld3 by langdetect
pycld3 requires the native library cld3
langdetect is a pure python package
2021-01-19 21:26:04 +01:00
Alexandre Flament
67a1aab0d5 [fix] /stats/checker : remove the timestamp field when the checker is disabled 2021-01-18 08:19:53 +01:00
Alexandre Flament
d473407ec9 [fix] checker: fix engine statistics
Without this commit, the URL /stats/errors shows percentage above 100% after the checker has run.
2021-01-18 08:19:44 +01:00
Alexandre Flament
ca76f3119a [fix] error_recorder: record code and lineno about the engine
since the PR #2225 , code and lineno were sometimes meaningless
see /stats/errors
2021-01-17 16:25:11 +01:00
Alexandre Flament
80d7411f2c
Merge pull request #2452 from kvch/add-wilby-engine
Add wiby.me engine
2021-01-16 22:36:31 +01:00
Alexandre Flament
b405646749
Merge pull request #2451 from mrwormo/invidious-engine
[Fix] Invidious Engine
2021-01-16 19:25:45 +01:00
Alexandre Flament
a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
mrwormo
2dff3887f0 [fix] Invidious engine by enabling requests by randomly picking amongst working instances 2021-01-14 12:12:56 +01:00
Alexandre Flament
912c7e975c [fix] checker: don't run the checker when uwsgi is not properly configured
Before this commit, even with the scheduler disabled, the checker was running
at least once for each uwsgi worker.
2021-01-13 14:07:39 +01:00
Alexandre Flament
7f0c508598 [fix] checker: fix typo unknown instead of unknow 2021-01-12 11:47:17 +01:00
Alexandre Flament
a0c8b413a6 [mod] searx.shared: minor tweaks
searx.shared.shared_abstract.SharedDict inherit from abc.ABC
searx.shared.shared_uwsgi.schedule can schedule multiple functions without issue
2021-01-12 11:47:17 +01:00
Alexandre Flament
87bafbc32b [mod] checker: add status and timestamp to the result
for each engine: replace status by success
2021-01-12 11:47:17 +01:00
Alexandre Flament
f3e1bd308f [mod] checker: minor adjustements on the default tests
the query "time" is convinient because most of the search engine will return some results,
but some engines in the general category will return documentation about the HTML tags <time> or <input type="time">
2021-01-12 11:47:17 +01:00
Alexandre Flament
45bfab77d0 |mod] checker: improve searx-checker command line
* output is unbuffered
* verbose mode describe more precisly the errrors
2021-01-12 11:47:17 +01:00
Alexandre Flament
3a9f513521 [enh] checker: background check
See settings.yml for the options
SIGUSR1 signal starts the checker.
The result is available at /stats/checker
2021-01-12 11:47:17 +01:00
Alexandre Flament
6e2872f436 [enh] add searx.shared
shared dictionary between the workers (UWSGI or werkzeug)
scheduler: run a task once every x seconds (UWSGI or werkzeug)
2021-01-12 11:47:17 +01:00
Markus Heiser
9c581466e1 [fix] do not colorize output on dumb terminals
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-12 11:47:17 +01:00
Alexandre Flament
ca0889d488 [enh] checker: wikidata & ddd: add specific tests 2021-01-12 11:47:17 +01:00
Alexandre Flament
16a889dd8f [enh] checker: add rosebud test 2021-01-12 11:47:17 +01:00
Alexandre Flament
8cbc9f2d58 [enh] add checker 2021-01-12 11:47:17 +01:00
Alexandre Flament
f7e11fd722
Merge pull request #2459 from dalf/update-python
Update python
2021-01-12 11:02:58 +01:00
Alexandre Flament
9c55d772e9
Merge pull request #2408 from return42/rm-brand-make
[mod] move brand options from Makefile to settings.yml
2021-01-12 10:52:42 +01:00
Alexandre Flament
f5c3cb7afa [mod] drop Python 3.5 support 2021-01-12 09:45:16 +01:00
Alexandre Flament
8d0312d014
Merge pull request #2458 from MarcAbonce/hide-links-mobile2
Hide links panel in mobile screens
2021-01-12 08:27:24 +01:00
Marc Abonce Seguin
635c6516a4 hide links panel in mobile screens 2021-01-11 20:40:21 -07:00
Alexandre Flament
424e6abc7e [mod] settings.yml: move brand settings to a dedicated section 2021-01-11 22:59:52 +01:00
Markus Heiser
d0338cb504 [fix] add missing brand.CONTACT_URL to /config API endpoint
Suggested-by: @dalf / https://github.com/searx/searx-stats2/issues/59#issuecomment-747961582
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Markus Heiser
9e53470b4c [mod] get rid of searx/brand.py
Removes module searx/brand.py and creates a namespace at searx.brand.

This patch is a first 'proof of concept'.  Later we can decide to remove the
brand namespace entirely or not.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Markus Heiser
9485179064 [mod] move brand options from Makefile to settings.yml
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Alexandre Flament
c2646df496
Merge pull request #2454 from MarcAbonce/fix-empty-lang-bang
Fix empty colon in query from selecting Chinese
2021-01-10 11:01:32 +01:00
Marc Abonce Seguin
571ce9ff07 fix empty colon in query from selecting Chinese 2021-01-09 22:11:41 -07:00
Noémi Ványi
a6dd1de4a8 Add wiby.me engine
Closes #2339
2021-01-08 23:11:18 +01:00
Markus Heiser
b0bb0a3a0f [fix] Library Genesis links shifted by 1 #1998
Fixes: #1998
Suggested-by: @linuxmue
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-07 14:47:34 +01:00
Émilien Devos
fc6cfc3b58
Remove voat due to its shutdown
Voat shutted down on December 25th, 2020 at 12 noon PST: https://voat.co/host/voat/static/inactive.min.html?ReturnUrl=/
2021-01-06 10:45:02 +00:00
Alexandre Flament
54e69d0367 [upd] update dependencies
minor change in the oscar theme becase the last version of jinja2
respect more carefully the spaces in the templates
2020-12-28 09:04:39 +01:00
Alexandre Flament
568b9465e9 [mod] check secret_key when searx.webapp is imported
Without this commit the module searx checks the secret_key value.

With this commit, make docs, utils/standalone_searx.py,
utils/fetch_firefox_version.py works without SEARX_DEBUG=1

For reference see https://github.com/searx/searx/pull/2386
2020-12-27 10:30:20 +01:00
Alexandre Flament
1956ab4b50
Merge pull request #2412 from dalf/update-buildenv
[fix] update buildenv
2020-12-27 08:31:23 +01:00
Markus Heiser
4de276e364 [upd] make SEARX_DEBUG=1 useragents.update
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-22 14:23:58 +01:00
Alexandre Flament
db5b060455 [fix] update buildenv
CONTACT_URL is unset in Makefile, but searx/brand.py and
utils/brand.env are not updated.

This commit fixes this issue.
2020-12-21 10:55:28 +01:00