Commit graph

63 commits

Author SHA1 Message Date
Markus Heiser
7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Markus Heiser
baec54c492 [fix] revise of the google-news engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 18:49:45 +01:00
Alexandre Flament
a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
Alexandre Flament
64cccae99e [mod] various engines: use eval_xpath* functions and searx.exceptions.*
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-12-03 10:22:48 +01:00
Alexandre Flament
2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Markus Heiser
8162d7aff4 [fix] google engine - div classes has been renamed in HTML reult
Since 1. October 2020 google has changed the 'class' attribute of the HTML
result page.

Fix the xpath expressions and ignore <div class="g" ../> sections which do not
match to title's xpath expression.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-10-01 09:44:29 +02:00
Marc Abonce Seguin
ecf5899153 fetch google's search langs rather than ui langs 2020-09-22 11:37:44 +02:00
Dalf
1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 2020-09-10 10:39:04 +02:00
Adam Tauber
52eba0c721 [fix] pep8 2020-07-08 00:46:03 +02:00
Markus Heiser
410c2f903d [fix] revise google engine
this commit is picked from #1985
2020-07-07 21:50:59 +02:00
Marc Abonce Seguin
ccaf6ca02c [fix] update xpaths for new google results page 2019-12-07 16:37:24 -07:00
Adam Tauber
731e34299d
Merge pull request #1744 from dalf/optimizations
[mod] speed optimization
2019-12-02 13:39:58 +00:00
Emilien Devos
8f51430f5c [fix] Force Google old UI with a new user agent 2019-11-22 23:01:41 +01:00
Dalf
85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Emilien Devos
cbd1ebdce8 [fix] Force Google old UI (#1597) 2019-05-29 10:05:57 +09:00
Noémi Ványi
b63d645a52 Revert "remove 'all' option from search languages"
This reverts commit 4d1770398a.
2019-01-07 21:19:00 +01:00
Marc Abonce Seguin
0169b63e84 [fix] fetch google's supported languages 2019-01-06 21:31:45 -06:00
Marc Abonce Seguin
5568f24d6c [fix] check language aliases when setting search language 2019-01-06 20:31:57 -06:00
Marc Abonce Seguin
f7f9c50393 [fix] force English results in Google when using en-US 2018-04-18 23:29:48 -05:00
Marc Abonce Seguin
772c048d01 refactor engine's search language handling
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.

Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
2018-03-27 00:08:03 -06:00
Marc Abonce Seguin
d1eae9359f fix fetch_langauges to be more accurate
Add languages supported by either all default general engines or 10 engines.
2018-03-20 17:58:20 -06:00
Noémi Ványi
2d5eed9b59 send constant cookie with query to Google 2017-12-18 21:38:52 +01:00
marc
4d1770398a remove 'all' option from search languages 2017-12-06 01:20:15 -06:00
Adam Tauber
1613c6319e [fix] handle /sorry redirects 2017-12-05 20:38:34 +01:00
Adam Tauber
6eb9503896 [fix] use english in google engine if no language was set - this prevents guessing the language by the IP of the instance 2017-11-22 22:56:47 +01:00
Adam Tauber
6fdb6640d9 [fix] revert language changes to prevent CAPTCHAs 2017-11-22 22:50:48 +01:00
Adam Tauber
9ab8536479 [fix] fix language support of google 2017-11-21 16:28:53 +01:00
Adam Tauber
52e615dede [enh] py3 compatibility 2017-05-15 12:02:30 +02:00
Adam Tauber
52d1087202 [enh] add result number parsing to google engine 2017-01-27 00:18:46 +01:00
David A Roberts
1d30141c20 [enh] show spelling corrections 2017-01-16 13:31:16 +10:00
Adam Tauber
0d4da30c7f [enh] add instant answers to google engine 2017-01-05 17:20:12 +01:00
marc
af35eee10b tests for _fetch_supported_languages in engines
and refactor method to make it testable without making requests
2016-12-15 00:40:21 -06:00
marc
f62ce21f50 [mod] fetch supported languages for several engines
utils/fetch_languages.py gets languages supported by each engine and
generates engines_languages.json with each engine's supported language.
2016-12-13 19:58:10 -06:00
marc
c677aee58a filter langauges 2016-12-13 19:32:00 -06:00
marc
149802c569 [enh] add supported_languages on engines and auto-generate languages.py 2016-12-13 19:32:00 -06:00
Noémi Ványi
c59c76e6ee add year to time range to engines which support "Last year"
Engines:
 * Bing images
 * Flickr (noapi)
 * Google
 * Google Images
 * Google News
2016-12-11 16:58:31 +01:00
Adam Tauber
16bdc0baf4 [mod] do not escape html content in engines 2016-12-09 18:59:19 +01:00
Adam Tauber
350a84520d [fix] time range detection 2016-07-26 00:28:48 +02:00
Noemi Vanyi
2e5839503f add time range search for google 2016-07-25 23:28:14 +02:00
stepshal
b3ab221b98 Fix anomalous backslash in string 2016-07-11 23:53:13 +07:00
Adam Tauber
85c0351dca Merge pull request #526 from ukwt/anime
Add a few search engines
2016-04-14 10:59:31 +02:00
Kirill Isakov
90c51cb449 Fix a few typos in Google search engine 2016-04-13 23:04:53 +06:00
Adam Tauber
6d55642ab4 [fix] no more redirect ++ explicitly specify search language to avoid googles ip based heuristics 2016-03-25 18:38:02 +01:00
Adam Tauber
09b7673fbd [fix] temporary disable googles inner links - #491 2016-01-18 13:10:21 +01:00
Adam Tauber
66f48c2bf5 [fix] google markup change - closes #489 2016-01-10 18:49:50 +01:00
Adam Tauber
5cea4f9445 [fix] prevent google engine to redirect
nid/pref cookies are also removed
2015-12-22 20:05:42 +01:00
Adam Tauber
d8f8bdc951 [fix] quickfix for sometimes missing PREF cookie 2015-12-15 09:48:38 +01:00
Adam Tauber
5d49c15f79 [fix] google engine - ignore new useless result type 2015-10-29 12:47:12 +01:00
Adam Tauber
0ad272c5cb [fix] content escaping - closes #441
TODO check other engines too
2015-09-30 16:42:03 +02:00