Commit graph

659 commits

Author SHA1 Message Date
Azure Star
a42f20549b
Merge branch 'searxng:master' into master 2023-05-15 22:45:00 +02:00
Markus Heiser
007a615ffa [mod] donation_url: disable by default
SearXNG's donation campaign has been ended.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-15 09:19:17 +02:00
Azure Star
d81262aa5d
Merge branch 'searxng:master' into master 2023-05-09 14:46:53 +02:00
Markus Heiser
45529f51a1
Merge pull request #2347 from return42/mod-lang-detection
If language recognition fails use the Accept-Language
2023-04-25 15:46:26 +02:00
Azure Star
86d8b02694
Merge branch 'searxng:master' into master 2023-04-17 13:53:36 +02:00
Markus Heiser
f1b6351ae1 [fix] engine: google play movies
Closes: https://github.com/searxng/searxng/pull/1746
Closes: https://github.com/searxng/searxng/issues/1599

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-16 19:15:44 +02:00
Markus Heiser
8adbc4fcec [mod] settings.yml: enable language detection by default_lang (auto)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 22:24:59 +02:00
Azure Star
a4788f8d90
Merge branch 'searxng:master' into master 2023-04-13 08:36:38 +02:00
Markus Heiser
5234e45010 [fix] Gigablast.com has been erased
[1] https://www.reddit.com/r/searchengines/comments/128wdcp/gigablastcom_has_been_erased/

Closes: https://github.com/searxng/searxng/issues/2321
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-06 08:22:57 +02:00
Azure Star
d691555629
Merge branch 'searxng:master' into master 2023-03-30 00:46:26 +02:00
Markus Heiser
2b8dfab33f [fix] engine gigablast: add &userid=<User ID>&code=<Feed Code>
Gigablast's API does block unauthorized request[1].

[1] https://gigablast.com/searchfeed.html

Closes: https://github.com/searxng/searxng/issues/1454
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-29 16:18:02 +02:00
Azure Star
42bc05744b
Merge branch 'searxng:master' into master 2023-03-29 09:58:10 +02:00
Markus Heiser
2499899554 [mod] Google: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the Google engines including a improved language
and region handling based on the engine.traits_v1 data.

When ever possible the implementations of the Google engines try to make use of
the async REST APIs.  The get_lang_info() has been generalized to a
get_google_info() function / especially the region handling has been improved by
adding the cr parameter.

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - Google (WEB),
  - Google images,
  - Google news,
  - Google scholar and
  - Google videos

  and remove data from obsolete data type "supported_languages".

  A traits.custom type that maps region codes to *supported_domains* is fetched
  from https://www.google.com/supported_domains

searx/autocomplete.py:
  Reversed engineered autocomplete from Google WEB.  Supports Google's languages and
  subdomains.  The old API suggestqueries.google.com/complete has been replaced
  by the async REST API: https://{subdomain}/complete/search?{args}

searx/engines/google.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - always use the async REST API (formally known as 'use_mobile_ui')
  - use *supported_domains* from traits
  - improved the result list by fetching './/div[@data-content-feature]'
    and parsing the type of the various *content features* --> thumbnails are
    added

searx/engines/google_images.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - if exists, freshness_date is added to the result
  - issue 1864: result list has been improved a lot (due to the new cr parameter)

searx/engines/google_news.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
    *supported_domains* is not needed but a ceid list has been added.
  - different region handling compared to Google WEB
  - fixed for various languages & regions (due to the new ceid parameter) /
    avoid CONSENT page
  - Google News do no longer support time range
  - result list has been fixed: XPath of pub_date and pub_origin

searx/engines/google_videos.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - add paging support
  - implement a async request ('asearch': 'arc' & 'async':
    'use_ac:true,_fmt:html')
  - simplified code (thanks to '_fmt:html' request)
  - issue 1359: fixed xpath of video length data

searx/engines/google_scholar.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - request(): include patents & citations
  - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager)
  - hardening XPath to iterate over results
  - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class)
  - issue 1769 fixed: new request implementation is no longer incompatible

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
a7fe22770a [mod] Peertube: re-engineered & upgrade to data_type: traits_v1
- fetch_traits(): Fetch languages from peertube's search-index source code.

  [mod] Include migration of the request methode from 'supported_languages'
        to 'traits' (EngineTraits) object.
  [fix] old supported_languages_url is no longer valid since the sources
        has been moved to a different path.

- fixed code to pass pylint
- request(): complete re-implementation based on the API docs [1]
- response(): complete re-implementation, adds serveral fields missed before
- add source code documentation

[1] https://docs.joinpeertube.org/api-rest-reference.html#tag/Search/operation/searchVideos

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Azure Star
db3f18a6c2
Merge branch 'searxng:master' into master 2023-03-10 14:37:23 +01:00
Solirs
35fbb3578b Increase timeout for gentoo wiki engine 2023-02-28 13:54:44 +01:00
Fauli1221
661342adc6
Merge branch 'searxng:master' into master 2023-02-18 19:26:06 +01:00
Fauli1221
e01362fee2
Merge branch 'searxng:master' into master 2023-02-17 18:12:21 +01:00
Markus Heiser
5820dc78ce [doc] slight improvements to the doc of the settings (base_url)
Closes: https://github.com/searxng/searxng/issues/2190

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-17 12:08:58 +01:00
Markus Heiser
52f6bc745b
Merge pull request #2188 from ahmad-alkadri/fix/petalsearch
Fix the petalsearch engine
2023-02-15 13:57:28 +01:00
Ahmad Alkadri
f6af59899b Fix petalsearch and remove petalsearch news 2023-02-14 18:43:55 +01:00
Markus Heiser
7d446dfdb2 [mod] disbale engine tineye by default
Tineye becomes active as soon as a https:// signature is found in the search
term, but most of the time a reverse image search is not requested when a URL is
specified, often the URL is just from a C&P.

The frequent requests to tineye lead in the end to the SearXNG instance being
blocked by tineye and the user seeing unexpected error messages.

BTW: many maintainers have disabled this engine in their local SearXNG settings.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-14 08:27:19 +01:00
Markus Heiser
3abff182ea [fix] remove engine neeva from settings.yml
Engine is broken and can't by used any longer as a simple XPath engine.
@allendema tested a engines/neeva.py version using json from the dom, but
without luck: There was some kind of captcha for pagination.

[1] https://github.com/searxng/searxng/issues/2007#issuecomment-1426061698

Closes: https://github.com/searxng/searxng/issues/2007
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-10 18:46:37 +01:00
wibyweb
6707354bc8 [mod] engine wiby: add pagination
Suggested by: @wibyweb in searx https://github.com/searx/searx/pull/3465

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-10 15:31:24 +01:00
Fauli1221
67dba29442
Merge branch 'searxng:master' into master 2023-02-01 08:42:39 +01:00
Alexandre Flament
37addec69e search.suspended_time settings: bug fixes
* fix type in settings.yml: replace suspend_times by suspended_times
* always use delay defined in settings.yml:
  * HTTP status 402 and 403: read the value from settings.yml instead of using the hardcoded value of 1 day.
  * startpage engine: CAPTCHA suspend the engine for one day instead of one week
2023-01-28 10:24:14 +00:00
Alexandre Flament
13b0c251c4
Merge pull request #2100 from nexryai/master
Add goo engine
2023-01-15 23:08:28 +01:00
Léon Tiekötter
0cedb1c6d8 Add search.suspended_times settings
Make suspended_time changeable in settings.yml
Allow different values to be set for different exceptions.

Co-authored-by: Alexandre Flament <alex@al-f.net>
2023-01-15 09:00:32 +00:00
nexryai
4e7bb1bf9a
Add goo engine 2023-01-12 16:28:09 +09:00
Paul-Luca-Schugardt
c8a13b8503 Merge remote-tracking branch 'searxng/master' 2023-01-10 15:12:24 +01:00
Milad-Laly
cf4db4be37 [fix] Mojeek Xpath showing suggestions and searches + add lang support 2023-01-09 09:33:47 +01:00
Fauli1221
fa82fc17ec
Merge branch 'searxng:master' into master 2022-12-21 22:06:25 +01:00
Markus Heiser
ed901ab18e [mod] improve 'Autodetect search language' plugin
- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model

Reosurces::

    import fasttext                                    # --> +10 MB
    fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB

Suggested-by: @dalf

- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
  py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
  libc) --> patch Dockerfile and build from fastetxt:

     sed -i s/fasttext-wheel/fasttext/ requirements.txt

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-12-11 11:26:07 +01:00
ArtikusHG
9925a20950 [mod] new plugin: Autodetect search language 2022-12-10 13:11:47 +01:00
Fauli1221
bcea4906dc
Merge branch 'searxng:master' into master 2022-11-18 11:02:06 +01:00
Ryan Draga
408200c87e [fix] disabling zlibrary due to z-lib.org domain seizure 2022-11-10 21:18:21 +01:00
Alexandre Flament
8f19bdaf17
Merge pull request #1882 from fehho/metacpan
Add MetaCPAN engine
2022-11-07 21:54:11 +01:00
fehho
fe351c2802 Add MetaCPAN engine 2022-11-07 08:07:06 -06:00
pau sch
b0aa367ed6 updated settings to add derpibooru 2022-11-06 19:03:09 +01:00
Alexandre FLAMENT
e92755d358 Initialize Redis in searx/webapp.py
settings.yml:
* The default URL was unix:///usr/local/searxng-redis/run/redis.sock?db=0
* The default URL is now "false"

The default URL makes the log difficult to deal with:
if the admin didn't install a Redis instance, the logs record a false error.

It worked before because SearXNG initialized the Redis connection when the limiter started.

In this commit, SearXNG initializes Redis in searx/webapp.py
so various components can use Redis without taking care of the initialization step.
2022-11-05 17:45:52 +01:00
Alexandre Flament
32e8c2cf09 searx.network: add "verify" option to the networks
Each network can define a verify option:
* false to disable certificate verification
* a path to existing certificate.

SearXNG uses SSL_CERT_FILE and SSL_CERT_DIR when they are defined
see https://www.python-httpx.org/environment_variables/#ssl_cert_file
2022-10-14 13:59:22 +00:00
Mohamed Elashri
8d5653e60d
Merge branch 'searxng:master' into master 2022-09-30 23:06:54 +00:00
Markus Heiser
ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Mohamed Elashri
5832c70680
correct sci-hub links/ add .ru and remove other 3rd party domains. 2022-09-24 11:03:57 -04:00
Markus Heiser
caebafdd06 [fix] typo in crossref settings: disable --> disabled
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-24 08:12:36 +02:00
Alexandre Flament
d6446be38f [mod] science category: various update of about PR 1705 2022-09-23 20:52:55 +02:00
Alexandre FLAMENT
e36f85b836 Science category: update the engines
* use the paper.html template
* fetch more data from the engines
* add crossref.py
2022-09-23 20:45:58 +02:00
Alexandre Flament
bef3984d03
Merge pull request #1728 from liimee/eng-ddw
add duckduckgo weather engine
2022-09-23 18:14:09 +02:00
Alexandre Flament
d3fec1388c
Merge pull request #1624 from liimee/eng-wttr
Add wttr.in engine
2022-09-23 18:13:37 +02:00
Alexandre FLAMENT
33b43763b9 Brave engine: fix BrotliDecoderDecompressStream error 2022-09-18 22:08:38 +00:00