Commit graph

1326 commits

Author SHA1 Message Date
Alexandre Flament
cd2dd5dd55 Wikidata engine: ignore dummy entities
Close #641
2022-06-11 11:09:21 +02:00
Alexandre Flament
d068b67a71 Wikidata engine: minor change of the SPARQL request
The engine can be slow especially when the query won't return any answer.
See https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI#Find_articles_in_Wikipedia_speaking_about_cheese_and_see_which_Wikibase_items_they_correspond_to

Related to #1290
2022-06-11 10:50:11 +02:00
Markus Heiser
2de007138c [fix] prepare for pylint 2.14.0
Remove issue reported by Pylint 2.14.0:

- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
  for max(), min() and sum(). [2]

.pylintrc:
  - <option name>-hint has been removed since long, Pylint 2.14.0 raises an
    error on invalid options
  - bad-continuation and bad-whitespace have been removed [3]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-03 15:41:52 +02:00
Allen
43dc9eb7d6 [enh] Initial Petalsearch Images support
Upstream example query:

  https://petalsearch.com/search?query=test&channel=image&ps=50&pn=1&region=de-de&ss_mode=off&ss_type=normal

Depending on locale it will internally use some/all results from other
engines. See:

  https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/#general-indexing-search-engines
2022-06-02 14:32:37 +02:00
Émilien Devos
06cb15cbf7
Reflect the real world parameter from settings.yml 2022-05-10 20:44:35 +00:00
Markus Heiser
4326009d00 [format.python] based on bugfix in 9ed626130 2022-05-07 18:23:10 +02:00
capric98
8c7e6cc983 [fix] FutureWarning from lxml
Just in case if content is None, the original code will skip extract_text(), and
just append the None value to 'content'. So just add allow_none=True, and this
will return None without raising a ValueError in extract_text().
2022-04-22 16:09:36 +02:00
Alexandre Flament
bbf13a4657
Merge pull request #1101 from allendema/pass-cookies-from-settings
[enh] Allow passing headers/cookies from settings.yml
2022-04-17 11:37:07 +02:00
Allen
dae8a08089
[fix[ Update only cookies/headers 2022-04-17 11:29:23 +02:00
Allen
67fb6fba84
[lint] Remove whitespace
From GH GUI
2022-04-17 10:42:25 +02:00
Allen
15862ebc35
[mod] Pass desired ebay domain in settings
https://www.ebay.de
https://www.ebay.com
htttps://www.ebay.es

etc
2022-04-16 19:10:35 +02:00
Allen
155333f625
[enh] Allow passing headers/cookies from settings.yml
Example:

   - engine: xpath
   - search_url: example.org
   - headers: {'example_header': 'example_header'}
   - cookies: {'safesearch': 'off'}
2022-04-16 17:42:04 +02:00
Alexandre Flament
c474616642
Merge pull request #1071 from return42/fix-lang-dailymotion
[fix] dailymotion engine: filter by language & country
2022-04-16 11:54:49 +02:00
Alexandre Flament
1a82e79b50 dailymotion: send valid value for the language parameter 2022-04-16 09:27:34 +02:00
Markus Heiser
3bb62823ec [fix] dailymotion engine: filter by language & country
- fix the issue of fetching more the 7000 *languages*
- improve the request function and filter by language & country
- implement time_range_support & safesearch
- add more fields to the response from dailymotion (allow_embed, length)
- better clean up of HTML tags in the 'content' field.

This is more or less a complete rework based on the '/videos' API from [1].
This patch cleans up the language list in SearXNG that has been polluted by the
ISO-639-3 2 and 3 letter codes from dailymotion languages which have never been
used.

[1] https://developers.dailymotion.com/tools/

Closes: https://github.com/searxng/searxng/issues/1065
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-16 09:27:34 +02:00
Jabster28
9eb1b04f48
change "Wolfram|Alpha" to "Wolfram Alpha" in search results 2022-04-12 10:37:33 +01:00
Alexandre Flament
592cea0e5e
Merge pull request #1030 from austinhuang0131/master
(feat) add jisho.org
2022-04-09 18:57:20 +02:00
Alexandre Flament
74c7aee9ec jisho : code refactoring 2022-04-09 18:01:57 +02:00
Austin Huang
19fa0095a0
(fix) satisfy the linter, and btw reduce timeout 2022-04-01 09:23:24 -04:00
Austin Huang
a399248f56
update jisho.py according to suggestions 2022-04-01 09:18:19 -04:00
Alexandre FLAMENT
f00cdb5e51 bing engine: _fetch_supported_languages: don't use the language code as a country
ref #1029
2022-03-31 20:03:34 +00:00
Austin Huang
934ae4e086
(feat) add jisho.org
Closes #1016
2022-03-31 14:45:39 -04:00
Alexandre Flament
378b29be2f fix startpage: update XPath in _fetch_supported_languages 2022-03-19 14:16:37 +01:00
Markus Heiser
53b5a804e2 [fix] engine mediathekviewweb: replace http links by https
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-07 19:49:16 +01:00
Markus Heiser
20f4538e13 [fix] engine: Semantic Scholar (Science) // rework & fix
Closes: https://github.com/searxng/searxng/issues/939
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-05 11:53:41 +01:00
Markus Heiser
8d937179ab
Merge pull request #913 from return42/add-artwork
[mod] add artwork to mixcloud & soundcloud engines
2022-02-21 22:24:40 +01:00
Markus Heiser
b08b81b434 [mod] bandcamp & genius: in result set img_src instead thumbnail
Suggested-by: @dalf https://github.com/searxng/searxng/pull/900#issuecomment-1046009057
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser
bded1ee280 [fix] genius: add player an avoid exceptional programming
Add player:

- The players are just playing 30sec from the title.  Some of the player will be
  blocked because of a cross-origin request and some players will link to apple
  when you press the play button.

Avoid exceptions and (and BTW improve results)

-  ERROR   searx.engines.genius          : list index out of range

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser
36aee70c24
Merge pull request #910 from tiekoetter/fix-909
[fix] google images engine: Fix 'scrap_img_by_id' function
2022-02-20 18:29:50 +01:00
Markus Heiser
2921d3cd17 [mod] add artwork to mixcloud & soundcloud engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 21:59:12 +01:00
Markus Heiser
4a28b593c2 [fix] google images engine: Fix 'scrap_img_by_id' function
The 'scrap_img_by_id' function didn't return any longer anything useful.  This
fix allows the google images engine to present the full source image instead of
only the thumbnail.

The function scrap_img_by_id() is rpelaced by a fully rewrite to parse image
URLs by a regular expression. The new function parse_urls_img_from_js(dom)
returns a mapping of data-id to image URL.

Closes: https://github.com/searxng/searxng/issues/909
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 14:33:56 +01:00
Alexandre Flament
ace5401632
Merge pull request #900 from return42/fix-883
[fix] bandcamp: fix itemtype (album|track) and exceptions
2022-02-19 13:42:53 +01:00
Markus Heiser
943a7fdcb5 [mod] mediathekviewweb engine: add iframe_src and use videos template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 00:50:54 +01:00
Markus Heiser
05c105b837 [fix] bandcamp: fix itemtype (album|track) and exceptions
BTW: polish implementation and show tracklist for albums

Closes: https://github.com/searxng/searxng/issues/883
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 22:44:43 +01:00
Markus Heiser
7352c6bc79 [mod] templates: rename field for <iframe> URL to iframe_src
Rename result field data_src to iframe_src

Suggested-by: @dalf https://github.com/searxng/searxng/pull/882#issuecomment-1037997402
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 19:00:49 +01:00
Markus Heiser
98cab4cf75 [mod] result_templates/default.html replace embedded HTML by data_src audio_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Markus Heiser
46e131fdad [mod] result_templates/videos.html: replace embedded HTML by data_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src', an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Émilien Devos
7d3e8118b0
Update the XPath for fetching the Google results 2022-02-09 14:34:14 +01:00
Markus Heiser
906a0a99cd [fix] openstreatmap: load thumbnail from uploads.wikimedia.org
Openstreatmap images are now loaded from uploads.wikimedia.org instead of
commons.wikimedia.org to prevent redirects.

With `image_proxy` enabled images from commons.wikimedia.org cant be loaded
since they are redirected.  We already discussed this issue [875] and
@tiekoetter fixed this issue in PR [878].

Related-to:
- [875] https://github.com/searxng/searxng/issues/875
- [878] https://github.com/searxng/searxng/pull/878
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-07 13:05:52 +01:00
Markus Heiser
a967e59590 [pylint] searx/engines/wikidata.py (no functional change)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-07 10:15:32 +01:00
Léon Tiekötter
1c151ae92b
[fix] wikidata: URL decoding and file extension handling
Add '.png' to the second img_src_name if it has the extension '.svg'.
Use urllib.parse.unquote for URL decoding.
2022-02-07 00:21:02 +01:00
Markus Heiser
a13c5d70c7 [fix] wikidata engine: select image with higher (not lower) priority
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-06 23:35:55 +01:00
Léon Tiekötter
a50f32bcfc
wikidata: load thumbnail instead of full image 2022-02-06 23:25:50 +01:00
Léon Tiekötter
560a14e77b
[fix] wikidata info box images
Wikidata info box images are now loaded from uploads.wikimedia.org instead of commons.wikimedia.org to prevent redirects

Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-06 22:16:06 +01:00
Markus Heiser
b35ef9789b [pylint] engines/invidious.py
Fix remarks from pylint and remove usless comments

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 15:42:06 +01:00
Markus Heiser
e2ec6b4211 [fix] invidious engine: store random base_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 15:42:06 +01:00
Markus Heiser
ddc2102a07 [fix] solidtorrents engine: store random bas_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:55:21 +01:00
Markus Heiser
d6061b7c8a [mod] solidtorrents engine: add metadata & torrentfile
BTW: define min_len in eval_xpath_list of 'stats' list

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872910744
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:42 +01:00
Markus Heiser
f9c4868142 [fix] solidtorrents engine: use get_torrent_size from searx.utils
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872858489
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:42 +01:00
Markus Heiser
d92b3d96fd [fix] solidtorrents engine: JSON API no longer exists
The API endpoint, we where using does not exist anymore.  This patch is a
rewrite that parses the HTML page.

Related: https://github.com/paulgoio/searxng/issues/17
Closes: https://github.com/searxng/searxng/issues/858

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-04 14:53:37 +01:00