Two different threads ( = two different user queries) can call the request
function in a row and then the response function. The namespace will be same
since this is the same engine.
To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Two different threads ( = two different user queries) can call the request
function in a row and then the response function. The namespace will be same
since this is the same engine.
To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
can replace filtron:
* rate limite the number of request per IP and per (IP, User-Agent)
* block some bots
use Redis
data stored in Redis never contains the IP addresses, only HMAC using the secret_key
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
Now that about.html extends page_with_header.html
it already has a link to the start page and removing
the link makes it easier to extract the page title
from the Markdown for the following commit.
Currency engine has DuckDuckGo metadata
In the engine selector of the preferences window, the currency search engine has
the same metadata and wikidata url as duckduckgo, I'd assume there should be a
difference of some sort there clarifying what source the currency uses or, if
it's a duckduckgo service, at least clarifying that it's a currency service by
duck duck go.
Closes: https://github.com/searxng/searxng/issues/787
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Previously the preferences & stats templates contained the markup:
<a href="{{ url_for('index') }}"><h1><span>SearXNG</span></h1></a>
There are many things wrong with this:
1. the markup was duplicated
2. the CSS needed to be changed whenever a new page wanted to use this
header (since the CSS used page-specific selectors)
3. h1 should be reserved for the actual page title
(e.g. Preferences or Engine stats)
4. the image was set via CSS which also set:
span { visibility: hidden; }
which however removes the alternative text from the accessibility
tree (meaning screen readers will ignore it).
This commit fixes all these problems.
Other optional parameter ..
`&sort=crawl_date`
can be appended to search_string to sort results by date.
`&domain=example.org`
can be implemented to search_string to get results from just one domain.
Public instances could get relatively fast timed-out for 3600s.
--
Merged from @allendema's commit [1] and slightly modfied / see [2].
Related-to: [1] 455b2b4460
Related-to: [2] https://github.com/searx/searx/pull/3040
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Check 'using_tor_proxy' for each engine individually instead of checking globally
[fix] searx.network: update _rdns test to the last httpx version
Co-authored-by: Alexandre Flament <alex@al-f.net>
The macro "checkbox" in macros.html uses the macro "icon_small"
from icons.html
The commit imports icon_small in macros.html to fix the issue.
It works because the macros in macros.html are imported with the Jinja2 context.
See https://jinja.palletsprojects.com/en/3.0.x/templates/#import-visibilityclose#819
Engine description can be configured, this is needed e.g. by custom search
engines. Here is an example of a command engine with a description in the about
section::
- name: locate
engine: command
command: ['locate', '{{QUERY}}']
disabled: true
categories: files
about:
description: local files
website: 'https://www.man7.org/linux/man-pages/man1/locate.1.html'
delimiter:
chars: ' '
keys: ['line']
Closes: https://github.com/searxng/searxng/issues/788
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Use httpx.Response.json() to avoid charset_normalizer issues:
DEBUG charset_normalizer : override steps (5) and chunk_size (512) as content does not fit (153 byte(s) given) parameters.
INFO charset_normalizer : ascii passed initial chaos probing. Mean measured chaos is 0.000000 %
DEBUG charset_normalizer : ascii should target any language(s) of ['Latin Based']
INFO charset_normalizer : ascii is most likely the one. Stopping the process.
[1] https://www.python-httpx.org/api/#response
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Currently we have two kinds of user documentation:
* the about page[1] which is written in HTML and part of the web
application and can therefore link instance-specific pages
(like e.g. the preferences) via Jinja variables
* the Sphinx documentation[2] which is written in reStructuredText
and cannot link instance-specific pages since it doesn't know
which instance the user is using
The plan is to integrate the user documentation currently in Sphinx
into the application, so that it can also link instance specific pages.
We also want to enable the user documentation to be translated.
This commit implements the first step in this endeavor (see #722).
[1]: searx/templates/__common__/about.html
[2]: docs/user/ (currently served at https://docs.searxng.org/user/)
Since https://github.com/searxng/searxng/pull/354
the searx.network.stream(...) returns a tuple
This commits update the checker code according to
this function signature change.
webapp.py monkey-patches the Flask request global.
This commit adds a type cast so that e.g. Pyright[1]
doesn't show "Cannot access member" errors everywhere.
[1]: https://github.com/microsoft/pyright
* mirror all inline SVGs so that direction SVGs display correctly on RTL
* set the bold list element in info box to RTL so the colon gets displayed on the right side
* set correct .ltr function for the left border on the search button in #q
* move text to the right in autocomplete
* move search form in lign with result article on RTL
* add the correct padding for img thumbnails in categories like music on RTL
* apply RTL to result table for map results
* align text in tables part of /preferences on RTL
* move burger menu on index page to the left on RTL
* fix positioning of drop down arrow on select boxes on RTL
* align result URL on the right (written LTR)
* align vim hotkeys help on the left since it is not translated
* image detail:
* labels (author, format, URL, etc...) are written on the right,
values are on the left.
* URL are written LTR and overflow on the right
The less grunt runner silently ignore missing files and continue with the build[1]::
Running "less:production" (less) task
>> Destination css/searxng.min.css not written because no source files were found.
>> 1 stylesheet created.
>> 1 sourcemap created.
Add filter function that calls grunt.fail() if the scr file does not exists.
[1] https://github.com/searxng/searxng/pull/750#discussion_r784357031
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Bangs with a `*` suffix (e.g. `!!d*`) overwrite Bangs with the same
prefix (e.g. `!!d`) [1]. This can be avoid when a non printable character is
used to tag a LEAF_KEY.
[1] https://github.com/searxng/searxng/pull/740#issuecomment-1010411888
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
There is an issue with redis v4.1.0 [1] / for the interim lets remove this
python dependency.
[1] https://github.com/searxng/searxng/issues/741
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
An ambiguous bang like `!!d` raises an exception in function get_bang_url(). A
bang is only unique when the bang_definition from get_bang_definition_and_ac() is
a string / for a ambiguous bang the returned bang_definition is a dictionary.
Reported-by: user prg at #searxng:matrix.org on 2022/01/11
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.
[1] https://github.com/searxng/searxng/pull/695
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:
1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed
Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
api.openverse.engineering is a little picky and wants to have a trailing slash
in the path:
/v1/images? -->/ v1/images/?
otherwise it redirects, here is the debug log:
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images?&page=1&page_size=20&format=json&q=foo "HTTP/2 301 Moved Permanently" (text/html; charset=utf-8)
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images/?&page=1&page_size=20&format=json&q=foo "HTTP/2 200 OK" (application/json)
WARNING searx.engines.openverse : ErrorContext('searx/search/processors/online.py', 105, 'count_error(', None, '1 redirects, maximum: 0', ('200', 'OK', 'api.openverse.engineering')) True
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The implementation of the etools engine is poor. No date-range support, no
language support and it is broken by a CAPTCHA.
etools is a metasearch engine, the major search engines it supports (google,
bing, wikipedia, Yahoo) are already available in SeaarXNG.
While etools does support several engines we currently don't support directly,
support for them should be added directly to SearXNG if there is demand.
In practice: in SearXNG the worse etools results will be mixed with good results
from other engines we have (as long as there is no captcha).
At best case, what we win with etools is in e.g. results from de.ask.com in a
query from a german request .. in all other cases worse results are bubble up in
SearXNG's result list.
[1] https://github.com/searxng/searxng/issues/696#issuecomment-1005855499
Closes: https://github.com/searxng/searxng/issues/696
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The previous implementation used two hash sets and a list.
... that's not necessary ... a single hash map suffices.
And it's also less error prone ... because the previous data structure
allowed a setting to be enabled and disabled at the same time.
Previously the default_value was abused for the cookie name.
Having SwitchableSetting subclass Setting doesn't even make sense
in the first place since none of the Setting methods apply.