forked from Ponysearch/Ponysearch
[docs] revision of the article "Engine Overview"
This patch revision of the article "Engine Overview": - add links & anchors - improve formating of the tables Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
parent
f122cb0e27
commit
5caebb379f
1 changed files with 232 additions and 225 deletions
|
@ -1,12 +1,20 @@
|
||||||
|
|
||||||
.. _engines-dev:
|
.. _engines-dev:
|
||||||
|
|
||||||
===============
|
===============
|
||||||
Engine overview
|
Engine Overview
|
||||||
===============
|
===============
|
||||||
|
|
||||||
.. _metasearch-engine: https://en.wikipedia.org/wiki/Metasearch_engine
|
.. _metasearch-engine: https://en.wikipedia.org/wiki/Metasearch_engine
|
||||||
|
|
||||||
|
.. sidebar:: Further reading ..
|
||||||
|
|
||||||
|
- :ref:`general engine settings`
|
||||||
|
- :ref:`settings engine`
|
||||||
|
|
||||||
|
.. contents::
|
||||||
|
:depth: 3
|
||||||
|
:backlinks: entry
|
||||||
|
|
||||||
searx is a metasearch-engine_, so it uses different search engines to provide
|
searx is a metasearch-engine_, so it uses different search engines to provide
|
||||||
better results.
|
better results.
|
||||||
|
|
||||||
|
@ -14,297 +22,296 @@ Because there is no general search API which could be used for every search
|
||||||
engine, an adapter has to be built between searx and the external search
|
engine, an adapter has to be built between searx and the external search
|
||||||
engines. Adapters are stored under the folder :origin:`searx/engines`.
|
engines. Adapters are stored under the folder :origin:`searx/engines`.
|
||||||
|
|
||||||
.. contents::
|
|
||||||
:depth: 3
|
|
||||||
:backlinks: entry
|
|
||||||
|
|
||||||
|
|
||||||
.. _general engine configuration:
|
.. _general engine configuration:
|
||||||
|
|
||||||
general engine configuration
|
General Engine Configuration
|
||||||
============================
|
============================
|
||||||
|
|
||||||
It is required to tell searx the type of results the engine provides. The
|
It is required to tell searx the type of results the engine provides. The
|
||||||
arguments can be set in the engine file or in the settings file
|
arguments can be set in the engine file or in the settings file (normally
|
||||||
(normally ``settings.yml``). The arguments in the settings file override
|
``settings.yml``). The arguments in the settings file override the ones in the
|
||||||
the ones in the engine file.
|
engine file.
|
||||||
|
|
||||||
It does not matter if an option is stored in the engine file or in the
|
It does not matter if an option is stored in the engine file or in the settings.
|
||||||
settings. However, the standard way is the following:
|
However, the standard way is the following:
|
||||||
|
|
||||||
.. _engine file:
|
.. _engine file:
|
||||||
|
|
||||||
engine file
|
Engine File
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
======================= =========== ========================================================
|
.. table:: Common options in the engine module
|
||||||
argument type information
|
:width: 100%
|
||||||
======================= =========== ========================================================
|
|
||||||
categories list pages, in which the engine is working
|
======================= =========== ========================================================
|
||||||
paging boolean support multible pages
|
argument type information
|
||||||
time_range_support boolean support search time range
|
======================= =========== ========================================================
|
||||||
engine_type str ``online`` by default, other possibles values are
|
categories list pages, in which the engine is working
|
||||||
``offline``, ``online_dictionnary``, ``online_currency``
|
paging boolean support multible pages
|
||||||
======================= =========== ========================================================
|
time_range_support boolean support search time range
|
||||||
|
engine_type str - ``online`` :ref:`[ref] <demo online engine>` by
|
||||||
|
default, other possibles values are:
|
||||||
|
- ``offline`` :ref:`[ref] <offline engines>`
|
||||||
|
- ``online_dictionary``
|
||||||
|
- ``online_currency``
|
||||||
|
======================= =========== ========================================================
|
||||||
|
|
||||||
.. _engine settings:
|
.. _engine settings:
|
||||||
|
|
||||||
settings.yml
|
Engine ``settings.yml``
|
||||||
------------
|
-----------------------
|
||||||
|
|
||||||
======================= =========== =============================================
|
For a more detailed description, see :ref:`settings engine` in the :ref:`settings.yml`.
|
||||||
argument type information
|
|
||||||
======================= =========== =============================================
|
.. table:: Common options in the engine setup (``settings.yml``)
|
||||||
name string name of search-engine
|
:width: 100%
|
||||||
engine string name of searx-engine
|
|
||||||
(filename without ``.py``)
|
======================= =========== ===============================================
|
||||||
enable_http bool enable HTTP
|
argument type information
|
||||||
(by default only HTTPS is enabled).
|
======================= =========== ===============================================
|
||||||
shortcut string shortcut of search-engine
|
name string name of search-engine
|
||||||
timeout string specific timeout for search-engine
|
engine string name of searx-engine (filename without ``.py``)
|
||||||
display_error_messages boolean display error messages on the web UI
|
enable_http bool enable HTTP (by default only HTTPS is enabled).
|
||||||
proxies dict set proxies for a specific engine
|
shortcut string shortcut of search-engine
|
||||||
|
timeout string specific timeout for search-engine
|
||||||
|
display_error_messages boolean display error messages on the web UI
|
||||||
|
proxies dict set proxies for a specific engine
|
||||||
(e.g. ``proxies : {http: socks5://proxy:port,
|
(e.g. ``proxies : {http: socks5://proxy:port,
|
||||||
https: socks5://proxy:port}``)
|
https: socks5://proxy:port}``)
|
||||||
======================= =========== =============================================
|
======================= =========== ===============================================
|
||||||
|
|
||||||
|
.. _engine overrides:
|
||||||
|
|
||||||
overrides
|
Overrides
|
||||||
---------
|
---------
|
||||||
|
|
||||||
A few of the options have default values in the engine, but are often
|
.. sidebar:: engine's global names
|
||||||
overwritten by the settings. If ``None`` is assigned to an option in the engine
|
|
||||||
file, it has to be redefined in the settings, otherwise searx will not start
|
|
||||||
with that engine.
|
|
||||||
|
|
||||||
The naming of overrides is arbitrary. But the recommended overrides are the
|
Global names with a leading underline are *private to the engine* and will
|
||||||
following:
|
not be overwritten.
|
||||||
|
|
||||||
======================= =========== ===========================================
|
A few of the options have default values in the namespace of engine's python
|
||||||
argument type information
|
modul, but are often overwritten by the settings. If ``None`` is assigned to an
|
||||||
======================= =========== ===========================================
|
option in the engine file, it has to be redefined in the settings, otherwise
|
||||||
base_url string base-url, can be overwritten to use same
|
searx will not start with that engine.
|
||||||
engine on other URL
|
|
||||||
number_of_results int maximum number of results per request
|
|
||||||
language string ISO code of language and country like en_US
|
|
||||||
api_key string api-key if required by engine
|
|
||||||
======================= =========== ===========================================
|
|
||||||
|
|
||||||
example code
|
Here is an very simple example of the global names in the namespace of engine's
|
||||||
------------
|
module:
|
||||||
|
|
||||||
.. code:: python
|
.. code:: python
|
||||||
|
|
||||||
# engine dependent config
|
# engine dependent config
|
||||||
categories = ['general']
|
categories = ['general']
|
||||||
paging = True
|
paging = True
|
||||||
|
_non_overwritten_global = 'foo'
|
||||||
|
|
||||||
|
|
||||||
|
.. table:: The naming of overrides is arbitrary / recommended overrides are:
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
|
======================= =========== ===========================================
|
||||||
|
argument type information
|
||||||
|
======================= =========== ===========================================
|
||||||
|
base_url string base-url, can be overwritten to use same
|
||||||
|
engine on other URL
|
||||||
|
number_of_results int maximum number of results per request
|
||||||
|
language string ISO code of language and country like en_US
|
||||||
|
api_key string api-key if required by engine
|
||||||
|
======================= =========== ===========================================
|
||||||
|
|
||||||
.. _engine request:
|
.. _engine request:
|
||||||
|
|
||||||
making a request
|
Making a Request
|
||||||
================
|
================
|
||||||
|
|
||||||
To perform a search an URL have to be specified. In addition to specifying an
|
To perform a search an URL have to be specified. In addition to specifying an
|
||||||
URL, arguments can be passed to the query.
|
URL, arguments can be passed to the query.
|
||||||
|
|
||||||
passed arguments
|
.. _engine request arguments:
|
||||||
----------------
|
|
||||||
|
Passed Arguments (request)
|
||||||
|
--------------------------
|
||||||
|
|
||||||
These arguments can be used to construct the search query. Furthermore,
|
These arguments can be used to construct the search query. Furthermore,
|
||||||
parameters with default value can be redefined for special purposes.
|
parameters with default value can be redefined for special purposes.
|
||||||
|
|
||||||
If the ``engine_type`` is ``online```:
|
|
||||||
|
|
||||||
====================== ============== ========================================================================
|
.. table:: If the ``engine_type`` is ``online``
|
||||||
argument type default-value, information
|
:width: 100%
|
||||||
====================== ============== ========================================================================
|
|
||||||
url str ``''``
|
====================== ============== ========================================================================
|
||||||
method str ``'GET'``
|
argument type default-value, information
|
||||||
headers set ``{}``
|
====================== ============== ========================================================================
|
||||||
data set ``{}``
|
url str ``''``
|
||||||
cookies set ``{}``
|
method str ``'GET'``
|
||||||
verify bool ``True``
|
headers set ``{}``
|
||||||
headers.User-Agent str a random User-Agent
|
data set ``{}``
|
||||||
category str current category, like ``'general'``
|
cookies set ``{}``
|
||||||
safesearch int ``0``, between ``0`` and ``2`` (normal, moderate, strict)
|
verify bool ``True``
|
||||||
time_range Optional[str] ``None``, can be ``day``, ``week``, ``month``, ``year``
|
headers.User-Agent str a random User-Agent
|
||||||
pageno int current pagenumber
|
category str current category, like ``'general'``
|
||||||
language str specific language code like ``'en_US'``, or ``'all'`` if unspecified
|
safesearch int ``0``, between ``0`` and ``2`` (normal, moderate, strict)
|
||||||
====================== ============== ========================================================================
|
time_range Optional[str] ``None``, can be ``day``, ``week``, ``month``, ``year``
|
||||||
|
pageno int current pagenumber
|
||||||
|
language str specific language code like ``'en_US'``, or ``'all'`` if unspecified
|
||||||
|
====================== ============== ========================================================================
|
||||||
|
|
||||||
|
|
||||||
If the ``engine_type`` is ``online_dictionnary```, in addition to the ``online`` arguments:
|
.. table:: If the ``engine_type`` is ``online_dictionary``, in addition to the
|
||||||
|
``online`` arguments:
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
argument type default-value, information
|
argument type default-value, information
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
from_lang str specific language code like ``'en_US'``
|
from_lang str specific language code like ``'en_US'``
|
||||||
to_lang str specific language code like ``'en_US'``
|
to_lang str specific language code like ``'en_US'``
|
||||||
query str the text query without the languages
|
query str the text query without the languages
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
|
|
||||||
If the ``engine_type`` is ``online_currency```, in addition to the ``online`` arguments:
|
.. table:: If the ``engine_type`` is ``online_currency```, in addition to the
|
||||||
|
``online`` arguments:
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
argument type default-value, information
|
argument type default-value, information
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
amount float the amount to convert
|
amount float the amount to convert
|
||||||
from str ISO 4217 code
|
from str ISO 4217 code
|
||||||
to str ISO 4217 code
|
to str ISO 4217 code
|
||||||
from_name str currency name
|
from_name str currency name
|
||||||
to_name str currency name
|
to_name str currency name
|
||||||
====================== ============ ========================================================================
|
====================== ============== ========================================================================
|
||||||
|
|
||||||
|
|
||||||
parsed arguments
|
Specify Request
|
||||||
----------------
|
---------------
|
||||||
|
|
||||||
The function ``def request(query, params):`` always returns the ``params``
|
The function :py:func:`def request(query, params):
|
||||||
variable. Inside searx, the following paramters can be used to specify a search
|
<searx.engines.demo_online.request>` always returns the ``params`` variable, the
|
||||||
request:
|
following parameters can be used to specify a search request:
|
||||||
|
|
||||||
=================== =========== ==========================================================================
|
.. table::
|
||||||
argument type information
|
:width: 100%
|
||||||
=================== =========== ==========================================================================
|
|
||||||
url str requested url
|
|
||||||
method str HTTP request method
|
|
||||||
headers set HTTP header information
|
|
||||||
data set HTTP data information
|
|
||||||
cookies set HTTP cookies
|
|
||||||
verify bool Performing SSL-Validity check
|
|
||||||
allow_redirects bool Follow redirects
|
|
||||||
max_redirects int maximum redirects, hard limit
|
|
||||||
soft_max_redirects int maximum redirects, soft limit. Record an error but don't stop the engine
|
|
||||||
raise_for_httperror bool True by default: raise an exception if the HTTP code of response is >= 300
|
|
||||||
=================== =========== ==========================================================================
|
|
||||||
|
|
||||||
|
=================== =========== ==========================================================================
|
||||||
example code
|
argument type information
|
||||||
------------
|
=================== =========== ==========================================================================
|
||||||
|
url str requested url
|
||||||
.. code:: python
|
method str HTTP request method
|
||||||
|
headers set HTTP header information
|
||||||
# search-url
|
data set HTTP data information
|
||||||
base_url = 'https://example.com/'
|
cookies set HTTP cookies
|
||||||
search_string = 'search?{query}&page={page}'
|
verify bool Performing SSL-Validity check
|
||||||
|
allow_redirects bool Follow redirects
|
||||||
# do search-request
|
max_redirects int maximum redirects, hard limit
|
||||||
def request(query, params):
|
soft_max_redirects int maximum redirects, soft limit. Record an error but don't stop the engine
|
||||||
search_path = search_string.format(
|
raise_for_httperror bool True by default: raise an exception if the HTTP code of response is >= 300
|
||||||
query=urlencode({'q': query}),
|
=================== =========== ==========================================================================
|
||||||
page=params['pageno'])
|
|
||||||
|
|
||||||
params['url'] = base_url + search_path
|
|
||||||
|
|
||||||
return params
|
|
||||||
|
|
||||||
|
|
||||||
.. _engine results:
|
.. _engine results:
|
||||||
|
.. _engine media types:
|
||||||
|
|
||||||
returned results
|
Media Types
|
||||||
================
|
===========
|
||||||
|
|
||||||
Searx is able to return results of different media-types. Currently the
|
Each result item of an engine can be of different media-types. Currently the
|
||||||
following media-types are supported:
|
following media-types are supported. To set another media-type as ``default``,
|
||||||
|
the parameter ``template`` must be set to the desired type.
|
||||||
|
|
||||||
- default_
|
.. table:: Parameter of the **default** media type:
|
||||||
- images_
|
:width: 100%
|
||||||
- videos_
|
|
||||||
- torrent_
|
|
||||||
- map_
|
|
||||||
|
|
||||||
To set another media-type as default, the parameter ``template`` must be set to
|
========================= =====================================================
|
||||||
the desired type.
|
result-parameter information
|
||||||
|
========================= =====================================================
|
||||||
|
url string, url of the result
|
||||||
|
title string, title of the result
|
||||||
|
content string, general result-text
|
||||||
|
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||||
|
========================= =====================================================
|
||||||
|
|
||||||
default
|
|
||||||
-------
|
|
||||||
|
|
||||||
========================= =====================================================
|
.. table:: Parameter of the **images** media type:
|
||||||
result-parameter information
|
:width: 100%
|
||||||
========================= =====================================================
|
|
||||||
url string, url of the result
|
|
||||||
title string, title of the result
|
|
||||||
content string, general result-text
|
|
||||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
|
||||||
========================= =====================================================
|
|
||||||
|
|
||||||
images
|
========================= =====================================================
|
||||||
------
|
result-parameter information
|
||||||
|
------------------------- -----------------------------------------------------
|
||||||
To use this template, the parameter:
|
template is set to ``images.html``
|
||||||
|
========================= =====================================================
|
||||||
========================= =====================================================
|
url string, url to the result site
|
||||||
result-parameter information
|
title string, title of the result *(partly implemented)*
|
||||||
========================= =====================================================
|
content *(partly implemented)*
|
||||||
template is set to ``images.html``
|
publishedDate :py:class:`datetime.datetime`,
|
||||||
url string, url to the result site
|
|
||||||
title string, title of the result *(partly implemented)*
|
|
||||||
content *(partly implemented)*
|
|
||||||
publishedDate :py:class:`datetime.datetime`,
|
|
||||||
time of publish *(partly implemented)*
|
time of publish *(partly implemented)*
|
||||||
img\_src string, url to the result image
|
img\_src string, url to the result image
|
||||||
thumbnail\_src string, url to a small-preview image
|
thumbnail\_src string, url to a small-preview image
|
||||||
========================= =====================================================
|
========================= =====================================================
|
||||||
|
|
||||||
videos
|
|
||||||
------
|
|
||||||
|
|
||||||
========================= =====================================================
|
.. table:: Parameter of the **videos** media type:
|
||||||
result-parameter information
|
:width: 100%
|
||||||
========================= =====================================================
|
|
||||||
template is set to ``videos.html``
|
|
||||||
url string, url of the result
|
|
||||||
title string, title of the result
|
|
||||||
content *(not implemented yet)*
|
|
||||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
|
||||||
thumbnail string, url to a small-preview image
|
|
||||||
========================= =====================================================
|
|
||||||
|
|
||||||
torrent
|
========================= =====================================================
|
||||||
-------
|
result-parameter information
|
||||||
|
------------------------- -----------------------------------------------------
|
||||||
|
template is set to ``videos.html``
|
||||||
|
========================= =====================================================
|
||||||
|
url string, url of the result
|
||||||
|
title string, title of the result
|
||||||
|
content *(not implemented yet)*
|
||||||
|
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||||
|
thumbnail string, url to a small-preview image
|
||||||
|
========================= =====================================================
|
||||||
|
|
||||||
.. _magnetlink: https://en.wikipedia.org/wiki/Magnet_URI_scheme
|
.. _magnetlink: https://en.wikipedia.org/wiki/Magnet_URI_scheme
|
||||||
|
|
||||||
========================= =====================================================
|
.. table:: Parameter of the **torrent** media type:
|
||||||
result-parameter information
|
:width: 100%
|
||||||
========================= =====================================================
|
|
||||||
template is set to ``torrent.html``
|
========================= =====================================================
|
||||||
url string, url of the result
|
result-parameter information
|
||||||
title string, title of the result
|
------------------------- -----------------------------------------------------
|
||||||
content string, general result-text
|
template is set to ``torrent.html``
|
||||||
publishedDate :py:class:`datetime.datetime`,
|
========================= =====================================================
|
||||||
|
url string, url of the result
|
||||||
|
title string, title of the result
|
||||||
|
content string, general result-text
|
||||||
|
publishedDate :py:class:`datetime.datetime`,
|
||||||
time of publish *(not implemented yet)*
|
time of publish *(not implemented yet)*
|
||||||
seed int, number of seeder
|
seed int, number of seeder
|
||||||
leech int, number of leecher
|
leech int, number of leecher
|
||||||
filesize int, size of file in bytes
|
filesize int, size of file in bytes
|
||||||
files int, number of files
|
files int, number of files
|
||||||
magnetlink string, magnetlink_ of the result
|
magnetlink string, magnetlink_ of the result
|
||||||
torrentfile string, torrentfile of the result
|
torrentfile string, torrentfile of the result
|
||||||
========================= =====================================================
|
========================= =====================================================
|
||||||
|
|
||||||
|
.. table:: Parameter of the **map** media type:
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
map
|
========================= =====================================================
|
||||||
---
|
result-parameter information
|
||||||
|
------------------------- -----------------------------------------------------
|
||||||
========================= =====================================================
|
template is set to ``map.html``
|
||||||
result-parameter information
|
========================= =====================================================
|
||||||
========================= =====================================================
|
url string, url of the result
|
||||||
url string, url of the result
|
title string, title of the result
|
||||||
title string, title of the result
|
content string, general result-text
|
||||||
content string, general result-text
|
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
latitude latitude of result (in decimal format)
|
||||||
latitude latitude of result (in decimal format)
|
longitude longitude of result (in decimal format)
|
||||||
longitude longitude of result (in decimal format)
|
boundingbox boundingbox of result (array of 4. values
|
||||||
boundingbox boundingbox of result (array of 4. values
|
|
||||||
``[lat-min, lat-max, lon-min, lon-max]``)
|
``[lat-min, lat-max, lon-min, lon-max]``)
|
||||||
geojson geojson of result (https://geojson.org/)
|
geojson geojson of result (https://geojson.org/)
|
||||||
osm.type type of osm-object (if OSM-Result)
|
osm.type type of osm-object (if OSM-Result)
|
||||||
osm.id id of osm-object (if OSM-Result)
|
osm.id id of osm-object (if OSM-Result)
|
||||||
address.name name of object
|
address.name name of object
|
||||||
address.road street name of object
|
address.road street name of object
|
||||||
address.house_number house number of object
|
address.house_number house number of object
|
||||||
address.locality city, place of object
|
address.locality city, place of object
|
||||||
address.postcode postcode of object
|
address.postcode postcode of object
|
||||||
address.country country of object
|
address.country country of object
|
||||||
========================= =====================================================
|
========================= =====================================================
|
||||||
|
|
Loading…
Reference in a new issue