forked from Ponysearch/Ponysearch
[docs] revision of the article "Engine Overview"
This patch revision of the article "Engine Overview": - add links & anchors - improve formating of the tables Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
parent
f122cb0e27
commit
5caebb379f
1 changed files with 232 additions and 225 deletions
|
@ -1,12 +1,20 @@
|
|||
|
||||
.. _engines-dev:
|
||||
|
||||
===============
|
||||
Engine overview
|
||||
Engine Overview
|
||||
===============
|
||||
|
||||
.. _metasearch-engine: https://en.wikipedia.org/wiki/Metasearch_engine
|
||||
|
||||
.. sidebar:: Further reading ..
|
||||
|
||||
- :ref:`general engine settings`
|
||||
- :ref:`settings engine`
|
||||
|
||||
.. contents::
|
||||
:depth: 3
|
||||
:backlinks: entry
|
||||
|
||||
searx is a metasearch-engine_, so it uses different search engines to provide
|
||||
better results.
|
||||
|
||||
|
@ -14,297 +22,296 @@ Because there is no general search API which could be used for every search
|
|||
engine, an adapter has to be built between searx and the external search
|
||||
engines. Adapters are stored under the folder :origin:`searx/engines`.
|
||||
|
||||
.. contents::
|
||||
:depth: 3
|
||||
:backlinks: entry
|
||||
|
||||
|
||||
.. _general engine configuration:
|
||||
|
||||
general engine configuration
|
||||
General Engine Configuration
|
||||
============================
|
||||
|
||||
It is required to tell searx the type of results the engine provides. The
|
||||
arguments can be set in the engine file or in the settings file
|
||||
(normally ``settings.yml``). The arguments in the settings file override
|
||||
the ones in the engine file.
|
||||
arguments can be set in the engine file or in the settings file (normally
|
||||
``settings.yml``). The arguments in the settings file override the ones in the
|
||||
engine file.
|
||||
|
||||
It does not matter if an option is stored in the engine file or in the
|
||||
settings. However, the standard way is the following:
|
||||
It does not matter if an option is stored in the engine file or in the settings.
|
||||
However, the standard way is the following:
|
||||
|
||||
.. _engine file:
|
||||
|
||||
engine file
|
||||
Engine File
|
||||
-----------
|
||||
|
||||
======================= =========== ========================================================
|
||||
argument type information
|
||||
======================= =========== ========================================================
|
||||
categories list pages, in which the engine is working
|
||||
paging boolean support multible pages
|
||||
time_range_support boolean support search time range
|
||||
engine_type str ``online`` by default, other possibles values are
|
||||
``offline``, ``online_dictionnary``, ``online_currency``
|
||||
======================= =========== ========================================================
|
||||
.. table:: Common options in the engine module
|
||||
:width: 100%
|
||||
|
||||
======================= =========== ========================================================
|
||||
argument type information
|
||||
======================= =========== ========================================================
|
||||
categories list pages, in which the engine is working
|
||||
paging boolean support multible pages
|
||||
time_range_support boolean support search time range
|
||||
engine_type str - ``online`` :ref:`[ref] <demo online engine>` by
|
||||
default, other possibles values are:
|
||||
- ``offline`` :ref:`[ref] <offline engines>`
|
||||
- ``online_dictionary``
|
||||
- ``online_currency``
|
||||
======================= =========== ========================================================
|
||||
|
||||
.. _engine settings:
|
||||
|
||||
settings.yml
|
||||
------------
|
||||
Engine ``settings.yml``
|
||||
-----------------------
|
||||
|
||||
======================= =========== =============================================
|
||||
argument type information
|
||||
======================= =========== =============================================
|
||||
name string name of search-engine
|
||||
engine string name of searx-engine
|
||||
(filename without ``.py``)
|
||||
enable_http bool enable HTTP
|
||||
(by default only HTTPS is enabled).
|
||||
shortcut string shortcut of search-engine
|
||||
timeout string specific timeout for search-engine
|
||||
display_error_messages boolean display error messages on the web UI
|
||||
proxies dict set proxies for a specific engine
|
||||
(e.g. ``proxies : {http: socks5://proxy:port,
|
||||
https: socks5://proxy:port}``)
|
||||
======================= =========== =============================================
|
||||
For a more detailed description, see :ref:`settings engine` in the :ref:`settings.yml`.
|
||||
|
||||
.. table:: Common options in the engine setup (``settings.yml``)
|
||||
:width: 100%
|
||||
|
||||
overrides
|
||||
======================= =========== ===============================================
|
||||
argument type information
|
||||
======================= =========== ===============================================
|
||||
name string name of search-engine
|
||||
engine string name of searx-engine (filename without ``.py``)
|
||||
enable_http bool enable HTTP (by default only HTTPS is enabled).
|
||||
shortcut string shortcut of search-engine
|
||||
timeout string specific timeout for search-engine
|
||||
display_error_messages boolean display error messages on the web UI
|
||||
proxies dict set proxies for a specific engine
|
||||
(e.g. ``proxies : {http: socks5://proxy:port,
|
||||
https: socks5://proxy:port}``)
|
||||
======================= =========== ===============================================
|
||||
|
||||
.. _engine overrides:
|
||||
|
||||
Overrides
|
||||
---------
|
||||
|
||||
A few of the options have default values in the engine, but are often
|
||||
overwritten by the settings. If ``None`` is assigned to an option in the engine
|
||||
file, it has to be redefined in the settings, otherwise searx will not start
|
||||
with that engine.
|
||||
.. sidebar:: engine's global names
|
||||
|
||||
The naming of overrides is arbitrary. But the recommended overrides are the
|
||||
following:
|
||||
Global names with a leading underline are *private to the engine* and will
|
||||
not be overwritten.
|
||||
|
||||
======================= =========== ===========================================
|
||||
argument type information
|
||||
======================= =========== ===========================================
|
||||
base_url string base-url, can be overwritten to use same
|
||||
engine on other URL
|
||||
number_of_results int maximum number of results per request
|
||||
language string ISO code of language and country like en_US
|
||||
api_key string api-key if required by engine
|
||||
======================= =========== ===========================================
|
||||
A few of the options have default values in the namespace of engine's python
|
||||
modul, but are often overwritten by the settings. If ``None`` is assigned to an
|
||||
option in the engine file, it has to be redefined in the settings, otherwise
|
||||
searx will not start with that engine.
|
||||
|
||||
example code
|
||||
------------
|
||||
Here is an very simple example of the global names in the namespace of engine's
|
||||
module:
|
||||
|
||||
.. code:: python
|
||||
|
||||
# engine dependent config
|
||||
categories = ['general']
|
||||
paging = True
|
||||
_non_overwritten_global = 'foo'
|
||||
|
||||
|
||||
.. table:: The naming of overrides is arbitrary / recommended overrides are:
|
||||
:width: 100%
|
||||
|
||||
======================= =========== ===========================================
|
||||
argument type information
|
||||
======================= =========== ===========================================
|
||||
base_url string base-url, can be overwritten to use same
|
||||
engine on other URL
|
||||
number_of_results int maximum number of results per request
|
||||
language string ISO code of language and country like en_US
|
||||
api_key string api-key if required by engine
|
||||
======================= =========== ===========================================
|
||||
|
||||
.. _engine request:
|
||||
|
||||
making a request
|
||||
Making a Request
|
||||
================
|
||||
|
||||
To perform a search an URL have to be specified. In addition to specifying an
|
||||
URL, arguments can be passed to the query.
|
||||
|
||||
passed arguments
|
||||
----------------
|
||||
.. _engine request arguments:
|
||||
|
||||
Passed Arguments (request)
|
||||
--------------------------
|
||||
|
||||
These arguments can be used to construct the search query. Furthermore,
|
||||
parameters with default value can be redefined for special purposes.
|
||||
|
||||
If the ``engine_type`` is ``online```:
|
||||
|
||||
====================== ============== ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============== ========================================================================
|
||||
url str ``''``
|
||||
method str ``'GET'``
|
||||
headers set ``{}``
|
||||
data set ``{}``
|
||||
cookies set ``{}``
|
||||
verify bool ``True``
|
||||
headers.User-Agent str a random User-Agent
|
||||
category str current category, like ``'general'``
|
||||
safesearch int ``0``, between ``0`` and ``2`` (normal, moderate, strict)
|
||||
time_range Optional[str] ``None``, can be ``day``, ``week``, ``month``, ``year``
|
||||
pageno int current pagenumber
|
||||
language str specific language code like ``'en_US'``, or ``'all'`` if unspecified
|
||||
====================== ============== ========================================================================
|
||||
.. table:: If the ``engine_type`` is ``online``
|
||||
:width: 100%
|
||||
|
||||
====================== ============== ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============== ========================================================================
|
||||
url str ``''``
|
||||
method str ``'GET'``
|
||||
headers set ``{}``
|
||||
data set ``{}``
|
||||
cookies set ``{}``
|
||||
verify bool ``True``
|
||||
headers.User-Agent str a random User-Agent
|
||||
category str current category, like ``'general'``
|
||||
safesearch int ``0``, between ``0`` and ``2`` (normal, moderate, strict)
|
||||
time_range Optional[str] ``None``, can be ``day``, ``week``, ``month``, ``year``
|
||||
pageno int current pagenumber
|
||||
language str specific language code like ``'en_US'``, or ``'all'`` if unspecified
|
||||
====================== ============== ========================================================================
|
||||
|
||||
|
||||
If the ``engine_type`` is ``online_dictionnary```, in addition to the ``online`` arguments:
|
||||
.. table:: If the ``engine_type`` is ``online_dictionary``, in addition to the
|
||||
``online`` arguments:
|
||||
:width: 100%
|
||||
|
||||
====================== ============ ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============ ========================================================================
|
||||
from_lang str specific language code like ``'en_US'``
|
||||
to_lang str specific language code like ``'en_US'``
|
||||
query str the text query without the languages
|
||||
====================== ============ ========================================================================
|
||||
====================== ============== ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============== ========================================================================
|
||||
from_lang str specific language code like ``'en_US'``
|
||||
to_lang str specific language code like ``'en_US'``
|
||||
query str the text query without the languages
|
||||
====================== ============== ========================================================================
|
||||
|
||||
If the ``engine_type`` is ``online_currency```, in addition to the ``online`` arguments:
|
||||
.. table:: If the ``engine_type`` is ``online_currency```, in addition to the
|
||||
``online`` arguments:
|
||||
:width: 100%
|
||||
|
||||
====================== ============ ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============ ========================================================================
|
||||
amount float the amount to convert
|
||||
from str ISO 4217 code
|
||||
to str ISO 4217 code
|
||||
from_name str currency name
|
||||
to_name str currency name
|
||||
====================== ============ ========================================================================
|
||||
====================== ============== ========================================================================
|
||||
argument type default-value, information
|
||||
====================== ============== ========================================================================
|
||||
amount float the amount to convert
|
||||
from str ISO 4217 code
|
||||
to str ISO 4217 code
|
||||
from_name str currency name
|
||||
to_name str currency name
|
||||
====================== ============== ========================================================================
|
||||
|
||||
|
||||
parsed arguments
|
||||
----------------
|
||||
Specify Request
|
||||
---------------
|
||||
|
||||
The function ``def request(query, params):`` always returns the ``params``
|
||||
variable. Inside searx, the following paramters can be used to specify a search
|
||||
request:
|
||||
The function :py:func:`def request(query, params):
|
||||
<searx.engines.demo_online.request>` always returns the ``params`` variable, the
|
||||
following parameters can be used to specify a search request:
|
||||
|
||||
=================== =========== ==========================================================================
|
||||
argument type information
|
||||
=================== =========== ==========================================================================
|
||||
url str requested url
|
||||
method str HTTP request method
|
||||
headers set HTTP header information
|
||||
data set HTTP data information
|
||||
cookies set HTTP cookies
|
||||
verify bool Performing SSL-Validity check
|
||||
allow_redirects bool Follow redirects
|
||||
max_redirects int maximum redirects, hard limit
|
||||
soft_max_redirects int maximum redirects, soft limit. Record an error but don't stop the engine
|
||||
raise_for_httperror bool True by default: raise an exception if the HTTP code of response is >= 300
|
||||
=================== =========== ==========================================================================
|
||||
.. table::
|
||||
:width: 100%
|
||||
|
||||
|
||||
example code
|
||||
------------
|
||||
|
||||
.. code:: python
|
||||
|
||||
# search-url
|
||||
base_url = 'https://example.com/'
|
||||
search_string = 'search?{query}&page={page}'
|
||||
|
||||
# do search-request
|
||||
def request(query, params):
|
||||
search_path = search_string.format(
|
||||
query=urlencode({'q': query}),
|
||||
page=params['pageno'])
|
||||
|
||||
params['url'] = base_url + search_path
|
||||
|
||||
return params
|
||||
=================== =========== ==========================================================================
|
||||
argument type information
|
||||
=================== =========== ==========================================================================
|
||||
url str requested url
|
||||
method str HTTP request method
|
||||
headers set HTTP header information
|
||||
data set HTTP data information
|
||||
cookies set HTTP cookies
|
||||
verify bool Performing SSL-Validity check
|
||||
allow_redirects bool Follow redirects
|
||||
max_redirects int maximum redirects, hard limit
|
||||
soft_max_redirects int maximum redirects, soft limit. Record an error but don't stop the engine
|
||||
raise_for_httperror bool True by default: raise an exception if the HTTP code of response is >= 300
|
||||
=================== =========== ==========================================================================
|
||||
|
||||
|
||||
.. _engine results:
|
||||
.. _engine media types:
|
||||
|
||||
returned results
|
||||
================
|
||||
Media Types
|
||||
===========
|
||||
|
||||
Searx is able to return results of different media-types. Currently the
|
||||
following media-types are supported:
|
||||
Each result item of an engine can be of different media-types. Currently the
|
||||
following media-types are supported. To set another media-type as ``default``,
|
||||
the parameter ``template`` must be set to the desired type.
|
||||
|
||||
- default_
|
||||
- images_
|
||||
- videos_
|
||||
- torrent_
|
||||
- map_
|
||||
.. table:: Parameter of the **default** media type:
|
||||
:width: 100%
|
||||
|
||||
To set another media-type as default, the parameter ``template`` must be set to
|
||||
the desired type.
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
========================= =====================================================
|
||||
|
||||
default
|
||||
-------
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
========================= =====================================================
|
||||
.. table:: Parameter of the **images** media type:
|
||||
:width: 100%
|
||||
|
||||
images
|
||||
------
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
------------------------- -----------------------------------------------------
|
||||
template is set to ``images.html``
|
||||
========================= =====================================================
|
||||
url string, url to the result site
|
||||
title string, title of the result *(partly implemented)*
|
||||
content *(partly implemented)*
|
||||
publishedDate :py:class:`datetime.datetime`,
|
||||
time of publish *(partly implemented)*
|
||||
img\_src string, url to the result image
|
||||
thumbnail\_src string, url to a small-preview image
|
||||
========================= =====================================================
|
||||
|
||||
To use this template, the parameter:
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
template is set to ``images.html``
|
||||
url string, url to the result site
|
||||
title string, title of the result *(partly implemented)*
|
||||
content *(partly implemented)*
|
||||
publishedDate :py:class:`datetime.datetime`,
|
||||
time of publish *(partly implemented)*
|
||||
img\_src string, url to the result image
|
||||
thumbnail\_src string, url to a small-preview image
|
||||
========================= =====================================================
|
||||
.. table:: Parameter of the **videos** media type:
|
||||
:width: 100%
|
||||
|
||||
videos
|
||||
------
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
template is set to ``videos.html``
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content *(not implemented yet)*
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
thumbnail string, url to a small-preview image
|
||||
========================= =====================================================
|
||||
|
||||
torrent
|
||||
-------
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
------------------------- -----------------------------------------------------
|
||||
template is set to ``videos.html``
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content *(not implemented yet)*
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
thumbnail string, url to a small-preview image
|
||||
========================= =====================================================
|
||||
|
||||
.. _magnetlink: https://en.wikipedia.org/wiki/Magnet_URI_scheme
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
template is set to ``torrent.html``
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`,
|
||||
time of publish *(not implemented yet)*
|
||||
seed int, number of seeder
|
||||
leech int, number of leecher
|
||||
filesize int, size of file in bytes
|
||||
files int, number of files
|
||||
magnetlink string, magnetlink_ of the result
|
||||
torrentfile string, torrentfile of the result
|
||||
========================= =====================================================
|
||||
.. table:: Parameter of the **torrent** media type:
|
||||
:width: 100%
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
------------------------- -----------------------------------------------------
|
||||
template is set to ``torrent.html``
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`,
|
||||
time of publish *(not implemented yet)*
|
||||
seed int, number of seeder
|
||||
leech int, number of leecher
|
||||
filesize int, size of file in bytes
|
||||
files int, number of files
|
||||
magnetlink string, magnetlink_ of the result
|
||||
torrentfile string, torrentfile of the result
|
||||
========================= =====================================================
|
||||
|
||||
map
|
||||
---
|
||||
.. table:: Parameter of the **map** media type:
|
||||
:width: 100%
|
||||
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
latitude latitude of result (in decimal format)
|
||||
longitude longitude of result (in decimal format)
|
||||
boundingbox boundingbox of result (array of 4. values
|
||||
``[lat-min, lat-max, lon-min, lon-max]``)
|
||||
geojson geojson of result (https://geojson.org/)
|
||||
osm.type type of osm-object (if OSM-Result)
|
||||
osm.id id of osm-object (if OSM-Result)
|
||||
address.name name of object
|
||||
address.road street name of object
|
||||
address.house_number house number of object
|
||||
address.locality city, place of object
|
||||
address.postcode postcode of object
|
||||
address.country country of object
|
||||
========================= =====================================================
|
||||
========================= =====================================================
|
||||
result-parameter information
|
||||
------------------------- -----------------------------------------------------
|
||||
template is set to ``map.html``
|
||||
========================= =====================================================
|
||||
url string, url of the result
|
||||
title string, title of the result
|
||||
content string, general result-text
|
||||
publishedDate :py:class:`datetime.datetime`, time of publish
|
||||
latitude latitude of result (in decimal format)
|
||||
longitude longitude of result (in decimal format)
|
||||
boundingbox boundingbox of result (array of 4. values
|
||||
``[lat-min, lat-max, lon-min, lon-max]``)
|
||||
geojson geojson of result (https://geojson.org/)
|
||||
osm.type type of osm-object (if OSM-Result)
|
||||
osm.id id of osm-object (if OSM-Result)
|
||||
address.name name of object
|
||||
address.road street name of object
|
||||
address.house_number house number of object
|
||||
address.locality city, place of object
|
||||
address.postcode postcode of object
|
||||
address.country country of object
|
||||
========================= =====================================================
|
||||
|
|
Loading…
Reference in a new issue