Ponysearch/docs/admin/filtron.rst

192 lines
5.9 KiB
ReStructuredText
Raw Normal View History

.. _searx filtron:
==========================
2016-10-30 01:01:22 +02:00
How to protect an instance
==========================
.. sidebar:: further reading
- :ref:`filtron.sh`
- :ref:`nginx searx site`
.. contents:: Contents
:depth: 2
:local:
:backlinks: entry
.. _filtron: https://github.com/asciimoo/filtron
Searx depends on external search services. To avoid the abuse of these services
it is advised to limit the number of requests processed by searx.
2016-10-30 01:01:22 +02:00
An application firewall, filtron_ solves exactly this problem. Filtron is just
a middleware between your web server (nginx, apache, ...) and searx, we describe
such infratructures in chapter: :ref:`architecture`.
filtron & go
============
.. _Go: https://golang.org/
.. _filtron README: https://github.com/asciimoo/filtron/blob/master/README.md
Filtron needs Go_ installed. If Go_ is preinstalled, filtron_ is simply
installed by ``go get`` package management (see `filtron README`_). If you use
filtron as middleware, a more isolated setup is recommended. To simplify such
an installation and the maintenance of, use our script :ref:`filtron.sh`.
2016-10-30 01:01:22 +02:00
.. _Sample configuration of filtron:
2016-10-30 01:01:22 +02:00
Sample configuration of filtron
===============================
2016-10-30 01:01:22 +02:00
.. sidebar:: Tooling box
- :origin:`/etc/filtron/rules.json <utils/templates/etc/filtron/rules.json>`
An example configuration can be find below. This configuration limits the access
of:
2016-10-30 01:01:22 +02:00
- scripts or applications (roboagent limit)
- webcrawlers (botlimit)
- IPs which send too many requests (IP limit)
- too many json, csv, etc. requests (rss/json limit)
- the same UserAgent of if too many requests (useragent limit)
2016-10-30 01:01:22 +02:00
.. code:: json
[
{
"name": "search request",
"filters": [
"Param:q",
"Path=^(/|/search)$"
],
"interval": "<time-interval-in-sec (int)>"
"limit": "<max-request-number-in-interval (int)>",
"subrules": [
{
"name": "missing Accept-Language",
"filters": ["!Header:Accept-Language"],
"limit": "<max-request-number-in-interval (int)>",
"stop": true,
"actions": [
{"name":"log"},
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "suspiciously Connection=close header",
"filters": ["Header:Connection=close"],
"limit": "<max-request-number-in-interval (int)>",
"stop": true,
"actions": [
{"name":"log"},
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "IP limit",
"interval": "<time-interval-in-sec (int)>"
"limit": "<max-request-number-in-interval (int)>",
"stop": true,
"aggregations": [
"Header:X-Forwarded-For"
],
"actions": [
{ "name": "log"},
{ "name": "block",
"params": {
"message": "Rate limit exceeded"
}
}
]
},
{
"name": "rss/json limit",
"filters": [
"Param:format=(csv|json|rss)"
],
"interval": "<time-interval-in-sec (int)>"
"limit": "<max-request-number-in-interval (int)>",
"stop": true,
"actions": [
{ "name": "log"},
{ "name": "block",
"params": {
"message": "Rate limit exceeded"
}
}
]
},
{
"name": "useragent limit",
"interval": "<time-interval-in-sec (int)>"
"limit": "<max-request-number-in-interval (int)>",
"aggregations": [
"Header:User-Agent"
],
"actions": [
{ "name": "log"},
{ "name": "block",
"params": {
"message": "Rate limit exceeded"
}
}
]
}
]
}
]
2016-10-30 01:01:22 +02:00
.. _filtron route request:
2016-10-30 01:01:22 +02:00
Route request through filtron
=============================
2016-10-30 01:01:22 +02:00
.. sidebar:: further reading
- :ref:`filtron.sh overview`
- :ref:`installation nginx`
- :ref:`installation apache`
2016-10-30 01:01:22 +02:00
Filtron can be started using the following command:
.. code:: sh
2016-10-30 01:01:22 +02:00
$ filtron -rules rules.json
2016-10-30 01:01:22 +02:00
It listens on ``127.0.0.1:4004`` and forwards filtered requests to
``127.0.0.1:8888`` by default.
2016-10-30 01:01:22 +02:00
Use it along with ``nginx`` with the following example configuration.
.. code:: nginx
2016-10-30 01:01:22 +02:00
# https://example.org/searx
location /searx {
proxy_pass http://127.0.0.1:4004/;
proxy_set_header Host $http_host;
proxy_set_header Connection $http_connection;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Script-Name /searx;
}
2016-10-30 01:01:22 +02:00
location /searx/static {
/usr/local/searx/searx-src/searx/static;
}
Requests are coming from port 4004 going through filtron and then forwarded to
port 8888 where a searx is being run. For a complete setup see: :ref:`nginx
searx site`.