Nginx error log json - Исправление ошибок и поиск оптимальных решений проблем

I’m trying to generate a JSON log from nginx.

I’m aware of solutions like this one but some of the fields I want to log include user generated input (like HTTP headers) which need to be escaped properly.

I’m aware of the nginx changelog entries from Oct 2011 and May 2008 that say:

*) Change: now the 0x7F-0x1F characters are escaped as xXX in an
   access_log.
*) Change: now the 0x00-0x1F, '"' and '' characters are escaped as xXX
   in an access_log.

but this still doesn’t help since xXX is invalid in a JSON string.

I’ve also looked at the HttpSetMiscModule module which has a set_quote_json_str directive, but this just seems to add x22 around the strings which doesn’t help.

Any idea for other solutions to log in JSON format from nginx?

asked Jul 31, 2014 at 1:54

Jules OlléonJules Olléon

6,6556 gold badges35 silver badges46 bronze badges

Finally it looks like we have good way to do this with vanilla nginx without any modules. Just define:

log_format json_combined escape=json
  '{'
    '"time_local":"$time_local",'
    '"remote_addr":"$remote_addr",'
    '"remote_user":"$remote_user",'
    '"request":"$request",'
    '"status": "$status",'
    '"body_bytes_sent":"$body_bytes_sent",'
    '"request_time":"$request_time",'
    '"http_referrer":"$http_referer",'
    '"http_user_agent":"$http_user_agent"'
  '}';

Note that escape=json was added in nginx 1.11.8.
http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format

grosser

14.5k7 gold badges56 silver badges60 bronze badges

answered Mar 2, 2017 at 19:53

pvapva

1,8781 gold badge11 silver badges8 bronze badges

Источник

Модуль ngx_http_log_module записывает логи запросов
в указанном формате.

Логи записываются в контексте location’а, где заканчивается обработка.
Это может быть location, отличный от первоначального, если в процессе
обработки запроса происходит
внутреннее
перенаправление.

Пример конфигурации

log_format compression '$remote_addr - $remote_user [$time_local] '
                       '"$request" $status $bytes_sent '
                       '"$http_referer" "$http_user_agent" "$gzip_ratio"';

access_log /spool/logs/nginx-access.log compression buffer=32k;

Директивы

Синтаксис:	`access_log путь [формат [buffer=размер] [gzip[=степень]] [flush=время] [if=условие]];` `access_log off;`
Умолчание:	access_log logs/access.log combined;
Контекст:	`http`, `server`, `location`, `if в location`, `limit_except`

Задаёт путь, формат и настройки буферизованной записи в лог.
На одном уровне конфигурации может использоваться несколько логов.
Запись в syslog
настраивается указанием префикса
“syslog:” в первом параметре.
Специальное значение off отменяет все директивы
access_log для текущего уровня.
Если формат не указан, то используется предопределённый формат
“combined”.

Если задан размер буфера с помощью параметра buffer или
указан параметр gzip (1.3.10, 1.2.7), то запись будет
буферизованной.

Размер буфера должен быть не больше размера атомарной записи в дисковый файл.
Для FreeBSD этот размер неограничен.

При включённой буферизации данные записываются в файл:

если очередная строка лога не помещается в буфер;
если данные в буфере находятся дольше интервала времени, заданного
параметром flush (1.3.10, 1.2.7);
при переоткрытии лог-файла или
завершении рабочего процесса.

Если задан параметр gzip, то буфер будет сжиматься перед
записью в файл.
Степень сжатия может быть задана в диапазоне от 1 (быстрее, но хуже сжатие)
до 9 (медленнее, но лучше сжатие).
По умолчанию используются буфер размером 64К байт и степень сжатия 1.
Данные сжимаются атомарными блоками, и в любой момент времени лог-файл может
быть распакован или прочитан с помощью утилиты “zcat”.

Пример:

access_log /path/to/log.gz combined gzip flush=5m;

Для поддержки gzip-сжатия логов nginx должен быть собран с библиотекой zlib.

В пути файла можно использовать переменные (0.7.6+),
но такие логи имеют некоторые ограничения:

пользователь,
с правами которого работают рабочие процессы, должен
иметь права на создание файлов в каталоге с такими логами;
не работает буферизация;
файл открывается для каждой записи в лог и сразу же после записи закрывается.
Следует однако иметь в виду, что поскольку дескрипторы часто используемых файлов
могут храниться в кэше,
то при вращении логов в течение времени, заданного параметром
valid директивы open_log_file_cache,
запись может продолжаться в старый файл.
при каждой записи в лог проверяется существование
корневого каталога
для запроса — если этот каталог не существует, то лог не создаётся.
Поэтому root
и access_log нужно описывать на одном уровне конфигурации:
```
server {
    root       /spool/vhost/data/$host;
    access_log /spool/vhost/logs/$host;
    ...
```

Параметр if (1.7.0) включает условную запись в лог.
Запрос не будет записываться в лог, если результатом вычисления
условия является “0” или пустая строка.
В следующем примере запросы с кодами ответа 2xx и 3xx
не будут записываться в лог:

map $status $loggable {
    ~^[23]  0;
    default 1;
}

access_log /path/to/access.log combined if=$loggable;

Синтаксис:	`log_format название [escape=default\|json\|none] строка ...;`
Умолчание:	log_format combined "...";
Контекст:	`http`

Задаёт формат лога.

Параметр escape (1.11.8) позволяет задать
экранирование символов json или default
в переменных, по умолчанию используется default.
Значение none (1.13.10) отключает
экранирование символов.

При использовании default
символы “"”, “”,
a также символы со значениями меньше 32 (0.7.0) или больше 126 (1.1.6)
экранируются как “xXX”.
Если значение переменной не найдено,
то в качестве значения в лог будет записываться дефис (“-”).

При использовании json
экранируются все символы, недопустимые
в JSON строках:
символы “"” и
“” экранируются как
“"” и “\”,
символы со значениями меньше 32 экранируются как
“n”,
“r”,
“t”,
“b”,
“f” или
“u00XX”.

Кроме общих переменных в формате можно использовать переменные,
существующие только на момент записи в лог:

$bytes_sent: число байт, переданное клиенту
$connection: порядковый номер соединения
$connection_requests: текущее число запросов в соединении (1.1.18)
$msec: время в секундах с точностью до миллисекунд на момент записи в лог
$pipe: “p” если запрос был pipelined, иначе “.”
$request_length: длина запроса (включая строку запроса, заголовок и тело запроса)
$request_time: время обработки запроса в секундах с точностью до миллисекунд;
время, прошедшее с момента чтения первых байт от клиента до
момента записи в лог после отправки последних байт клиенту
$status: статус ответа
$time_iso8601: локальное время в формате по стандарту ISO 8601
$time_local: локальное время в Common Log Format

В современных версиях nginx переменные
$status
(1.3.2, 1.2.2),
$bytes_sent
(1.3.8, 1.2.5),
$connection
(1.3.8, 1.2.5),
$connection_requests
(1.3.8, 1.2.5),
$msec
(1.3.9, 1.2.6),
$request_time
(1.3.9, 1.2.6),
$pipe
(1.3.12, 1.2.7),
$request_length
(1.3.12, 1.2.7),
$time_iso8601
(1.3.12, 1.2.7)
и
$time_local
(1.3.12, 1.2.7)
также доступны как общие переменные.

Строки заголовка, переданные клиенту, начинаются с префикса
“sent_http_”, например,
$sent_http_content_range.

В конфигурации всегда существует предопределённый формат
“combined”:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

Синтаксис:	`open_log_file_cache max=N [inactive=время] [min_uses=N] [valid=время];` `open_log_file_cache off;`
Умолчание:	open_log_file_cache off;
Контекст:	`http`, `server`, `location`

Задаёт кэш, в котором хранятся дескрипторы файлов часто используемых
логов, имена которых заданы с использованием переменных.
Параметры директивы:

max: задаёт максимальное число дескрипторов в кэше;
при переполнении кэша наименее востребованные (LRU)
дескрипторы закрываются
inactive: задаёт время, после которого закэшированный дескриптор закрывается,
если к нему не было обращений в течение этого времени;
по умолчанию 10 секунд
min_uses: задаёт минимальное число использований файла в течение
времени, заданного параметром inactive,
после которого дескриптор файла будет оставаться открытым в кэше;
по умолчанию 1
valid: задаёт, через какое время нужно проверять, что файл ещё
существует под тем же именем;
по умолчанию 60 секунд
off: запрещает кэш

Пример использования:

open_log_file_cache max=1000 inactive=20s valid=1m min_uses=2;

Источник

In this tutorial, you will learn everything you need to know about logging in
NGINX and how it can help you troubleshoot and quickly resolve any problem you
may encounter on your web server. We will discuss where the logs are stored and
how to access them, how to customize their format, and how to centralize them in
one place with Syslog or a log management service.

Here’s an outline of what you will learn by following through with this tutorial:

Where NGINX logs are stored and how to access them.
How to customize the NGINX log format and storage location to fit your needs.
How to utilize a structured format (such as JSON) for your NGINX logs.
How to centralize NGINX logs through Syslog or a managed cloud-based service.

Prerequisites

To follow through with this tutorial, you need the following:

A Linux server that includes a non-root user with sudo privileges. We tested
the commands shown in this guide on an Ubuntu 20.04 server.
The
NGINX web server installed
and enabled on your server.

🔭 Want to centralize and monitor your NGINX logs?

Head over to Logtail and start ingesting your logs in 5 minutes.

Step 1 — Locating the NGINX log files

NGINX writes logs of all its events in two different log files:

Access log: this file contains information about incoming requests and
user visits.
Error log: this file contains information about errors encountered while
processing requests, or other diagnostic messages about the web server.

The location of both log files is dependent on the host operating system of the
NGINX web server and the mode of installation. On most Linux distributions, both
files will be found in the /var/log/nginx/ directory as access.log and
error.log, respectively.

A typical access log entry might look like the one shown below. It describes an
HTTP GET request to the server for a favicon.ico file.

Output

217.138.222.101 - - [11/Feb/2022:13:22:11 +0000] "GET /favicon.ico HTTP/1.1" 404 3650 "http://135.181.110.245/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36" "-"

Similarly, an error log entry might look like the one below, which was generated
due to the inability of the server to locate the favicon.ico file that was
requested above.

Output

2022/02/11 13:12:24 [error] 37839#37839: *7 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 113.31.102.176, server: _, request: "GET /favicon.ico HTTP/1.1", host: "192.168.110.245:80"

In the next section, you’ll see how to view both NGINX log files from the
command line.

Step 2 — Viewing the NGINX log files

Examining the NGINX logs can be done in a variety of ways. One of the most
common methods involves using the tail command to view logs entries in
real-time:

sudo tail -f /var/log/nginx/access.log

You will observe the following output:

Output

107.189.10.196 - - [14/Feb/2022:03:48:55 +0000] "POST /HNAP1/ HTTP/1.1" 404 134 "-" "Mozila/5.0"
35.162.122.225 - - [14/Feb/2022:04:11:57 +0000] "GET /.env HTTP/1.1" 404 162 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0"
45.61.172.7 - - [14/Feb/2022:04:16:54 +0000] "GET /.env HTTP/1.1" 404 197 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
45.61.172.7 - - [14/Feb/2022:04:16:55 +0000] "POST / HTTP/1.1" 405 568 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
45.137.21.134 - - [14/Feb/2022:04:18:57 +0000] "GET /dispatch.asp HTTP/1.1" 404 134 "-" "Mozilla/5.0 (iPad; CPU OS 7_1_2 like Mac OS X; en-US) AppleWebKit/531.5.2 (KHTML, like Gecko) Version/4.0.5 Mobile/8B116 Safari/6531.5.2"
23.95.100.141 - - [14/Feb/2022:04:42:23 +0000] "HEAD / HTTP/1.0" 200 0 "-" "-"
217.138.222.101 - - [14/Feb/2022:07:38:40 +0000] "GET /icons/ubuntu-logo.png HTTP/1.1" 404 197 "http://168.119.119.25/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"
217.138.222.101 - - [14/Feb/2022:07:38:42 +0000] "GET /favicon.ico HTTP/1.1" 404 197 "http://168.119.119.25/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"
217.138.222.101 - - [14/Feb/2022:07:44:02 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"
217.138.222.101 - - [14/Feb/2022:07:44:02 +0000] "GET /icons/ubuntu-logo.png HTTP/1.1" 404 197 "http://168.119.119.25/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"

The tail command prints the last 10 lines from the selected file. The -f
option causes it to continue displaying subsequent lines that are added to the
file in real-time.

To examine the entire contents of an NGINX log file, you can use the cat
command or open it in your text editor:

sudo cat /var/log/nginx/error.log

If you want to filter the lines that contain a specific term, you can use the
grep command as shown below:

sudo grep "GET /favicon.ico" /var/log/nginx/access.log

The command above will print all the lines that contain GET /favicon.ico so we
can see how many requests were made for that resource.

Step 3 — Configuring NGINX access logs

The NGINX access log stores data about incoming client requests to the server
which is beneficial when deciphering what users are doing in the application,
and what resources are being requested. In this section, you will learn how to
configure what data is stored in the access log.

One thing to keep in mind while following through with the instructions below is
that you’ll need to restart the nginx service after modifying the config file
so that the changes can take effect.

sudo systemctl restart nginx

Enabling the access log

The NGINX access Log should be enabled by default. However, if this is not the
case, you can enable it manually in the Nginx configuration file
(/etc/nginx/nginx.conf) using the access_log directive within the http
block.

Output

http {
  access_log /var/log/nginx/access.log;
}

This directive is also applicable in the server and location configuration
blocks for a specific website:

Output

server {
   access_log /var/log/nginx/app1.access.log;

  location /app2 {
    access_log /var/log/nginx/app2.access.log;
  }
}

Disabling the access log

In cases where you’d like to disable the NGINX access log, you can use the
special off value:

You can also disable the access log on a virtual server or specific URIs by
editing its server or location block configuration in the
/etc/nginx/sites-available/ directory:

Output

server {
  listen 80;

  access_log off;

  location ~* .(woff|jpg|jpeg|png|gif|ico|css|js)$ {
    access_log off;
  }
}

Logging to multiple access log files

If you’d like to duplicate the access log entries in separate files, you can do
so by repeating the access_log directive in the main config file or in a
server block as shown below:

Output

access_log /var/log/nginx/access.log;
access_log /var/log/nginx/combined.log;

Don’t forget to restart the nginx service afterward:

sudo systemctl restart nginx

Explanation of the default access log format

The access log entries produced using the default configuration will look like
this:

Output

127.0.0.1 alice Alice [07/May/2021:10:44:53 +0200] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4531.93 Safari/537.36"

Here’s a breakdown of the log message above:

127.0.0.1: the IP address of the client that made the request.
alice: remote log name (name used to log in a user).
Alice: remote username (username of logged-in user).
[07/May/2021:10:44:53 +0200] : date and time of the request.
"GET / HTTP/1.1" : request method, path and protocol.
200: the HTTP response code.
396: the size of the response in bytes.
"-": the IP address of the referrer (- is used when the it is not
available).
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4531.93 Safari/537.36" —
detailed user agent information.

Step 4 — Creating a custom log format

Customizing the format of the entries in the access log can be done using the
log_format directive, and it can be placed in the http, server or
location blocks as needed. Here’s an example of what it could look like:

Output

log_format custom '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"';

This yields a log entry in the following format:

Output

217.138.222.109 - - [14/Feb/2022:10:38:35 +0000] "GET /favicon.ico HTTP/1.1" 404 197 "http://192.168.100.1/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"

The syntax for configuring an access log format is shown below. First, you need
to specify a nickname for the format that will be used as its identifier, and
then the log format string that represents the details and formatting for each
log message.

Output

log_format <nickname> '<formatting_variables>';

Here’s an explanation of each variable used in the custom log format shown
above:

$remote_addr: the IP address of the client
$remote_user: information about the user making the request
$time_local: the server’s date and time.
$request: actual request details like path, method, and protocol.
$status: the response code.
$body_bytes_sent: the size of the response in bytes.
$http_referer: the IP address of the HTTP referrer.
$http_user_agent: detailed user agent information.

You may also use the following variables in your custom log format
(see here for the complete list):

$upstream_connect_time: the time spent establishing a connection with an
upstream server.
$upstream_header_time: the time between establishing a connection and
receiving the first byte of the response header from the upstream server.
$upstream_response_time: the time between establishing a connection and
receiving the last byte of the response body from the upstream server.
$request_time: the total time spent processing a request.
$gzip_ratio: ration of gzip compression (if gzip is enabled).

After you create a custom log format, you can apply it to a log file by
providing a second parameter to the access_log directive:

Output

access_log /var/log/nginx/access.log custom;

You can use this feature to log different information in to separate log files.
Create the log formats first:

Output

log_format custom '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer"';
log_format agent "$http_user_agent";

Then, apply them as shown below:

Output

access_log /var/log/nginx/access.log custom;
access_log /var/log/nginx/agent_access.log agent;

This configuration ensures that user agent information for all incoming requests
are logged into a separate access log file.

Step 5 — Formatting your access logs as JSON

A common way to customize NGINX access logs is to format them as JSON. This is
quite straightforward to achieve by combining the log_format directive with
the escape=json parameter introduced in Nginx 1.11.8 to escape characters that
are not valid in JSON:

Output

log_format custom_json escape=json
  '{'
    '"time_local":"$time_local",'
    '"remote_addr":"$remote_addr",'
    '"remote_user":"$remote_user",'
    '"request":"$request",'
    '"status": "$status",'
    '"body_bytes_sent":"$body_bytes_sent",'
    '"request_time":"$request_time",'
    '"http_referrer":"$http_referer",'
    '"http_user_agent":"$http_user_agent"'
  '}';

After applying the custom_json format to a log file and restarting the nginx
service, you will observe log entries in the following format:

{
  "time_local": "14/Feb/2022:11:25:44 +0000",
  "remote_addr": "217.138.222.109",
  "remote_user": "",
  "request": "GET /icons/ubuntu-logo.png HTTP/1.1",
  "status": "404",
  "body_bytes_sent": "197",
  "request_time": "0.000",
  "http_referrer": "http://192.168.100.1/",
  "http_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36"
}

Step 6 — Configuring NGINX error logs

Whenever NGINX encounters an error, it stores the event data in the error log so
that it can be referred to later by a system administrator. This section will
describe how to enable and customize the error logs as you see fit.

Enabling the error log

The NGINX error log should be enabled by default. However, if this is not the
case, you can enable it manually in the relevant NGINX configuration file
(either at the http, server, or location levels) using the error_log
directive.

Output

error_log /var/log/nginx/error.log;

The error_log directive can take two parameters. The first one is the location
of the log file (as shown above), while the second one is optional and sets the
severity level of the log. Events with a lower severity level than set one will
not be logged.

Output

error_log /var/log/nginx/error.log info;

These are the possible levels of severity (from lowest to highest) and their
meaning:

debug: messages used for debugging.
info: informational messages.
notice: a notable event occurred.
warn: something unexpected happened.
error: something failed.
crit: critical conditions.
alert: errors that require immediate action.
emerg: the system is unusable.

Disabling the error log

The NGINX error log can be disabled by setting the error_log directive to
off or by redirecting it to /dev/null:

Output

error_log off;
error_log /dev/null;

Logging errors into multiple files

As is the case with access logs, you can log errors into multiple files, and you
can use different severity levels too:

Output

error_log /var/log/nginx/error.log info;
error_log /var/log/nginx/emerg_error.log emerg;

This configuration will log every event except those at the debug level event
to the error.log file, while emergency events are placed in a separate
emerg_error.log file.

Step 7 — Sending NGINX logs to Syslog

Apart from logging to a file, it’s also possible to set up NGINX to transport
its logs to the syslog service especially if you’re already using it for other
system logs. Logging to syslog is done by specifying the syslog: prefix to
either the access_log or error_log directive:

Output

error_log  syslog:server=unix:/var/log/nginx.sock debug;
access_log syslog:server=[127.0.0.1]:1234,facility=local7,tag=nginx,severity=info;

Log messages are sent to a server which can be specified in terms of a domain
name, IPv4 or IPv6 address or a UNIX-domain socket path.

In the example above, error log messages are sent to a UNIX domain socket at the
debug logging level, while the access log is written to a syslog server with
an IPv4 address and port 1234. The facility= parameter specifies the type of
program that is logging the message, the tag= parameter applies a custom tag
to syslog messages, and the severity= parameter sets the severity level of
the syslog entry for access log messages.

For more information on using Syslog to manage your logs, you can check out our
tutorial on viewing and configuring system logs on
Linux.

Step 8 — Centralizing your NGINX logs

In this section, we’ll describe how you can centralize your NGINX logs in a log
management service through Vector, a
high-performance tool for building observability pipelines. This is a crucial
step when administrating multiple servers so that you can monitor all your logs
in one place (you can also centralize your logs with an Rsyslog
server).

The following instructions assume that you’ve signed up for a free
Logtail account and retrieved your source
token. Go ahead and follow the relevant
installation instructions for Vector
for your operating system. For example, on Ubuntu, you may run the following
commands to install the Vector CLI:

curl -1sLf  'https://repositories.timber.io/public/vector/cfg/setup/bash.deb.sh'  | sudo -E bash
$ sudo apt install vector

After Vector is installed, confirm that it is up and running through
systemctl:

You should observe that it is active and running:

Output

● vector.service - Vector
     Loaded: loaded (/lib/systemd/system/vector.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-02-08 10:52:59 UTC; 48s ago
       Docs: https://vector.dev
    Process: 18586 ExecStartPre=/usr/bin/vector validate (code=exited, status=0/SUCCESS)
   Main PID: 18599 (vector)
      Tasks: 3 (limit: 2275)
     Memory: 6.8M
     CGroup: /system.slice/vector.service
             └─18599 /usr/bin/vector

Otherwise, go ahead and start it with the command below.

sudo systemctl start vector

Afterward, change into a root shell and append your Logtail vector configuration
for NGINX into the /etc/vector/vector.toml file using the command below. Don’t
forget to replace the <your_logtail_source_token> placeholder below with your
source token.

sudo -s
$ wget -O ->> /etc/vector/vector.toml 
    https://logtail.com/vector-toml/nginx/<your_logtail_source_token>

Then restart the vector service:

sudo systemctl restart vector

You will observe that your NGINX logs will start coming through in Logtail:

Conclusion

In this tutorial, you learned about the different types of logs that the NGINX
web server keeps, where you can find them, how to understand their formatting.
We also discussed how to create your own custom log formats (including a
structured JSON format), and how to log into multiple files at once. Finally, we
demonstrated the process of sending your logs to Syslog or a log management
service so that you can monitor them all in one place.

Thanks for reading, and happy logging!

Centralize all your logs into one place.

Analyze, correlate and filter logs with SQL.

Create actionable

dashboards.

Share and comment with built-in collaboration.

Got an article suggestion?
Let us know

How to Get Started with Logging in Node.js

Learn how to start logging with Node.js and go from basics to best practices in no time.

→

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Источник

I’m trying to generate a JSON log from nginx.

I’m aware of solutions like this one but some of the fields I want to log include user generated input (like HTTP headers) which need to be escaped properly.

I’m aware of the nginx changelog entries from Oct 2011 and May 2008 that say:

*) Change: now the 0x7F-0x1F characters are escaped as xXX in an
   access_log.
*) Change: now the 0x00-0x1F, '"' and '' characters are escaped as xXX
   in an access_log.

but this still doesn’t help since xXX is invalid in a JSON string.

I’ve also looked at the HttpSetMiscModule module which has a set_quote_json_str directive, but this just seems to add x22 around the strings which doesn’t help.

Any idea for other solutions to log in JSON format from nginx?

asked Jul 31, 2014 at 1:54

Jules OlléonJules Olléon

6,6556 gold badges35 silver badges46 bronze badges

Finally it looks like we have good way to do this with vanilla nginx without any modules. Just define:

log_format json_combined escape=json
  '{'
    '"time_local":"$time_local",'
    '"remote_addr":"$remote_addr",'
    '"remote_user":"$remote_user",'
    '"request":"$request",'
    '"status": "$status",'
    '"body_bytes_sent":"$body_bytes_sent",'
    '"request_time":"$request_time",'
    '"http_referrer":"$http_referer",'
    '"http_user_agent":"$http_user_agent"'
  '}';

Note that escape=json was added in nginx 1.11.8.
http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format

grosser

14.5k7 gold badges56 silver badges60 bronze badges

answered Mar 2, 2017 at 19:53

pvapva

1,8781 gold badge11 silver badges8 bronze badges

Источник

Troubleshooting in Production Without Tuning the Error Log

Editor – This blog is one of several that discuss logging with NGINX and NGINX Plus. Please also see:

Application Tracing with NGINX and NGINX Plus

Sampling Requests with NGINX Conditional Logging

It’s also one of many blogs about use cases for the NGINX JavaScript module. For the complete list, see Use Cases for the NGINX JavaScript Module.

NGINX helps organizations of all sizes to run their mission‑critical websites, applications, and APIs. Regardless of your scale and choice of deployment infrastructure, running in production is not easy. In this article we talk about just one of the hard things about a production deployment – logging. More specifically, we discuss the balancing act of collecting the right amount of detailed logs for troubleshooting without being swamped with unnecessary data.

Logging Basics

NGINX provides two different logging mechanisms: access logging for client requests, and error logging for when things go wrong. These mechanisms are available in both the HTTP and Stream (TCP/UDP) modules, but here we focus on HTTP traffic. (There is also a third logging mechanism which uses the debug severity level, but we won’t discuss it here.)

A typical, default NGINX logging configuration looks like this.

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
 
    access_log /var/log/nginx/access.log main; # Log using the 'main' format
    error_log  /var/log/nginx/error.log  warn; # Log up to 'warn' severity level
    ...
}

The log_format directive describes the contents and structure of the log entries created when the access_log directive is included in the configuration. The example above is an extension of the Common Log Format (CLF) used by many web servers. With the error_log directive, you specify the severity level of the messages to log, but not the content or format of entries, which are fixed. More on that in the next section.

Other noteworthy aspects of NGINX logging include:

Logging directives are automatically inherited by lower‑level configuration contexts. For example, an access_log directive at the http context is applied to all server{} blocks.
A logging directive in a child context overrides inherited directives.
Multiple logging directives may exist in the same context. For example, two access_log directives might be used to create both a standard CLF log file and a second, more detailed log.

The Reality of Logging in Production

In general terms, we want to use the access log to provide analytics and usage statistics, and use the error log for failure detection and troubleshooting. But running a production system is seldom that simple. Here are some common challenges:

Access logs lack sufficient detail for troubleshooting
Error logs disclose good detail at the info severity level but that is too verbose for normal operations
The error log format is fixed and so cannot be customized to include variables of particular interest
Entries in the error log don’t include the context of the request and are difficult to match with the corresponding access log entry

What’s more, changing the NGINX configuration to add or remove logging detail in a production environment may also require going through a change‑control process, and redeploying the configuration. Entirely safe, but somewhat cumbersome when troubleshooting a live issue such as, “why am I seeing a spike in 4xx/5xx errors?”. This is of course magnified when there are multiple NGINX instances handling the same traffic across a cluster.

Using a Second Access Log for Errors

Customizing the format of the access log to enrich the data collected for each request is a common approach for enhanced analytics, but doesn’t scale well for diagnostics or troubleshooting. Asking the main access log to do two jobs is a contrived solution, because we typically want a lot more information for troubleshooting than we do for regular analytics. Adding numerous variables to the main access log can dramatically increase log volumes with data that is only occasionally useful.

Instead we can use a second access log and write to it only when we encounter an error that warrants debugging. The access_log directive supports conditional logging with the if parameter – requests are only logged when the specified variable evaluates to a non‑zero, non‑empty value.

map $status $is_error {
    400     1; # Bad request, including expired client cert
    495     1; # Client cert error
    502     1; # Bad gateway (no upstream server could be selected)
    504     1; # Gateway timeout (couldn't connect to selected upstream)
    default 0;
}

access_log /var/log/nginx/access_debug.log access_debug if=$is_error; # Diagnostic logging
access_log /var/log/nginx/access.log main;

With this configuration, we pass the $status variable through a map block to determine the value of the $is_error variable, which is then evaluated by the access_log directive’s if parameter. When $is_error evaluates to 1 we write a special log entry to the access_debug.log file.

However, this configuration doesn’t detect errors encountered during request processing that are ultimately resolved, and therefore result in status 200 OK. One such example is when NGINX is load balancing between multiple upstream servers. If an error is encountered with the selected server then NGINX passes the request to the next server under the conditions configured by the proxy_next_upstream directive. As long as one of the upstream servers responds successfully, then the client receives a successful response, which gets logged with status 200. However, the user experience is likely to be poor due to the retries, and it may not be immediately obvious that an upstream server is unhealthy. After all, we logged a 200.

When NGINX attempts to proxy to multiple upstream servers, their addresses are all captured in the $upstream_addr variable. The same is true for other $upstream_* variables, for example $upstream_status which captures the response code from each attempted server. So when we see multiple entries in these variables, we know that something bad happened – we probably have a problem with at least one of the upstream servers.

How about we also write to access_debug.log when the request was proxied to multiple upstream servers?

map $upstream_status $multi_upstreams {
    "~,"    1; # Has a comma
    default 0;
} 

map $status $is_error {
    400     1; # Bad request, including expired client cert
    495     1; # Client cert error
    502     1; # Bad gateway (no upstream server could be selected)
    504     1; # Gateway timeout (couldn't connect to selected upstream)
    default $multi_upstreams; # If we tried more than one upstream server
}

access_log /var/log/nginx/access_debug.log access_debug if=$is_error; # Diagnostic logging
access_log /var/log/nginx/access.log main; # Regular logging

Here we use another map block to produce a new variable ($multi_upstreams) whose value depends on the presence of a comma (,) in $upstream_status. A comma means there is more than one status code, and therefore more than one upstream server was encountered. This new variable determines the value of $is_error when $status is not one of the listed error codes.

At this point we have our regular access log and a special access_debug.log file that contains erroneous requests, but we haven’t yet defined the access_debug log format. Let’s now ensure that we have all the data we need in the access_debug.log file to help us diagnose problems.

Getting diagnostic data into access_debug.log is not difficult. NGINX provides over 100 variables related to HTTP processing, and we can define a special log_format directive that captures as many of them as we need. However, there are a few downsides to building out a naïve log format for this purpose.

It is a custom format – you need to train a log parser how to read it
Entries can be very long and hard for humans to read during live troubleshooting
You continuously need to refer to the log format in order to interpret entries
It is not possible to log non‑deterministic values such as “all request headers”

We can address these challenges by writing log entries in a structured format such as JSON, using the NGINX JavaScript module (njs). JSON format is also widely supported by log processing systems such as Splunk, LogStash, Graylog, and Loggly. By offloading the log_format syntax to a JavaScript function, we benefit from native JSON syntax, with access to all of the NGINX variables and additional data from the njs ‘r’ object.

js_import conf.d/json_log.js;
js_set $json_debug_log json_log.debugLog;

log_format access_debug escape=none $json_debug_log; # Offload to njs 
access_log /var/log/nginx/access_debug.log access_debug if=$is_error;

The js_import directive specifies the file containing the JavaScript code and imports it as a module. The code itself can be found here. Whenever we write an access log entry that uses the access_debug log format, the $json_debug_log variable is evaluated. This variable is evaluated by executing the debugLog JavaScript function as defined by the js_set directive.

This combination of JavaScript code and NGINX configuration produces diagnostic logs that look like this.

$ tail --lines=1 /var/log/nginx/access_debug.log | jq
{
   "timestamp": "2020-09-21T11:25:55+00:00",
   "connection": {
      "request_count": 1,
      "elapsed_time": 0.555,
      "pipelined": false,
      "ssl": {
         "protocol": "TLSv1.2",
         "cipher": "ECDHE-RSA-AES256-GCM-SHA384",
         "session_id": "b302f76a70dfec92f6bd72de5732692481ebecbbc69a4d81c900ae4dc928485c",
         "session_reused": false,
         "client_cert": {
            "status": "NONE"
         }
      }
   },
   "request": {
      "client": "127.0.0.1",
      "port": 443,
      "host": "foo.example.com",
      "method": "GET",
      "uri": "/one",
      "http_version": 1.1,
      "bytes_received": 87,
      "headers": {
         "Host": "foo.example.com:443",
         "User-Agent": "curl/7.64.1",
         "Accept": "*/*"
      }
   },
   "upstreams": [
      {
         "server_addr": "10.37.0.71",
         "server_port": 443,
         "connect_time": null,
         "header_time": null,
         "response_time": 0.551,
         "bytes_sent": 0,
         "bytes_received": 0,
         "status": 504
      },
      {
         "server_addr": "10.37.0.72",
         "server_port": 443,
         "connect_time": 0.004,
         "header_time": 0.004,
         "response_time": 0.004,
         "bytes_sent": 92,
         "bytes_received": 4161,
         "status": 200
      }
   ],
   "response": {
      "status": 200,
      "bytes_sent": 186,
      "headers": {
         "Content-Type": "text/html",
         "Content-Length": "4161"
      }
   }
}

The JSON format enables us to have separate objects for information related to the overall HTTP connection (including SSL/TLS), request, upstreams, and response. Notice how the first upstream (10.37.0.71) returned status 504 (Gateway Timeout) before NGINX tried the next upstream (10.37.0.72), which responded successfully. The half‑second timeout (reported as response_time in the first element of the upstreams object) accounts for most of the overall latency for this successful response (reported as elapsed_time in the connection object).

Here is another example of a (truncated) log entry, for a client error caused by an expired client certificate.

{
   "timestamp": "2020-09-22T10:20:50+00:00",
   "connection": {
      "ssl": {
         "protocol": "TLSv1.2",
         "cipher": "ECDHE-RSA-AES256-GCM-SHA384",
         "session_id": "30711efbe047c38a98c2209cc4b5f196988dcf2d7f1f2c269fde7269c370432e",
         "session_reused": false,
         "client_cert": {
            "status": "FAILED:certificate has expired",
            "serial": "1006",
            "fingerprint": "0c47cc4bd0fefbc2ac6363345cfbbf295594fe8d",
            "subject": "emailAddress=liam@nginx.com,CN=test01,OU=Demo CA,O=nginx,ST=CA,C=US",
            "issuer": "CN=Demo Intermediate CA,OU=Demo CA,O=nginx,ST=CA,C=US",
            "starts": "Sep 20 12:00:11 2019 GMT",
            "expires": "Sep 20 12:00:11 2020 GMT",
            "expired": true,
         ...
   "response": {
      "status": 400,
      "bytes_sent": 283,
      "headers": {
      "Content-Type": "text/html",
      "Content-Length": "215"
   }
}

Summary

By generating rich diagnostic data only when we encounter an error, we enable real‑time troubleshooting without needing to perform any reconfiguration. Successful requests are not impacted because the JavaScript code runs only when we detect an error at the logging phase, after the last byte was sent to the client.

The complete configuration is available on GitHub – why not try it in your environment? If you’re not already running NGINX Plus, start a free 30-day trial today or contact us to discuss your use cases.

Источник

Содержание

Модуль ngx_http_log_module
Пример конфигурации
Директивы
Module ngx_http_log_module
Example Configuration
Directives
Tech Blog: How to configure JSON logging in nginx?
Logging architecture
Nginx logging
JSON format
How the result looks like (sample)

Модуль ngx_http_log_module

Модуль ngx_http_log_module записывает логи запросов в указанном формате.

Логи записываются в контексте location’а, где заканчивается обработка. Это может быть location, отличный от первоначального, если в процессе обработки запроса происходит внутреннее перенаправление.

Пример конфигурации

Директивы

Умолчание:
Синтаксис:	access_log путь [ формат [ buffer = размер ] [ gzip[= степень ] ] [ flush = время ] [ if = условие ]]; access_log off ;
Контекст:	http , server , location , if в location , limit_except

Задаёт путь, формат и настройки буферизованной записи в лог. На одном уровне конфигурации может использоваться несколько логов. Запись в syslog настраивается указанием префикса “ syslog: ” в первом параметре. Специальное значение off отменяет все директивы access_log для текущего уровня. Если формат не указан, то используется предопределённый формат “ combined ”.

Если задан размер буфера с помощью параметра buffer или указан параметр gzip (1.3.10, 1.2.7), то запись будет буферизованной.

Размер буфера должен быть не больше размера атомарной записи в дисковый файл. Для FreeBSD этот размер неограничен.

При включённой буферизации данные записываются в файл:

если очередная строка лога не помещается в буфер;
если данные в буфере находятся дольше интервала времени, заданного параметром flush (1.3.10, 1.2.7);
при переоткрытии лог-файла или завершении рабочего процесса.

Если задан параметр gzip , то буфер будет сжиматься перед записью в файл. Степень сжатия может быть задана в диапазоне от 1 (быстрее, но хуже сжатие) до 9 (медленнее, но лучше сжатие). По умолчанию используются буфер размером 64К байт и степень сжатия 1. Данные сжимаются атомарными блоками, и в любой момент времени лог-файл может быть распакован или прочитан с помощью утилиты “ zcat ”.

Для поддержки gzip-сжатия логов nginx должен быть собран с библиотекой zlib.

В пути файла можно использовать переменные (0.7.6+), но такие логи имеют некоторые ограничения:

пользователь, с правами которого работают рабочие процессы, должен иметь права на создание файлов в каталоге с такими логами;
не работает буферизация;
файл открывается для каждой записи в лог и сразу же после записи закрывается. Следует однако иметь в виду, что поскольку дескрипторы часто используемых файлов могут храниться в кэше, то при вращении логов в течение времени, заданного параметром valid директивы open_log_file_cache, запись может продолжаться в старый файл.
при каждой записи в лог проверяется существование корневого каталога для запроса — если этот каталог не существует, то лог не создаётся. Поэтому root и access_log нужно описывать на одном уровне конфигурации:

Параметр if (1.7.0) включает условную запись в лог. Запрос не будет записываться в лог, если результатом вычисления условия является “0” или пустая строка. В следующем примере запросы с кодами ответа 2xx и 3xx не будут записываться в лог:

Умолчание:
Синтаксис:	log_format название [ escape = default \| json \| none ] строка . ;
Контекст:	http

Задаёт формат лога.

Параметр escape (1.11.8) позволяет задать экранирование символов json или default в переменных, по умолчанию используется default . Значение none (1.13.10) отключает экранирование символов.

При использовании default символы “ » ”, “ ”, a также символы со значениями меньше 32 (0.7.0) или больше 126 (1.1.6) экранируются как “ xXX ”. Если значение переменной не найдено, то в качестве значения в лог будет записываться дефис (“ — ”).

При использовании json экранируются все символы, недопустимые в JSON строках: символы “ » ” и “ ” экранируются как “ » ” и “ \ ”, символы со значениями меньше 32 экранируются как “ n ”, “ r ”, “ t ”, “ b ”, “ f ” или “ u00XX ”.

Кроме общих переменных в формате можно использовать переменные, существующие только на момент записи в лог:

$bytes_sent число байт, переданное клиенту $connection порядковый номер соединения $connection_requests текущее число запросов в соединении (1.1.18) $msec время в секундах с точностью до миллисекунд на момент записи в лог $pipe “ p ” если запрос был pipelined, иначе “ . ” $request_length длина запроса (включая строку запроса, заголовок и тело запроса) $request_time время обработки запроса в секундах с точностью до миллисекунд; время, прошедшее с момента чтения первых байт от клиента до момента записи в лог после отправки последних байт клиенту $status статус ответа $time_iso8601 локальное время в формате по стандарту ISO 8601 $time_local локальное время в Common Log Format

Строки заголовка, переданные клиенту, начинаются с префикса “ sent_http_ ”, например, $sent_http_content_range .

В конфигурации всегда существует предопределённый формат “ combined ”:

Умолчание:
Синтаксис:	open_log_file_cache max = N [ inactive = время ] [ min_uses = N ] [ valid = время ]; open_log_file_cache off ;
Контекст:	http , server , location

Задаёт кэш, в котором хранятся дескрипторы файлов часто используемых логов, имена которых заданы с использованием переменных. Параметры директивы:

max задаёт максимальное число дескрипторов в кэше; при переполнении кэша наименее востребованные (LRU) дескрипторы закрываются inactive задаёт время, после которого закэшированный дескриптор закрывается, если к нему не было обращений в течение этого времени; по умолчанию 10 секунд min_uses задаёт минимальное число использований файла в течение времени, заданного параметром inactive , после которого дескриптор файла будет оставаться открытым в кэше; по умолчанию 1 valid задаёт, через какое время нужно проверять, что файл ещё существует под тем же именем; по умолчанию 60 секунд off запрещает кэш

Источник

Module ngx_http_log_module

The ngx_http_log_module module writes request logs in the specified format.

Requests are logged in the context of a location where processing ends. It may be different from the original location, if an internal redirect happens during request processing.

Example Configuration

Directives

Default:
Syntax:	access_log path [ format [ buffer = size ] [ gzip[= level ] ] [ flush = time ] [ if = condition ]]; access_log off ;
Context:	http , server , location , if in location , limit_except

Sets the path, format, and configuration for a buffered log write. Several logs can be specified on the same configuration level. Logging to syslog can be configured by specifying the “ syslog: ” prefix in the first parameter. The special value off cancels all access_log directives on the current level. If the format is not specified then the predefined “ combined ” format is used.

If either the buffer or gzip (1.3.10, 1.2.7) parameter is used, writes to log will be buffered.

The buffer size must not exceed the size of an atomic write to a disk file. For FreeBSD this size is unlimited.

When buffering is enabled, the data will be written to the file:

if the next log line does not fit into the buffer;
if the buffered data is older than specified by the flush parameter (1.3.10, 1.2.7);
when a worker process is re-opening log files or is shutting down.

If the gzip parameter is used, then the buffered data will be compressed before writing to the file. The compression level can be set between 1 (fastest, less compression) and 9 (slowest, best compression). By default, the buffer size is equal to 64K bytes, and the compression level is set to 1. Since the data is compressed in atomic blocks, the log file can be decompressed or read by “ zcat ” at any time.

For gzip compression to work, nginx must be built with the zlib library.

The file path can contain variables (0.7.6+), but such logs have some constraints:

the user whose credentials are used by worker processes should have permissions to create files in a directory with such logs;
buffered writes do not work;
the file is opened and closed for each log write. However, since the descriptors of frequently used files can be stored in a cache, writing to the old file can continue during the time specified by the open_log_file_cache directive’s valid parameter
during each log write the existence of the request’s root directory is checked, and if it does not exist the log is not created. It is thus a good idea to specify both root and access_log on the same configuration level:

The if parameter (1.7.0) enables conditional logging. A request will not be logged if the condition evaluates to “0” or an empty string. In the following example, the requests with response codes 2xx and 3xx will not be logged:

Default:
Syntax:	log_format name [ escape = default \| json \| none ] string . ;
Context:	http

Specifies log format.

The escape parameter (1.11.8) allows setting json or default characters escaping in variables, by default, default escaping is used. The none value (1.13.10) disables escaping.

For default escaping, characters “ » ”, “ ”, and other characters with values less than 32 (0.7.0) or above 126 (1.1.6) are escaped as “ xXX ”. If the variable value is not found, a hyphen (“ — ”) will be logged.

For json escaping, all characters not allowed in JSON strings will be escaped: characters “ » ” and “ ” are escaped as “ » ” and “ \ ”, characters with values less than 32 are escaped as “ n ”, “ r ”, “ t ”, “ b ”, “ f ”, or “ u00XX ”.

The log format can contain common variables, and variables that exist only at the time of a log write:

$bytes_sent the number of bytes sent to a client $connection connection serial number $connection_requests the current number of requests made through a connection (1.1.18) $msec time in seconds with a milliseconds resolution at the time of the log write $pipe “ p ” if request was pipelined, “ . ” otherwise $request_length request length (including request line, header, and request body) $request_time request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client $status response status $time_iso8601 local time in the ISO 8601 standard format $time_local local time in the Common Log Format

Header lines sent to a client have the prefix “ sent_http_ ”, for example, $sent_http_content_range .

The configuration always includes the predefined “ combined ” format:

Default:
Syntax:	open_log_file_cache max = N [ inactive = time ] [ min_uses = N ] [ valid = time ]; open_log_file_cache off ;
Context:	http , server , location

Defines a cache that stores the file descriptors of frequently used logs whose names contain variables. The directive has the following parameters:

max sets the maximum number of descriptors in a cache; if the cache becomes full the least recently used (LRU) descriptors are closed inactive sets the time after which the cached descriptor is closed if there were no access during this time; by default, 10 seconds min_uses sets the minimum number of file uses during the time defined by the inactive parameter to let the descriptor stay open in a cache; by default, 1 valid sets the time after which it should be checked that the file still exists with the same name; by default, 60 seconds off disables caching

Источник

Tech Blog: How to configure JSON logging in nginx?

By Ivan Borko on July 21, 2022

In previous posts from this series, we discussed how we formatted UWSGI and Python logs in JSON format. We still have one important production component left: the Nginx server. This blog post will describe how the Nginx logging module works, and showcase a simple logging configuration where Nginx logger is configured to output JSON logs.

This blog post is a part of the Velebit AI Tech Blog series where we discuss good practices for a scalable and robust production deployment.

Logging architecture

Our goal is to have all services produce JSON logs so that we can directly feed them to Elasticsearch, without additional processing services (like Logstash) that require additional maintenance, consume a lot of CPU, and thus incur extra costs. This enables us to, instead of having Filebeat, Logstash, and Elasticsearch, use just Fluent Bit and Elasticsearch.

All our services are deployed as docker containers but this solution can work with or without docker. You can read more about our logging system components and architecture here.

Nginx logging

We use Nginx as a load balancer, for SSL termination (HTTPS), as a cache, and as a reverse proxy. Incoming requests for all our services go through the Nginx. As all external client requests go through the Nginx, access logs are the place where you get the best picture of the actual usage and performance of your system from the client’s perspective.

Nginx by default has two log files: access log and error log. Access log records every request to the Nginx server, while the error log records all issues the Nginx service has, but not errors of our services and bad responses (these go to the access log).

Nginx server contains a logging module ngx_http_log module that writes request logs in a specified format. The default format is called combined and logs values with a space delimiter.

JSON format

The log format is specified using characters and variables, so you can create your own JSON format. The main issue with this approach is that variables can contain strings that will break JSON format (if a variable contains “ or ‘ characters, for example). To fix this, Nginx version 1.11.8 added support for JSON escaping.

There are 3 options for escaping: default, json, and none. JSON escaping will escape characters not allowed in JSON strings: characters “»” and “” are escaped as “»” and “\” , characters with values less than 32 are escaped as “n” , “r” , “t” , “b” , “f” , or “u00XX” .

Every log format has a name (identifier) that can be used multiple times. Log formats can be configured outside of server directive, and then used in multiple servers.

In our nginx configuration, this looks like this:

The important thing to note here is that we want to index variables that contain numbers as long type in Elasticsearch and they don’t have double quotes around them, while variables that have strings have double quotes.

In the server directive, we reference the log format ( logger-json ) we want to use when specifying access log parameters.

How the result looks like (sample)

In the access.log file (located in /var/log/nginx ), every line contains one log record. Below you can see an example of a log record (formatted for readability).

This record is easily processed by FluentBit and inserted into Elasticsearch (more on FluentBit in this article). In most setups, Elasticsearch is then connected to Grafana, Kibana, or a similar tool used for log visualizations, creating graphs, and alerting.

We are using both Kibana and Grafana. Grafana is better suited for alerting, while Kibana is better when you have a problem and you need to deep dive into logs (discovery). You can read more about connecting Elasticsearch to Grafana and Kibana in our previous blog post.

If you want to read more about similar topics, you can read our other articles or subscribe to Velebit AI on LinkedIn.

Источник

This article was originally posted on SigNoz Blog and is written by Selvaganesh.

NGINX is a prominent web server, reverse proxy server, and mail proxy utilized by many websites and applications to serve content to their users. One important aspect of managing a web server is logging, which refers to the process of recording information about the server’s activity and performance.

In NGINX, logging is done using the error_log and access_log directives.

error_log directive specifies the file where NGINX should log errors.

access_log directive specifies the file where NGINX should log information about incoming requests and responses.

What are Nginx Error Logs?

The error_log directive is typically used to log information about errors and other important events that occur on the server. This can include messages about failed requests, issues with the server configuration, and other issues that may require attention.

An example of an error log is shown in the picture below:

Nginx Error log example

What are Nginx Access Logs?

The access_log directive, on the other hand, is used to log information about incoming requests and responses. This can include details such as the IP address of the client making the request, the URL of the requested resource, the response status code, and the size of the response.

An example of access logs is shown in the picture below:

Nginx Access log example

NGINX logs can be useful for various purposes, including tracking the server’s performance, identifying potential issues or errors, and analyzing the usage patterns of the server. However, managing logs can also be challenging, as they can quickly grow in size and become difficult to manage.

In this tutorial, we will illustrate the following:

How to configure Nginx access logs
How to configure Nginx error logs
How to send Nginx logs to Syslog
Collecting and analyzing Nginx logs with SigNoz

Let’s get started.

Prerequisites

Docker
Nginx

Installing Nginx

Installing NGINX on Linux

sudo apt update
sudo apt install nginx

Enter fullscreen mode

Exit fullscreen mode

To start NGINX

service nginx start

Enter fullscreen mode

Exit fullscreen mode

Installing NGINX on Mac

You can install NGINX on Mac using Homebrew :

brew install nginx

Enter fullscreen mode

Exit fullscreen mode

To start NGINX:

brew services start nginx

Enter fullscreen mode

Exit fullscreen mode

You can then access your NGINX server on localhost. Go to http://localhost, and you should see a screen like the one below.

NGINX server running on localhost

Configuring NGINX to generate access logs

Let’s go ahead and make the necessary changes to the nginx.conf file in order to change the location and structure of the logs.

By default, NGINX logs all incoming requests to the access.log file in the /var/log/nginx directory. The format of the log entries in this file is defined by the log_format directive in the NGINX configuration file.

Let’s define the custom Nginx log pattern to the nginx.conf file in the directory /etc/nginx/nginx.conf, as shown below.

log_format logger-json escape=json 
'{'
'"source": "nginx",'
'"message":"nginx log captured",'
'"time": $time_iso8601,'
'"resp_body_size": $body_bytes_sent,'
'"host": "$http_host",' 
'"address": "$remote_addr",' 
'"request_length": $request_length,'
'"method": "$request_method",' 
'"uri": "$request_uri",' 
'"status": $status,'  
'"user_agent": "$http_user_agent",' 
'"resp_time": $request_time,' 
'"upstream_addr": "$upstream_addr"'
'}';

Enter fullscreen mode

Exit fullscreen mode

When configuring the server’s access logs, we provide the preferred log_format (logger-json) in a server directive.

Make a log format with the name and pattern shown below.

http {
        sendfile on;
        tcp_nopush on;
        tcp_nodelay on;
        keepalive_timeout 65;
        types_hash_max_size 2048;

                log_format logger-json escape=json '{"source": "nginx","message":"nginx log captured","time": $time_iso8601, "resp_body_size": $body_bytes_sent, "host": "$http_host", "address": "$remote_addr", "request_length": $request_length, "method": "$request_method", "uri": "$request_uri", "status": $status,  "user_agent": "$http_user_agent", "resp_time": $request_time, "upstream_addr": "$upstream_addr"}';

        include /etc/nginx/mime.types;
        default_type application/octet-stream;

        ##
        # Logging Settings
        ##

        access_log /home/user/Work/logs/access.log logger-json;
}

Enter fullscreen mode

Exit fullscreen mode

A list of available variables can be found here.

Restart the Nginx for the config to take effect:

sudo service nginx restart

Enter fullscreen mode

Exit fullscreen mode

Note: The nginx config path for Mac OS will be under /usr/local/etc/nginx. So please ensure to update the config properly.

Go to url http://localhost and now look at the access.log file.

Every line in the access.log file, which is in /home/user/Work/logs/access.log, has one log record. One example of a log record is shown below.

{
  "source": "nginx",
  "message": "nginx log captured",
  "time": 2022-12-11T03:52:58-08:00,
  "resp_body_size": 396,
  "host": "192.168.1.2",
  "address": "192.168.1.8",
  "request_length": 198,
  "method": "GET",
  "uri": "/",
  "status": 200,
  "user_agent": "PostmanRuntime/7.29.2",
  "resp_time": 0.000,
  "upstream_addr": ""
}

Enter fullscreen mode

Exit fullscreen mode

You can view the access.log file through the terminal using the cat command as shown below.

access.log file which generates Nginx logs

The next step is to send these logs to the SigNoz platform.

Configuring NGINX to generate error logs

To enable the error log, choose the log level and log file location. Using the error log directive in the nginx.conf configuration file, you may select the log level as shown below:

error_log  /home/user/Work/logs/nginx_error.log emerg;
error_log  /home/user/Work/logs/nginx_info.log info;

Enter fullscreen mode

Exit fullscreen mode

There are several levels of error logging that you can use to specify the types of errors that should be logged. These log levels are:

debug: Debug-level messages are very detailed and are typically used for debugging purposes.
info: Information-level messages are used to log important events, such as the start and stop of the Nginx server.
notice: Notice-level messages are used to log events that are not necessarily error conditions, but are worth noting.
warn: Warning-level messages are used to log potential error conditions that may require attention.
error: Error-level messages are used to log actual error conditions that have occurred.
crit: Critical-level messages are used to log very severe error conditions that may require immediate attention.
alert: Alert-level messages are used to log conditions that require immediate action.
emerg: Emergency-level messages are used to log conditions that are so severe that the Nginx server may be unable to continue running.

You need to reload the nginx configuration for these changes to take effect. You can do this by running the command service nginx restart

Nginx error log configuration

Sending NGINX logs to Syslog

Syslog is a standard for logging system events. It is used to record and store the log messages produced by various system components, including the kernel, system libraries, and applications. Syslog provides a centralised method for managing and storing log messages, making it easier to monitor and resolve system issues.

To collect syslog from Nginx, you will need to configure Nginx to send its log messages to syslog.

Add the following line to the configuration file, replacing «syslog_server_hostname» with the hostname or IP address of your syslog server:


error_log syslog:server=syslog_server_hostname:54527,facility=local7,tag=nginx,severity=error;
access_log syslog:server=syslog_server_hostname:54527,facility=local7,tag=nginx,severity=debug;

Enter fullscreen mode

Exit fullscreen mode

Save the configuration file and restart Nginx.

Now, Nginx will send its log messages to the syslog server, which can be accessed and analyzed as needed.

There are several options that you can use to customize the way that Nginx sends syslog messages. Here are a few examples:

«facility»: This option specifies the facility to which the log message should be sent. The facility is used to categorize log messages and can be used to filter log data on the syslog server. Common facilities include «local0» through «local7», «user», «daemon», and «system».
«tag»: This option specifies a tag to be added to the log message. The tag can be used to identify the source of the log message, and can be used to filter log data on the syslog server.
«severity»: This option specifies the severity level of the log message. Common severity levels include «emerg», «alert», «crit», «error», «warning», «notice», «info», and «debug».

💡 Note: The above configuration will send only error messages to syslog. If you want to send other log levels (e.g. info, warning, etc.), you can adjust the «severity» parameter in the configuration line.

To configure syslog on the signoz platform, refer this documentation.

NGINX Logging and Analysis with SigNoz

SigNoz is a full-stack open source APM that can be used for analyzing NGINX logs. SigNoz provides all three telemetry signals — logs, metrics, and traces under a single pane of glass. You can easily correlate these signals to get more contextual information while debugging your application.

SigNoz uses a columnar database ClickHouse to store logs, which is very efficient at ingesting and storing logs data. Columnar databases like ClickHouse are very effective in storing log data and making it available for analysis.

Using SigNoz for NGINX logs can make troubleshooting easier. SigNoz comes with an advanced log query builder, live tail logs, and the ability to filter log data across multiple fields.

Let us see how to collect and analyze Nginx logs with SigNoz.

Installing SigNoz

SigNoz may be installed in three simple steps on macOS or Linux PCs using a simple install script.

Docker Engine is installed automatically on Linux by the installation script. However, before running the setup script on macOS, you must manually install Docker Engine.

git clone -b main https://github.com/SigNoz/signoz.git
cd signoz/deploy/
./install.sh

Enter fullscreen mode

Exit fullscreen mode

You can visit the documentation for instructions on how to install SigNoz using Docker Swarm and Helm Charts.

Deployment Docs

Steps for collecting Nginx logs into SigNoz

Modify the docker-compose.yaml file present inside deploy/docker/clickhouse-setup to expose to mount the log file to otel-collector. The file is located here. Mount the path where the Nginx access logs are available ~/Work/logs/access.log:/tmp/access.log to docker volume.

otel-collector:
    image: signoz/signoz-otel-collector:0.66.0
    command: ["--config=/etc/otel-collector-config.yaml"]
    user: root # required for reading docker container logs
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      //highlight-next-line
      - ~/Work/logs/access.log:/tmp/access.log

Enter fullscreen mode

Exit fullscreen mode

Here we are mounting the log file of our application to the tmp directory of SigNoz otel-collector. You will have to replace <path>with the path where your log file is present.

Add the filelog reciever to otel-collector-config.yaml which is present inside deploy/docker/clickhouse-setup and include the path /tmp/access.log:

receivers:
  filelog:
  //highlight-next-line
    include: [  "/tmp/access.log" ]
    start_at: beginning
    operators:
    - type: json_parser
      timestamp:
        parse_from: attributes.time
        layout: '%Y-%m-%dT%H:%M:%S%z'
    - type: move
      id: parse_body
      from: attributes.message
      to: body
    - type: remove
      id: time
      field: attributes.time

Enter fullscreen mode

Exit fullscreen mode

Next we will modify our pipeline inside otel-collector-config.yaml to include the receiver we have created above.

service:
    ....
    logs:
    //highlight-next-line
        receivers: [otlp, filelog]
        processors: [batch]
        exporters: [clickhouselogsexporter]

Enter fullscreen mode

Exit fullscreen mode

Once the changes are made, we need to restart the OTel Collector container to apply new changes. Use the command docker-compose restart.

Check if all the containers are running properly by using the command docker ps:

Check if all the containers are running properly

We can now go to the URL http://localhost and generate some Nginx logs into the access.log

Go to the SigNoz URL, click on the Logs tab, and look for the word «nginx”. http://localhost:3301/

Nginx logs sent to SigNoz

Click on the individual log to get a detailed view:

Details of individual Nginx logs

You can also view the logs in JSON format.

Nginx logs in JSON format

Conclusion

NGINX logs provides useful information to debug Nginx web servers. By using the error_log and access_log directives, you can track the performance and usage of your server and identify potential issues or errors.

The error_log directive can give you information about all errors, and you can use these logs to identify exactly what went wrong. The access_log directive gives you information about HTTP requests received by the Nginx server, and other client information like IP address, response status code, etc.

While debugging a single Nginx server by directly accessing Nginx logs can be done, it’s often not the case in production. Managing multiple Nginx servers and troubleshooting them effectively requires a centralized log management solution. In case of server downtime, you need to troubleshoot issues quickly, and effective dashboards around Nginx monitoring is needed.

With SigNoz logs, you can effectively manage your Nginx logging. You can check out its GitHub repo now.

Related Posts

SigNoz — A Lightweight Open Source ELK alternative

OpenTelemetry Logs — A complete introduction

Источник

Пример конфигурации

Директивы

Prerequisites

🔭 Want to centralize and monitor your NGINX logs?

Step 1 — Locating the NGINX log files

Step 2 — Viewing the NGINX log files

Step 3 — Configuring NGINX access logs

Enabling the access log

Disabling the access log

Logging to multiple access log files

Explanation of the default access log format

Step 4 — Creating a custom log format

Step 5 — Formatting your access logs as JSON

Step 6 — Configuring NGINX error logs

Enabling the error log

Disabling the error log

Logging errors into multiple files

Step 7 — Sending NGINX logs to Syslog

Step 8 — Centralizing your NGINX logs

Conclusion

Troubleshooting in Production Without Tuning the Error Log

Logging Basics

The Reality of Logging in Production

Using a Second Access Log for Errors

Summary

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="http://nginx.org/nginx.png" />

Модуль ngx_http_log_module

Пример конфигурации

Директивы

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="http://nginx.org/nginx.png" />

Module ngx_http_log_module

Example Configuration

Directives

Tech Blog: How to configure JSON logging in nginx?

Logging architecture

Nginx logging

JSON format

How the result looks like (sample)

What are Nginx Error Logs?

What are Nginx Access Logs?

Prerequisites

Installing Nginx

Installing NGINX on Linux

Installing NGINX on Mac

Configuring NGINX to generate access logs

Configuring NGINX to generate error logs

Sending NGINX logs to Syslog

NGINX Logging and Analysis with SigNoz

Installing SigNoz

Steps for collecting Nginx logs into SigNoz

Conclusion

Читайте также: