Lsyncd error terminating since out of inotify watches

Following from a previous article where I demonstrated how to synchronise data to a remote host using lsyncd, this one with show how to deal with the below problem I faced.

Following from a previous article where I demonstrated how to synchronise data to a remote host using lsyncd, this one with show how to deal with the below problem I faced.

The problem I faced was, although lsyncd has started successfully, the synchronisation had not begun. In the previous article, I created a directory and some files to test the sync – and this worked. However, when I tried to copy some data for real – 500G odd – the sync did not happen.

After checking the log file:

[email protected]:~# less /var/log/lsyncd/lsyncd.log

…I found the following error:

...
Wed Apr  7 14:27:40 2021 Normal: Finished (list): 0
Wed Apr  7 16:48:55 2021 Normal: --- TERM signal, fading ---
Wed Apr  7 16:49:13 2021 Normal: recursive startup rsync: /srv/repositories-remote/ -> 192.168.124.16:/srv/repositories/
Wed Apr  7 16:49:14 2021 Normal: Startup of "/srv/repositories-remote/" finished: 0
Wed Apr  7 16:50:38 2021 Normal: --- TERM signal, fading ---
Wed Apr  7 16:53:50 2021 Error: Terminating since out of inotify watches.
Consider increasing /proc/sys/fs/inotify/max_user_watches
Wed Apr  7 17:07:53 2021 Normal: --- TERM signal, fading ---
...

Fortunately this problem is easy to solve – we just need to increase a kernel-based limitation. I believe one inotify watch is required per file/directory and they use about 1kB each for a 64 bit system. The variable we need to alter is max_user_watches.

You can check the current value with:

[email protected]:~# cat /proc/sys/fs/inotify/max_user_watches 
8192

To adjust this value – first stop the lsyncd daemon process:

[email protected]:~# systemctl stop lsyncd.service

Issue the following command to change the current running value in the kernel:

[email protected]:~# sysctl fs.inotify.max_user_watches=400000

And then restart the lsyncd process:

[email protected]:~# systemctl start lsyncd.service

Then check the log again and make sure the previous error is no longer showing up. In which case the synchronisation should begin.

The change this permanently, run something like this:

[email protected]:~# sysctl -w fs.inotify.max_user_watches=400000 | tee -a /etc/sysctl.conf && sysctl -p

The above command sets the current kernel value as well as update the configuration file. We also force sysctl to reload the configuration for good measure.

Lsyncd Technical Session

So what is lsyncd?

Lsyncd is a tool used to keep a source directory in sync with other local or remote directories. It is a solution suited keeping directories in sync by batch processing changes over to the synced directories.

When would we use lsyncd?

So the generic use case is to keep a source directory in sync with one or more local and remote directories.

This could mean:

  • Creating a live backup of a directory which would be easy to fail over to.
  • Eliminate a single point of failure by distributing the data to multiple servers
  • Scale out a web application (e.g. WordPress)

How does lsyncd work?

Rsync + SSH

Primarily lsyncd is used as a combination of rsync + ssh. This is how we keep folders on remote servers in sync with our source.

Lsyncd is written in the Lua language and thus the configuration is valid Lua syntax. This allows us to configure lsyncd at various depths of complexity. On their website the configuration is broken down in to four layers. You can read further on this topic in their documentation.

Configuration

Lsyncd’s configuration file is located at /etc/lsyncd.conf. In here we find how to configure the behavior of lsyncd. The two main sections for our configuration file are settings and sync.

In the settings section we define some of the global options for our daemon.

settings {
        logfile         = "/var/log/lsyncd/lsyncd.log",
        statusFile      = "/var/log/lsyncd/lsyncd.stat",
        statusIntervall = 1,
        nodaemon        = false
}

Caution: You may see the settings directive started off as «settings = {…}«. In previous versions the configuration file settings was defined as a variable. It is now a function and thus no longer needs an «=» between settings and {

Here is an example of a sync section.

sync {
        default.rsyncssh,
        source = "/var/www/html",   
        targetdir = "/var/www/html",
        host = "192.168.33.20",
        delay           = 5,
        rsync = { rsh="/usr/bin/ssh -l webuser -i /home/webuser/.ssh/id_rsa -o StrictHostKeyChecking=no"}
}

It is possible to use just rsync or bash to keep two locations in sync locally. We do not cover these in this session and instead of focus mainly on using rsync + ssh.

Batch Processing

Lsyncd monitors files and directories for changes. These changes are observed, aggregated and batched out to the target servers. The default interval for batching is 15 seconds. This can be modified by either the maxDelays in the settings section or the Delay directive in the sync section.

Note on File Monitoring: This is done via the file monitoring interface ( inotify or fsevents ). The max number of monitors is defined under /proc/sys/fs/inotify/max_user_watches

Example Setting maxDelays configures the number of events to queue up before running an rysnc.

settings {
...
maxDelays = 10
...
}

Example: Delay here sets how long between syncing the queued events.

sync {
	default.rsyncssh,
	...,
	Delay = 5,
	...
}

Setup and Installation

As of this writing (8/29/2016) the most recent version available in the EPEL channel is lsyncd-2.1.5. This section will cover installation and setup on a pair of CentOS 6 servers.

Prerequisites

  • Two or more servers.
  • The appropriate EPEL channel configured
  • lsyncd from the EPEL channel
  • SSH Keys
  • A source and target location defined

Installation

For our CentOS 6 servers, installation is pretty straight forward.

# yum -y install lsyncd

SSH Keys

Let us assume we have two servers, web1 and web2.

  • Web1 — Our source server has the IP of 192.168.33.10
  • Web2 — Our target server has the IP of 192.168.33.20

In this example we are creating ssh-keys on web1 and distributing them to web2.

# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a7:a3:70:ce:f7:1c:11:a4:56:cd:7a:47:87:c3:1b:a3 root@lb
The key's randomart image is:
+--[ RSA 2048]----+
|          oo . . |
|         +  o B .|
|        o .. o * |
|       .  ..E o  |
|        S o. .   |
|         o .     |
|    . . o .      |
|     = ..o .     |
|      +. .o      |
+-----------------+
# ssh-copy-id user@web2

Source and Target Locations

Here is an example configuration file. This could be copied and pasted to have a working configuration. We will discuss it in detail below.

-- Two dashes define a comment
settings {
        logfile         = "/var/log/lsyncd/lsyncd.log",
        statusFile      = "/var/log/lsyncd/lsyncd.stat",
        statusIntervall = 1,
        nodaemon        = false,
}

sync {
        default.rsyncssh,
        source = "/var/www/html",
        host = "192.168.33.20",
        targetdir = "/var/www/html",
        delay           = 5,
        rsync = { rsh="/usr/bin/ssh -l webuser -i /home/webuser/.ssh/id_rsa -o StrictHostKeyChecking=no"}
}

Under the settings function, most items are self explanatory, but let us talk about statusInterval and «nodaemon».

statusInterval = 1 — writes the status file at shortest after this number of seconds has passed (default: 10)

nodaemon = false — Determines if lsyncd runs as daemon or in the foreground.

With the sync section there is a bit more to elaborate on.

default.rsyncssh — Defines we are using SSH in addition to rsync to sync remote hosts.
source = «/var/www/html» — The path we want to sync from
host = «192.168.33.20» — The host we want to sync to
targetdir = «/var/www/html» — The destination path we want to sync to
delay = 5 — Changing the default batch time from 15 to 5 seconds.
rsync = {rsh=»/usr/bin/ssh -l webuser -i /home/webuser/.ssh/id_rsa -o StrictHostKeyChecking=no»} — This section provides a way to pass along additional rsync options. In our example we are connecting as the webuser user.

One extra Note

The -o StrictHostKeyChecking=no setting is not 100% needed but it will make life easier if you are syncing with a server you have never logged in to. It by passes having to ssh in to a server and accept the host key before syncing.

And That’s It

With our servers setup, the keys in place and the lsyncd.conf file setup we should be good to start the service up and see our files start to sync up.

# chkconfig lsyncd on
# service lsyncd start

Upon start up we can look at our log and stat file to gather some information.

/var/log/lsyncd/lsyncd.log

Tue Aug 30 04:57:54 2016 Normal: recursive startup rsync: /var/www/html/ -> 192.168.33.20:/var/www/html/
Tue Aug 30 04:57:55 2016 Normal: Startup of "/var/www/html/" finished: 0

/var/log/lsyncd/lsyncd.start

Lsyncd status report at Tue Aug 30 04:58:05 2016

Sync1 source=/var/www/html/
There are 0 delays
Excluding:
  nothing.


Inotify watching 136 directories
  1: /var/www/html/
  2: /var/www/html/wordpress/
  3: /var/www/html/wordpress/wp-content/
  4: /var/www/html/wordpress/wp-content/plugins/
  5: /var/www/html/wordpress/wp-content/plugins/akismet/
  6: /var/www/html/wordpress/wp-content/plugins/akismet/_inc/

One More Thing

WARNING — Restarting lsyncd can delete files you may not be expecting it to. — WARNING

From lsyncd documentation.

By default Lsyncd will delete files on the target that are not present at the source since this is a fundamental part of the idea of keeping the target in sync with the source. However, many users requested exceptions for this, for various reasons, so all default implementations take delete as an additional parameter.

The default the delete directive is true. Which is Lsyncd will delete on the target whatever is not in the source. At startup and what’s being deleted during normal operation.

Other Valid Options

delete = false Lsyncd will not delete any files on the target. Not on startup nor on normal operation. (Overwrites are possible though)

delete = ‘startup’ Lsyncd will delete files on the target when it starts up but not on normal operation.

delete = ‘running’ Lsyncd will not delete files on the target when it starts up but will delete those that are removed during normal operation

So by default lsyncd will delete any files on the targets not currently on the source. This keeps with the idea of keeping one path in sync with a source path.

If you find yourself in a place where data is being written to one of the target directories outside of the lsyncd process, restarting lsyncd on the master WILL DELETE THOSE FILES..

If you find yourself in this situation, before restarting lsyncd you will want to try and sync all of the targets back to the source and then restart.

Old

09-06-2019

7,
0

Member Information Avatar

Join Date: Oct 2017

Last Activity: 1 October 2019, 9:33 AM EDT

Posts: 7

Thanks Given: 4

Thanked 0 Times in 0 Posts

Lsyncd Configuration


Hi All,
Hope you all doing good.

I’m facing some issue while syncing data using lsyncd. I’m working on a project to migrate data from a source S3 bucket to target S3 bucket. Both buckets has been configured via AWS storage gateway and shared to Linux servers as nfs shares.
The data size on below servers are:
Server A : approx 2 TB
Server B : approx 8 TB
Server C : approx 25 TB

NFS shares :
Server A : (src-nfsshare1, tgt-nfsshare2)
Server B : (src-nfsshare3, tgt-nfsshare4)
Server C : (src-nfsshare5, tgt-nfsshare6)

Our approach of data migration is, to mount the source S3 (src nfs share x) and target S3 (tgt nfs share x) on each server.
I’ve used tar command to copy the initial full copy of data which was quite quicker than rsync.

Code:

ex: cd /src/data/01;tar -c * |tar -xvf - -k -C /tgt/data/01 >> /root/tar_01.log

It took me 2 months to complete the full copy process. Now, I’ve setup lsyncd to replicate the delta data as soon as the source have new/updated data. I preferred lsyncd preffered over rsync (scheduling via cron).

Now, for «Server A» it works fine. I can see the logs are updating and showing the data being updated.

Logs Server A:

Code:

# tail -f /var/log/lsyncd/lsyncd.log
Mon Sep  2 23:50:25 2019 Normal: --- Startup ---
Mon Sep  2 23:51:27 2019 Normal: recursive startup rsync: /src/data/01/ -> /tgt/data/01/ excluding
*.snapshot
lost+found
Mon Sep  2 23:51:27 2019 Normal: recursive startup rsync: /src/data/01/ -> /tgt/data/01/ excluding
*lost+found
*.snapshot
Mon Sep  2 23:56:22 2019 Normal: Startup of /src/data/01/ -> /tgt/data/01/ finished.

Lsyncd Configuration:

Code:

# cat /etc/lsyncd.conf
----
-- User configuration file for lsyncd.
--
-- Simple example for default rsync, but executing moves through on the target.
--
-- For more examples, see /usr/share/doc/lsyncd*/examples/
--
settings {
logfile = "/var/log/lsyncd/lsyncd.log",
statusFile = "/var/log/lsyncd/lsyncd.status",
statusInterval = 60,
}
sync { default.rsync, source = "/src/data/01", target = "/tgt/data/01", delete = "false", delay = 300, exclude = { 'lost+found', '*.snapshot' }, rsync = {archive = true,compress = false,verbose = true} }
#

— Increased the below value from default to 150000 which worked for me. Reason, getting error «Consider increasing /proc/sys/fs/inotify/max_user_watches» while lsyncd initiating the sync.

# cat /proc/sys/fs/inotify/max_user_watches
150000
#

But for server B and C, though lsyncd doesn’t throw any error but I think there is some issue. Actually, the server has been started since yesterday but seems it is not doing anything.

Lsyncd configuration for server B & C is same a above (only the src & tgt paths are different).
Here I set inotify watches a little high because till 500000, it was failing.

Code:

# cat /proc/sys/fs/inotify/max_user_watches
600000
#

Lsyncd logs for Server A & B are same:

Code:

# tail -f /var/log/lsyncd/lsyncd.log
Consider increasing /proc/sys/fs/inotify/max_user_watches
Wed Sep  4 12:10:56 2019 Normal: --- Startup ---
Wed Sep  4 12:30:44 2019 Error: Terminating since out of inotify watches.
Consider increasing /proc/sys/fs/inotify/max_user_watches
Wed Sep  4 12:59:58 2019 Normal: --- Startup ---
Wed Sep  4 18:10:36 2019 Error: Terminating since out of inotify watches.
Consider increasing /proc/sys/fs/inotify/max_user_watches
Wed Sep  4 23:21:16 2019 Normal: --- Startup ---
Thu Sep  5 02:14:52 2019 Normal: --- TERM signal, fading ---
Thu Sep  5 02:14:52 2019 Normal: --- Startup ---

Its more than 25 hours but nothing logged in the log files. /etc/lsyncd.status file also blank. Can anybody shade light on this?
Also, I’m looking to add other rsync option with lsyncd (i.e. —stats, —progress or —progress2=info). How can I do that?
How to redirect the errors which occurred during file syncing operation (i.e. io errors or timeout errors) to a separate log file?

Понравилась статья? Поделить с друзьями:
  • Localized error steam reported an unknown hoi 4 на пиратке
  • Linpro scilab ошибка
  • Link verification error please check if trades can be accepted on your account
  • License error alert from server invalid client
  • Jtag dp sticky error