An error occurred while taking a snapshot msg snapshot error quiescingerror

Решение описанной ниже проблемы заняло 6 месяцев. Было перепробовано множество различных решений, обновлений, изменений таймеров, пер...

    Решение описанной ниже проблемы заняло 6 месяцев. Было перепробовано множество различных решений, обновлений, изменений таймеров, перелопачено куча форумов и коммьюнити. В конечном итоге решение было найдено экспериментально с помощью техподдержки VMware. 

   Описание работы системы: организовано резервное копирование виртуальных машин с помощью Veritas NetBackup. Одним из вариантов резервного копирования виртуальных машин является создание снапшотов этих машин и запись их на ленточное хранилище. Этот вариант не самый лучший, поскольку создание снапшотов виртуальных машин в принципе нельзя рассматривать как полноценный бэкап: есть вероятность возникновения ошибок ввода-вывода с последующим созданием «inconsistent backup». Тем не менее этот вариант был выбран и реализован, поскольку позволял вместо общей LAN использовать SAN-сеть для бэкапа, что увеличивало скорость копирования в существенное количество раз и разгружало LAN. 

   Перед резервным копированием машины через создание снапшота необходимо в идеале выключить виртуальную машину, но в продакшене это сделать проблематично. Поэтому используются специальные pre_- и post_freeze_scripts, которые останавливают нужные сервисы до создания снапшота, а потом восстанавливают работоспособность сервисов после завершения создания снапшотов. Такие скрипты используются и в других продуктах по созданию резервных копий, и даже в базе знаний VMware есть KB, посвященные созданию нужных скриптов.

   Суть проблемы: резервное копирование производится с помощью ПО Veritas Netbackup через VMware vStorage APIs for data protection с использованием SAN транспорта. Для того, чтобы сделать резервное копирование машины, NetBackup должен отправить запрос в VMware, которая в свою очередь с помощью VMware Tools запустит ранее созданные скрипты для остановки сервисов. Однако в случае высоконагруженных приложений, которые непрерывно получаютотправляют большой объем данных, может случиться, что сервисы не могут быть остановлены быстро из-за большой нагрузки IO. 

   В итоге видим следующую картину: после 15 минут и обрыва создания снапшота в логе ВМ vmware.log появляется сообщение вида:


2016-09-19T14:43:46.971Z| vmx| I120: Msg_Post: Warning
2016-09-19T14:43:46.971Z| vmx| I120: [msg.snapshot.quiesce.timeout] Timed out while quiescing the virtual machine.
2016-09-19T14:43:46.971Z| vmx| I120: —————————————-
2016-09-19T14:43:46.976Z| vmx| I120: ToolsBackup: changing quiesce state: STARTED -> DONE
2016-09-19T14:43:46.976Z| vmx| I120: SnapshotVMXTakeSnapshotComplete: Done with snapshot ‘test_snap_vm_2’: 0
2016-09-19T14:43:46.976Z| vmx| I120: SnapshotVMXTakeSnapshotComplete: Snapshot 0 failed: Failed to quiesce the virtual machine (40).

В интерфейсе (vSplere client, vSphere Web client) появляется соответствующее сообщение об ошибке msg.snapshot.error-QUIESCINGERROR. 

     Проблема проявляется независимо от размещения ВМ на одном или другом хосте. Проблема проявляется независимо от того, какими инструментами создается снапшот (vSplere client, vSphere Web client или задача Netbackup). Проблема проявляется независимо от того, подключены мы к хосту или к vCenter. Т.к. проблема воспроизводится в Vmware, а таймаут в Netbackup отрабатывает корректно (об этом ниже), считаем, что Netbackup мы исключаем из диагностики.

Этапы решения:

   1. Для того, чтобы исключить возможное влияние нехватки ресурсов СХД или иных ресурсов серверов или ошибок в pre/post скриптах, создан простой скрипт, который работает чуть более 15 минут и отлично воспроизводит проблему. Примерно следующее содержание скрипта:

echo start timer
ping –n 1500 127.0.0.1
echo stop timer

    После проведения нескольких тестов выявлено, что проблема полностью на стороне VMware. 

   2. Далее действия, которые описаны в доступных KB-статьях на сайте VMware. Основные файлы, подвергающиеся редактированию, и их место расположения:
 

/etc/vmware/vpxa/vpxa.cfg — файл находится на каждом из хостов ESXi
C:ProgramDataVMwarevCenterServercfgvmware-vpxvpxd.cfg – файл на сервере vCenter
C:ProgramDataVMwareVMware Toolstools.conf – файл внутри каждой виртуальной машины

   2a. Увеличиваем таймаут на vCenter:
https://kb.vmware.com/kb/1004790 указывает на таймаут в 15 минут. Меняем таймаут на 1500 секунд – файл C:ProgramDataVMwarevCenterServercfgvmware-vpxvpxd.cfg.
   2b. Увеличиваем таймаут для хоста ESX согласно https://kb.vmware.com/kb/1017253. После выполнения изменений перезагрузить хост.
   2c. Установить таймаут для C:ProgramDataVMwareVMware Toolstools.conf в следующем виде:

[vmbackup]

timeout=<count>

   В итоге вышеуказанные действия из общедоступных рекомендаций не принесли результата. После обращения в техподдержку были получены дополнительные рекомендации по редактированию файла виртуальных машин vmx:

snapshot.quiesce.timeout = «<numSeconds>»

   Рекомендованное значение – 2100.

   Данную строку необходимо внести на все VMX файлы всех виртуальных машин. После этого снапшот заработал корректно, выполняется ровно столько, сколько указано в таймауте. При этом выставляемого значения хватает для корректной отработки prepost скрипта.

tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

msg.snapshot.error-QUIESCINGERROR

Hi there,

I have a Linux SLES11 VM that has been backing up successfully via B&R for some time. Last week we added an application to the VM; GlusterFS to be precise and now every time a backup is performed it bombs out with the VMware error «msg.snapshot.error-QUIESCINGERROR». Disabling VMware tools quiescence allows the backup to be performed successfully. This is on vSphere 5.5 with the latest VMware tools installed and B&R v8p2.

I’m just wandering if anyone else has come across similar issues when quiescing a Linux guest OS? All of the VMware KB’s with similar errors point to Windows VSS issues so aren’t helpful.

Thanks :wink:


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » May 26, 2015 9:28 am

Are you able to reproduce this issue by taking manual snapshot with quiescence enabled via vSphere Client?


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » May 26, 2015 11:07 am

v.Eremin wrote:Are you able to reproduce this issue by taking manual snapshot with quiescence enabled via vSphere Client?

Its a production box so I cant test it. Just rigging up a test box now to see what happens.


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » May 26, 2015 11:55 am

So, you can backup VM in question which also requires snapshot to be created, but cannot create that snapshot manually? The tests conducted on similar testing VM might necessarily show VM specific issue. Thanks.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » May 26, 2015 2:17 pm

I forgot to mention that the quiescence snapshot causes Linux to crash and requires a hard reboot, which is why im now trying to recereate the issue on a test box.


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » May 26, 2015 2:22 pm

So, if the quiescence snapshot conducted outside of VB&R results in VM crash, may be it’s time to contact VMware support team and let investigate it. Thanks.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » May 27, 2015 8:09 am

I haven’t tried a snapshot outside of VB&R, as I said its a production box so I’m working on recreating a test environment. I havent approached VMware support since they only offer break/fix assistance not fault finding. I’m not really looking for any support, just curious if anyone else has had similar experiences and under what circumstances. Thanks.


foggy

Veeam Software
Posts: 20911
Liked: 2063 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by foggy » May 27, 2015 11:03 am

You seem to be the first one to report such issue, so if you want to find the reasons and prevent this behavior, contacting support looks to be the most effective way of further action.


kirchk

Lurker
Posts: 1
Liked: never
Joined: May 28, 2015 2:21 pm
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by kirchk » May 28, 2015 2:31 pm

I just wanted to chime in that I am also having this issue with SLES 11 VMware virtual machines on using a different backup solution. Attempting to create a snapshot within the vSphere client and using the quiesce option does not appear to reproduce the issue. I’ve been working with VMware technical support for close to two months to resolve this issue. Technical support suggested to update ESX 5.5 to the recently published Patch 6, however, QUIESCINGERRORs still occur seemly at random after patching the hosts.

I’m curious to know too if anyone else has had this issue occur to them and if they were able to go about resolving it. I’m also curious to know if anyone hears anything differently from VMware technical support.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » May 29, 2015 1:33 pm

Hi kirchk, thanks for the input. Its interesting that you are also seeing the same issue on SLES. Would you mind letting me know what applications and filesystems are running on the affected VM’s?

On a side note, I have set up a test VM and been able to recreate the issue. Interestingly the issue occurs when performing a Veeam backup but not when taking a snapshot directly through the vSphere client.


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » May 29, 2015 2:45 pm

Probably, I have misinterpreted your answer. If the issue is reproducible only while VB&R is present, please, open a case with our support team and let them analyze the problem directly. Thanks.


foggy

Veeam Software
Posts: 20911
Liked: 2063 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by foggy » Jun 01, 2015 2:02 pm

Guys, could you please check if the affected VMs have disk.enableUUID parameter set to true in the vmx file and if yes, try to backup with it set to false?

And yes, we would be grateful if you open cases with our support as well and provide their ID’s for reference. Thanks!


zerina

Novice
Posts: 3
Liked: never
Joined: Jun 02, 2015 7:39 am
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by zerina » Jun 02, 2015 7:54 am

Hello,

i found this thread while i was searching for a similiar Problem.

In Our case is also a Sles 11 SP3 (up to date patched) affected.

We are running a Vsphere 5.5 (patches up to date) but we are NOT using veeam, we use a different Product (i am not sure if saying which one is here allowed).

Most vesphere Backup solutions rely on VDDK so the process is mostly the same, that means to me, it is somehow a Sles (kernel) + Vsphere VM Tools Problem (probably VDDK related).

We activated vmtools debug log unfortunatly it is not telling usefull informations (while the vm dies) The only thing that i can guess off is, that the related Sles VM has an Sybase DB running wich Sqlanywhere performs some scheduled DB Jobs.

Every Time the VM dies while performing the backup, these jobs are running, so it looks like an overload on the VM (just a guess)

We are still searching for a solution

regards


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » Jun 02, 2015 9:04 am

Just out of curiosity — does setting disk.enableUUID parameter to false allow your backup solution to proceed any further?


zerina

Novice
Posts: 3
Liked: never
Joined: Jun 02, 2015 7:39 am
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by zerina » Jun 02, 2015 9:23 am

disk.enableUUID is currently not within the .vmx file, what is the default value for disk.enableUUID.

Our Vm is not dying (during backup) ervery time, i can try that setting, but i need some days to proof that this setting helps.
I will test that and will report back.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » Jun 02, 2015 9:50 am

Interesting that a few people are having the same issues on SLES.

Foggy,v.Eremin, Ive just tried a few test runs with disk.enableUUID=false and it the backup went fine without any issues. When setting disk.enableUUID=true we encounter the error and the VM crashes. I imagine this setting gives us a crash consistent backup so is the equivalent of having quiesence disabled?


foggy

Veeam Software
Posts: 20911
Liked: 2063 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by foggy » Jun 02, 2015 10:35 am

Right, «true» (default) means application-consistent backup, «false» — crash-consistent, however the setting applies to Windows guests only, so no need to bother with Linux VMs. We are still investigating the issue.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » Jun 02, 2015 10:51 am

foggy wrote:Right, «true» (default) means application-consistent backup, «false» — crash-consistent, however the setting applies to Windows guests only, so no need to bother with Linux VMs. We are still investigating the issue.

Are you sure about that? The test shows very different results in Linux when toggling the flag. I can’t find much documentation around disk.EnableUUID but I’m guessing it enables ID disk presentation to the OS. Can anyone elaborate?


zerina

Novice
Posts: 3
Liked: never
Joined: Jun 02, 2015 7:39 am
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by zerina » Jun 17, 2015 8:20 am

In our case, the test failed, we had an crash this weekend during backup.
The Test was made with the vmx setting disk.enableUUID, false


Wocka

Enthusiast
Posts: 44
Liked: 7 times
Joined: Oct 01, 2014 12:04 am
Full Name: Warwick Ferguson
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by Wocka » Jul 01, 2015 10:47 pm

Hello,

I’m also experiencing this issue in a hosted Datacentre on 7 RedHat servers after upgrading VMware Tools to 9.0.5.21789 (build-1065307). Perviously we were on open vmware tools 8.3.7.4937 (build-381511).

The hoster sent me this error:
An error occured while taking a snapshot: msg.snapshot.error-QUIESCINGERROR
An error occured while saving a snapshot: msg.snapshot.error-QUIESCINGERROR

Environment:
Vmware: 5.5 — I don’t know the patch level as it’s a hosted server
OS: RedHat Enterprise Linux Server release 6.6 (Santiago)
Backup Software: CommVault 10

Manual Tests I have performed with VMWare tools logging on:
(http://kb.vmware.com/selfservice/micros … Id=1007873 )
1. Snapshot through vCentre (no Quiesce) = no error or crash
2. Snapshot through vCentre (with Quiesce) = no error or crash
3. Snapshot through CommVault = no error or crash

I could not reproduce the error, but we are still experiencing it randomly over our 7 servers at least one server per night, sometimes up to 3.
When the issue occurs, I can still PING the server, I can SSh to it and enter in root and my password. The prompt never comes back after entering the password. A hard reset is the only way to resolve this.

Interesting, on our internal UAT environment we have not seen this issue:
Vmware: 5.1.0 no patches
OS: RedHat Enterprise Linux Server release 6.6 (Santiago)
All Yum Updates are the same between Dev, Test, UAT and Production
Backup Software: vRanger 6.1.0.35402 (still trying to get mgmt approval for my Veeam solution)

I have found the following articles for Linux, this talks about changing the «disk.enableUUID=false» as perviously mentioned.
http://kb.vmware.com/selfservice/micros … Id=2038606
http://kb.vmware.com/selfservice/micros … Id=2079220

I’m about to make these changes today.


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » Jul 02, 2015 9:33 am

Kindly, keep us updated about the results you get. Based on the experience of previous posters, the mentioned setting should be a way to go, indeed.


Wocka

Enthusiast
Posts: 44
Liked: 7 times
Joined: Oct 01, 2014 12:04 am
Full Name: Warwick Ferguson
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by Wocka » Jul 07, 2015 1:00 am
1 person likes this post

A follow up from my post above.

After making the required changes to the .vmx files by our hosting provider and myself making the changes in the VMWare Tools (tools.conf) file. We have not had a server become unresponsive during a snapshot for the past 5 nights. The above KB2079220 from VMWare has resolved the issue.


veremin

Product Manager
Posts: 19894
Liked: 2153 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by veremin » Jul 07, 2015 5:20 am

Thank you for confirming that setting disk.EnableUUID parameter in the .vmx file to false resolves the problem for you; much appreciated.


tommyv

Influencer
Posts: 19
Liked: 1 time
Joined: Dec 17, 2014 10:47 am
Full Name: Tom Vernon
Location: England
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by tommyv » Jul 13, 2015 7:30 am
1 person likes this post

Thanks for the update Wocka. Interestingly KB2079220 claims to be fixed in ESXi5.5u2 however we still see the issue running on the latest version. Setting disk.EnableUUID to false or disabling VMware tools quiesce does work as a fix though.


jspeicher

Service Provider
Posts: 17
Liked: 1 time
Joined: May 09, 2013 7:49 pm
Full Name: Jason
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by jspeicher » Sep 28, 2016 7:35 pm

we’re seeing the same issue with vcenter/veeam. esxi 5.5.0 3568722

We’re getting random issues as described in this thread. When can keep retrying the jobs, and eventually get the job to complete all the vm’s. Was any resolution found for anyone?


stuartmacgreen

Expert
Posts: 146
Liked: 33 times
Joined: May 01, 2012 11:56 am
Full Name: Stuart Green
Contact:

Re: msg.snapshot.error-QUIESCINGERROR

Post

by stuartmacgreen » Sep 29, 2016 1:15 pm
1 person likes this post

I had this issue and was given a fix by VMware. Here is the environment and versions I am running.

VM Guest OS: SUSE Linux Enterprise Server 11 SP4
ESXi: 5.5 U3
VirtualCenter Server: 5.5 U3e

The VM’s were updated to VMware Tools 10.x from 9.x and got a load of VM backup failures «msg.snapshot.error-QUIESCINGERROR»
But the VM’s would sometimes backup, and then sometimes fail. VERY inconsistent but consistent to fail on the v10.x tools compared to v9.x

I reverted back my VM’s to v9.x
Logged a case with VMware to report the issue.
Several VMware Tools in the 10.x stream have been released and all applied to a sample VM but exhibited the same «msg.snapshot.error-QUIESCINGERROR» every now and again.

Now these VM’s had configured in the OS via /etc/vmware-tools/tools.conf the following setting:

[vmbackup]
enableSyncDriver = false

This was set after a recommendation from VMware in v9.x tools as without it this HUNG the whole filesystem of the OS — even more severe, than the current predicament of just not quiescing and at least you got the OS back!

VMware recommended to install VMware Tools 10.x and set tools.conf

Code: Select all

[vmbackup]
enableSyncDriver = true

Now we are getting successful VM backups without any QUIESCE errors.

Backup Infrastructure: Veeam B&R 11.0.1.1261 P20220302 (Mar 2022) • vSphere 7.0u3h •HPE Nimble AF40 [FC] (Primary) •HPE Nimble HF40 [FC] (Repositories) • HPE MSL3040 LTO-7 (Tape)


Содержание

  1. Волчье логово / Ulvens Lair / Wolfshöhle / Wolfs Lair
  2. вторник, 14 февраля 2017 г.
  3. Решение проблемы msg.snapshot.error-QUIESCINGERROR при создании снапшотов VMware
  4. Этапы решения:
  5. Vmware snapshot quiescing error

Волчье логово / Ulvens Lair / Wolfshöhle / Wolfs Lair

Шпаргалки и заметки о сетевых технологиях, серверах, СХД, IT в принципе. И о разном другом) Чтоб самому не забывать, и другим помочь. ВНИМАНИЕ: ИСПОЛЬЗУЙТЕ VPN ДЛЯ КОРРЕКТНОГО ОТОБРАЖЕНИЯ КАРТИНОК

вторник, 14 февраля 2017 г.

Решение проблемы msg.snapshot.error-QUIESCINGERROR при создании снапшотов VMware

Этапы решения:

1. Для того, чтобы исключить возможное влияние нехватки ресурсов СХД или иных ресурсов серверов или ошибок в pre/post скриптах, создан простой скрипт, который работает чуть более 15 минут и отлично воспроизводит проблему. Примерно следующее содержание скрипта:

echo start timer
ping –n 1500 127.0.0.1
echo stop timer

После проведения нескольких тестов выявлено, что проблема полностью на стороне VMware.

/etc/vmware/vpxa/vpxa.cfg — файл находится на каждом из хостов ESXi
C:ProgramDataVMwarevCenterServercfgvmware-vpxvpxd.cfg – файл на сервере vCenter
C:ProgramDataVMwareVMware Toolstools.conf – файл внутри каждой виртуальной машины

2a. Увеличиваем таймаут на vCenter:
https://kb.vmware.com/kb/1004790 указывает на таймаут в 15 минут. Меняем таймаут на 1500 секунд – файл C:ProgramDataVMwarevCenterServercfgvmware-vpxvpxd.cfg.
2b. Увеличиваем таймаут для хоста ESX согласно https://kb.vmware.com/kb/1017253. После выполнения изменений перезагрузить хост.
2c. Установить таймаут для C:ProgramDataVMwareVMware Toolstools.conf в следующем виде:

В итоге вышеуказанные действия из общедоступных рекомендаций не принесли результата. После обращения в техподдержку были получены дополнительные рекомендации по редактированию файла виртуальных машин vmx:

Рекомендованное значение – 2100 .

Данную строку необходимо внести на все VMX файлы всех виртуальных машин. После этого снапшот заработал корректно, выполняется ровно столько, сколько указано в таймауте. При этом выставляемого значения хватает для корректной отработки prepost скрипта.

Источник

Vmware snapshot quiescing error

I’m using vCenter 5.5 with WEB vSphere GUI to take a snapshot of a RHEL 5.4 32 based VM on a ESXi 5.5. When I choose the «Quiesce the guest file system» instead of the «snapshot the virtual machine memory», the snapshot creation failed at 90% with «An error occurred while taking a snapshot: Failed to quiesce the virtual machine» .

I have no problem using the «snapshot the virtual machine memory» option to create the snapshot.

Your help is appreciated

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

It is a RHEL 5.4 kernel bug. Thanks everyone who helped.

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

As it turns out that we have two root partitions on the VM. If the VM is up and running from the first root partition (file system label /) there is no issue taking the snapshot but if the VM is up on the second partition (/2), the snapshot fails and causes the VM to panic.

Re-installing and re-configuring the VMware tool on the second partition does not make any difference. After the snapshot approaching 95%, the VM panic.

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Please check virtual machine usage , if it is generating high IO inside the OS

Please check the below link if it helps you,

Just for the information, RHEL / any of the linux or Unix VM server does not support VMware tools quiescing.

Unix / Linux file system does not quiesce the machine in order to take the application consistent backup, it is only crash consistent copy of the OS.

Thanks and Regards,

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

I’m not sure how you took snapshot for the first machine which VM installed on first partition, since Linux OS doesn’t support for Quiesce, since which require VSS to quiesce, so where ever we have VSS components, in those machines only we can take snapshot with Quiesce.

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Let me provide more information. I was taken a VM backup using vSpehere WEB client from a RHEL 5.4 based VM. The backup failed due to the Quiesce error. That is why I try manual snapshot using the «Quiesce the guest file system» instead of the «snapshot the virtual machine memory» .

So the question now is why the EBR plug-in q uiesce the RHEL VM if it is not supported ?

The KB article provided by Krishant did not help. I cannot find the snapshot.redoNotWithParent parameter.

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Here is the hostd.log of the ESXi:

2014-02-24T18:41:20.030Z [3D8C2B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMkernel log for detailed error information

2014-02-24T18:41:20.030Z [3D8C2B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection. Turn on ‘trivia’ log for details

2014-02-24T18:41:22.812Z [3D881B70 verbose ‘SoapAdapter’] Responded to service state request

2014-02-24T18:41:24.615Z [3D881B70 verbose ‘Cimsvc’] Ticket issued for CIMOM version 1.0, user root

2014-02-24T18:41:25.710Z [3C240B70 verbose ‘Hostsvc.DvsManager’] PersistAllDvsInfo called

2014-02-24T18:41:25.939Z [3D881B70 info ‘Hostsvc’ opID=hostd-9f65] VsanSystemVmkProvider : GetRuntimeInfo: Start

2014-02-24T18:41:25.939Z [3D881B70 info ‘Hostsvc’ opID=hostd-9f65] VsanSystemVmkProvider : GetRuntimeInfo: Complete, runtime info: (vim.vsan.host.VsanRuntimeInfo) <

2014-02-24T18:41:31.114Z [3BEC2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Updating current heartbeatStatus: yellow

2014-02-24T18:41:32.031Z [3D8C2B70 verbose ‘SoapAdapter’] Responded to service state request

2014-02-24T18:41:35.396Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’GetPerfCounter’

2014-02-24T18:41:35.404Z [3D8C2B70 verbose ‘Locale’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] Default resource used for ‘counter.vsanDomObj.writeThroughput.summary’ expected in module ‘perf’.

2014-02-24T18:41:35.423Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] AdapterServer: target=’vim.HostSystem:ha-host’, method=’retrieveInternalCapability’

2014-02-24T18:41:35.425Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’queryPerfCounterInt’

2014-02-24T18:41:35.437Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] AdapterServer: target=’vim.LicenseManager:ha-license-manager’, method=’GetLicenses’

2014-02-24T18:41:35.437Z [FFDE4B70 verbose ‘Vimsvc.ha-license-manager’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] Load: Loading existing file: /etc/vmware/license.cfg

2014-02-24T18:41:35.452Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6836-77a33b2-eb user=vpxuser] ha-license-manager:Validate -> Valid license detected for «VMware ESX Server 5.0» (lastError=0, desc.IsValid:Yes)

2014-02-24T18:41:40.029Z [3D8C2B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMkernel log for detailed error information

2014-02-24T18:41:40.030Z [3D8C2B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection. Turn on ‘trivia’ log for details

2014-02-24T18:41:45.422Z [3D881B70 verbose ‘Default’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’GetPerfCounter’

2014-02-24T18:41:45.430Z [3D881B70 verbose ‘Locale’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] Default resource used for ‘counter.vsanDomObj.writeThroughput.summary’ expected in module ‘perf’.

2014-02-24T18:41:45.452Z [3D881B70 verbose ‘Default’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] AdapterServer: target=’vim.HostSystem:ha-host’, method=’retrieveInternalCapability’

2014-02-24T18:41:45.454Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’queryPerfCounterInt’

2014-02-24T18:41:45.466Z [FFDE4B70 verbose ‘Default’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] AdapterServer: target=’vim.LicenseManager:ha-license-manager’, method=’GetLicenses’

2014-02-24T18:41:45.466Z [3D881B70 verbose ‘Vimsvc.ha-license-manager’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] Load: Loading existing file: /etc/vmware/license.cfg

2014-02-24T18:41:45.481Z [3D881B70 verbose ‘Default’ opID=HB-host-516@6837-6cc68d7b-c6 user=vpxuser] ha-license-manager:Validate -> Valid license detected for «VMware ESX Server 5.0» (lastError=0, desc.IsValid:Yes)

2014-02-24T18:41:51.393Z [3C240B70 verbose ‘Default’ opID=1381cd91-97 user=vpxuser] AdapterServer: target=’vim.VirtualMachine:8′, method=’createSnapshot’

2014-02-24T18:41:51.394Z [3C240B70 info ‘Vimsvc.TaskManager’ opID=1381cd91-97 user=vpxuser] Task Created : haTask-8-vim.VirtualMachine.createSnapshot-228252373

2014-02-24T18:41:51.394Z [3D8C2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’ opID=1381cd91-97 user=vpxuser] Create Snapshot: Avamar-1393267311a6d576b91e72d43fdb15e1a7fc3dea10e9a47796, memory=false, quiescent=true state=4

2014-02-24T18:41:51.394Z [3D8C2B70 info ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’ opID=1381cd91-97 user=vpxuser] State Transition (VM_STATE_ON -> VM_STATE_CREATE_SNAPSHOT)

2014-02-24T18:41:52.641Z [3BEC2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Handling message _vmx3: The CPU has been disabled by the guest operating system. Power off or reset the virtual machine.

2014-02-24T18:41:52.641Z [3BEC2B70 info ‘Vimsvc.ha-eventmgr’] Event 243 : Message on slu003 on esxi14. in ha-datacenter: The CPU has been disabled by the guest operating system. Power off or reset the virtual machine.

2014-02-24T18:41:52.642Z [3BEC2B70 info ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Answered question _vmx3

2014-02-24T18:41:52.813Z [3D8C2B70 verbose ‘SoapAdapter’] Responded to service state request

2014-02-24T18:42:00.029Z [3D581B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMkernel log for detailed error information

2014-02-24T18:42:00.029Z [3D581B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection. Turn on ‘trivia’ log for details

2014-02-24T18:42:01.116Z [3D8C2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Updating current heartbeatStatus: green

2014-02-24T18:42:02.033Z [3D581B70 verbose ‘SoapAdapter’] Responded to service state request

2014-02-24T18:42:03.555Z [FFD815B0 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Version status of tools changed to: 3

2014-02-24T18:42:05.401Z [3D581B70 verbose ‘Default’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’GetPerfCounter’

2014-02-24T18:42:05.409Z [3D581B70 verbose ‘Locale’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] Default resource used for ‘counter.vsanDomObj.writeThroughput.summary’ expected in module ‘perf’.

2014-02-24T18:42:05.428Z [3D581B70 verbose ‘Default’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] AdapterServer: target=’vim.HostSystem:ha-host’, method=’retrieveInternalCapability’

2014-02-24T18:42:05.430Z [3D581B70 verbose ‘Default’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’queryPerfCounterInt’

2014-02-24T18:42:05.441Z [3D581B70 verbose ‘Default’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] AdapterServer: target=’vim.LicenseManager:ha-license-manager’, method=’GetLicenses’

2014-02-24T18:42:05.441Z [3D581B70 verbose ‘Vimsvc.ha-license-manager’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] Load: Loading existing file: /etc/vmware/license.cfg

2014-02-24T18:42:05.456Z [3D581B70 verbose ‘Default’ opID=HB-host-516@6840-7c1e5c65-1e user=vpxuser] ha-license-manager:Validate -> Valid license detected for «VMware ESX Server 5.0» (lastError=0, desc.IsValid:Yes)

2014-02-24T18:42:08.332Z [3C240B70 verbose ‘Hostsvc.ResourcePool ha-root-pool’] Root pool capacity changed from 30597MHz/60241MB to 30597MHz/60239MB

2014-02-24T18:42:08.335Z [3D581B70 verbose ‘Default’ opID=SWI-dda157cf user=vpxuser] AdapterServer: target=’vim.ResourcePool:ha-root-pool’, method=’GetConfig’

2014-02-24T18:42:08.336Z [3BEC2B70 verbose ‘Default’ opID=SWI-dda157cf user=vpxuser] AdapterServer: target=’vim.ResourcePool:ha-root-pool’, method=’GetName’

2014-02-24T18:42:15.402Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’GetPerfCounter’

2014-02-24T18:42:15.410Z [3C240B70 verbose ‘Locale’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] Default resource used for ‘counter.vsanDomObj.writeThroughput.summary’ expected in module ‘perf’.

2014-02-24T18:42:15.429Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] AdapterServer: target=’vim.HostSystem:ha-host’, method=’retrieveInternalCapability’

2014-02-24T18:42:15.430Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] AdapterServer: target=’vim.PerformanceManager:ha-perfmgr’, method=’queryPerfCounterInt’

2014-02-24T18:42:15.442Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] AdapterServer: target=’vim.LicenseManager:ha-license-manager’, method=’GetLicenses’

2014-02-24T18:42:15.442Z [3C240B70 verbose ‘Vimsvc.ha-license-manager’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] Load: Loading existing file: /etc/vmware/license.cfg

2014-02-24T18:42:15.457Z [3C240B70 verbose ‘Default’ opID=HB-host-516@6842-41f5ef53-39 user=vpxuser] ha-license-manager:Validate -> Valid license detected for «VMware ESX Server 5.0» (lastError=0, desc.IsValid:Yes)

2014-02-24T18:42:15.711Z [3C240B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Version status of tools changed to: 3

2014-02-24T18:42:18.569Z [3C5D2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Create Snapshot translated error to vim.fault.GenericVmConfigFault

2014-02-24T18:42:18.569Z [3C5D2B70 info ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Create Snapshot failed: vim.fault.GenericVmConfigFault

2014-02-24T18:42:18.569Z [3C5D2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Create Snapshot message: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

—> An error occurred while taking a snapshot: Failed to quiesce the virtual machine.

2014-02-24T18:42:18.569Z [3C240B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] guest operations are not ready

2014-02-24T18:42:18.569Z [3C240B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Version status of tools changed to: 3

2014-02-24T18:42:18.571Z [FFD815B0 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Tools are not auto-upgrade capable

2014-02-24T18:42:18.571Z [FFD815B0 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Skipped tools ManifestInfo update.

2014-02-24T18:42:18.574Z [3D8C2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Time to gather Snapshot information ( read from disk, build tree): 2 msecs. needConsolidate is false.

2014-02-24T18:42:18.597Z [3D8C2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Time to gather config: 22 (msecs)

2014-02-24T18:42:18.597Z [3D581B70 verbose ‘Hbrsvc’] Replicator: ReconfigListener triggered for config VM 8

2014-02-24T18:42:18.599Z [3D8C2B70 verbose ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Time to gather snapshot file layout: 0 (msecs)

2014-02-24T18:42:18.617Z [3D8C2B70 warning ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] Failed operation

2014-02-24T18:42:18.617Z [3D8C2B70 info ‘Vmsvc.vm:/vmfs/volumes/d9b2943f-86ca1719/slu003/slu003.vmx’] State Transition (VM_STATE_CREATE_SNAPSHOT -> VM_STATE_ON)

2014-02-24T18:42:18.618Z [3D881B70 verbose ‘Hostsvc’] Received state change for VM ‘8’

2014-02-24T18:42:18.618Z [3D881B70 info ‘Guestsvc.GuestFileTransferImpl’] Entered VmPowerStateListener

2014-02-24T18:42:18.618Z [3D8C2B70 info ‘Vimsvc.TaskManager’] Task Completed : haTask-8-vim.VirtualMachine.createSnapshot-228252373 Status error

2014-02-24T18:42:18.618Z [3D881B70 info ‘Guestsvc.GuestFileTransferImpl’] VmPowerStateListener succeeded

2014-02-24T18:42:18.618Z [3D581B70 verbose ‘Hbrsvc’] Replicator: VmReconfig ignoring VM 8 not configured for replication

2014-02-24T18:42:18.618Z [3D881B70 info ‘Hbrsvc’] Replicator: powerstate change VM: 8 Old: 1 New: 1

2014-02-24T18:42:18.619Z [3D881B70 verbose ‘Hbrsvc’] Replicator: Remove group no matching entry for VM (id=8)

2014-02-24T18:42:20.029Z [3C240B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMkernel log for detailed error information

2014-02-24T18:42:20.030Z [3C240B70 verbose ‘Statssvc.vim.PerformanceManager’] HostCtl Exception in stats collection. Turn on ‘trivia’ log for details

2014-02-24T18:42:22.815Z [3BEC2B70 verbose ‘SoapAdapter’] Responded to service state request

Источник

VMware backup failure with msg.snapshot.error-QUIESCINGERROR.

The VMware backup job failed to proceed with the error message «An error occurred while saving a snapshot: msg.snapshot.error-QUIESCINGERROR». I restarted the whole instance and checked in vSphere client during the backup schedule that the same error is shown while the VM snapshot is on progress. Please advise how to fix this issue as we need a valid backup data by end on this weekend.

Re: VMware backup failure with msg.snapshot.error-QUIESCINGERROR.

The reported issue specifically affects Windows Server 2008 R2 and Windows Server 2008 SP2, where the Virtual Disk service does not start in the Windows guest operating system on the affected virtual machine. The same is addressed in VMware kb articles.

To resolve this issue, start the Windows Virtual Disk service by following the below steps:

  • Log in to the Windows operating system as an Administrator.
  • Click Start, type services.msc, and click Enter.
  • Right-click the Virtual Disk service and click Start.

Now try scheduling the problematic backup job and check whether the reported issue is fixed and update us.

You can determine whether the backup failure is due to Windows Virtual Disk service by checking the below symptoms.

  • we won’t be able to create a quiesced snapshot of the problematic virtual machine.
  • The snapshot operation will fail approximately at 30%.
  • In the vSphere Client, you can find an error similar to the below error:

    An error occurred while saving the snapshot: msg.snapshot.error-QUIESCING-ERROR

  • You can see the following entries in the vmware.log file, located at /vmfs/volumes/datastore/Affected_VM/.

    The error message was: The service cannot be started, either because it is disabled or because it has no enabled devices associated with it.

  • Remove From My Forums
  • Question

  • Hi,

    We use CommVault Simpana 10 to backup our vm’s. The last two weeks two virtual machines can’t backup anymore. I always see this error in the vSphere client:
    An error occurred while taking a snapshot: msg.snapshot.error-QUIESCINGERROR.
    An error occurred while saving the snapshot: msg.snapshot.error-QUIESCINGERROR.

    In CommVault I see this error:
    Unable to quiesce guest file system during snapshot creation

    In event viewer I see this when the backup starts:

    I already installed the latest updates of VMware on the server and the latest VMware tools on the vm’s.

    I also checked this kb article:http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2069952

    How can I fix this problem?

Answers

  • Hi StijnS,

    Thanks for your post.

    Your error seems to be the vMware related error. According to EV100450 (VMware KB: 2006849),  you may experience this type of problem when using the new version of VMware Tools in ESXi/ESX 4.1 Update 1 or Update 2 or ESXi 5.x. And I suggest you could
    post in VMWARE forum for more suppot about this.

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2006849

    https://communities.vmware.com/thread/309844?start=0&tstart=0

    Please Note: Since the web site is not hosted by Microsoft, the link may change without notice. Microsoft does not guarantee the accuracy of this information.

    Besides, since this is window server backup forum. We maily focus on backup windows server with
    Windows server backup tool that from the microsoft. For the 3rd party software, you could contact to the software vendor for more support.

    Thanks for your support and understanding.

    Best Regards,

    Mary Dong


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    • Proposed as answer by

      Friday, January 1, 2016 7:55 AM

    • Marked as answer by
      Mary Dong
      Monday, January 4, 2016 9:07 AM

Понравилась статья? Поделить с друзьями:
  • An error occurred while signing in please try again later
  • An error occurred while sending the request system io ioexception the response ended prematurely
  • An error occurred while sending the message перевод
  • An error occurred while sending the message facebook
  • An error occurred while sending email