Bug 1099375
Summary: | docker fail to start when dm metadata are corrupted | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Lukáš Doktor <ldoktor> |
Component: | kernel | Assignee: | LVM and device-mapper development team <lvm-team> |
kernel sub component: | Thin Provisioning | QA Contact: | Storage QE <storage-qe> |
Status: | CLOSED WONTFIX | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | agk, dwalsh, mclasen, msnitzer, thornber, vgoyal |
Version: | 7.0 | ||
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-04-01 16:17:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lukáš Doktor
2014-05-20 07:09:45 UTC
The output of /var/log/messages: May 20 08:47:38 t530 kernel: [ 227.203463] loop: module loaded May 20 08:47:38 t530 kernel: loop: module loaded May 20 08:47:38 t530 kernel: [ 227.235138] device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900 May 20 08:47:38 t530 kernel: [ 227.235163] device-mapper: block manager: index validator check failed for block 1054 May 20 08:47:38 t530 kernel: [ 227.235165] device-mapper: transaction manager: couldn't open metadata space map May 20 08:47:38 t530 kernel: [ 227.235170] device-mapper: thin metadata: tm_open_with_sm failed May 20 08:47:38 t530 kernel: device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900 May 20 08:47:38 t530 kernel: device-mapper: block manager: index validator check failed for block 1054 May 20 08:47:38 t530 kernel: device-mapper: transaction manager: couldn't open metadata space map May 20 08:47:38 t530 kernel: device-mapper: thin metadata: tm_open_with_sm failed May 20 08:47:38 t530 kernel: [ 227.239388] device-mapper: table: 253:4: thin-pool: Error creating metadata object May 20 08:47:38 t530 kernel: [ 227.239399] device-mapper: ioctl: error adding target to table May 20 08:47:38 t530 kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object May 20 08:47:38 t530 kernel: device-mapper: ioctl: error adding target to table I don't see what docker could do in this situation. It seems like the error is in the kernel. Dunno how that could have worked before... Should we reassign to the kernel? I tested this with the same kernel using two docker versions. Older version just put a message about this error and proceed, newer version logged the same error and stopped. So apparently it can be worked around... moving docker bugs off alexl On docker-1.3 if you corrupt the metadata file, then it fails with the following ``` vbatts@valse ~ (master) $ sudo docker -d -D -g /home/docker 2014/10/21 12:44:39 docker daemon: 1.3.0 c78088f; execdriver: native; graphdriver: [ab4be98f] +job serveapi(unix:///var/run/docker.sock) [debug] deviceset.go:565 Generated prefix: docker-253:2-4980740 [debug] deviceset.go:568 Checking for existence of the pool 'docker-253:2-4980740-pool' [debug] deviceset.go:587 Pool doesn't exist. Creating it. [debug] deviceset.go:455 libdevmapper(3): ioctl/libdm-iface.c:1769 (-1) device-mapper: reload ioctl on docker-253:2-4980740-pool failed: Invalid or incomplete multibyte or wide character 2014/10/21 12:44:39 Error running DeviceCreate (createPool) dm_task_run failed ``` I don't think the solution is to failback to the 'vfs' graph driver, but perhaps have clearer output that the metadata is corrupt. After chatting with Vivek, I don't this is any longer an issue. Can you reproduce with a recent version of docker? Hello Vincent, not sure how about RHEL, but on F25 it still fails. My steps are: $ dd of=/var/lib/docker/devicemapper/devicemapper/metadata if=/dev/urandom count=1 bs=5 skip=10 $ systemctl start docker Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. $ journalctl -xe ... bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker-stor bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Unit entered failed state. bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Failed with result 'exit-code'. bře 14 12:27:40 localhost.localdomain systemd[1]: Starting Docker Application Container Engine... -- Subject: Unit docker.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit docker.service has begun starting up. bře 14 12:27:40 localhost.localdomain kernel: device-mapper: space map common: index_check failed: blocknr 0 != wanted 148 bře 14 12:27:40 localhost.localdomain kernel: device-mapper: block manager: index validator check failed for block 148 bře 14 12:27:40 localhost.localdomain kernel: device-mapper: transaction manager: couldn't open metadata space map bře 14 12:27:40 localhost.localdomain kernel: device-mapper: thin metadata: tm_open_with_sm failed bře 14 12:27:40 localhost.localdomain kernel: device-mapper: table: 253:8: thin-pool: Error creating metadata object bře 14 12:27:40 localhost.localdomain kernel: device-mapper: ioctl: error adding target to table bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742875167+01:00" level=error msg="[graphdriver] prior storage driver \" bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742954033+01:00" level=fatal msg="Error starting daemon: error initiali bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE bře 14 12:27:40 localhost.localdomain systemd[1]: Failed to start Docker Application Container Engine. -- Subject: Unit docker.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit docker.service has failed. -- -- The result is failed. bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker comm bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Unit entered failed state. bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Failed with result 'exit-code'. I have F25 with docker-1.12.6-6.gitae7d637.fc25.x86_64 Is this still an issue? Vivek, do you know whether this even applies to containers/storage now? The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |