RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1099375 - docker fail to start when dm metadata are corrupted
Summary: docker fail to start when dm metadata are corrupted
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: Storage QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-20 07:09 UTC by Lukáš Doktor
Modified: 2023-09-14 23:57 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-01 16:17:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Lukáš Doktor 2014-05-20 07:09:45 UTC
Description of problem:
Hello guys, today after a docker daemon crash I had problems to get it working again. The problem was, that metadata (/var/lib/docker/devicemapper/devicemapper/metadata) was corrupted. The older version of docker seems to ignores/workarounds that and works perfectly, but the lates version of docker simply stops.

Version-Release number of selected component (if applicable):
docker-0.11.1-2.el7.x86_64

How reproducible:
On metadata corruption

Steps to Reproduce:
1. corrupt metadata (eg. hexedit /var/lib/docker/devicemapper/devicemapper/metadata and override first few characters)
2. docker -D -d

Actual results:
2014/05/20 08:55:40 docker daemon: 0.11.1-dev abf8689/0.11.1; execdriver: native; graphdriver: 
[47686eaa] +job serveapi(unix:///var/run/docker.sock)
[47686eaa] +job initserver()
[47686eaa.initserver()] Creating server
[debug] deviceset.go:527 Generated prefix: docker-253:3-4210871
[debug] deviceset.go:530 Checking for existence of the pool 'docker-253:3-4210871-pool'
[debug] deviceset.go:549 Pool doesn't exist. Creating it.
[debug] daemon.go:747 Using graph driver devicemapper
[debug] daemon.go:764 Creating images graph
[debug] graph.go:65 Restored 34 elements
[debug] daemon.go:776 Creating volumes graph
[debug] graph.go:65 Restored 3 elements
[debug] daemon.go:781 Creating repository list
[47686eaa] +job init_networkdriver()
[DEBUG] [iptables]: /sbin/iptables, [-C POSTROUTING -t nat -s 172.17.42.1/16 ! -d 172.17.42.1/16 -j MASQUERADE]
2014/05/20 08:55:40 Listening for HTTP on unix (/var/run/docker.sock)
[debug] server.go:1095 Registering GET, /events
[debug] server.go:1095 Registering GET, /images/search
[debug] server.go:1095 Registering GET, /images/{name:.*}/history
[debug] server.go:1095 Registering GET, /containers/{name:.*}/json
[debug] server.go:1095 Registering GET, /containers/{name:.*}/top
[debug] server.go:1095 Registering GET, /_ping
[debug] server.go:1095 Registering GET, /info
[debug] server.go:1095 Registering GET, /version
[debug] server.go:1095 Registering GET, /images/{name:.*}/get
[debug] server.go:1095 Registering GET, /containers/ps
[debug] server.go:1095 Registering GET, /containers/{name:.*}/export
[debug] server.go:1095 Registering GET, /containers/{name:.*}/attach/ws
[debug] server.go:1095 Registering GET, /images/{name:.*}/json
[debug] server.go:1095 Registering GET, /containers/{name:.*}/logs
[debug] server.go:1095 Registering GET, /images/json
[debug] server.go:1095 Registering GET, /images/viz
[debug] server.go:1095 Registering GET, /containers/json
[debug] server.go:1095 Registering GET, /containers/{name:.*}/changes
[debug] server.go:1095 Registering POST, /images/create
[debug] server.go:1095 Registering POST, /containers/{name:.*}/kill
[debug] server.go:1095 Registering POST, /containers/{name:.*}/copy
[debug] server.go:1095 Registering POST, /commit
[debug] server.go:1095 Registering POST, /images/{name:.*}/tag
[debug] server.go:1095 Registering POST, /containers/{name:.*}/restart
[debug] server.go:1095 Registering POST, /containers/{name:.*}/wait
[debug] server.go:1095 Registering POST, /build
[debug] server.go:1095 Registering POST, /images/{name:.*}/push
[debug] server.go:1095 Registering POST, /containers/{name:.*}/start
[debug] server.go:1095 Registering POST, /containers/{name:.*}/resize
[debug] server.go:1095 Registering POST, /auth
[debug] server.go:1095 Registering POST, /images/{name:.*}/insert
[debug] server.go:1095 Registering POST, /images/load
[debug] server.go:1095 Registering POST, /containers/create
[debug] server.go:1095 Registering POST, /containers/{name:.*}/stop
[debug] server.go:1095 Registering POST, /containers/{name:.*}/attach
[debug] server.go:1095 Registering DELETE, /containers/{name:.*}
[debug] server.go:1095 Registering DELETE, /images/{name:.*}
[debug] server.go:1095 Registering OPTIONS, 
[debug] server.go:1185 docker group found. gid: 985
[DEBUG] [iptables]: /sbin/iptables, [-D FORWARD -i docker0 -o docker0 -j DROP]
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 -o docker0 -j ACCEPT]
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 ! -o docker0 -j ACCEPT]
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -F DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -X DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -N DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -A OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER]
[47686eaa] -job init_networkdriver() = OK (0)
[47686eaa.initserver()] Creating pidfile
[47686eaa.initserver()] Setting up signal traps
[47686eaa] -job initserver() = OK (0)
[47686eaa] +job acceptconnections()
[47686eaa] -job acceptconnections() = OK (0)
^C2014/05/20 08:55:53 Received signal 'interrupt', starting shutdown of docker...
[debug] daemon.go:861 starting clean shutdown of all containers...
[debug] deviceset.go:832 [deviceset docker-253:3-4210871] shutdown()
[debug] deviceset.go:833 [devmapper] Shutting down DeviceSet: /home/medic/Work/Projekty/Docker/root/devicemapper
[debug] deviceset.go:719 [devmapper] deactivateDevice()
[debug] deviceset.go:815 Waiting for unmount of : opencount=0
[debug] deviceset.go:741 [devmapper] deactivateDevice END
[debug] deviceset.go:703 [devmapper] deactivatePool()
[debug] devmapper.go:525 [devmapper] removeDevice START
[debug] devmapper.go:539 [devmapper] removeDevice END
[debug] deviceset.go:716 [devmapper] deactivatePool END
[debug] deviceset.go:881 [deviceset docker-253:3-4210871] shutdown END


Expected results (obtained from upstream "Docker version 0.10.0, build dc9c28f"):
[root@t530 ~ ]$ docker-upstream -d -D | tee /tmp/docker.log
[/home/medic/Work/Projekty/Docker/root|262a1674] +job serveapi(unix:///var/run/docker.sock)
[/home/medic/Work/Projekty/Docker/root|262a1674] +job initserver()
[/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Creating server
2014/05/20 08:59:26 Listening for HTTP on unix (/var/run/docker.sock)
[debug] driver.go:78 Error loading driver aufs: AUFS was not found in /proc/filesystems
[debug] driver.go:78 Error loading driver btrfs: /home/medic/Work/Projekty/Docker/root is not a btrfs filesystem
[debug] deviceset.go:528 Generated prefix: docker-253:3-4210871
[debug] deviceset.go:531 Checking for existence of the pool 'docker-253:3-4210871-pool'
[debug] deviceset.go:545 Pool doesn't exist. Creating it.
[debug] deviceset.go:408 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: reload ioctl on docker-253:3-4210871-pool failed: Invalid or incomplete multibyte or wide character
[debug] deviceset.go:562 
--->Err: Error running DeviceCreate (createPool)

[debug] driver.go:78 Error loading driver devicemapper: Error running DeviceCreate (createPool)
[debug] runtime.go:690 Using graph driver vfs
[debug] runtime.go:707 Creating images graph
[debug] graph.go:63 Restored 34 elements
[debug] runtime.go:719 Creating volumes graph
[debug] graph.go:63 Restored 3 elements
[debug] runtime.go:724 Creating repository list
[/home/medic/Work/Projekty/Docker/root|262a1674] +job init_networkdriver()
[DEBUG] [iptables]: /sbin/iptables, [-C POSTROUTING -t nat -s 172.17.42.1/16 ! -d 172.17.42.1/16 -j MASQUERADE]
[DEBUG] [iptables]: /sbin/iptables, [-D FORWARD -i docker0 -o docker0 -j DROP]
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 -o docker0 -j ACCEPT]
[debug] server.go:1048 Registering GET, /events
[debug] server.go:1048 Registering GET, /containers/{name:.*}/top
[debug] server.go:1048 Registering GET, /containers/{name:.*}/attach/ws
[debug] server.go:1048 Registering GET, /images/json
[debug] server.go:1048 Registering GET, /images/{name:.*}/history
[debug] server.go:1048 Registering GET, /containers/ps
[debug] server.go:1048 Registering GET, /containers/json
[debug] server.go:1048 Registering GET, /containers/{name:.*}/json
[debug] server.go:1048 Registering GET, /images/viz
[debug] server.go:1048 Registering GET, /images/{name:.*}/json
[debug] server.go:1048 Registering GET, /info
[debug] server.go:1048 Registering GET, /version
[debug] server.go:1048 Registering GET, /images/search
[debug] server.go:1048 Registering GET, /images/{name:.*}/get
[debug] server.go:1048 Registering GET, /containers/{name:.*}/export
[debug] server.go:1048 Registering GET, /containers/{name:.*}/changes
[debug] server.go:1048 Registering POST, /auth
[debug] server.go:1048 Registering POST, /images/create
[debug] server.go:1048 Registering POST, /images/{name:.*}/tag
[debug] server.go:1048 Registering POST, /containers/{name:.*}/restart
[debug] server.go:1048 Registering POST, /containers/{name:.*}/start
[debug] server.go:1048 Registering POST, /containers/{name:.*}/copy
[debug] server.go:1048 Registering POST, /build
[debug] server.go:1048 Registering POST, /containers/{name:.*}/stop
[debug] server.go:1048 Registering POST, /commit
[debug] server.go:1048 Registering POST, /containers/{name:.*}/kill
[debug] server.go:1048 Registering POST, /containers/{name:.*}/wait
[debug] server.go:1048 Registering POST, /images/{name:.*}/insert
[debug] server.go:1048 Registering POST, /images/load
[debug] server.go:1048 Registering POST, /images/{name:.*}/push
[debug] server.go:1048 Registering POST, /containers/create
[debug] server.go:1048 Registering POST, /containers/{name:.*}/resize
[debug] server.go:1048 Registering POST, /containers/{name:.*}/attach
[debug] server.go:1048 Registering DELETE, /containers/{name:.*}
[debug] server.go:1048 Registering DELETE, /images/{name:.*}
[debug] server.go:1048 Registering OPTIONS, 
[debug] server.go:1138 docker group found. gid: 985
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 ! -o docker0 -j ACCEPT]
[DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -F DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -X DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -N DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
[DEBUG] [iptables]: /sbin/iptables, [-t nat -A OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER]
[/home/medic/Work/Projekty/Docker/root|262a1674] -job init_networkdriver() = OK (0)
[/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Creating pidfile
[/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Setting up signal traps
[/home/medic/Work/Projekty/Docker/root|262a1674] -job initserver() = OK (0)
[/home/medic/Work/Projekty/Docker/root|262a1674] +job acceptconnections()
[/home/medic/Work/Projekty/Docker/root|262a1674] -job acceptconnections() = OK (0)
# Docker is up and running

Additional info:
Workaround is simple, just rm -rf /var/lib/docker/devicemapper/ ...

Comment 1 Lukáš Doktor 2014-05-20 07:12:35 UTC
The output of /var/log/messages:
May 20 08:47:38 t530 kernel: [  227.203463] loop: module loaded
May 20 08:47:38 t530 kernel: loop: module loaded
May 20 08:47:38 t530 kernel: [  227.235138] device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900
May 20 08:47:38 t530 kernel: [  227.235163] device-mapper: block manager: index validator check failed for block 1054
May 20 08:47:38 t530 kernel: [  227.235165] device-mapper: transaction manager: couldn't open metadata space map
May 20 08:47:38 t530 kernel: [  227.235170] device-mapper: thin metadata: tm_open_with_sm failed
May 20 08:47:38 t530 kernel: device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900
May 20 08:47:38 t530 kernel: device-mapper: block manager: index validator check failed for block 1054
May 20 08:47:38 t530 kernel: device-mapper: transaction manager: couldn't open metadata space map
May 20 08:47:38 t530 kernel: device-mapper: thin metadata: tm_open_with_sm failed
May 20 08:47:38 t530 kernel: [  227.239388] device-mapper: table: 253:4: thin-pool: Error creating metadata object
May 20 08:47:38 t530 kernel: [  227.239399] device-mapper: ioctl: error adding target to table
May 20 08:47:38 t530 kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object
May 20 08:47:38 t530 kernel: device-mapper: ioctl: error adding target to table

Comment 3 Alexander Larsson 2014-06-18 13:14:05 UTC
I don't see what docker could do in this situation. It seems like the error is in the kernel. Dunno how that could have worked before...

Comment 4 Daniel Walsh 2014-06-18 14:00:58 UTC
Should we reassign to the kernel?

Comment 5 Lukáš Doktor 2014-06-30 14:02:27 UTC
I tested this with the same kernel using two docker versions. Older version just put a message about this error and proceed, newer version logged the same error and stopped. So apparently it can be worked around...

Comment 6 Matthias Clasen 2014-09-30 15:00:14 UTC
moving docker bugs off alexl

Comment 7 Vincent Batts 2014-10-21 16:48:05 UTC
On docker-1.3 if you corrupt the metadata file, then it fails with the following

```
vbatts@valse ~ (master) $ sudo docker -d -D -g /home/docker
2014/10/21 12:44:39 docker daemon: 1.3.0 c78088f; execdriver: native; graphdriver: 
[ab4be98f] +job serveapi(unix:///var/run/docker.sock)
[debug] deviceset.go:565 Generated prefix: docker-253:2-4980740
[debug] deviceset.go:568 Checking for existence of the pool 'docker-253:2-4980740-pool'
[debug] deviceset.go:587 Pool doesn't exist. Creating it.
[debug] deviceset.go:455 libdevmapper(3): ioctl/libdm-iface.c:1769 (-1) device-mapper: reload ioctl on docker-253:2-4980740-pool failed: Invalid or incomplete multibyte or wide character
2014/10/21 12:44:39 Error running DeviceCreate (createPool) dm_task_run failed
```


I don't think the solution is to failback to the 'vfs' graph driver, but perhaps have clearer output that the metadata is corrupt.

Comment 9 Vincent Batts 2017-03-13 13:22:15 UTC
After chatting with Vivek, I don't this is any longer an issue. Can you reproduce with a recent version of docker?

Comment 10 Lukáš Doktor 2017-03-14 11:30:33 UTC
Hello Vincent,

not sure how about RHEL, but on F25 it still fails. My steps are:

$ dd of=/var/lib/docker/devicemapper/devicemapper/metadata if=/dev/urandom count=1 bs=5 skip=10
$ systemctl start docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.
$ journalctl -xe
...
bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker-stor
bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Unit entered failed state.
bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Failed with result 'exit-code'.
bře 14 12:27:40 localhost.localdomain systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has begun starting up.
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: space map common: index_check failed: blocknr 0 != wanted 148
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: block manager: index validator check failed for block 148
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: transaction manager: couldn't open metadata space map
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: thin metadata: tm_open_with_sm failed
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: table: 253:8: thin-pool: Error creating metadata object
bře 14 12:27:40 localhost.localdomain kernel: device-mapper: ioctl: error adding target to table
bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742875167+01:00" level=error msg="[graphdriver] prior storage driver \"
bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742954033+01:00" level=fatal msg="Error starting daemon: error initiali
bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
bře 14 12:27:40 localhost.localdomain systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has failed.
-- 
-- The result is failed.
bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker comm
bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Unit entered failed state.
bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Failed with result 'exit-code'.


I have F25 with docker-1.12.6-6.gitae7d637.fc25.x86_64

Comment 11 Vincent Batts 2018-11-14 14:37:38 UTC
Is this still an issue? Vivek, do you know whether this even applies to containers/storage now?

Comment 14 Red Hat Bugzilla 2023-09-14 23:57:40 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.