Bug 1099375
| Summary: | docker fail to start when dm metadata are corrupted | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Lukáš Doktor <ldoktor> |
| Component: | kernel | Assignee: | LVM and device-mapper development team <lvm-team> |
| kernel sub component: | Thin Provisioning | QA Contact: | Storage QE <storage-qe> |
| Status: | CLOSED WONTFIX | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | agk, dwalsh, mclasen, msnitzer, thornber, vgoyal |
| Version: | 7.0 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-04-01 16:17:45 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The output of /var/log/messages: May 20 08:47:38 t530 kernel: [ 227.203463] loop: module loaded May 20 08:47:38 t530 kernel: loop: module loaded May 20 08:47:38 t530 kernel: [ 227.235138] device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900 May 20 08:47:38 t530 kernel: [ 227.235163] device-mapper: block manager: index validator check failed for block 1054 May 20 08:47:38 t530 kernel: [ 227.235165] device-mapper: transaction manager: couldn't open metadata space map May 20 08:47:38 t530 kernel: [ 227.235170] device-mapper: thin metadata: tm_open_with_sm failed May 20 08:47:38 t530 kernel: device-mapper: space map common: index_check failed: csum 2396624857 != wanted 2396666900 May 20 08:47:38 t530 kernel: device-mapper: block manager: index validator check failed for block 1054 May 20 08:47:38 t530 kernel: device-mapper: transaction manager: couldn't open metadata space map May 20 08:47:38 t530 kernel: device-mapper: thin metadata: tm_open_with_sm failed May 20 08:47:38 t530 kernel: [ 227.239388] device-mapper: table: 253:4: thin-pool: Error creating metadata object May 20 08:47:38 t530 kernel: [ 227.239399] device-mapper: ioctl: error adding target to table May 20 08:47:38 t530 kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object May 20 08:47:38 t530 kernel: device-mapper: ioctl: error adding target to table I don't see what docker could do in this situation. It seems like the error is in the kernel. Dunno how that could have worked before... Should we reassign to the kernel? I tested this with the same kernel using two docker versions. Older version just put a message about this error and proceed, newer version logged the same error and stopped. So apparently it can be worked around... moving docker bugs off alexl On docker-1.3 if you corrupt the metadata file, then it fails with the following ``` vbatts@valse ~ (master) $ sudo docker -d -D -g /home/docker 2014/10/21 12:44:39 docker daemon: 1.3.0 c78088f; execdriver: native; graphdriver: [ab4be98f] +job serveapi(unix:///var/run/docker.sock) [debug] deviceset.go:565 Generated prefix: docker-253:2-4980740 [debug] deviceset.go:568 Checking for existence of the pool 'docker-253:2-4980740-pool' [debug] deviceset.go:587 Pool doesn't exist. Creating it. [debug] deviceset.go:455 libdevmapper(3): ioctl/libdm-iface.c:1769 (-1) device-mapper: reload ioctl on docker-253:2-4980740-pool failed: Invalid or incomplete multibyte or wide character 2014/10/21 12:44:39 Error running DeviceCreate (createPool) dm_task_run failed ``` I don't think the solution is to failback to the 'vfs' graph driver, but perhaps have clearer output that the metadata is corrupt. After chatting with Vivek, I don't this is any longer an issue. Can you reproduce with a recent version of docker? Hello Vincent, not sure how about RHEL, but on F25 it still fails. My steps are: $ dd of=/var/lib/docker/devicemapper/devicemapper/metadata if=/dev/urandom count=1 bs=5 skip=10 $ systemctl start docker Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. $ journalctl -xe ... bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker-stor bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Unit entered failed state. bře 14 12:27:40 localhost.localdomain systemd[1]: docker-storage-setup.service: Failed with result 'exit-code'. bře 14 12:27:40 localhost.localdomain systemd[1]: Starting Docker Application Container Engine... -- Subject: Unit docker.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit docker.service has begun starting up. bře 14 12:27:40 localhost.localdomain kernel: device-mapper: space map common: index_check failed: blocknr 0 != wanted 148 bře 14 12:27:40 localhost.localdomain kernel: device-mapper: block manager: index validator check failed for block 148 bře 14 12:27:40 localhost.localdomain kernel: device-mapper: transaction manager: couldn't open metadata space map bře 14 12:27:40 localhost.localdomain kernel: device-mapper: thin metadata: tm_open_with_sm failed bře 14 12:27:40 localhost.localdomain kernel: device-mapper: table: 253:8: thin-pool: Error creating metadata object bře 14 12:27:40 localhost.localdomain kernel: device-mapper: ioctl: error adding target to table bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742875167+01:00" level=error msg="[graphdriver] prior storage driver \" bře 14 12:27:40 localhost.localdomain dockerd-current[30313]: time="2017-03-14T12:27:40.742954033+01:00" level=fatal msg="Error starting daemon: error initiali bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE bře 14 12:27:40 localhost.localdomain systemd[1]: Failed to start Docker Application Container Engine. -- Subject: Unit docker.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit docker.service has failed. -- -- The result is failed. bře 14 12:27:40 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker comm bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Unit entered failed state. bře 14 12:27:40 localhost.localdomain systemd[1]: docker.service: Failed with result 'exit-code'. I have F25 with docker-1.12.6-6.gitae7d637.fc25.x86_64 Is this still an issue? Vivek, do you know whether this even applies to containers/storage now? The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |
Description of problem: Hello guys, today after a docker daemon crash I had problems to get it working again. The problem was, that metadata (/var/lib/docker/devicemapper/devicemapper/metadata) was corrupted. The older version of docker seems to ignores/workarounds that and works perfectly, but the lates version of docker simply stops. Version-Release number of selected component (if applicable): docker-0.11.1-2.el7.x86_64 How reproducible: On metadata corruption Steps to Reproduce: 1. corrupt metadata (eg. hexedit /var/lib/docker/devicemapper/devicemapper/metadata and override first few characters) 2. docker -D -d Actual results: 2014/05/20 08:55:40 docker daemon: 0.11.1-dev abf8689/0.11.1; execdriver: native; graphdriver: [47686eaa] +job serveapi(unix:///var/run/docker.sock) [47686eaa] +job initserver() [47686eaa.initserver()] Creating server [debug] deviceset.go:527 Generated prefix: docker-253:3-4210871 [debug] deviceset.go:530 Checking for existence of the pool 'docker-253:3-4210871-pool' [debug] deviceset.go:549 Pool doesn't exist. Creating it. [debug] daemon.go:747 Using graph driver devicemapper [debug] daemon.go:764 Creating images graph [debug] graph.go:65 Restored 34 elements [debug] daemon.go:776 Creating volumes graph [debug] graph.go:65 Restored 3 elements [debug] daemon.go:781 Creating repository list [47686eaa] +job init_networkdriver() [DEBUG] [iptables]: /sbin/iptables, [-C POSTROUTING -t nat -s 172.17.42.1/16 ! -d 172.17.42.1/16 -j MASQUERADE] 2014/05/20 08:55:40 Listening for HTTP on unix (/var/run/docker.sock) [debug] server.go:1095 Registering GET, /events [debug] server.go:1095 Registering GET, /images/search [debug] server.go:1095 Registering GET, /images/{name:.*}/history [debug] server.go:1095 Registering GET, /containers/{name:.*}/json [debug] server.go:1095 Registering GET, /containers/{name:.*}/top [debug] server.go:1095 Registering GET, /_ping [debug] server.go:1095 Registering GET, /info [debug] server.go:1095 Registering GET, /version [debug] server.go:1095 Registering GET, /images/{name:.*}/get [debug] server.go:1095 Registering GET, /containers/ps [debug] server.go:1095 Registering GET, /containers/{name:.*}/export [debug] server.go:1095 Registering GET, /containers/{name:.*}/attach/ws [debug] server.go:1095 Registering GET, /images/{name:.*}/json [debug] server.go:1095 Registering GET, /containers/{name:.*}/logs [debug] server.go:1095 Registering GET, /images/json [debug] server.go:1095 Registering GET, /images/viz [debug] server.go:1095 Registering GET, /containers/json [debug] server.go:1095 Registering GET, /containers/{name:.*}/changes [debug] server.go:1095 Registering POST, /images/create [debug] server.go:1095 Registering POST, /containers/{name:.*}/kill [debug] server.go:1095 Registering POST, /containers/{name:.*}/copy [debug] server.go:1095 Registering POST, /commit [debug] server.go:1095 Registering POST, /images/{name:.*}/tag [debug] server.go:1095 Registering POST, /containers/{name:.*}/restart [debug] server.go:1095 Registering POST, /containers/{name:.*}/wait [debug] server.go:1095 Registering POST, /build [debug] server.go:1095 Registering POST, /images/{name:.*}/push [debug] server.go:1095 Registering POST, /containers/{name:.*}/start [debug] server.go:1095 Registering POST, /containers/{name:.*}/resize [debug] server.go:1095 Registering POST, /auth [debug] server.go:1095 Registering POST, /images/{name:.*}/insert [debug] server.go:1095 Registering POST, /images/load [debug] server.go:1095 Registering POST, /containers/create [debug] server.go:1095 Registering POST, /containers/{name:.*}/stop [debug] server.go:1095 Registering POST, /containers/{name:.*}/attach [debug] server.go:1095 Registering DELETE, /containers/{name:.*} [debug] server.go:1095 Registering DELETE, /images/{name:.*} [debug] server.go:1095 Registering OPTIONS, [debug] server.go:1185 docker group found. gid: 985 [DEBUG] [iptables]: /sbin/iptables, [-D FORWARD -i docker0 -o docker0 -j DROP] [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 -o docker0 -j ACCEPT] [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 ! -o docker0 -j ACCEPT] [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -F DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -X DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -N DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -A OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER] [47686eaa] -job init_networkdriver() = OK (0) [47686eaa.initserver()] Creating pidfile [47686eaa.initserver()] Setting up signal traps [47686eaa] -job initserver() = OK (0) [47686eaa] +job acceptconnections() [47686eaa] -job acceptconnections() = OK (0) ^C2014/05/20 08:55:53 Received signal 'interrupt', starting shutdown of docker... [debug] daemon.go:861 starting clean shutdown of all containers... [debug] deviceset.go:832 [deviceset docker-253:3-4210871] shutdown() [debug] deviceset.go:833 [devmapper] Shutting down DeviceSet: /home/medic/Work/Projekty/Docker/root/devicemapper [debug] deviceset.go:719 [devmapper] deactivateDevice() [debug] deviceset.go:815 Waiting for unmount of : opencount=0 [debug] deviceset.go:741 [devmapper] deactivateDevice END [debug] deviceset.go:703 [devmapper] deactivatePool() [debug] devmapper.go:525 [devmapper] removeDevice START [debug] devmapper.go:539 [devmapper] removeDevice END [debug] deviceset.go:716 [devmapper] deactivatePool END [debug] deviceset.go:881 [deviceset docker-253:3-4210871] shutdown END Expected results (obtained from upstream "Docker version 0.10.0, build dc9c28f"): [root@t530 ~ ]$ docker-upstream -d -D | tee /tmp/docker.log [/home/medic/Work/Projekty/Docker/root|262a1674] +job serveapi(unix:///var/run/docker.sock) [/home/medic/Work/Projekty/Docker/root|262a1674] +job initserver() [/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Creating server 2014/05/20 08:59:26 Listening for HTTP on unix (/var/run/docker.sock) [debug] driver.go:78 Error loading driver aufs: AUFS was not found in /proc/filesystems [debug] driver.go:78 Error loading driver btrfs: /home/medic/Work/Projekty/Docker/root is not a btrfs filesystem [debug] deviceset.go:528 Generated prefix: docker-253:3-4210871 [debug] deviceset.go:531 Checking for existence of the pool 'docker-253:3-4210871-pool' [debug] deviceset.go:545 Pool doesn't exist. Creating it. [debug] deviceset.go:408 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: reload ioctl on docker-253:3-4210871-pool failed: Invalid or incomplete multibyte or wide character [debug] deviceset.go:562 --->Err: Error running DeviceCreate (createPool) [debug] driver.go:78 Error loading driver devicemapper: Error running DeviceCreate (createPool) [debug] runtime.go:690 Using graph driver vfs [debug] runtime.go:707 Creating images graph [debug] graph.go:63 Restored 34 elements [debug] runtime.go:719 Creating volumes graph [debug] graph.go:63 Restored 3 elements [debug] runtime.go:724 Creating repository list [/home/medic/Work/Projekty/Docker/root|262a1674] +job init_networkdriver() [DEBUG] [iptables]: /sbin/iptables, [-C POSTROUTING -t nat -s 172.17.42.1/16 ! -d 172.17.42.1/16 -j MASQUERADE] [DEBUG] [iptables]: /sbin/iptables, [-D FORWARD -i docker0 -o docker0 -j DROP] [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 -o docker0 -j ACCEPT] [debug] server.go:1048 Registering GET, /events [debug] server.go:1048 Registering GET, /containers/{name:.*}/top [debug] server.go:1048 Registering GET, /containers/{name:.*}/attach/ws [debug] server.go:1048 Registering GET, /images/json [debug] server.go:1048 Registering GET, /images/{name:.*}/history [debug] server.go:1048 Registering GET, /containers/ps [debug] server.go:1048 Registering GET, /containers/json [debug] server.go:1048 Registering GET, /containers/{name:.*}/json [debug] server.go:1048 Registering GET, /images/viz [debug] server.go:1048 Registering GET, /images/{name:.*}/json [debug] server.go:1048 Registering GET, /info [debug] server.go:1048 Registering GET, /version [debug] server.go:1048 Registering GET, /images/search [debug] server.go:1048 Registering GET, /images/{name:.*}/get [debug] server.go:1048 Registering GET, /containers/{name:.*}/export [debug] server.go:1048 Registering GET, /containers/{name:.*}/changes [debug] server.go:1048 Registering POST, /auth [debug] server.go:1048 Registering POST, /images/create [debug] server.go:1048 Registering POST, /images/{name:.*}/tag [debug] server.go:1048 Registering POST, /containers/{name:.*}/restart [debug] server.go:1048 Registering POST, /containers/{name:.*}/start [debug] server.go:1048 Registering POST, /containers/{name:.*}/copy [debug] server.go:1048 Registering POST, /build [debug] server.go:1048 Registering POST, /containers/{name:.*}/stop [debug] server.go:1048 Registering POST, /commit [debug] server.go:1048 Registering POST, /containers/{name:.*}/kill [debug] server.go:1048 Registering POST, /containers/{name:.*}/wait [debug] server.go:1048 Registering POST, /images/{name:.*}/insert [debug] server.go:1048 Registering POST, /images/load [debug] server.go:1048 Registering POST, /images/{name:.*}/push [debug] server.go:1048 Registering POST, /containers/create [debug] server.go:1048 Registering POST, /containers/{name:.*}/resize [debug] server.go:1048 Registering POST, /containers/{name:.*}/attach [debug] server.go:1048 Registering DELETE, /containers/{name:.*} [debug] server.go:1048 Registering DELETE, /images/{name:.*} [debug] server.go:1048 Registering OPTIONS, [debug] server.go:1138 docker group found. gid: 985 [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -i docker0 ! -o docker0 -j ACCEPT] [DEBUG] [iptables]: /sbin/iptables, [-C FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D PREROUTING -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -D OUTPUT -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -F DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -X DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -N DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER] [DEBUG] [iptables]: /sbin/iptables, [-t nat -A OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER] [/home/medic/Work/Projekty/Docker/root|262a1674] -job init_networkdriver() = OK (0) [/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Creating pidfile [/home/medic/Work/Projekty/Docker/root|262a1674.initserver()] Setting up signal traps [/home/medic/Work/Projekty/Docker/root|262a1674] -job initserver() = OK (0) [/home/medic/Work/Projekty/Docker/root|262a1674] +job acceptconnections() [/home/medic/Work/Projekty/Docker/root|262a1674] -job acceptconnections() = OK (0) # Docker is up and running Additional info: Workaround is simple, just rm -rf /var/lib/docker/devicemapper/ ...