block/nvme: invoke blk_io_plug_call() outside q->lock
commit66547f416a61e0cb711dc76821890242432ba193
authorStefan Hajnoczi <stefanha@redhat.com>
Wed, 12 Jul 2023 19:16:28 +0000 (12 15:16 -0400)
committerStefan Hajnoczi <stefanha@redhat.com>
Mon, 17 Jul 2023 13:17:41 +0000 (17 09:17 -0400)
treed3db3ff275fc9c07eb27e678fe361944bdb0cb62
parented8ad9728a9c0eec34db9dff61dfa2f1dd625637
block/nvme: invoke blk_io_plug_call() outside q->lock

blk_io_plug_call() is invoked outside a blk_io_plug()/blk_io_unplug()
section while opening the NVMe drive from:

  nvme_file_open() ->
  nvme_init() ->
  nvme_identify() ->
  nvme_admin_cmd_sync() ->
  nvme_submit_command() ->
  blk_io_plug_call()

blk_io_plug_call() immediately invokes the given callback when the
current thread is not plugged, as is the case during nvme_file_open().

Unfortunately, nvme_submit_command() calls blk_io_plug_call() with
q->lock still held:

    ...
    q->sq.tail = (q->sq.tail + 1) % NVME_QUEUE_SIZE;
    q->need_kick++;
    blk_io_plug_call(nvme_unplug_fn, q);
    qemu_mutex_unlock(&q->lock);
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^

nvme_unplug_fn() deadlocks trying to acquire q->lock because the lock is
already acquired by the same thread. The symptom is that QEMU hangs
during startup while opening the NVMe drive.

Fix this by moving the blk_io_plug_call() outside q->lock. This is safe
because no other thread runs code related to this queue and
blk_io_plug_call()'s internal state is immune to thread safety issues
since it is thread-local.

Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Lukas Doktor <ldoktor@redhat.com>
Message-id: 20230712191628.252806-1-stefanha@redhat.com
Fixes: f2e590002bd6 ("block/nvme: convert to blk_io_plug_call() API")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block/nvme.c