Search Results (2317 CVEs found)

CVE Vendors Products Updated CVSS v3.1
CVE-2026-32723 1 Nyariv 1 Sandboxjs 2026-03-19 N/A
SandboxJS is a JavaScript sandboxing library. Prior to 0.8.35, SandboxJS timers have an execution-quota bypass. A global tick state (`currentTicks.current`) is shared between sandboxes. Timer string handlers are compiled at execution time using that global tick state rather than the scheduling sandbox's tick object. In multi-tenant / concurrent sandbox scenarios, another sandbox can overwrite `currentTicks.current` between scheduling and execution, causing the timer callback to run under a different sandbox's tick budget and bypass the original sandbox's execution quota/watchdog. Version 0.8.35 fixes this issue.
CVE-2026-32700 1 Heartcombo 1 Devise 2026-03-19 6.8 Medium
Devise is an authentication solution for Rails based on Warden. Prior to version 5.0.3, a race condition in Devise's Confirmable module allows an attacker to confirm an email address they do not own. This affects any Devise application using the `reconfirmable` option (the default when using Confirmable with email changes). By sending two concurrent email change requests, an attacker can desynchronize the `confirmation_token` and `unconfirmed_email` fields. The confirmation token is sent to an email the attacker controls, but the `unconfirmed_email` in the database points to a victim's email address. When the attacker uses the token, the victim's email is confirmed on the attacker's account. This is patched in Devise v5.0.3. Users should upgrade as soon as possible. As a workaround, applications can override a specific method from Devise models to force `unconfirmed_email` to be persisted when unchanged. Note that Mongoid does not seem to respect that `will_change!` should force the attribute to be persisted, even if it did not really change, so the user might have to implement a workaround similar to Devise by setting `changed_attributes["unconfirmed_email"] = nil` as well.
CVE-2025-43510 1 Apple 11 Ios, Ipad Os, Ipados and 8 more 2026-03-19 7.8 High
A memory corruption issue was addressed with improved lock state checking. This issue is fixed in watchOS 26.1, iOS 18.7.2 and iPadOS 18.7.2, macOS Tahoe 26.1, visionOS 26.1, tvOS 26.1, macOS Sonoma 14.8.2, macOS Sequoia 15.7.2, iOS 26.1 and iPadOS 26.1. A malicious application may cause unexpected changes in memory shared between processes.
CVE-2026-23071 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: regmap: Fix race condition in hwspinlock irqsave routine Previously, the address of the shared member '&map->spinlock_flags' was passed directly to 'hwspin_lock_timeout_irqsave'. This creates a race condition where multiple contexts contending for the lock could overwrite the shared flags variable, potentially corrupting the state for the current lock owner. Fix this by using a local stack variable 'flags' to store the IRQ state temporarily.
CVE-2025-71221 1 Linux 1 Linux Kernel 2026-03-18 7.0 High
In the Linux kernel, the following vulnerability has been resolved: dmaengine: mmp_pdma: Fix race condition in mmp_pdma_residue() Add proper locking in mmp_pdma_residue() to prevent use-after-free when accessing descriptor list and descriptor contents. The race occurs when multiple threads call tx_status() while the tasklet on another CPU is freeing completed descriptors: CPU 0 CPU 1 ----- ----- mmp_pdma_tx_status() mmp_pdma_residue() -> NO LOCK held list_for_each_entry(sw, ..) DMA interrupt dma_do_tasklet() -> spin_lock(&desc_lock) list_move(sw->node, ...) spin_unlock(&desc_lock) | dma_pool_free(sw) <- FREED! -> access sw->desc <- UAF! This issue can be reproduced when running dmatest on the same channel with multiple threads (threads_per_chan > 1). Fix by protecting the chain_running list iteration and descriptor access with the chan->desc_lock spinlock.
CVE-2026-23207 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: spi: tegra210-quad: Protect curr_xfer check in IRQ handler Now that all other accesses to curr_xfer are done under the lock, protect the curr_xfer NULL check in tegra_qspi_isr_thread() with the spinlock. Without this protection, the following race can occur: CPU0 (ISR thread) CPU1 (timeout path) ---------------- ------------------- if (!tqspi->curr_xfer) // sees non-NULL spin_lock() tqspi->curr_xfer = NULL spin_unlock() handle_*_xfer() spin_lock() t = tqspi->curr_xfer // NULL! ... t->len ... // NULL dereference! With this patch, all curr_xfer accesses are now properly synchronized. Although all accesses to curr_xfer are done under the lock, in tegra_qspi_isr_thread() it checks for NULL, releases the lock and reacquires it later in handle_cpu_based_xfer()/handle_dma_based_xfer(). There is a potential for an update in between, which could cause a NULL pointer dereference. To handle this, add a NULL check inside the handlers after acquiring the lock. This ensures that if the timeout path has already cleared curr_xfer, the handler will safely return without dereferencing the NULL pointer.
CVE-2025-38617 2 Debian, Linux 2 Debian Linux, Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: net/packet: fix a race in packet_set_ring() and packet_notifier() When packet_set_ring() releases po->bind_lock, another thread can run packet_notifier() and process an NETDEV_UP event. This race and the fix are both similar to that of commit 15fe076edea7 ("net/packet: fix a race in packet_bind() and packet_notifier()"). There too the packet_notifier NETDEV_UP event managed to run while a po->bind_lock critical section had to be temporarily released. And the fix was similarly to temporarily set po->num to zero to keep the socket unhooked until the lock is retaken. The po->bind_lock in packet_set_ring and packet_notifier precede the introduction of git history.
CVE-2026-23161 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: mm/shmem, swap: fix race of truncate and swap entry split The helper for shmem swap freeing is not handling the order of swap entries correctly. It uses xa_cmpxchg_irq to erase the swap entry, but it gets the entry order before that using xa_get_order without lock protection, and it may get an outdated order value if the entry is split or changed in other ways after the xa_get_order and before the xa_cmpxchg_irq. And besides, the order could grow and be larger than expected, and cause truncation to erase data beyond the end border. For example, if the target entry and following entries are swapped in or freed, then a large folio was added in place and swapped out, using the same entry, the xa_cmpxchg_irq will still succeed, it's very unlikely to happen though. To fix that, open code the Xarray cmpxchg and put the order retrieval and value checking in the same critical section. Also, ensure the order won't exceed the end border, skip it if the entry goes across the border. Skipping large swap entries crosses the end border is safe here. Shmem truncate iterates the range twice, in the first iteration, find_lock_entries already filtered such entries, and shmem will swapin the entries that cross the end border and partially truncate the folio (split the folio or at least zero part of it). So in the second loop here, if we see a swap entry that crosses the end order, it must at least have its content erased already. I observed random swapoff hangs and kernel panics when stress testing ZSWAP with shmem. After applying this patch, all problems are gone.
CVE-2026-23167 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: nfc: nci: Fix race between rfkill and nci_unregister_device(). syzbot reported the splat below [0] without a repro. It indicates that struct nci_dev.cmd_wq had been destroyed before nci_close_device() was called via rfkill. nci_dev.cmd_wq is only destroyed in nci_unregister_device(), which (I think) was called from virtual_ncidev_close() when syzbot close()d an fd of virtual_ncidev. The problem is that nci_unregister_device() destroys nci_dev.cmd_wq first and then calls nfc_unregister_device(), which removes the device from rfkill by rfkill_unregister(). So, the device is still visible via rfkill even after nci_dev.cmd_wq is destroyed. Let's unregister the device from rfkill first in nci_unregister_device(). Note that we cannot call nfc_unregister_device() before nci_close_device() because 1) nfc_unregister_device() calls device_del() which frees all memory allocated by devm_kzalloc() and linked to ndev->conn_info_list 2) nci_rx_work() could try to queue nci_conn_info to ndev->conn_info_list which could be leaked Thus, nfc_unregister_device() is split into two functions so we can remove rfkill interfaces only before nci_close_device(). [0]: DEBUG_LOCKS_WARN_ON(1) WARNING: kernel/locking/lockdep.c:238 at hlock_class kernel/locking/lockdep.c:238 [inline], CPU#0: syz.0.8675/6349 WARNING: kernel/locking/lockdep.c:238 at check_wait_context kernel/locking/lockdep.c:4854 [inline], CPU#0: syz.0.8675/6349 WARNING: kernel/locking/lockdep.c:238 at __lock_acquire+0x39d/0x2cf0 kernel/locking/lockdep.c:5187, CPU#0: syz.0.8675/6349 Modules linked in: CPU: 0 UID: 0 PID: 6349 Comm: syz.0.8675 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026 RIP: 0010:hlock_class kernel/locking/lockdep.c:238 [inline] RIP: 0010:check_wait_context kernel/locking/lockdep.c:4854 [inline] RIP: 0010:__lock_acquire+0x3a4/0x2cf0 kernel/locking/lockdep.c:5187 Code: 18 00 4c 8b 74 24 08 75 27 90 e8 17 f2 fc 02 85 c0 74 1c 83 3d 50 e0 4e 0e 00 75 13 48 8d 3d 43 f7 51 0e 48 c7 c6 8b 3a de 8d <67> 48 0f b9 3a 90 31 c0 0f b6 98 c4 00 00 00 41 8b 45 20 25 ff 1f RSP: 0018:ffffc9000c767680 EFLAGS: 00010046 RAX: 0000000000000001 RBX: 0000000000040000 RCX: 0000000000080000 RDX: ffffc90013080000 RSI: ffffffff8dde3a8b RDI: ffffffff8ff24ca0 RBP: 0000000000000003 R08: ffffffff8fef35a3 R09: 1ffffffff1fde6b4 R10: dffffc0000000000 R11: fffffbfff1fde6b5 R12: 00000000000012a2 R13: ffff888030338ba8 R14: ffff888030338000 R15: ffff888030338b30 FS: 00007fa5995f66c0(0000) GS:ffff8881256f8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7e72f842d0 CR3: 00000000485a0000 CR4: 00000000003526f0 Call Trace: <TASK> lock_acquire+0x106/0x330 kernel/locking/lockdep.c:5868 touch_wq_lockdep_map+0xcb/0x180 kernel/workqueue.c:3940 __flush_workqueue+0x14b/0x14f0 kernel/workqueue.c:3982 nci_close_device+0x302/0x630 net/nfc/nci/core.c:567 nci_dev_down+0x3b/0x50 net/nfc/nci/core.c:639 nfc_dev_down+0x152/0x290 net/nfc/core.c:161 nfc_rfkill_set_block+0x2d/0x100 net/nfc/core.c:179 rfkill_set_block+0x1d2/0x440 net/rfkill/core.c:346 rfkill_fop_write+0x461/0x5a0 net/rfkill/core.c:1301 vfs_write+0x29a/0xb90 fs/read_write.c:684 ksys_write+0x150/0x270 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa59b39acb9 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fa5995f6028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fa59b615fa0 RCX: 00007fa59b39acb9 RDX: 0000000000000008 RSI: 0000200000000080 RDI: 0000000000000007 RBP: 00007fa59b408bf7 R08: ---truncated---
CVE-2026-23169 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: mptcp: fix race in mptcp_pm_nl_flush_addrs_doit() syzbot and Eulgyu Kim reported crashes in mptcp_pm_nl_get_local_id() and/or mptcp_pm_nl_is_backup() Root cause is list_splice_init() in mptcp_pm_nl_flush_addrs_doit() which is not RCU ready. list_splice_init_rcu() can not be called here while holding pernet->lock spinlock. Many thanks to Eulgyu Kim for providing a repro and testing our patches.
CVE-2026-23126 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: netdevsim: fix a race issue related to the operation on bpf_bound_progs list The netdevsim driver lacks a protection mechanism for operations on the bpf_bound_progs list. When the nsim_bpf_create_prog() performs list_add_tail, it is possible that nsim_bpf_destroy_prog() is simultaneously performs list_del. Concurrent operations on the list may lead to list corruption and trigger a kernel crash as follows: [ 417.290971] kernel BUG at lib/list_debug.c:62! [ 417.290983] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 417.290992] CPU: 10 PID: 168 Comm: kworker/10:1 Kdump: loaded Not tainted 6.19.0-rc5 #1 [ 417.291003] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 417.291007] Workqueue: events bpf_prog_free_deferred [ 417.291021] RIP: 0010:__list_del_entry_valid_or_report+0xa7/0xc0 [ 417.291034] Code: a8 ff 0f 0b 48 89 fe 48 89 ca 48 c7 c7 48 a1 eb ae e8 ed fb a8 ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 80 a1 eb ae e8 d9 fb a8 ff <0f> 0b 48 89 d1 48 c7 c7 d0 a1 eb ae 48 89 f2 48 89 c6 e8 c2 fb a8 [ 417.291040] RSP: 0018:ffffb16a40807df8 EFLAGS: 00010246 [ 417.291046] RAX: 000000000000006d RBX: ffff8e589866f500 RCX: 0000000000000000 [ 417.291051] RDX: 0000000000000000 RSI: ffff8e59f7b23180 RDI: ffff8e59f7b23180 [ 417.291055] RBP: ffffb16a412c9000 R08: 0000000000000000 R09: 0000000000000003 [ 417.291059] R10: ffffb16a40807c80 R11: ffffffffaf9edce8 R12: ffff8e594427ac20 [ 417.291063] R13: ffff8e59f7b44780 R14: ffff8e58800b7a05 R15: 0000000000000000 [ 417.291074] FS: 0000000000000000(0000) GS:ffff8e59f7b00000(0000) knlGS:0000000000000000 [ 417.291079] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 417.291083] CR2: 00007fc4083efe08 CR3: 00000001c3626006 CR4: 0000000000770ee0 [ 417.291088] PKRU: 55555554 [ 417.291091] Call Trace: [ 417.291096] <TASK> [ 417.291103] nsim_bpf_destroy_prog+0x31/0x80 [netdevsim] [ 417.291154] __bpf_prog_offload_destroy+0x2a/0x80 [ 417.291163] bpf_prog_dev_bound_destroy+0x6f/0xb0 [ 417.291171] bpf_prog_free_deferred+0x18e/0x1a0 [ 417.291178] process_one_work+0x18a/0x3a0 [ 417.291188] worker_thread+0x27b/0x3a0 [ 417.291197] ? __pfx_worker_thread+0x10/0x10 [ 417.291207] kthread+0xe5/0x120 [ 417.291214] ? __pfx_kthread+0x10/0x10 [ 417.291221] ret_from_fork+0x31/0x50 [ 417.291230] ? __pfx_kthread+0x10/0x10 [ 417.291236] ret_from_fork_asm+0x1a/0x30 [ 417.291246] </TASK> Add a mutex lock, to prevent simultaneous addition and deletion operations on the list.
CVE-2026-25536 2 Lfprojects, Modelcontextprotocol 2 Mcp Typescript Sdk, Typescript-sdk 2026-03-18 7.1 High
MCP TypeScript SDK is the official TypeScript SDK for Model Context Protocol servers and clients. From version 1.10.0 to 1.25.3, cross-client response data leak when a single McpServer/Server and transport instance is reused across multiple client connections, most commonly in stateless StreamableHTTPServerTransport deployments. This issue has been patched in version 1.26.0.
CVE-2026-23153 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: firewire: core: fix race condition against transaction list The list of transaction is enumerated without acquiring card lock when processing AR response event. This causes a race condition bug when processing AT request completion event concurrently. This commit fixes the bug by put timer start for split transaction expiration into the scope of lock. The value of jiffies in card structure is referred before acquiring the lock.
CVE-2026-23110 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: scsi: core: Wake up the error handler when final completions race against each other The fragile ordering between marking commands completed or failed so that the error handler only wakes when the last running command completes or times out has race conditions. These race conditions can cause the SCSI layer to fail to wake the error handler, leaving I/O through the SCSI host stuck as the error state cannot advance. First, there is an memory ordering issue within scsi_dec_host_busy(). The write which clears SCMD_STATE_INFLIGHT may be reordered with reads counting in scsi_host_busy(). While the local CPU will see its own write, reordering can allow other CPUs in scsi_dec_host_busy() or scsi_eh_inc_host_failed() to see a raised busy count, causing no CPU to see a host busy equal to the host_failed count. This race condition can be prevented with a memory barrier on the error path to force the write to be visible before counting host busy commands. Second, there is a general ordering issue with scsi_eh_inc_host_failed(). By counting busy commands before incrementing host_failed, it can race with a final command in scsi_dec_host_busy(), such that scsi_dec_host_busy() does not see host_failed incremented but scsi_eh_inc_host_failed() counts busy commands before SCMD_STATE_INFLIGHT is cleared by scsi_dec_host_busy(), resulting in neither waking the error handler task. This needs the call to scsi_host_busy() to be moved after host_failed is incremented to close the race condition.
CVE-2026-23115 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: serial: Fix not set tty->port race condition Revert commit bfc467db60b7 ("serial: remove redundant tty_port_link_device()") because the tty_port_link_device() is not redundant: the tty->port has to be confured before we call uart_configure_port(), otherwise user-space can open console without TTY linked to the driver. This tty_port_link_device() was added explicitly to avoid this exact issue in commit fb2b90014d78 ("tty: link tty and port before configuring it as console"), so offending commit basically reverted the fix saying it is redundant without addressing the actual race condition presented there. Reproducible always as tty->port warning on Qualcomm SoC with most of devices disabled, so with very fast boot, and one serial device being the console: printk: legacy console [ttyMSM0] enabled printk: legacy console [ttyMSM0] enabled printk: legacy bootconsole [qcom_geni0] disabled printk: legacy bootconsole [qcom_geni0] disabled ------------[ cut here ]------------ tty_init_dev: ttyMSM driver does not set tty->port. This would crash the kernel. Fix the driver! WARNING: drivers/tty/tty_io.c:1414 at tty_init_dev.part.0+0x228/0x25c, CPU#2: systemd/1 Modules linked in: socinfo tcsrcc_eliza gcc_eliza sm3_ce fuse ipv6 CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G S 6.19.0-rc4-next-20260108-00024-g2202f4d30aa8 #73 PREEMPT Tainted: [S]=CPU_OUT_OF_SPEC Hardware name: Qualcomm Technologies, Inc. Eliza (DT) ... tty_init_dev.part.0 (drivers/tty/tty_io.c:1414 (discriminator 11)) (P) tty_open (arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 3) drivers/tty/tty_io.c:2073 (discriminator 3) drivers/tty/tty_io.c:2120 (discriminator 3)) chrdev_open (fs/char_dev.c:411) do_dentry_open (fs/open.c:962) vfs_open (fs/open.c:1094) do_open (fs/namei.c:4634) path_openat (fs/namei.c:4793) do_filp_open (fs/namei.c:4820) do_sys_openat2 (fs/open.c:1391 (discriminator 3)) ... Starting Network Name Resolution... Apparently the flow with this small Yocto-based ramdisk user-space is: driver (qcom_geni_serial.c): user-space: ============================ =========== qcom_geni_serial_probe() uart_add_one_port() serial_core_register_port() serial_core_add_one_port() uart_configure_port() register_console() | | open console | ... | tty_init_dev() | driver->ports[idx] is NULL | tty_port_register_device_attr_serdev() tty_port_link_device() <- set driver->ports[idx]
CVE-2026-23118 1 Linux 1 Linux Kernel 2026-03-18 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: rxrpc: Fix data-race warning and potential load/store tearing Fix the following: BUG: KCSAN: data-race in rxrpc_peer_keepalive_worker / rxrpc_send_data_packet which is reporting an issue with the reads and writes to ->last_tx_at in: conn->peer->last_tx_at = ktime_get_seconds(); and: keepalive_at = peer->last_tx_at + RXRPC_KEEPALIVE_TIME; The lockless accesses to these to values aren't actually a problem as the read only needs an approximate time of last transmission for the purposes of deciding whether or not the transmission of a keepalive packet is warranted yet. Also, as ->last_tx_at is a 64-bit value, tearing can occur on a 32-bit arch. Fix both of these by switching to an unsigned int for ->last_tx_at and only storing the LSW of the time64_t. It can then be reconstructed at need provided no more than 68 years has elapsed since the last transmission.
CVE-2026-0121 1 Google 1 Android 2026-03-17 2.9 Low
In VPU, there is a possible use-after-free read due to a race condition. This could lead to local information disclosure with no additional execution privileges needed. User interaction is not needed for exploitation.
CVE-2026-32398 2 Subratamal, Wordpress 2 Terawallet For Woocommerce, Wordpress 2026-03-17 5.3 Medium
Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition') vulnerability in Subrata Mal TeraWallet – For WooCommerce woo-wallet allows Leveraging Race Conditions.This issue affects TeraWallet – For WooCommerce: from n/a through <= 1.5.15.
CVE-2025-37920 1 Linux 1 Linux Kernel 2026-03-17 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: xsk: Fix race condition in AF_XDP generic RX path Move rx_lock from xsk_socket to xsk_buff_pool. Fix synchronization for shared umem mode in generic RX path where multiple sockets share single xsk_buff_pool. RX queue is exclusive to xsk_socket, while FILL queue can be shared between multiple sockets. This could result in race condition where two CPU cores access RX path of two different sockets sharing the same umem. Protect both queues by acquiring spinlock in shared xsk_buff_pool. Lock contention may be minimized in the future by some per-thread FQ buffering. It's safe and necessary to move spin_lock_bh(rx_lock) after xsk_rcv_check(): * xs->pool and spinlock_init is synchronized by xsk_bind() -> xsk_is_bound() memory barriers. * xsk_rcv_check() may return true at the moment of xsk_release() or xsk_unbind_dev(), however this will not cause any data races or race conditions. xsk_unbind_dev() removes xdp socket from all maps and waits for completion of all outstanding rx operations. Packets in RX path will either complete safely or drop.
CVE-2025-38104 1 Linux 1 Linux Kernel 2026-03-17 4.7 Medium
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV RLCG Register Access is a way for virtual functions to safely access GPU registers in a virtualized environment., including TLB flushes and register reads. When multiple threads or VFs try to access the same registers simultaneously, it can lead to race conditions. By using the RLCG interface, the driver can serialize access to the registers. This means that only one thread can access the registers at a time, preventing conflicts and ensuring that operations are performed correctly. Additionally, when a low-priority task holds a mutex that a high-priority task needs, ie., If a thread holding a spinlock tries to acquire a mutex, it can lead to priority inversion. register access in amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical. The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being called, which attempts to acquire the mutex. This function is invoked from amdgpu_sriov_wreg, which in turn is called from gmc_v11_0_flush_gpu_tlb. The [ BUG: Invalid wait context ] indicates that a thread is trying to acquire a mutex while it is in a context that does not allow it to sleep (like holding a spinlock). Fixes the below: [ 253.013423] ============================= [ 253.013434] [ BUG: Invalid wait context ] [ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE [ 253.013464] ----------------------------- [ 253.013475] kworker/0:1/10 is trying to lock: [ 253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.013815] other info that might help us debug this: [ 253.013827] context-{4:4} [ 253.013835] 3 locks held by kworker/0:1/10: [ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680 [ 253.013877] #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680 [ 253.013905] #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu] [ 253.014154] stack backtrace: [ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14 [ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024 [ 253.014224] Workqueue: events work_for_cpu_fn [ 253.014241] Call Trace: [ 253.014250] <TASK> [ 253.014260] dump_stack_lvl+0x9b/0xf0 [ 253.014275] dump_stack+0x10/0x20 [ 253.014287] __lock_acquire+0xa47/0x2810 [ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014321] lock_acquire+0xd1/0x300 [ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014562] ? __lock_acquire+0xa6b/0x2810 [ 253.014578] __mutex_lock+0x85/0xe20 [ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014782] ? sched_clock_noinstr+0x9/0x10 [ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014808] ? local_clock_noinstr+0xe/0xc0 [ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.015029] mutex_lock_nested+0x1b/0x30 [ 253.015044] ? mutex_lock_nested+0x1b/0x30 [ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu] [ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu] [ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu] [ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu] [ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu] [ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu] [ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu] [ 253.0170 ---truncated---