How to Find Queries Causing RESOURCE_SEMAPHORE Waits in SQL Server

How to Find Queries Causing RESOURCE_SEMAPHORE Waits in SQL Server

SQL Server resource_semaphore waits

The resource_semaphore wait can have devastating consequences for SQL Server performance. This wait essentially means that some of the queries in your workload have memory grants that are larger than the memory for the server can support. When that happens, the SQL Server feels like it is frozen and unresponsive. Queries are likely running, but this wait causes a queue to build up while submitted queries wait for memory to run.

What is the resource_semaphore wait?

A wait type of resource_semaphore means that there isn’t enough available memory to grant for queries to run. At a high level, here is what is happening. A query is submitted to the SQL Server engine for execution. As part of the pre-execution phase, SQL Server estimates how much memory it thinks a query will need to run. Several factors influence this memory estimation. Assuming there is free memory to grant to the query, then the query moves along the execution phases and starts running.

But let’s say you have a server with 128 GB of RAM allocated to SQL Server. A series of queries are submitted to the SQL Server database engine that are each granted 15 GB of RAM. At most, SQL Server can handle 8 of those queries before it runs out of memory to allocate. The next query that comes along and needs another 15 GB of RAM is prevented from starting its execution because 8 X 15 = 120 GB of RAM. This 9th query, if granted memory, would cause a total of 135 GB to be allocated. The server doesn’t have that much RAM allocated for queries. So, it has to wait.

As other queries are submitted, they wait behind this 9th query that needs the additional 15 GB of RAM. If the other 8 queries that are already executing are long-running queries, it might be several minutes, or longer, before memory is available. Queries start stacking up behind each other. Users start noticing that the app is slow, pages aren’t refreshing, and reports aren’t completing. Soon, everyone is hitting F5 on the web app to resubmit queries because nothing is happening. Immediately following this, your phone or your Slack messages start blowing up!

How to find queries with large memory grants

First, you can use sp_whoisactive to find the queries and the offending wait in real-time. This is very useful for the scenario above, where users feel the pain and start entering tickets and messaging people for help.

However, let’s say, for example, that your server is just teetering on the edge with the memory allocation. Memory grants are occasionally high enough that SQL Server is registering the resource_semaphore wait, but it isn’t happening for long enough, or frequently enough, that users really notice and start complaining. Your SQL Server is experiencing slowness, at times, it’s just not causing excruciating pain. This may show up by this wait appearing low in the result set from Paul Randall’s wait stats query. Maybe it’s only causing a few seconds of wait, on average, when it happens. Users might notice this because their report or application screen completes after a brief wait, so they assume this is “normal.” Consequently, they don’t report it. This doesn’t mean you can or should ignore what the wait stats information is telling you. Act now before this becomes a full-blown emergency.

Second, in the above scenario, you can use Extended Events to find queries with large memory grants. Extended Events are light-weight trace objects that allow for the capture of far more events than Profiler or a server-side trace. They also work differently by only firing and capturing when an event defined in the session happens, versus capturing everything and then filtering like Profiler does.

Setting Up The Extended Events Session

After some poking around and some experimenting, I was able to arrive at the T-SQL below to create the extended event session. This code will capture any query with a memory grant greater than 1 GB. Adjust this higher or lower as makes sense for your environment. The session stores the database ID, plan handle, session ID, and the T-SQL text for the offending query in a file. It will write to as many as 5 files, each 1 GB in size. When the 5th file is full, it will delete the oldest files and start writing a new file.

One thing to be aware of is the path for the file that will hold the data. That path must exist first. I’m using C:\XE\NameofExtendedEventSession.xel. Be sure to update that path to a location that your SQL Server instance can access.

CREATE EVENT SESSION [TrackHighMemoryGrants] ON SERVER 
ADD EVENT sqlserver.query_memory_grant_usage(
ACTION(sqlserver.database_id,sqlserver.plan_handle,sqlserver.session_id,sqlserver.sql_text)
    WHERE ([granted_memory_kb]>(1024000))),
ADD EVENT sqlserver.query_memory_grant_wait_end(
    ACTION(sqlserver.database_id,sqlserver.session_id,sqlserver.sql_text)
    WHERE ([sqlserver].[database_id]=(8) AND [granted_memory_kb]>(1024000)))
ADD TARGET package0.event_file(SET filename=N'C:\XE\HighMemoryGrants.xel',max_file_size=(1024),max_rollover_files=(5))
WITH (MAX_MEMORY=51200 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=5 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
GO

To start the Extended Events session, run the following T-SQL:

ALTER EVENT SESSION [TrackingHighMemoryGrants]
ON SERVER
STATE = START;

This can also be done from SSMS. Traverse the UI under the SQL instance name to Management > Extended Events > “TrackingHighMemoryGrants”, right click and select the “Start Session” option.

How to query Extended Events files

The extended event is gathering data. The query below can parse the collected files to show which queries have the highest memory grants on average.


WITH ParsedEvents AS (
SELECT 
event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_text,
event_data.value('(event/data[@name="granted_memory_kb"]/value)[1]', 'bigint') / 1024.0 AS granted_mb
FROM (
    SELECT CAST(event_data AS XML) AS event_data
    FROM sys.fn_xe_file_target_read_file('C:\XE\HighMemoryGrants*.xel', NULL, NULL, NULL)
    ) AS x
)

SELECT 
sql_text,
COUNT(*) AS execution_count,
AVG(granted_mb) AS avg_granted_mb,
MAX(granted_mb) AS max_granted_mb,
MIN(granted_mb) AS min_granted_mb
FROM ParsedEvents
GROUP BY sql_text
ORDER BY avg_granted_mb DESC;


Now you can start dealing with the queries involved in that pesky wait before the problem brings your server to a screeching halt! By the way, the queries may lead you to a design problem in your tables that can cause high memory grants. More about that in a future post!

Want to work with The SERO Group?

Want to learn more about how The SERO Group helps organizations take the guesswork out of managing their SQL Servers? Schedule a no-obligation discovery call with us to get started.