Clarifying queue_depth_limit and queue_time_limit Recommendations for Large Pool Configuration

Hi everyone,

I've come across several F5 DevCentral discussions and docs recommending smaller queue_depth_limit percentages (typically 0.05%–0.1%) for large pools (e.g., 10–100 members). The reasoning seems to be:

Large pools have a higher total connection capacity, so smaller percentages are enough to handle burst traffic without causing memory strain.

Smaller pools, with fewer total connections, can tolerate higher queue depths (e.g., up to 0.5%) because the overall load is more contained.

Could anyone confirm if this aligns with real-world best practices? And are there any production examples where tuning beyond these ranges made a meaningful impact?

Also, I'm looking for guidance on queue_time_limit tuning:

I understand that the default of 0 means “unlimited wait time,” but in practice, that's usually not desirable.

Some sources suggest setting it slightly higher than the average backend response time, but below the typical client timeout threshold.

For example, for apps that occasionally spike to 10–20s delays, a queue_time_limit of 30,000–40,000ms might be appropriate.

Can anyone share their experience on what queue_time_limit values worked best in interactive or high-traffic environments? Are there general thresholds or formulas you follow?

Appreciate any input or examples from your deployments. Thanks!