vllm.v1.worker.gpu.async_utils ¶
stream ¶
Lightweight version of torch.cuda.stream() context manager which avoids current_stream and device lookups.
vllm.v1.worker.gpu.async_utils ¶ stream ¶Lightweight version of torch.cuda.stream() context manager which avoids current_stream and device lookups.
vllm/v1/worker/gpu/async_utils.py