Running Qwen 3.5 on AMD Ryzen AI Max+ 395

I bough lately Desktop Framework with intention of running Qwen 3.5 as model for my AI assistant on Nanobot. At first I could not run this model on this hardware for some weird bug in one of the libraries. I explained why and how I fixed it here. In this post I will just put the list of packages that I used to run Qwen finally and vLLM command switches and parameters.

Here is the list of packages that I used to finally get it working:

vllm 0.17.1+rocm700
amd-aiter 0.1.10.post2
torch 2.9.1+git8907517
triton 3.4.0
rocm 7.2.0.70200-43~24.04

And here is the script that I am using:

TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 \
  VLLM_ROCM_USE_AITER=1 \
  vllm serve \
  cyankiwi/Qwen3.5-35B-A3B-AWQ-4bit \
  --host 0.0.0.0 \
  --port 8000 \
  --reasoning-parser qwen3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder \
  --dtype float16 \
  --max-model-len 128k \
  --gpu-memory-utilization 0.33

Happy hacking!

One Reply to “Running Qwen 3.5 on AMD Ryzen AI Max+ 395”

One Reply to “Running Qwen 3.5 on AMD Ryzen AI Max+ 395”

Leave a Reply Cancel reply