Desktop Framework AI assitant part 3

Recently I started playing with Nanobot which is a bit like OpenClaw but, in my opinion much better since it is small and simple and have native integration with custom OpenAI APIs providers. I decided to use vLLM because I wanted to use Qwen 3.5 because according to my quick reaserach it is pretty good in such agentic usage and Qwen models are built with integration with SgLang and vLLM frameworks in mind. Also AMD is testing their own drivers and libraries, ROCm on docker images with SgLang and vLLM and Desktop Framework is AMD APU. So all of it since like a good idea since I decided to go with Strix Halo unifies memory architecture for my AI assistant. I bought Desktop Framework motherboard and played with it a little to test models performance on ROCm inside the docker. It was not blazing fast but enough to actually have working solution.

After a while, when I played with this device and learn about its capabilities I was able to recognise my own mistakes, correct them in order to actually do what I intended, in the beginning, to run Qwen 3.5 as my agentic model. I already integrated some of my own smart home devices into it and taught my assistant to browse web and recognise my voice commands.

So far it really feels like great experience. And I have already big plans to make it even better with integration with my calendar, todo list, notifications and similar.

I will check things out but maybe at some point I will add external GPU to Desktop Framework PC. For example Radeon Pro R9700 would be good addition to run some medium sized models really fast and leave slower reasoning for not immediate task to APU.

Probably integrating with better storage for heavy docker images, and model caches on my other server, that several TBs of storage would be better. But for that better networking would probably be better. Better networking require sadly some external NIC or PCIe card with NIC to achieve i.e. 10Gb/s speeds. Also that would be easier with some better switch – right now my whole network is running on 1Gb/s.

Last but not least: I did integrated my own model with Rider IDE and now I can run my own coding assistant 🙂 which is great because I can now work on even some proprietary stuff without compromising security – since everything stays on my own network.

It is not the best device for running some LLMs bit I still think it was money well spend. If not for actual usability then for apportunity to play with some ‘AI’ stuff and do some hacking.

Leave a Reply Cancel reply