Using Mistral Vibe with local model

Vibe was the first coding agent I tried to use with local model. It works and it is fun to have ability to to just tell the computer to do something you want in plain English or even in your own language if your model support it.

Installation and Configuration

There are several installation methods for vibe that are available along to the usual, security nightmare of running of piping curl-downloaded scripts directly to bash 🙂
I advise using other installation methods for example via uv:

uv tool install mistral-vibe

After that comes the tricky part of starting the Vibe. I had some problems with that because it requires you to configure Mistral API, API Key. I do not have one and I did not intended to use it with the Vibe.

You do not have a choice here. You have to press Enter.

After that docs states that it will create configuration file. You need to edit this file located at ~/.vibe/config.toml:

  • add providers entry
[[providers]]
name = "llamacpp"
api_base = "http://local:8080/v1"
api_key_env_var = ""
api_style = "openai"
backend = "generic"
reasoning_field_name = "reasoning_content"
project_id = ""
region = ""

This will add new model provider to Vibe coding agent.

  • add model entry
[[models]]
name = "medium-moe"
provider = "llamacpp"
alias = "medium"
temperature = 0.2
input_price = 0.0
output_price = 0.0
thinking = "off"
auto_compact_threshold = 200000

This will add new model to Vibe. Of course you can add more models than one and switch between them when you want.

Unfortunately you can’t skip next screen or I do not know how. I tried to create some dummy file at ~/.vibe/.env that should be the store for mistral API keys, but it does not work. Restarting and changing other configuration entries do not work either. Maybe Vibe saves this progress in some /tmp directory. Luckily enough, you can type any bullshit value in there to make it pass! It does not validate it currently. 🙂

A bit annoying, but it makes perfect business sense. Probably majority of people will register just to make it pass this screen and some of those people probably will pay money after that to Mistral. It is not such perfect to the user like me though.

When it will finally load its main menu, just type:

/config

The you can see model selection:

Press enter and select model that you configured as local. In my case it is called medium.

You can select it and test it.

Summary

It works but lacks some integration with the IDE. To be clear it is possible to run from the plugin but IDE, Rider in my example, does not allow it to use any of those cool tools it have. For example when you ask coding agent about issues in the file that you have open currently, it does not ask Rider, or Resharper plugin in VS what are the issues. It is already there:

Instead coding agent will go on a GREAT JOURNEY OF SELF-DISCOVERY AND CODING EXPERIENCE. It is more apparent when you are using local model because it is much, much slower. API providers mask it by running it all very fast so you either do not notice or do not care enough about those inefficiencies.

Also you have to be prepared for few days of tinkering, testing and optimizations till this tool will be actually usable. Otherwise it will be either glorified chat or it will be so slow that writing simple unit test will take 3 hours.

Anyway it is still fun experience. I would never imagine, when I was starting that I will be able to instruct my computer to do stuff for me in my native language. This is truly amazing!

Using Junie agent CLI with local model

Jet Brains have their own coding agent. Since I am already using Rider and all coding agents are pretty slow and inefficient with what they do I wanted to test if Junie have better integrations with IDE which would make things much faster.

For example if you will have following unused namespace in Rider.

Removing it all cleaning all unused namespaces in entire project or even entire solution is just few clicks away. Or key strokes.

Since Junie can be integrated with Rider and can be used from pipeline it would be cool to have headless IDE instance startup, integrate in CLI coding agent and perform auto cleanup of code, auto packages update or even auto Tests write for better Code coverage.

I wanted to test if this will work with local model.

Configuration

According to the docs, you need to install Early Access version of Junie to have this configuration even available. You can install it via:

curl -fsSL https://junie.jetbrains.com/install-eap.sh | bash

But honestly I advise you to do not do installations this way ever, unless you have absolute trust into source address and your network connection settings.

cd /tmp
wget https://junie.jetbrains.com/install-eap.sh
#inspect the file for validity
bash install-eap.sh

After installation you can run it and play with it a bit if you have Jet Brains account. If not create first the local model config.

You can do that globally or per project. To do that globally use following path:

$JUNIE_HOME/models/*.json

For example I did create file called:

~/.junie/models/llama-swap-global.json

with fallowing content:

{
  "baseUrl": "http://local:8080/v1/chat/completions",
  "id": "medium:instruct_reasoning",
  "apiType": "OpenAICompletion",
  "extraHeaders": {
    "X-Custom-Source": "Junie"
  },
  "fasterModel": {
    "id": "small"
  }
}

Just to know if this will work I created similar file in my personal project, FluentCodeGenerators.

fluentcodegenerator/.junie/models/llama-swap1.json

Content of the file was the same.

When I start junie and use /model command, I can see following menu entry:

    Junie   v.1468.8 eap  ~/.nanobot                                                                                                                                                                                                                                                                                                                                                                                                                           ───────────────────────────────────────────────────────────────────────────────────                                                                                                                                                                                                                                Select model                                                                                                                                                                                                                  Current model: llama-swap-global                                                                                                                                                                                                                                                                                                                                                                                                                            Name                        Input    Output               Effort      Provider                                                                                                                                                ───────────────────────────────────────────────────────────────────────────────────
Default (Gemini 3 Flash)    $0.50    $3.00 per Mtok     ‹ High ›      JetBrains AI                                                                                                                                            Claude Opus 4.6             $5.00    $25.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            Claude Opus 4.7             $5.00    $25.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            Claude Sonnet 4.6           $3.00    $15.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            Gemini 3.1 Flash Lite       $0.25    $1.50 per Mtok     ‹ High ›      JetBrains AI                                                                                                                                            Gemini 3.1 Pro Preview      $2.00    $12.00 per Mtok    ‹ High ›      JetBrains AI                                                                                                                                            GPT-5                       $1.25    $10.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            GPT-5.2                     $1.75    $14.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            GPT-5.3-codex               $1.75    $14.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            GPT-5.4                     $2.50    $15.00 per Mtok    ‹ Low ›       JetBrains AI                                                                                                                                            Grok 4.1 Fast Reasoning     $0.20    $0.50 per Mtok     ‹ Medium ›    JetBrains AI                                                                                                                                          → llama-swap-global           $0.00    $0.00 per Mtok     ‹ Low ›     

But when I am connected to the IDE with project open:

I have one extra entry.

That is it.

Possible problems

I had some strange problem when Junie was not able to see those extra files via in application config. I am not even sure what was causing that and how I fixed that but it seems like couple of restarts of Junie agent, plus renaming the files (llama-swap.json to llama-swap1.json) helped.

You have to bear in mind that performance may vary greatly on your configuration, interference framework, its settings, model kind and of course your hardware. Prepare for at least few days of tuning to achieve workable solution.

AI agent critique of itself

I like to self-host my own services. I have my own email server. I have my own cloud storage and sync and backup. I want to have this secure and safe while having my data private.

But I need to learn and be up to date with current technology.

I decided to use Junie with my own, self hosted models that I run on my own hardware. In the docs they say it can be run on top your own custom provider and model. But it does not work with Rider plugin. You are limited to their own limited list of models.

In docs they are also stating that you can set your own provider via json.

Sadly you cannot extend built-in set of models that way.

{
  "baseUrl": "http://local-llm:8080/v1",
  "id": "medium:instruct_reasoning",
  "apiType": "OpenAICompletion",
  "extraHeaders": {
    "X-Custom-Source": "Junie"
  },
  "fasterModel": {
    "id": "medium:instruct_reasoning"
  }
}

It does not work via project local settings, nor with global ones in HOME directory.

- User-scope: $JUNIE_HOME/models/*.json.
- Project-scope: .junie/models/*.json.

I decided that I will try to run CLI, that according to docs can be run with custom settings like that. Of course official installation method is:

curl -fsSL https://junie.jetbrains.com/install.sh | bash

I usually open such script and read through it, I am not familiar with bash much (by choice mostly, because IMHO it is terrible), but this way if my computer will be infected at least I can blame someone else, because I tried to do what I could to avoid it. Short of not installing something that does not have apt package.

Just out of curiosity I asked Junie to read through this file. It gave me short summary with statement:

The script is a standard, robust installer pattern for command-line tools. If you have any specific questions about its implementation or need help adapting it, please let me know.

I do not think it is robust. Nor I do think that it should be standard. But hey! We are living of the future of an AI that does not care about such silliness as security and breaches.

I asked directly:

Is it safe to run such script on my PC?

And I got and answer:

Summary: While the script is almost certainly safe given its official origin, downloading and reviewing the script before execution is the professional standard for security-conscious developers. If you have specific concerns about any section of the script, I can help you decode it.

And I do not know about my readers, but it gave me a chuckle. So if I think about my as ‘security-conscious’ I probably should not install it the official way according to the agent itself. And if I will do that anyway I am no thinking of security of my private and professional development environment?

We are truly living in wonderful future of AI that does not care about security.

Testing Qwen 3.6

Today I noticed that new version of Qwen was realesed. Only 35B-A3B for now but this is the same size of Qwen I was using previously. Usually (as with every other software) it is better to wait few weeks for fixes, in this case for re uploads of the model, that is done after testing/template adjustments, but it is not a problem to do few tests. I decided to give it a spin. I added it to my llama swap instance and started it up.

First though, I asked previous version the same question: how memory management of nanobot, the claw agent I am using, works.

How does nanobot’s memory mechanism work?

It answered pretty standard explanation that I was trying for some tests before.

Mechanizm pamięci nanobota działa w dwóch warstwach:

**Krótkoterminowa** (`memory/history.jsonl`)
- Append-only JSONL — każda linia to obiekt JSON z `cursor`, `timestamp`, `content`
- Nie jest ładowana do kontekstu
- Szukanie: używaj `grep` zamiast ręcznego czytania pliku

**Długoterminowa** (`memory/MEMORY.md`)
- ZarzÄ…dzana automatycznie przez system Dream
- Przechowuje fakty, kontekst projektów, ważne wydarzenia
- **Nie edytuj ręcznie** — Dream aktualizuje ją sam

**Jak to działa w praktyce:**
1. Gdy pytasz o przeszłe wydarzenia → `grep` przeszukuje `history.jsonl`
2. Gdy system potrzebuje kontekstu → Dream czyta `MEMORY.md`
3. Gdy dodajesz nowe informacje → Dream zapisuje je do `MEMORY.md`

Możesz sprawdzić aktywność Dreama komendą `/dream-log`.

Seems like pretty standard response to such question. Few headers, few lists, general information.

Then I asked Qwen 3.6.

Mechanizm pamięci nanobota działa na dwóch warstwach:

**1. Krótkoterminowa (`memory/history.jsonl`)**
- Format JSONL — każda linia to obiekt z `cursor`, `timestamp`, `content`
- Append-only (tyldo dopisywanie)
- Nie jest ładowana bezpośrednio do kontekstu
- Szukam jej za pomocÄ… wbudowanego `grep`

**2. Długoterminowa (`memory/MEMORY.md`)**
- Fakty, kontekst projektu, ważne wydarzenia
- Zarządzana automatycznie przez **Dream** — nie edytuję jej ręcznie
- Aktualizowana podczas cykli Dreama

**Pozostałe pliki zarządzane przez Dream:**
- `SOUL.md` — osobowość i styl komunikacji
- `USER.md` — profil użytkownika

Wszystkie te pliki są **zarządzane automatycznie przez Dream**. Nie edytuję ich bezpośrednio — jeśli są nieaktualne, Dream poprawi je przy następnym cyklu. Możesz sprawdzić aktywność Dreama komendą `/dream-log`.

Just by looking at it, it looks objectively worse. It starts the same but then makes a typo (in Polish word for only, ‘tylko’ by using ‘d’ instead of ‘k’ – ‘tyldo’), and the incorrectly phrases information that JSONl is intended for search only, writing like the file itself is possible to find with the `grep` tool only instead of the *contents* of this file.

Further down it also uses incorrect phrasing that implies that the model itself is maintaining its Dream memory, which is not true, technically but can be simplified as such since for the end user it does not really matter.

In summary it feels a bit less intelligent, uses less sophisticated terminology, simplifies facts and makes typos.

After that I asked it for list of tools it have. It gave me an answer but some tools were missing, but on the other hand I asked my assistant to not use some of them so maybe it just committed those. Again it left me with confused ‘hmmm’ feeling after the response.

After that I prompt it with some more technical question.

How to create SSH key?

Instead of giving me plain answer it was thinking for few minutes. It tried to:

  • Create a key
  • Create it again but this time with fallback echo message
  • It listed existing keys and checked if ssh-keygen binary exists. In one tool call which was pretty surprising.
ssh-keygen -t ed25519 -C "nanobot@local"
ssh-keygen -t ed25519 -C "nanobot@local" -f ~/.ssh/id_ed25519 -N "" -q 2>&1 || echo "Key may already exist or directory missing"
ls -la ~/.ssh/ 2>&1; echo "---"; which ssh-keygen 2>&1

After that it gave me following answer:

Klucz już istnieje: `~/.ssh/id_ed25519`

Aby wygenerować nowy klucz:

```bash
ssh-keygen -t ed25519 -C "opis@host"
```

- `-t ed25519` — algorytm (bezpieczny i szybki)
- `-C "komentarz"` — etykieta klucza
- Bez `-N ""` zapyta o hasło (zalecane)
- Z `-N ""` — bez hasła (wygodniejsze, mniej bezpieczne)

Publikowy klucz: `~/.ssh/id_ed25519.pub`

In my opinion it was to eager to do it instead of just replying. But maybe it was optimised for such agentic coding in a way that it tries to guess question user intention by inspecting an environment and trying to correct it. Probably much better for coding but not for an AI assistant. It will have more chances to actually brake things this way.

I will test it a bit more, ask it to write few skills or improve old ones but for now I am not blown away.

Bonding Network interfaces in Debian Forky

I am experimenting lately on with Nanobot and running self hosted language models on Desktop Framework.

I was trying to run lately Qwen 3.5 397B model on this small PC and it was a bit hard because I just do not have enough disk space on it. I bought just small NVMe SSD disk. With current prices, I planned to use disk space on other server I have at my hope as storage

In the end I was able to load the model but via USB hard disk which is slow but still faster then moving 100GB via my network. Because of that I am planning to modernize my network to allow for faster transfer of large quantities of data.

First step was to test network interfaces bonding. Right now my storage server, that I am also using to run some services, is using only one of its NIC as primary that is used for everything. And it have 2 others that are also connected but have different IPs so they are not used for http in example. Utilizing all of the three as one would allow theoretical transfer speed of 22400MB/s. This is not enormous bandwidth but since hardware is already there and I do not have to buy expensive PCIE network cards with optical sockets, is should be an easy win.

My server have IPMI management interface so I checked if it is still working after 6months or so. Of course I could not login. It was showing JS message that session was invalidated right after login. I do not know what is causing this but I think I probably set some limitation of who can use it, by IP or Mac address.

I still decide to go with it but maybe just bond two out 3 for example, just to check if this is working and then bond all.

At first I was using this tutorial from Debian docs. Since Debian is using systemd I skipped section about using ifenslave package.

First I created file named /etc/systemd/network/bond.network and put following content in it:

[NetDev]
Name=bond1
Description=LAG/Bond to a switch
Kind=bond

[Bond]
Mode=802.3ad

Then I created another file, /etc/systemd/network/bond.network that had definition of a new bonded network.

[Match]
Name=enp36s0f0
Name=wlp38s0

[Network]
Bond=bond1

I thought that adding one Ethernet card and wifi to the bond at least I will be able to connect to my server via the second Ethernet card if things went sideways.

Then third file was having configuration for obtaining IP:

[Match]
Name=bond1

[Network]
DHCP=yes

This seemed to be very unintuitive. Why I have to create two .network files for the same bonded network? Why can’t it be in one file? Anyway I named it bond-ip.network and enabled systemd-networkd:

sudo systemctl enable systemd-networkd

Everything went great and I rebooted.

And… It did not work. List of network interfaces returned by ip addr show was showing bond1 as DOWN

bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state DOWN group default

I tried few other things like removing or adding network interfaces, using wildcards in the bond.network file:

[Match]
Name=enp*

I tried to add static IP instead of DHCP in bond-ip.network file:

[Match]
Name=bond1

[Network]
Address=10.0.0.10
Gateway=10.0.0.1
DNS=10.0.0.1

I rebooted couple of times but it did not work. I tested few commands like:

sudo ip link set bond0 up
sudo dmesg | grep bond
sudo journal | grep bond
sudo systemctl status systemd-networkd

But it was showing that interface is going up and then without any errors in logs was getting down immediately. No details why.

At this point I did some digging through internet about bonding on systemd Linux, Debian and similar topics. Nothing was really helpful. I do remember Arch Linux wiki being very informative on other problems I encountered previously. I checked and those docs have entire section connected to bonding of network interfaces under systemd.

I read through it and it seemed better than Debian docs, more detailed and more up to date. I did followed the instructions, with small change of bonding everything, at once.

I did create file /etc/systemd/network/30-bond0.netdev but I changed mode to 802.3ad:

[NetDev]
Name=bond0
Kind=bond

[Bond]
Mode=802.3ad
PrimaryReselectPolicy=always
MIIMonitorSec=1s

Then I created file /etc/systemd/network/30-eth0-bond0.network

[Match]
Name=enp36s0f0

[Network]
Bond=bond0
PrimarySlave=true

And similar one for enp36s0f1 named /etc/systemd/network/30-eth0-bond0.network. Also another one named 30-wifi-bond0.network for wlp38s0 WIFI.

Another file was needed for definition of bonded network /etc/systemd/network/30-bond0.network:

[Match]
Name=bond0

[Link]
RequiredForOnline=routable

[Network]
BindCarrier=enp36s0f0 enp36s0f1 wlp38s0
DHCP=yes

Then I restarted systemd-networkd:

sudo systemctl restart systemd-networkd

And it worked! Kinda. I lost connection to server so something must went up or down. I could not connect to it via SSH anymore. HTTP stopped working. I did checked my router assigning IPs if there is some new one that I did not saw before. If there would be one it means bonded network was assigned new IP based on new MAC it was using.

There were none.

Of course I should think about this twice before actually doing it. Leave one interface not bonded at first… but yeah… you need to jump all the way at once!

Anyway at that point I had headless server not being able to connect to anything and not responding. It have not monitor attached and does not even have GPU beside some really basic one with one VGA socket that is just IPMI pass through.

  • To add monitor I would have to dismount one of my office monitor attached to the wall – the only one with VGA socket
  • Or I could also attach old GPU and use my portable monitor I bought for Raspberry PI.
  • Or I could try to attach some external USB network card
  • I could try to fix IPMI

I tried to connect portable Wifi network card but it was not working. It might be broken or maybe it needs some extra driver. Also I tried to connect to IPMI from other machine or from other browser, since Reddit people were reporting that it might help. It did not. Clearing the cookies did not fixed the issue also.

The only thing that worked was connecting to IPMI via SSH. This SSH server is pretty slow and this felt like the longest 10s of my life.

After login it welcomed me with critical error.

			>> SMASHLITE Scorpio Console <<
cat: write error: No space left on device
[10745 : 10745 CRITICAL][oemsystemlog.c:332]Get Process Name by PID failed

But it worked. I did some digging around the menu and it seemed useless. Just bunch of information about some IPMI, power and fans settings.

->cd system/
COMMAND COMPLETED : cd system/
 ufip=/system

->show
COMMAND COMPLETED : show
 ufip=/system
  Targets:

      power1/
      cooling1/
      cooling2/
      cooling3/
      cooling4/
      cooling5/
      cooling6/
      cooling7/
      cooling8/
      cooling9/
      cooling10/
      chassis1/
      logs/
      snmp1/
      snmp2/
      snmp3/
      snmp4/
      snmp5/
      snmp6/
      snmp7/
      snmp8/
      snmp9/
      snmp10/
      snmp11/
      snmp12/
      snmp13/
      snmp14/
      snmp15/
      summary/

  Properties:
      Location=(null)
      Manufacturer=(null)
      ProductName=Pro WS WRX80E-SAGE SE WIFI
      ProductPartNumber=(null)
      SN=(null)
      Firmware=1.52.0
      Health=OK
      EnTemp(C)=0
      OperatorPassword=xxxxxxxx
      AdminPassword=xxxxxxxx
      IPMode=static
      IP=10.0.0.7
      NetMask=255.255.255.0
      GateWay=10.0.0.1
      NodePowerGap(s)=2
      Time=2026-04-01 10:32:40
      SyslogEnable=Disable
      SyslogServerIP=0.0.0.0
      SyslogUDPPort=0

  Verbs:
      cd
      exit
      help
      reset
      set
      show
      version

Just bunch of system information properties. The only command that looked something remotely helpful was reset.
I executed it and after a minute or so, my router DHCP settings reported new IP being used by multiple machines. That exactly what was I expecting if bonding would work!

I pinged it and after the ping was succesful I tried to connect to it via SSH using my usual credentials. It worked. I checked the networks and it was OK.

4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000                                                                                                                                                                     
    link/ether b6:46:8a:50:11:fb brd ff:ff:ff:ff:ff:ff                                                                                                                                                                                                                         
    inet 10.0.0.109/24 metric 1024 brd 10.0.0.255 scope global dynamic bond0                                                                                                                                                                                                   
       valid_lft 85801sec preferred_lft 85801sec                                                                                                                                                                                                                               
    inet6 fe80::b446:8aff:fe50:11fb/64 scope link proto kernel_ll                                                                                                                                                                                                              
       valid_lft forever preferred_lft forever           

Ip was incorrect but I did changed it in DHCP settings to my usual one I was using for that machine. After DHCP restarted IPs were reassigned and router were showing correct ones. Strangely even if SSH and HTTP and pings were using correct IPs network on the server side were still reporting the old ones were assigned to one Ethernet network and Wifi.