Thursday, November 20, 2025

Since the release of functions/tools for LLM's I have been working in so many directions to get opensource models to be decent for consumer use, Meta with Llama was the 1st that allowed consumers to have some access to tools/functions but it was lacking specially with all the frameworks at the time all of them were made for close source models which was pretty frustrating, I jumped around with so many other models like deepseek,qwen, openhermes,command,etc with very generic results, I had to switch models for specific use cases draining my 24GB VRAM, then OpenAI release GPT-OSS and that was such a break through this is the best well rounded opensource model and its very interesting not many people talk about how good and how much this helps the open source community.

Anywise, this allowed me to move forward with my work and it was awesome until i star hitting context walls, the more tools i provided the harder and more unstable it became and dont forget the speed degradation.

So, as GPT-OSS was better from everything i have tried before I went back to framework testing as all my work was custom, i had to code my own agent framework and tools (opensource was not good enough) i have found that GPT-OSS was good enough to use them reducing my effords so I can use their code and their already working solutions after testing almost all them I really like LangGraph and the Google ADK.

 But i have still face the problem of context, if you have too many tools it gets too unstable after you run out of context and its obious we are providing ridiculous amounts of text to LLM's, So with the release of MCP and i gave it a try and i really like the idea of been able to share code for other initiatives for example if you are at work and you create an agent with MCP and a cooworker likes it but they want part of your ideas into their agent pointing to an MCP is really easy way to share them, just think about an enterprise that share MCP servers for each department building agentic frameworks will speed up integrations accross agentic workloads.


Anywise the problem persisted so if you need 1 function and the MCP has 20 all of those go into the context which makes no sense specially with limited hardware. 

So, I give it a try to the concept of Agents as Tools which works great to reduce the context you are giving back to the Agent cordinator, so the idea is to point the MCP to a dedicated agent that handles all the requests to the MCP and the coordinator agent only see's a summary of that tool, for example an agent for docker and expose back to the coordinator hey i can do anything about docker, instead of 20 tools.

But we still have the scaling issue, if i am to simplistic with the agent as tools descriptions to save context tokens, too many will become unstable and if i expand the descriptions ill have to reduce tools.

So, it seems that its impossible to have an AI that does anything, and if you look around you will see that speciallize Agents is the way to go and this is even the reallity with close source models on the cloud.
And if you keep looking around youll see that everybody is talking about agents of agents, agents that talk to other agents, so i decided to give it a try.

So, I started learning the A2A protocol and seeing how can i benefit from the model cards which seems like a great idea, after building multi-agents that talk to each other i found out that the A2A dont really searches the network for available agents that can do the work, you predifine each A2A connection as a sub-agent, what a bummer, i dont know why in my head this would work just like IOT devices that discover the network for compatible devices with the main HUB, in my head i thought there was a functinality looking for the correct agent to accomplish the task and build its tools dynamically instead of predifine each one by one and predetermine them as sub agents. 

I endup at the same place, -_-)' i still like the concept of agents over the network that can talk to each other at an enterpice level this is priceless but its lacking the main functionality i was looking from it.
Meaning at an enterprise All workflows will have to be predifine meaning manual work which is not the whole point of ai, still very useful but its not the AI Assistant I want.

So, back to square one I need to generate my own solution and i have been thinking about it for a while and trying many ways to generate tools dynamically but there are even frameworks that wont allow you to do this, to me this is mind blowing.

Why there is no programatically generated solution? 

So, I started trying to do that, how can i programatically generate my solutions with the LLM.  Well this is not new, many frameworks allow code execution but the LLM's are so unreliable generating the code, they getting better but remember i am using open source, I am in Hard mode here and even paid solutions are not trusted to do this.

So, How can I steer the LLM to provide good code that works!!

the 1st thing i did was to remove all the garbage context i dont need and all the layers of complexity sorounded the functions i am providing as tools and instead of that, i am telling the LLM to import the module and use it and that surprisinly worked very well, I was able to mix forces with the LLM and tell it Yo! just use my code i already test it and you can find it here.

Great, This is starting to look like the AI assistant I want but i am still providing extra context i dont want the LLM to process the whole module it might have several functions. So how can i tell the LLM yo! look at this folder with all my code and reuse what you can to accomplish your task, without affecting the context that much.

I have tried so many solutions without success and then i saw the streamer Theo T3 talking about Claude blog post about MCP , which was no good news as i already knew that but they mention a similar solution to what i was working on and they casually said:

Alternatively, a search_tools tool can be added to the server to find relevant definitions.

And i was like YES, where is it? no code or sample :'( but knowning that its possible that encouranges me, it meas i might be able to pull something off but on my solution i am taking everything away no MCP server, no more layer of complexity just like Theo Said on his video he was right, i have been on this train of thoughts for a while. 

So, i did what anthropic said, just make a search_tool, and which is the best search tool for LLMs. RAG's! so i build one and focus on the contextual of the problem, the yo! just use my code, instead of the code itself to keep the context as small as possible, added 28 tools and worked just fine, i have finally created an agent that works with dynamic tools, the agent is completely unaware of what tools it has access and i can add tools without changing a single line of code from the agent itlself, the most interesting part is that the system prompt (6 lines) does not have to change either as i add tools to the RAG (PGVECTOR) this is very interesting, it makes me wonder if i even need another agent. 

Is all we need, is an agent with 2 tools (search / execute)?

I'll continue testing this up and add more tools to find the break point and see how it will behave on more complex tasks.











 

Monday, June 30, 2025

ADK Quick Review

 Got into a new job and i had to switch cloud vendors, so part of my learning curve is to get involve in new SDK's like the ADK from google, I havent paid much attention to that one as i thought it was Gemini only but it seems we can run locall LLM's with it and take advantage of their features.
So, I have decieded to run a quick demo to see how fast we can implement a quick agent with tools. On this example i will let the Agent talk to Docker API on the system.

 

Now, let try let the LLM look into logs, find errors and provide suggestions:


 

 

This was a quick night after work so i didnt have much time to set it up but i am quite impress with the tracing/sessions/states/artifacts this is quite the framework to work with Agents it does makes things much easier at 1st glance.
Ofcourse on this sample the LLM its a bit confuse and tries to talk me into using the CLI but that's just because i didnt set a proper prompt i was testing how fast we can get on track.
I switch to an MCP server as they are too convinient to move around Agents and worked right away.
I'll try to create some better workflows and test the A2A framework in another time.

Monday, May 19, 2025

From Overwhelmed to Efficient: Qwen 3 and the Future of Ticket System Management

 Qwen3 recently came out and It seems the 1st thinking model with decent results. I was not too impress with DeekSeek results running locally, so i have decided to give it a try.

For this model i have decided to mimik a ticket system for Lv1 Support and create a simple agent with Qwen3 that will be integrated with the ticket system to manage its tickets through actions.

I am going to try 2 aproaches here:

    • Support Engineer uses an agent to work on his queue.
    • Ticket owner works directly with the Agent. 

 So before I start playing with Qwen3, I need to create a basic ticket system that will allow the basic CRUD operations and decided to use Flask & sqlite to quickly prototype it.

And this is how it looks basic but good enough:

 
Now, that we have the ticket system ready, i can start integrating the LLM to perform actions.
So a visual representation of this would look something like this:
 

This is a very basic implementation but the goal here is to see how good the open source models that run locally have become.

After drafting some code I got a basic agent to handle the basic CRUD (Create/Read/Update/Delete) Operations on the ticket system.

Here is a quick test:

 It was very interesting to see the LLM been able to answer the questions and interact with the ticket system from a basic agent.

This shows that there could be some uses cases for engineers that work with a ticket system, they can have an agent working with them on the tickets they have assign to speed the resolution process.

 Now, if you look at the industry for AI its quite focus on chatbots like the previous example but from my personal view, there is more power on removing the user interaction and integrate the AI Agent directly into the pipeline, something like this:

In this way the ticket creator will interact directly with the Agent.

So after a few code changes I have intergrated the LLM into the ticketing system.

In this way the ticket user will interact directly with the Agent and the agent will help with the current issue and any new issues that show up during the troubleshooting process as a regular ticket interaction.

For a demo this is pretty nice but currently all the answers from the agent come from its own knowledge. So in a real corporate/enterprise scenario We cant use this, We would need the LLM to know about propretary data, like company processes, policies or an internal knowledge base.

So I guess the next step for me is to find something similar to that in the public internet and integrate it but maybe for a part 2.

I might need to stop here and change the LLM as anthropic recently publish a paper invalidating thinking models:

 Reasoning models don't always say what they think 

So if the reasoning is not accurate we might be spending time/tokens/money for no reason.