Even though I did not know about Andrej Karpathy's tweet from earlier this month, I ended up converging on something very similar.
A couple of weeks ago I built a git-based knowledge base designed to run agents prompts on top of it.
I connected our company's ticketing system, wiki, GitHub, jenkins, etc, and spent several hours effectively "onboarding" the AI (I used Claude Opus 4.6). I explained where to find company policies, how developers work, how the build system operates, and how different projects relate to each other.
In practice, I treated it like onboarding a new engineer: I fed it a lot of context and had it organize everything into AI-friendly documentation (including an AGENTS.md). I barely wrote anything myself, mostly I just instructed the AI to write and update the files, while I guided the overall structure and refactored as needed.
The result was a git-based knowledge base that agents could operate on directly. Since the agent had access to multiple parts of the company, I could give high-level prompts like: investigate this bug (with not much context), produce a root cause analysis, open a ticket, fix it, and verify a build on Jenkins. I did not even need to have the repos locally, the AI would figure it out, clone them, analyze, create branches using our company policy, etc...
For me, this ended up working as a multi-project coordination layer across the company, and it worked much better than I expected.
It wasn't all smooth, though. When the AI failed at a task, I had to step in, provide more context, and let it update the documentation itself. But through incremental iterations, each failure improved the system, and its capabilities compounded very quickly.
Well, my comment was meant as an example of a setup for actually building something real with reasonable quality. I was answering to that part of the previous comment.
In my experience, the difference is context. Agents without structure produce slop, but with a well-curated knowledge base and iteration, they can be useful. I was just sharing a setup that has been working for me lately.
The problem with your comment is that the word "real" is just there to move the goalposts. There are people building high-quality stuff like this, yes.
I built a tiny utility like this that works very well yesterday:
I remember the personal wiki was a bit of trend 5 years ago but it kind of died because it had an unclear purpose for the most part. I kept one but never really referred to any of the notes and then just went back to a paper and to do list. I’m sure this is useful for those who kept up the habit.
We have been using it as a sounding board. I think that in its current state it's actually more useful for someone to learn about how to run a business - "what does a CEO vs PM do" and/or learn about the pros/cons of running a bunch of agents at once.
let's talk about real stuff. we built an AI-native CRM backed by HubSpot founder Dharmesh Shah last year before this, had revenue, iterated to focus on context graph infra which looked like the right moat to focus on, did enterprise PoCs, and all of that distilled into this personal project i built on the side to help my own work. turned out to be right interface for making context infra usable.
the team is of 4 HubSpotters who built HubSpot's largest platforms - search, nav, notifs, permissions, AI.
we are in the process of opening up large pieces of our enterprise context architecture to WUPHF and also ship the cloud enterprise version of WUPHF (https://nex.ai/new-home).
This is from my experience the same in AWS and Azure. I would love for a kill-switch if the usage goes above a critical threshold. 5 hours down time will not kill my app but a huge cloud bill might.
It's been a year since I last looked at this, but when I did you could get near-realtime cost metrics for AWS Bedrock via CloudWatch (you get input & output token counts and have to generate the actual price yourself)
And vote with your wallet. In general, contribute your 2 cents to the good cause. Let your voice be heard, join initiatives that work on solutions. Whatever you do. It does matter.
See how many vile, empathy-void sociopaths are elected across the world because they have the political skills required to make things happen.
What I try in practice is to lead by example, and be a good neighbor/citizen. In my village, 90% of the people would rather bitch about the mayor or public services not cleaning up the park for months or years, when you can simply grab few plastic bags and clean most of it in few hours. I'm glad my initiative has pushed more citizens to actively care and take matters in their hands.
Yet the scope of degeneracy and risks are now at the global stage and I don't think I can do much when people with the wrong incentives are elected in office, if there's elections at all.
I've worked as a software engineer with different types of engineers (electrical, mechanical and automation).
Their testing is often more strict but that is a natural consequence of their products being significantly harder to fix in the field than a software product is.
Other than that, my experience is that our way of working on projects across disciplines is very similar.
Which only reinforces someone just lit $60M on fire. It's trivial to do this and there are so many ways people do things, having the AI build custom for you is better than paying some VC funded platform to build something for the average
Every time I hear someone say "I have a team of agents", what I hear is "I'm shipping heaps of AI slop".
reply