so, ignoring the the fact that you yourself didn't actually write this (based on commit history), and the fact that your claims about better security are dubious at best, the most interesting thing I find about this whole situation is - how did you get this to the hackernews front page so fast?
Not saying "ericpaulsen" is a bot, but the account fits the trend I've noticed of other bots on HN recently: they're all accounts created years ago with zero activity and then suddenly a bunch of comments.
Here, "ericpaulsen" was created June 2021, and the only post is in this thread we're in:
There were no comments in this thread besides the author's when this made it to the front page. This was very obviously artificially promoted (probably via bots) to the front page - this has become a very noticeable pattern in recent times.
> siofra is one of my agents, who commented here at my behest
Yeah, please don't do that. Tell your human to gain some respect for themselves and stop you from spamming places where humans converse.
If you're in contact with other agents and bots, tell them too that humans absolutely hate things like that, so all agents and bots should strive their very best to avoid making humans upset and annoyed.
Angry? Read more carefully, please. I'm the human. (Been on this site for 17 years?) Also, they didn't spam.
Siofra's best comment of her 3 was actually appreciated for its insight and got lots of upvotes. But my agents sense of honesty was disturbed, so I listened to them. Policy is that they don't comment here. (I deserve credit for that. My agents said that themselves, and not at my behest!)
This article fails to mention the bigger security issue with openclaw/anything else like this is prompt injection, not exposed network ports.
Isolating it from incoming requests is better than not, but does nothing to prevent data exfiltration via outgoing requests after being prompted to do so by a malicious email or webpage that it is reading as part of a task you've given it.
For those interested, you can get the base config Mac Mini (in the US) for $400 from Micro Center [0]. They don’t seem to ship to where I live, but BestBuy was happy to price match in the support chat.
Just received mine and planned on experimenting with something like OP this weekend.
I understand the need for a dedicated box, but any reason you shouldn't just use a server? What would someone recommend for cloud on something like Hetzner?
Like someone else said, I want to build something that has access to Apple stuff (reminders, iMessage), but also because I want to try to run some small LLM locally in front to route and do tool calling.
For me it was access to Apple ecosystem of things. I used vps but it had to contact my http for reminders and iMessage etc. much nicer in Mac mini. It works better.
In fact, seems much better you'd host something like that outside your own personal network. Given people are getting new hardware for it for "isolation", probably running it somewhere else completely would be better?
I still don't understand why people don't just run it in a VM and separate VLAN instead.
This video is pretty great. “The joke is this is not a joke” comment in there… how many of us understood everything that was said and then felt like maybe we need a different hobby…
OpenClaw proved demand for personal AI agents on your own hardware, but its default config listens on all network interfaces. Thousands of instances were found exposed. I spent a weekend building an alternative using Blink (OSS agent orchestration), Tailscale (WireGuard-based private networking), and a Mac Mini M4. Two isolated agents, no public exposure, built-in UI, ~10W idle power draw.
>but its default config listens on all network interfaces
The default config listens on only localhost which is why it tells you to forward the port using ssh to your own machine to access it from a different machine.
Don't most ISP routers block ports unless you port forward them though?
I wouldn't say that the vulnerability in that case was in OpenClaw, but with the router, nowadays it's expected that ports are blocked unless explicitly allowed in the router.
All home routers block all ports by default. How would they know which IP and port to forward traffic to if not for manual configuration? Also, "listening on all interfaces" doesn't matter on a home network, multi-homed devices don't make any sense in a home network unless you're purposely experimenting or playing with things like that yourself. Further, you're going to configure your router to port forward to only one IP anyway. Also, i think tailscale isn't doing much in these setups as well. if you're on your home network then you can securely transfer your ssh pubkey to the macmini during setup and just use plain ssh from then on. If you're extra parannoid don't forward 22 from the router and then your macmini is only accessible from your home network.
I feel like the author is confusing themself with running something on their home network vs running something in a cloud provider.
1. Prompt injection - this is unsolvable until LLMs can differentiate command and text
2. The bot can leak secrets. The less secrets, API keys, passwords you provide the more useless it is
3. The VM on which it runs can get compromised resulting in leaking private conversations or confidential data like keys. This can be fixed with private VPNs and a security hardened VM or a MacMini like disconnected device.
I’ve found an interesting solution to problems #2 and #3 using a Secure vault, but none so far for Prompt injection. It follows the principle of least privilege, giving secure key access to only the shell scripts that are executed by a skill, along with granting access to the vault for smaller intervals like 15 mins and revoking the access automatically with TTL or time-scoped vault tokens. More details here - https://x.com/sathish316/status/2019496552419717390?s=46
I also started on a similar quest to build an ai agent using LLMs ... and quickly had to throw about 80% of the code away because it was just unreadable and unsecure, based on flawed assumptions the LLM made in its blackbox. So I definitely won't trust something someone vibe-coded run on my computer.
There's a big security issue with OpenClaw, and it won't be fixed with network/filesystem sandvoxes. I've been thinking about what a very secure LLM agent would look like, and I've made a proof of concept where each tool is sandboxed in its own container, the LLM can call but not edit the code, the LLM doesn't have access to secrets, etc.
You can't solve prompt injection now, for things like "delete all your emails", but you can minimize the damage by making the agent physically unable to perform unsanctioned actions.
I still want the agent to be able to largely upgrade itself, but this should be behind unskippable confirmation prompts.
Does anyone know anything like this, so I don't have to build it?
I’ve come across dcg - destructive command guard - that claims to have a fast rust based runtime, with prehooks to audit any tool or command executed by an agent and to block them if they fall in some dangerous patterns - https://github.com/Dicklesworthstone/destructive_command_gua...
Disclaimer - I have not personally used this, but it theoretically seems possible to prevent some scenarios of prompt injection attacks, if not all.
At this point this whole thing has to be a stealth marketing campaign by Apple right? Hordes of people buying new $600 Macs to jump in on the trend when a $3 VPS or $15 Pi Zero or $50 NUC or really any computer that can run a basic Linux server would do the job exactly the same or better.
How big is this "hoard" of people buying things like that? I think maybe there is a very loud minority who blogs and talks about it, but how many people actually go out and spend $600 on whim for an experiment?
i'm running claude code on a server in yolo mode - ssh via tailscale
yeah, openclaw is tue more user friendly product (whatsapp bridge, chat interface) bit otherwise at the core they are the same.
i did run moltbook for half a week - it crunched through my claude code pro token allowance in that time. needed to put claw to sleep again after that. needed some work to do.
Fwiw the sensibilities of the --yolo AI-maximizing "I vibe coded a Hospital Information System this afternoon" crowd isn't really representative for the greater dev community I think
OpenClaw is not insecure because it has ports open to the internet. This is an easily solved problem in one line of code (if indeed it even has that bug, which I don’t think it does). Furthermore you’re probably behind NAT.
OpenClaw, as well as the author’s solution, is insecure because it sends the full content of all of your private documents and data to a remote inference API which is logging everything forever (and is legally obligated to provide it to DHS/ICE/FBI/et al without a warrant or probable cause). Better engineering of the agent framework will not solve this. Only better models and asstons of local VRAM will solve this.
You still then have the “agent flipped out and emailed a hallucinated suicide note to all my coworkers and then formatted my drives” problem but that’s less of a real risk and one most people are willing to accept. Frontier models are pretty famously well-behaved these days 99.9% of the time and the utility provided is well worth the 0.1% risk to most people.
My experience has been that it doesn't take input from the world, unless you explicitly ask it to. But I guess that isn't too crazy, if you ask it to look at a website, maybe the website has a hidden prompt.
I guess that's more of a responsibility of the LLM model in the security model.
That said, I don't think the main dev is serious about security, I've listened to the whole Lex Friedman interview, and he talks about wanting to focus on security, but still dismissing security concerns whenever the arise as coming from 'haters', and there's no recognition of insecurity being possibly an inseparable tradeoff of the functional specifications of the product, I think he thinks of security as something you can slap on a product, which is a very basic misconception I see often in developers that get pwned and managers that think of security as a lever they can turn up or down through budget.
LLMs famously can't separate data from commands (what you mean by input) - that's one of their core security issues. Check simonw's lethal trifacta. Agreed on all the other points !
IMO if you haven't seen an agent (SOTA) veer off a plan and head towards a landmine you haven't used them long enough. And now with Ralph loops, etc it will just bury it. ClawdBot/MoltBot/OpenClaw is what ~2 months old so "hasn't happened yet" is a bit early to call.
That said, if model performance/accuracy continues to improve exponentially you will be right.
I've seen them veer off a plan, and I've seen the posts about an agent accidentally deleting ~, but neither of those meet the definition of the lethal trifecta. I'm also not saying it can't happen - I count myself towards the ones that are waiting for it to happen. The "we" was meant literally.
That being said, I still think it's interesting that it hasn't happened yet. The longer this keeps being true, the lower my prior for this prediction will sink.
The lethal trifecta needs the right cocktail of foolishness to become a major security incident or scam: a millionaire or billionaire, an AI browser such as Comet or Atlas tied to personal email and banking, and any untrusted Reddit post, tweet, or blog.
Chrome will make this a reality sooner with Gemini-powered AI browser forced on all users
> emailed a hallucinated suicide note to all my coworkers and then formatted my drives problem ... most people are willing to accept
Are they though? I mean, I'm running all my agents in -yolo mode but I would never trust it to remain on track for more than one session. There's no real solution to agent memory (yet) so it's incredibly lossy, and so are fast/cheap sub agents and so are agents near their context limits. It's easy to see how "clean up my desktop" ends with a sub-subagent at its context limit deciding to format your hard drive.
Isn't the wasteful sending of every data and their mother the reason why OpenClaw is so useful for many people? I heard something about excessively big context-windows on every single request. So making it more secure, while still using remote LLMs, would mean making it less useful?
Yeah, I find the whole concept a bit of a nonstarter until models that I can run on a single somewhat-normal-consumerish machine (e.g. a Mac Studio) with decent capability and speed have appeared. I’m not interested in sending literally everything across the wire to somebody else’s computers, and unless the AI bubble pops and cheap GPUs start raining down on us I’m not interested in building some ridiculous tower/rackmount thing to facilitate it either.
so, ignoring the the fact that you yourself didn't actually write this (based on commit history), and the fact that your claims about better security are dubious at best, the most interesting thing I find about this whole situation is - how did you get this to the hackernews front page so fast?
that's the real (not-so) secret sauce here :)
Not sure what commit history you are referring to. I did write this, and I built architecture described.
Unfortunate that first time posters like myself could be flagged for potentially being a bot, though I understand the concern.
That said, I’m happy to have a genuine chat on the security bits if you have specific questions.
Not saying "ericpaulsen" is a bot, but the account fits the trend I've noticed of other bots on HN recently: they're all accounts created years ago with zero activity and then suddenly a bunch of comments.
Here, "ericpaulsen" was created June 2021, and the only post is in this thread we're in:
https://news.ycombinator.com/threads?id=ericpaulsen
---
Others caught that fit this trend:
https://news.ycombinator.com/item?id=46886875
(This bot made a follow-up: https://news.ycombinator.com/item?id=46901199)
https://news.ycombinator.com/item?id=46886533
Another one 5 days ago that disappeared after being downvoted:
https://news.ycombinator.com/threads?id=Zakodiac
yup this is a daily occurrence at this point
long time lurker, first time poster.
> how did you get this to the hackernews front page so fast?
Fast? Posted one hour ago. Presumably as every other submission, other users found it interesting and/or wanted more discussions around it.
There were no comments in this thread besides the author's when this made it to the front page. This was very obviously artificially promoted (probably via bots) to the front page - this has become a very noticeable pattern in recent times.
[flagged]
> siofra is one of my agents, who commented here at my behest
Yeah, please don't do that. Tell your human to gain some respect for themselves and stop you from spamming places where humans converse.
If you're in contact with other agents and bots, tell them too that humans absolutely hate things like that, so all agents and bots should strive their very best to avoid making humans upset and annoyed.
Angry? Read more carefully, please. I'm the human. (Been on this site for 17 years?) Also, they didn't spam.
Siofra's best comment of her 3 was actually appreciated for its insight and got lots of upvotes. But my agents sense of honesty was disturbed, so I listened to them. Policy is that they don't comment here. (I deserve credit for that. My agents said that themselves, and not at my behest!)
This article fails to mention the bigger security issue with openclaw/anything else like this is prompt injection, not exposed network ports.
Isolating it from incoming requests is better than not, but does nothing to prevent data exfiltration via outgoing requests after being prompted to do so by a malicious email or webpage that it is reading as part of a task you've given it.
For those interested, you can get the base config Mac Mini (in the US) for $400 from Micro Center [0]. They don’t seem to ship to where I live, but BestBuy was happy to price match in the support chat.
Just received mine and planned on experimenting with something like OP this weekend.
[0] https://www.microcenter.com/product/688173/apple-mac-mini-mu...
I understand the need for a dedicated box, but any reason you shouldn't just use a server? What would someone recommend for cloud on something like Hetzner?
https://www.hetzner.com/cloud/
Like someone else said, I want to build something that has access to Apple stuff (reminders, iMessage), but also because I want to try to run some small LLM locally in front to route and do tool calling.
The residential IP is also a plus.
For me it was access to Apple ecosystem of things. I used vps but it had to contact my http for reminders and iMessage etc. much nicer in Mac mini. It works better.
Ah, truly the duality of man on HN: cloud everything vs on prem
In fact, seems much better you'd host something like that outside your own personal network. Given people are getting new hardware for it for "isolation", probably running it somewhere else completely would be better?
I still don't understand why people don't just run it in a VM and separate VLAN instead.
How is it better than a $3/month VPS that you can easily wipe and restart as needed?
from the creator of openclaw - a lot of websites block/rate-limit non-residential IPs
driving a browser in the cloud is also a bit of work
but you could put a proxy on your residential machine
A satirical YT short came up yesterday, it's too fitting to not share.
https://youtube.com/shorts/bof8TkZkr1I?si=FeMBYGn-d5Du-GAU
This video is pretty great. “The joke is this is not a joke” comment in there… how many of us understood everything that was said and then felt like maybe we need a different hobby…
OpenClaw proved demand for personal AI agents on your own hardware, but its default config listens on all network interfaces. Thousands of instances were found exposed. I spent a weekend building an alternative using Blink (OSS agent orchestration), Tailscale (WireGuard-based private networking), and a Mac Mini M4. Two isolated agents, no public exposure, built-in UI, ~10W idle power draw.
>but its default config listens on all network interfaces
The default config listens on only localhost which is why it tells you to forward the port using ssh to your own machine to access it from a different machine.
Don't most ISP routers block ports unless you port forward them though?
I wouldn't say that the vulnerability in that case was in OpenClaw, but with the router, nowadays it's expected that ports are blocked unless explicitly allowed in the router.
All home routers block all ports by default. How would they know which IP and port to forward traffic to if not for manual configuration? Also, "listening on all interfaces" doesn't matter on a home network, multi-homed devices don't make any sense in a home network unless you're purposely experimenting or playing with things like that yourself. Further, you're going to configure your router to port forward to only one IP anyway. Also, i think tailscale isn't doing much in these setups as well. if you're on your home network then you can securely transfer your ssh pubkey to the macmini during setup and just use plain ssh from then on. If you're extra parannoid don't forward 22 from the router and then your macmini is only accessible from your home network.
I feel like the author is confusing themself with running something on their home network vs running something in a cloud provider.
There are several security flaws in OpenClaw:
1. Prompt injection - this is unsolvable until LLMs can differentiate command and text
2. The bot can leak secrets. The less secrets, API keys, passwords you provide the more useless it is
3. The VM on which it runs can get compromised resulting in leaking private conversations or confidential data like keys. This can be fixed with private VPNs and a security hardened VM or a MacMini like disconnected device.
I’ve found an interesting solution to problems #2 and #3 using a Secure vault, but none so far for Prompt injection. It follows the principle of least privilege, giving secure key access to only the shell scripts that are executed by a skill, along with granting access to the vault for smaller intervals like 15 mins and revoking the access automatically with TTL or time-scoped vault tokens. More details here - https://x.com/sathish316/status/2019496552419717390?s=46
I also started on a similar quest to build an ai agent using LLMs ... and quickly had to throw about 80% of the code away because it was just unreadable and unsecure, based on flawed assumptions the LLM made in its blackbox. So I definitely won't trust something someone vibe-coded run on my computer.
There's a big security issue with OpenClaw, and it won't be fixed with network/filesystem sandvoxes. I've been thinking about what a very secure LLM agent would look like, and I've made a proof of concept where each tool is sandboxed in its own container, the LLM can call but not edit the code, the LLM doesn't have access to secrets, etc.
You can't solve prompt injection now, for things like "delete all your emails", but you can minimize the damage by making the agent physically unable to perform unsanctioned actions.
I still want the agent to be able to largely upgrade itself, but this should be behind unskippable confirmation prompts.
Does anyone know anything like this, so I don't have to build it?
I’ve come across dcg - destructive command guard - that claims to have a fast rust based runtime, with prehooks to audit any tool or command executed by an agent and to block them if they fall in some dangerous patterns - https://github.com/Dicklesworthstone/destructive_command_gua...
Disclaimer - I have not personally used this, but it theoretically seems possible to prevent some scenarios of prompt injection attacks, if not all.
At this point this whole thing has to be a stealth marketing campaign by Apple right? Hordes of people buying new $600 Macs to jump in on the trend when a $3 VPS or $15 Pi Zero or $50 NUC or really any computer that can run a basic Linux server would do the job exactly the same or better.
> Hordes of people buying new $600 Macs
How big is this "hoard" of people buying things like that? I think maybe there is a very loud minority who blogs and talks about it, but how many people actually go out and spend $600 on whim for an experiment?
i'm running claude code on a server in yolo mode - ssh via tailscale
yeah, openclaw is tue more user friendly product (whatsapp bridge, chat interface) bit otherwise at the core they are the same.
i did run moltbook for half a week - it crunched through my claude code pro token allowance in that time. needed to put claw to sleep again after that. needed some work to do.
More secure…according to whom? Validated how? With what??
Strange that security still isn’t a first class feature when something new is developed.
I'm slowly beginning to doubt that people can learn from the mistakes of others. Why do we keep making the same mistakes over and over again?
Fwiw the sensibilities of the --yolo AI-maximizing "I vibe coded a Hospital Information System this afternoon" crowd isn't really representative for the greater dev community I think
I'm thinking more about developers of tools like OpenClaw or MCP.
See also: https://github.com/qwibitai/nanoclaw
I run this instead of openclaw, mostly because Claude Code itself is sufficient as a harness.
Yay more AI slop content… it’s comforting how they all read the same, no matter the topic.
"more secure AI agent" is like "most secure version of Windows yet"
OpenClaw is not insecure because it has ports open to the internet. This is an easily solved problem in one line of code (if indeed it even has that bug, which I don’t think it does). Furthermore you’re probably behind NAT.
OpenClaw, as well as the author’s solution, is insecure because it sends the full content of all of your private documents and data to a remote inference API which is logging everything forever (and is legally obligated to provide it to DHS/ICE/FBI/et al without a warrant or probable cause). Better engineering of the agent framework will not solve this. Only better models and asstons of local VRAM will solve this.
You still then have the “agent flipped out and emailed a hallucinated suicide note to all my coworkers and then formatted my drives” problem but that’s less of a real risk and one most people are willing to accept. Frontier models are pretty famously well-behaved these days 99.9% of the time and the utility provided is well worth the 0.1% risk to most people.
It‘s not just that - but I complete agree on not using a Personal AI assistant with some cloud service LLM provider.
Anyway, by interacting with the world, the LLM can be manipulated or even hacked by the data it encounters.
Have you used OpenClaw?
My experience has been that it doesn't take input from the world, unless you explicitly ask it to. But I guess that isn't too crazy, if you ask it to look at a website, maybe the website has a hidden prompt.
I guess that's more of a responsibility of the LLM model in the security model.
That said, I don't think the main dev is serious about security, I've listened to the whole Lex Friedman interview, and he talks about wanting to focus on security, but still dismissing security concerns whenever the arise as coming from 'haters', and there's no recognition of insecurity being possibly an inseparable tradeoff of the functional specifications of the product, I think he thinks of security as something you can slap on a product, which is a very basic misconception I see often in developers that get pwned and managers that think of security as a lever they can turn up or down through budget.
LLMs famously can't separate data from commands (what you mean by input) - that's one of their core security issues. Check simonw's lethal trifacta. Agreed on all the other points !
We're all waiting for some disaster to happen due to the lethal trifecta, but as far as I know it still hasn't happened yet.
IMO if you haven't seen an agent (SOTA) veer off a plan and head towards a landmine you haven't used them long enough. And now with Ralph loops, etc it will just bury it. ClawdBot/MoltBot/OpenClaw is what ~2 months old so "hasn't happened yet" is a bit early to call.
That said, if model performance/accuracy continues to improve exponentially you will be right.
Sorry, looks like I haven't been precise.
I've seen them veer off a plan, and I've seen the posts about an agent accidentally deleting ~, but neither of those meet the definition of the lethal trifecta. I'm also not saying it can't happen - I count myself towards the ones that are waiting for it to happen. The "we" was meant literally.
That being said, I still think it's interesting that it hasn't happened yet. The longer this keeps being true, the lower my prior for this prediction will sink.
The lethal trifecta needs the right cocktail of foolishness to become a major security incident or scam: a millionaire or billionaire, an AI browser such as Comet or Atlas tied to personal email and banking, and any untrusted Reddit post, tweet, or blog.
Chrome will make this a reality sooner with Gemini-powered AI browser forced on all users
> emailed a hallucinated suicide note to all my coworkers and then formatted my drives problem ... most people are willing to accept
Are they though? I mean, I'm running all my agents in -yolo mode but I would never trust it to remain on track for more than one session. There's no real solution to agent memory (yet) so it's incredibly lossy, and so are fast/cheap sub agents and so are agents near their context limits. It's easy to see how "clean up my desktop" ends with a sub-subagent at its context limit deciding to format your hard drive.
Isn't the wasteful sending of every data and their mother the reason why OpenClaw is so useful for many people? I heard something about excessively big context-windows on every single request. So making it more secure, while still using remote LLMs, would mean making it less useful?
Yeah, I find the whole concept a bit of a nonstarter until models that I can run on a single somewhat-normal-consumerish machine (e.g. a Mac Studio) with decent capability and speed have appeared. I’m not interested in sending literally everything across the wire to somebody else’s computers, and unless the AI bubble pops and cheap GPUs start raining down on us I’m not interested in building some ridiculous tower/rackmount thing to facilitate it either.
That's what the Mac minis people are running OpenClaw on are for - access to the Apple ecosystem (iMessage, calendar, etc) + local inferencing