433 points by adam_gyroscope 32 days ago | 15 comments
ryanMVP 31 days ago
Reading the whitepaper, the inference provider still has the ability to access the prompt and response plaintext. This scheme does seem to guarantee that plaintext cannot be read for all other parties (e.g. the API router), and that the client's identity is hidden and cannot be associated with their request. Perhaps the precise privacy guarantees and allowances should be summarized in the readme.

With that in mind, does this scheme offer any advantage over the much simpler setup of a user sending an inference request:

- directly to an inference provider (no API router middleman)

- that accepts anonymous crypto payments (I believe such things exist)

- using a VPN to mask their IP?

macrael 30 days ago
Howdy, head of Eng at confident.security here, so excited to see this out there.

I'm not sure I understand what you mean by inference provider here? The inference workload is not shipped off the compute node once it's been decrypted to e.g. OpenAI, it's running directly on the compute machine on open source models loaded there. Those machines are cryptographically attesting to the software they are running. Proving, ultimately, that there is no software that is logging sensitive info off the machine, and the machine is locked down, no SSH access.

This is how Apple's PCC does it as well, clients of the system will not even send requests to compute nodes that aren't making these promises, and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

bjackman 30 days ago
> no one, not even people operating the inference hardware

You need to be careful with these claims IMO. I am not involved directly in CoCo so my understanding lacks nuance but after https://tee.fail I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?

The Ars Technica coverage of that publication has some pretty yikes contrasts between quotes from people making claims like yours, and the actual reality of the hardware features.

https://arstechnica.com/security/2025/10/new-physical-attack...

My current understanding of the guarantees here is:

- even if you completely pwn the inference operator, steal all root keys etc, you can't steal their customers' data as a remote attacker

- as a small cabal of arbitrarily privileged employees of the operator, you can't steal the customers' data without a very high risk of getting caught

- BUT, if the operator systematically conspires to steal the customers' data, they can. If the state wants the data and is willing to spend money on getting it, it's theirs.

macrael 30 days ago
I'm happy to be careful, you are right we are relying on TEEs and vTPMs as roots of trust here and TEEs have been compromised by attackers with physical access.

This is actually part of why we think it's so important to have the non-targetability part of the security stack as well, so that even if someone where to physically compromise some machines at a cloud provider, there would be no way for them to reliably route a target's requests to that machine.

michaelt 30 days ago
> I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?

xbox, playstation, and some smartphone activation locks.

Of course, you may note those products have certain things in common...

bjackman 29 days ago
Yeah that's a good point, I don't call that confidential compute though it's a different use case.

CoCo = protecting consumer data from the industry. DRM = protecting industry bullshit from the consumer.

TBF my understanding is that in the DRM usecases they achieve actual security by squeezing the TCB into a single die. And I think if anyone tries, they generally still always get pwned by physical attackers even though it's supposedly in scope for the threat model.

bragr 30 days ago
All things that were compromised with physical attacks? What are mod chips if not physical attack as a service?
jon-wood 30 days ago
I'm not aware of working jailbreaks for either Xbox Series or PS5. Its possible that's just a matter of time, but they've both been out for quite a while now it seems like the console manufacturers have finally worked out how to secure them.
einsteinx2 28 days ago
Older firmware versions of PS5 are in fact jailbroken (google ps5 jailbreak and you’ll find a bunch of info). I’m not aware of any for Xbox Series but I think that’s more due to lack of interest and the fact that you can run homebrew in development mode already.
zeusk 30 days ago
Nvidia has been investing in confidential compute for inference workloads in cloud - that covers physical ownership/attacks in their thread model.

https://www.nvidia.com/en-us/data-center/solutions/confident...

https://developer.nvidia.com/blog/protecting-sensitive-data-...

bjackman 29 days ago
It's likely I'm mistaken about details here but I _think_ tee.fail bypassed this technology and the AT article covers exactly that.
jiveturkey 30 days ago
> The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

that cannot be met, period. your asssumptions around physical protections are invalid or at least incorrect. It works for Apple (well enough) because of the high trust we place in their own physical controls, and market incentive to protect that at all costs.

> This is how Apple's PCC does it as well [...] and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

just based on my recollection, and I'm not going to have a new look at it to validate what I'm saying here, but with PCC, no you can't actually do that. With PCC you do get an attestation, but there isn't actually a "confidential compute" aspect where that attestation (that you can trust) proves that is what is running. You have to trust Apple at that lowest layer of the "attestation trust chain".

I feel like with your bold misunderstandings you are really believing your own hype. Apple can do that, sure, but a new challenger cannot. And I mean your web page doesn't even have an "about us" section.

dcliu 30 days ago
That's a strong claim for not looking into it at all.

From a brief glance at the white paper it looks like they are using TEE, which would mean that the root of trust is the hardware chip vendor (e.g. Intel). Then, it is possible for confidentiality guarantees to work if you can trust the vendor of the software that is running. That's the whole purpose of TEE.

jiveturkey 30 days ago
I guess you're unaware that Intel TEE does not provide physical protection. Literally out of scope, at least per runZero CEO (which I didn't verify). But anyway, in scope or not, it doesn't succeed at it.

And I mean I get it. As a not-hardware-manufacturer, they have to have a root of trust they build upon. I gather that no one undertakes something like this without very, very, very high competence and that their part of the stack _is_ secure. But it's built on sand.

I mean it's fine. Everything around us is built that way. Who among us uses a Raptor Talus II and has x-ray'd the PCB? The difference is they are making an overly strong claim.

9dev 30 days ago
It doesn’t matter either way. Intel is an American company as well, and thus unsuitable as a trust root.
bangaladore 30 days ago
A company of what country would you prefer?

Everyone likes to dunk on the US, but I doubt you could provide a single example of a country that is certainly a better alternative (to be clear I believe many of the west up in the same boat).

9dev 30 days ago
A European one. Pulling the kind of tricks the NSA does is considerably harder if you don’t have a secret court with secret orders.
bangaladore 24 days ago
You might want to look into what GCHQ, DGSE, and BND (as examples) actually do. Europe is not some surveillance-free zone.
jiveturkey 30 days ago
> Intel is an American company

Literally.

brookst 30 days ago
If you’re moving the goalposts from tech implementation to political vibes, it’s just more post-fact nabobism.
9dev 30 days ago
"SSL added and removed here :-)"

It’s not about vibes, but clear proof of a strategy to undermine global information security. Is anyone suppose to believe they don’t do that anymore?

macrael 30 days ago
Apple actually attests to signatures of every single binary they install on their machines, before soft booting into a mode where no further executables can be installed: https://security.apple.com/documentation/private-cloud-compu...

We don't _quite_ have the funding to build out our own custom OS to match that level of attestation, so we settled for attesting to a hash of every file on the booted VM instead.

jiveturkey 30 days ago
> Apple actually attests to signatures

But (based on light reading, forgive errors) the only way to attest them is to ask _Apple_! It reminds me what i call e2e2e encryption. iMessage is secure e2e but you have to trust that Apple is sending you the correct keys. (There's some recent update, maybe 1-2 years old, where you can verify the other party's keys in person I think? But it's closed software, you _still_ have to trust that what you're being shown is something that isn't a coordinated deception.)

Apple claims to operate the infrastructure securely, and while I believe they would never destroy their business by not operating as rigorously as they claim, OTOH they gave all the data to China for Chinese users, so YMMV. And their OS spams me with ads for their services. I absolutely hate that.

Again, anyway, I am comfortable putting my trust in Apple. My data aren't state secrets. But I wouldn't be putting my trust in random cloud operator based on your known-invalid claim of physical protection. Not if the whole point is to protect against an untrustworthy operator. I would much sooner trust a nitro enclave.

brookst 30 days ago
You should read the PCC paper: https://security.apple.com/blog/private-cloud-compute/

You are not in fact trusting Apple at all. You are trusting some limited number of independent security researchers, which is not perfect, but the system is very carefully designed to give Apple themselves no avenue to exploit without detection.

astrange 30 days ago
> OTOH they gave all the data to China for Chinese users, so YMMV

This is true for the same reason that American data is in the US. China is frequently a normal and competent country and has data privacy laws too.

ryanMVP 30 days ago
Thanks for the reply! By "inference provider" I meant someone operating a ComputeNode. I initially skimmed the paper, but I've now read more closely and see that we're trying to get guarantees that even a malicious operator is unable to e.g. exfiltrate prompt plaintext.

Despite recent news of vulnerabilities, I do think that hardware-root-of-trust will eventually be a great tool for verifiable security.

A couple follow-up questions:

1. For the ComputeNode to be verifiable by the client, does this require that the operator makes all source code running on the machine publicly available?

2. After a client validates a ComputeNode's attestation bundle and sends an encrypted prompt, is the client guaranteed that only the ComputeNode running in its attested state can decrypt the prompt? Section 2.5.5 of the whitepaper mentions expiring old attestation bundles, so I wonder if this is to protect against a malicious operator presenting an attestation bundle that doesn't match what's actually running on the ComputeNode.

macrael 30 days ago
Great questions!

1. The mechanics of the protocol are that a client will check that the software attested to has been released on a transparency log. dm-verity is what enforces that the hashes of the booted filesystem on the compute node match what was built and so those hashes are what are put on the transparency log, with a link to the deployed image that matches them. The point of the transparency log is that anyone could then go inspect the code related to that release to confirm that it isn't maliciously logging. So if you don't publish the code for your compute nodes then the fact of it being on the log isn't really useful.

So I think the answer is yes, to be compliant with OpenPCC you would need to publish the code for your compute nodes, though the client can't actually technically check that for you.

2. Absolutely yes. The client encrypts its prompt to a public key specific to a single compute node (well, technically it will encrypt the prompt N times for N specific compute nodes) where the private half of that key is only resident in the vTPM, the machine itself has no access to it. If the machine were swapped or rebooted for another one, it would be impossible for that computer to decrypt the prompt. The fact that the private key is in the vTPM is part of the attestation bundle, so you can't fake it

Terretta 31 days ago
> the inference provider still has the ability to access the prompt and response plaintext

Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

BYOK does cover most of it, but oh look, you brought me and my code your key, thanks… Apple's approach, and certain other systems such as AWS's Nitro Enclaves, aim at this last step of the problem:

- https://security.apple.com/documentation/private-cloud-compu...

- https://aws.amazon.com/confidential-computing/

NCC Group verified AWS's approach and found:

1. There is no mechanism for a cloud service provider employee to log in to the underlying host.

2. No administrative API can access customer content on the underlying host.

3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

4. There is no mechanism for a cloud service provider employee to access encrypted data transmitted over the network.

5. Access to administrative APIs always requires authentication and authorization.

6. Access to administrative APIs is always logged.

7. Hosts can only run tested and signed software that is deployed by an authenticated and authorized deployment service. No cloud service provider employee can deploy code directly onto hosts.

- https://aws.amazon.com/blogs/compute/aws-nitro-system-gets-i...

Points 1 and 2 are more unusual than 3 - 7.

Folks who enjoy taking things apart to understand them can hack at Apple's here:

https://security.apple.com/blog/pcc-security-research/

* Except by, say, withdrawing the system (see Apple in UK) so users have to use something less secure, observably changing the system, or other transparency trippers.

michaelt 30 days ago
> 3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

Are you telling me customer services can't reset a customer's forgotten console login password?

Terretta 18 days ago
In these systems secured to this degree, yes.
jiggawatts 30 days ago
My logic is that these "confidential compute" problems suffer from some of the same issues as "immutable storage in blockchain".

I.e.: If the security/privacy guarantees really are as advertised, then ipso facto someone could store child porn in the system and the provider couldn't detect this.

Then by extension, any truly private system is exposing themselves to significant business, legal, and moral risk of being tarred and feathered along with the pedos that used their system.

It's a real issue, and has come up regularly with blockchain based data storage. If you make it "cencorship proof", the by definition you can't scrub it of illegal data!

Similarly, if cloud providers allow truly private data hosting, then they're exposing themselves to the risk of hosting data that is being stored with that level of privacy guarantees precisely because it is so very, very illegal.

(Or substitute: Stolen state secrets that will have the government come down on you like a ton of bricks. Stolen intellectual property. Blackmail information on humourless billionaires. Illegal gambling sites. Nuclear weapons designs. So on, and so forth.)

avianlyric 30 days ago
This is hardly a new problem that only appears in the cloud. Any bank that offers a private secure storage facility I.e. a safety deposit box, or anyone that offers a PO Box service is also exposed to the same risk.

But both of these services exist, and have existed for hundreds of years, and don’t require service providers to go snooping though their customer’s possessions or communications.

bangaladore 30 days ago
> I.e.: If the security/privacy guarantees really are as advertised, then ipso facto someone could store child porn in the system and the provider couldn't detect this.

But what they would be storing in this case is not illegal content. Straight up. Encrypted bits without a key are meaningless.

There is nothing stopping a criminal from uploading illegal content to Google drive as an encrypted blob. There's nothing Google can do about it, and there is no legal repercussion (to my knowledge) of holding such a blob.

saithound 30 days ago
You're simply wrong about this. "I don't know the key" is not legal defense even against hosting an encrypted blob of copyright infringing contemt, much less an encrypted blob of illegal pornography.
bangaladore 24 days ago
If this were the case nobody would ever offer file hosting services (eg. Google Drive). Do you have any case history to show any company getting prosecuted for unknowingly hosting encrypted blobs of illegal material?

Obviously if they have to ability to know material is illegal, that's a problem.

And exactly what algorithm can you provide to me that takes an encrypted blob as an input and returns whether it is not illegal material. Clearly that doesn't exist, so your point makes zero sense.

You may be conflating "I forgot the key" vs "I've never been provided the key"

saithound 23 days ago
I think you misunderstand jiggawatt, who wasn't talking about unknowingly hosting illegal material.

We're talking about knowingly hosting encrypted illegal material without knowing the key. This is unambiguously illegal whether or not you ever knew the key.

If the police show up and tell you that your site has an encrypted zip file containing illegal porn, of course they can instruct you to stop hosting it, and hold you liable if you refuse to follow those instructions.

They're not going to give you the decryption key to check for yourself, and it'd not even be legal for them to do so.

Jiggawatt is saying that if you have a truly uncensorable system, it's impossible to comply with the police instructions to selectively remove the illegal material, and so the whole thing becomes illegal.

> And exactly what algorithm can you provide to me that takes an encrypted blob as an input and returns whether it is not illegal material. Clearly that doesn't exist, so your point makes zero sense.

This on the other hand, tells me you don't know much about how legal systems work. I recommend you start with the essay "What color are your bits?" [1]

[1] https://ansuz.sooke.bc.ca/entry/23

bangaladore 19 days ago
I'm not sure I agree that OCs argument was focused on knowing that you were hosting illegal material that is encrypted. I'd argue that no-where in jiggawatt's comment is that argued. I think that's your argument, which is fine, and I agree with that. I also agree that you can be compelled to remove data, encrypted or not from your servers through lawful orders and if your system is designed in a blockchain like manner where it is not possible to remove illegal content, that's an even bigger issue.

My point all along, is that Google is not liable for someone uploading previously encrypted blobs of illegal content to Google Drive. And even more so, Google isn't liable if someone uploads illegal content to Google Drive that isn't encrypted. Google simply needs to remove it and follow the correct processes if reported / detected.

Could you make an argument for either that theoretically they could be? Sure. But in reality, no, they are not liable.

This is law due to Section 230:

> Section 230 of the Communications Act of 1934, enacted as part of the Communications Decency Act of 1996, provides limited federal immunity to providers and users of interactive computer services. The statute generally precludes providers and users from being held liable—that is, legally responsible—for information provided by another person, but does not prevent them from being held legally responsible for information that they have developed or for activities unrelated to third-party content. Courts have interpreted Section 230 to foreclose a wide variety of lawsuits and to preempt laws that would make providers and users liable for third-party content. For example, the law has been applied to protect online service providers like social media companies from lawsuits based on their decisions to transmit or take down user-generated content.

https://www.congress.gov/crs-product/R46751 https://www.eff.org/issues/cda230

Also, the blockchain problem already exists. I've linked some commentary about it.

https://ethereum.stackexchange.com/questions/94558/what-prev...

brookst 30 days ago
You’ve hit on exactly why Apple proposed that client-side, pre-upload CSAM detection which everyone freaked out about.

Instead, iCloud, Google Drive, and similar all rely on being able to hash content post-upload for exactly that reason.

sublimefire 31 days ago
Yes but at the end of the day you need to trust the cloud provider tools which expands the trust boundary from just hardware root of trust. Who is to guarantee they will not create a malicious tool update and push it then retract it? It is nowhere captured and you cannot prove it.
avianlyric 30 days ago
You can detect and prove it because the hardware attestation signature will change.

You might not know what change was made, or have any prior warning of the change. But you will be able to detect it happening. Which means an operator only gets to play that card once, after which nobody will trust them again.

7e 31 days ago
At the end if the day, Nitro Enclaves are still “trust Amazon”, which is a poor guarantee. NVIDIA+AMD offers hardware backed enclave features for their GPUs which is the superior solution here.
iancarroll 31 days ago
Aren’t they both hardware backed, just changing the X in “trust X”?
Terretta 18 days ago
You think Nitro Enclaves aren't hardware backed?
almostgotcaught 30 days ago
be sure to let us know when you can run eg nginx on a GPU in said enclave.
amelius 31 days ago
> Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

It's even harder to do this plus the hard requirement of giving the NSA access.

Or alternatively, give the user a verifiable guarantee that nobody has access.

astrange 30 days ago
That's what non-targetability is for.
immibis 31 days ago
It's probably illegal for a business to take anonymous cryptocurrency payments in the EU. Businesses are allowed to take traceable payments only, or else it's money laundering.

With the caveat that it's not clear what precisely is illegal about these payments and to what level it's illegal. It might be that a business isn't allowed to have any at all, or isn't allowed to use them for business, or can use them for business but can't exchange them for normal currency, or can do all that but has to check their customer's passport and fill out reams of paperwork.

https://bitcoinblog.de/2025/05/05/eu-to-ban-trading-of-priva...

30 days ago
anon721656321 31 days ago
at that point, it seems easier to run a slightly worse model locally. (or on a rented server)
rimeice 31 days ago
Which is apples own approach until the compute requirements need them to run some compute on cloud.
bigyabai 30 days ago
Just a shame they spent so long skimping on iPhone memory. The tail-end of support for 4gb and 6gb handsets is going to push that compute barrier pretty low.
brookst 30 days ago
Eh, maybe a bit, but those era devices also have much lower memory bandwidth. I suspect that the utility of client models will rule out those devices for other reasons than memory.
bigyabai 29 days ago
> much lower memory bandwidth

Not really? The A11 Bionic chip that shipped with the iPhone X has 3gb of 30gb/s memory. That's plenty fast for small LLMs if they'll fit in memory, it's only ~1/3rd of the M1's memory speed and it only gets faster on the LPDDR5 handsets.

A big part of Apple's chip design philosophy was investing in memory controller hardware to take advantage of the iOS runtime better. They just didn't foresee any technologies beside GC that could potentially inflate memory consumption.

rasengan 31 days ago
We are introducing Verifiably Private AI [1] which actually solves all of the issues you mention. Everything across the entire chain is verifiably private (or in other words, transparent to the user in such a way they can verify what is running across the entire architecture).

[1] https://ai.vp.net/

jmort 31 days ago
It should be able to support connecting via an OpenPCC client, then!
poly2it 31 days ago
Whitepaper?
derpsteb 31 days ago
I was part of a team that does the same thing. Arguably as a paid service, but source availability and meaningful attestation.

Service: https://www.privatemode.ai/ Code: https://github.com/edgelesssys/privatemode-public

jmort 30 days ago
OpenPCC is Apache 2.0 without a CLA to prevent rugpulls whereas edgeless is BSL
jiveturkey 30 days ago
<3
m1ghtym0 31 days ago
Exactly, attestation is what matters. Excluding the inference provider from the prompt is the USP here. Privatemode can do that via an attestation chain (source code -> reproducible build -> TEE attestation report) + code/stack that ensures isolation (Kata/CoCo, runtime policy).
saurik 31 days ago
Yes: "provably" private... unless you have $1000 for a logic analyzer and a steady hand to solder together a fake DDR module.

https://news.ycombinator.com/item?id=45746753

Lord-Jobo 31 days ago
well, also indefinite time and physical access.
saurik 31 days ago
Which is what the provider themselves have, by definition. The people who run these services are literally sitting next to the box day in and day out... this isn't "provably" anything. You can trust them not to take advantage of the fact that they own the hardware, and you can even claim it makes it ever so slightly harder for them to do so, but this isn't something where the word "provably" is anything other than a lie.
anon721656321 31 days ago
yeah, for a moment I was reading it as being a holomorphic encryption type setup, which I think is the only case where you can say 'provably private'.

It's better than nothing, I guess...

But if you placed the server at the NSA, and said "there is something on here that you really want, it's currently powered on and connected to the network, and the user is accessing it via ssh", it seems relatively straightforward for them to intercept and access.

anon5739483 30 days ago
[dead]
sublimefire 31 days ago
If you trust the provider then it does not make it much better to use such architecture. If you do not then at least the execution should be inside a confidential system so that even soldering would not get you to data
rossjudson 30 days ago
GCP can and does live migrate confidential VMs between machines. Which of the 50k machines in a cluster were you going to attach your analyzer to?
saurik 30 days ago
1) If you were GCP (as they are the attacker in this scenario), you'd attach the analyzer to ANY (!) ONE (!) server and then you migrate the user's workload that you wanted to snoop on (or were required to snoop on by the FBI) to your evil server. Like, you are clearly trying to say this makes it harder (though even if this were true that doesn't make it at all "provable")... but, if you support migration, you actually made it EASIER for you (aka, GCP) to abuse your privileged position.

2) These attacks are actually worse than what I am pretty sure you are assuming (and so where I started my response), as you actually just need one hacked server and then you can simulate working servers on other hardware that isn't hacked by either stealing an attested key or stealing the attestation key itself. You often wouldn't even then need to have the hacked server anymore.

comex 30 days ago
Apple’s PCC has an explicit design goal to mitigate your point 1, by designing the protocol such that the load balancer has to decide which physical server to route a request to without knowing which user made the request. If you compromise a single physical server, you will get a random sample of requests, but you can’t target any particular user, not even if you also compromise the load balancer. At least that’s the theory; see [1] under the heading “non-targetability”. I have no idea whether OpenPCC replicates this, but I have to imagine they would. The main issue is that you need large scale in order for the “random sample of requests” limitation to actually protect anyone.

[1]: https://security.apple.com/blog/private-cloud-compute/

saurik 30 days ago
There are many things one can do to mitigate the (weaker) point 1, including simply not supporting any kind of migration at all. I only bothered to go there to demonstrate that the ability to live migrate is a liability here, not a benefit.

> targeting users should require a wide attack that’s likely to be detected

Regardless, Apple's attacker here doesn't sound like Apple: the "wide attack that's likely to be detected" is going to be detected by them. We even seemingly have to trust them that this magic hardware has the properties they claim it does.

This is way worse than most of these schemes, as if I run one of these on Intel hardware, you inherently are working with multiple parties (me and Intel).

That we trust Apple to not be lying about the entire scheme so they can see the data they are claiming not to be able to see is thereby doing the heavy lifting.

kiwicopple 31 days ago
impressive work jmo - thanks for open sourcing this (and OSI-compliant)

we are working on a challenge which is somewhat like a homomorphic encryption problem - I'm wondering if OpenPCC could help in some way? :

When developing websites/apps, developers generally use logs to debug production issues. However with wearables, logs can be privacy issue: imagine some AR glasses logging visual data (like someone's face). Would OpenPCC help to extract/clean/anonymize this sort of data for developers to help with their debugging?

jmort 31 days ago
Yep, you could run an anonymization workload inside the OpenPCC compute node. We target inference as the "workload" but it's really just attested HTTP server where you can't see inside. So, in this case your client (the wearable) would send its data first through OpenPCC to a server that runs some anonymization process.

If it's possible to anonymize on the wearable, that would be simpler.

The challenge is what does the anonymizer "do" to be perfect?

As an aside, IMO homomorphic encryption (still) isn't ready...

wferrell 31 days ago
Really nice release. Excited to see this out in the wild and hopeful more companies leverage this for better end user privacy.
sublimefire 31 days ago
Quite similar to what Azure with conf ai inference did [1].

[1] https://techcommunity.microsoft.com/blog/azureconfidentialco...

jmort 30 days ago
I haven’t been able to find their source code. Pretty important for the transparency side of it. Have you seen it?
DeveloperOne 31 days ago
Glad to see Golang here. Go will surpass Python in the AI field, mark my words.
jabedude 31 days ago
Where is the compute node source code?
utopiah 31 days ago
That's nice... in theory. Like it could be cool, and useful... but like what would I actually run on it if I'm not a spammer?

Edit : reminds me of federated learning and FlowerLLM (training only AFAIR, not inference), like... yes, nice, I ALWAYS applaud any way to disentangle from proprieaty software and wall gardens... but like what for? What actual usage?

utopiah 31 days ago
Gimme an actual example instead of downvoting, help me learn.

Edit on that too : makes me think of OpenAI Whisper as a service via /e/OS and supposedly anonymous proxying (by mixing), namely running STT remotely. That would be an actual potential usage... but IMHO that's low end enough to be run locally. So I'm still looking for an application here.

wat10000 31 days ago
Are you looking for a general application of LLMs too large to run locally? Because anything you might use remote inference for, you might want to use privately.
utopiah 30 days ago
Sure that'd do, what

- useful thing (according to someone specific requirements, maybe hallucinations are OK, maybe not) that

- needs privacy (for example generating code that will be open source probably does not need that)

- can't be run locally

- can be trusted to actually process as said it does

wat10000 30 days ago
I've found LLMs to be extremely useful for writing helper scripts (debugger enhancements, analyzing large disassembly dumps, that sort of thing) and as a next-level source code search. Take a large code base and you get an error in component A involving a type in component B and it's not immediately obvious how the two are connected. I've had great success in giving an LLM the error message and access to the code and asking it how B got to A and why that's an error. This is something I could certainly do myself, but there are times when it would take 100x longer.

The key is that these are all things I can verify without much difficulty: read over the script, spot-check the analysis, look at the claimed connection between A and B and see if it's real. And I don't really care about style, quality, maintainability.

You certainly can run this locally, but anything that will fit into reasonable local hardware won't be as good.

I don't need to trust it to process as it says it does, because I'm verifying the output.

And as far as I'm concerned, "needs privacy" is always true. I don't care if the code will be open source. I don't care if it's analyzing existing code that's already open source. Other people have no business seeing what I'm doing unless I explicitly allow it. In any case, I work on a lot of proprietary code as well, and my employer would be most displeased if I were exposing it to others.

fragmede 30 days ago
> would I actually run on it if I'm not a spammer?

> Gimme an actual example instead of downvoting, help me learn.

Basically you asked a bunch of people on a privacy minded forum, why should they be allowed to encrypt their data? What are you (they) hiding!? Are you a spammer???

Apple is beloved for their stance on privacy, and you basically called everyone who thinks that's more than marketing, a spammer. And before you start arguing no you didn't, it doesn't matter that you didn't, what matters is that that's how your comment made people feel. You can say they're the stupid ones because that's not what you wrote, but if you're genuinely asking for feedback about the downvotes, there you are.

You seriously can't imagine any reason to want to use an LLM privately other than to use it to write spam bots and to spam people? At the very least expand your scope past spamming to, like, also using it to write ransomware.

The proprietary models that can't be run locally are SOTA and local models, even if they can come close, simply aren't what people want.

utopiah 30 days ago
Seems I formulated my question in a way that wasn't clear.

I specifically like the privacy aspect (even though honestly I think most people in this forum claim they do and yet they rely on BigTech with which they their data, so IMHO most people on HN are not as demanding as you describe) the question is precisely about what to run.

FWIW I specifically keep a page on self hosting AI (you can check if you want to see if it's real at https://fabien.benetou.fr/Content/SelfHostingArtificialIntel... ) so again, the privacy aspect is crucial to me.

The question is ... what to actually run. What can't be run locally that would be useful. What model for which tasks. For example coding (which I don't think models are good enough for) typically would NOT need this because one wouldn't share any PII over there, hopefully would even instead publish the resulting code as open source.

So my provocation about spammer is... because they often ARE actual users of LLMs. They use LLM for their language capabilities, namely craft a message that is always slightly different yet convey (roughly) the same meaning (the scam) to avoid detection. A random person though using LLM might NOT be OK with hallucinations when they use it for their own private journal or chat.

So... what for?

Edit: I did use Apple for years, recommending by someone at Mozilla, and I moved away from them earlier this year precisely because even though they are better than others, e.g Google, IMHO it's just not good enough for me. No intermediary is better than one with closed source.

fragmede 28 days ago
Okay, so you had a whole agenda. Could have just been more transparent if you came out and said that directly.

It's an LLM. The user can ask it anything they can think of! But then to to "only spammers could eke out anything from them".

What do users ask OpenAI about? My boyfriend/girlfriend/wife/mistress cheated on me, what should I do? My mom is... My dad is... My friends are.. I have this problem with this company and I want to sue them... I have this weird lump on my foot... More nefariously, I'm sure someone out there asking "how do I make cocaine" is seriously considering it, and not just testing the machine. I want to talk to somebody about 1994, the TV series from 2019. I want to write a fiction book about the near future but one where I won the lottery or I grew up rich or I was Harry Potter or a murder mystery or utopian sci-fi or dystopian sci-fi or an alt-history where there are still dinousaurs or or or.

I don't know if it's a failure of your imagination, or if mine is overactive, but making a venn diagram of all the world's humans broken down into spammers and not spammers, and then placing the circle users of local LLMs inside of spammers, and there's no one else, just seems a bit reductive.

30 days ago
nixpulvis 31 days ago
Thought this was going to be about Orchard from the title.
MangoToupe 31 days ago
@dang can we modify the title to acknowledge that it's specific to chatbots? The title reads like this is about generic compute, and the content is emphatically not about generic compute.

I realize this is just bad branding by apple but it's still hella confusing.

jmort 31 days ago
It does work generically. Like Apple, we initially targeted inference, but it under the hood just an anonymous, attested HTTP server wrapper. The ComputeNode can run an arbitrary workload.
MangoToupe 31 days ago
Interesting!
mr_windfrog 31 days ago
[dead]
okelahbos28 31 days ago
[flagged]
pjmlp 31 days ago
[flagged]
kreetx 31 days ago
I read this and your reply to the sibling, you seem to have reputation to be sensible - what are you trying to say? If someone re-implements or reverses a service then it doesn't need to be in the same language.
almostgotcaught 31 days ago
This dude stays commenting on things he doesn't actually understand anything about. I have run into him multiple times in threads on what I do (compilers) and he's clueless but insistent.
pjmlp 31 days ago
Thakfully that the Internet is full of analists with such deep understanding of others personality, otherwise we would all be lost.

As for the matter at hand, I assume that such implementation is validated against which specification exactly?

Gladly to be educated in my clueless.

almostgotcaught 31 days ago
I guess English isn't your first language so nbd but I have no idea what you're asking.
pjmlp 31 days ago
To be educated on the public specification of Apple's Private Compute Cloud so that I become less clueless, according to you.
almostgotcaught 31 days ago
This question makes zero sense - PCC is a (proprietary) system not an interface. There is no spec just like there's no spec for how you have the furniture arranged in your own house.
pjmlp 30 days ago
Please explain the audience the title of this submission given that answer.

Apparently some of us actually have a clue.

pjmlp 31 days ago
Pedantic yes, sensible not really, sensible folks don't survive the level of BBS and USENET discussion forums.

To make a full implementation of a Apple product, the specification for that Apple product must exist in some form.

kreetx 27 days ago
Right, but whatever the languages used by Apple to implement their cloud is not really related to the Swift language. I guess you should have said what you said just now first.
mlnj 31 days ago
It is an implementation. As long has it behaves the same...
rrdharan 30 days ago
I think the parent has a valid point. The actual README says "inspired by Apple’s Private Cloud Compute".

I think it's more fair to say it implements the same idea but it is not an opensource implementation of Apple's Private Compute Cloud the way e.g. minio is an implementation of S3, so the HN title is misleading.

pjmlp 31 days ago
[flagged]
parting0163 31 days ago
the point here wasn't to be a complete clone of Apple's PCC.
pjmlp 31 days ago
Title says otherwise.
Robin_Message 31 days ago
It's not a drop-in replacement; rather it is an implementation of the same ideas (+ some extra ones) but open source so it can be used for things other than Apple devices.
pjmlp 30 days ago
Which isn't the same as the title suggests.
31 days ago