Fresh Hacker News | Cloudflare's AI Platform: an inference layer designed for agents

▲Cloudflare's AI Platform: an inference layer designed for agents(blog.cloudflare.com)

197 points by nikitoci 7 hours ago | 16 comments

▲mips_avatar 8 minutes ago

So it's basically just openrouter with cloudflare argo networking? I feel like they could do some much more interesting stuff with their replicate acquisition. Application specific RL is getting so good but there's no good way to deploy these models in a scalable way. Even the providers like fireworks which claim to let you deploy LORAs in a scalable way can't do it. For now I literally have to host base load on my application on a rack of 3090s in my garage which seems silly but it saves me $1k a month.

▲james2doyle 2 hours ago

I find it really confusing that the worker AI models on here: https://developers.cloudflare.com/workers-ai/models/ do not have full overlap with the ones on here: https://developers.cloudflare.com/ai/models/

Yes, you can see the same "hosted" ones on there, but when you look at the models endpoint, there are much less options at the "workers-ai/*" namespace. Is that intentional?

▲james2doyle 1 hour ago

To better clarify, I don’t see "workers-ai/@cf/google/gemma-4-26b-a4b-it" in the /models enpoint in gateway.ai.cloudflare.com but it does seem to exist as a hosted model. Same with "workers-ai/@cf/nvidia/nemotron-3-120b-a12b" which I would expect to see

▲samjs 1 hour ago

Hey James.

Thanks for the feedback, and good catch. Looks like that endpoint is pulling from a slightly out of date data source. The docs/dashboard currently are the best resources for the full catalog, but we'll update that API to match.

▲whereistejas 5 hours ago

This actually looks very useful. Cloudflare seems to be brining together a great set of tools. Not to mention, D2 is literally the only sqlite-as-a-service solution out there whose reliability is great and free tier limits are generous.

▲mikeocool 2 hours ago

Agreed -- except that all of their docs and marketing pitches it for use cases like "per-user, per-tenant or per-entity databases" -- which would be SO great.

But in practice, it's basically impossible to use that way in conjunctions with workers, since you have to bind every database you want to use to the worker and binding a new database requires redeploying the worker.

▲AgentME 57 minutes ago

If you want to dynamically create sqlite databases, then moving to durable objects which are each backed by an sqlite database seems to be the way to go currently.

▲eis 1 hour ago

D1 reliability has been bad in our experience. We've had queries hanging on their internal network layer for several seconds, sometimes double digits over extended periods (on the order of weeks). Recently I've seen a few times plain network exceptions - again, these are internal between their worker and the D1 hosts. And many of the hung queries wouldn't even show up under traces in their observability dashboard so unless you have your own timeout detection you wouldn't even know things are not working. It was hard to get someone on their side to take a look and actually acknowledge and understand the problem.

But even without network issues that have plagued it I would hesitate to build anything for production on it because it can't even do transactions and the product manager for D1 openly stated they wont implement them [0]. Your only way to ensure data consistency is to use a Durable Object which comes with its own costs and tradeoffs.

https://github.com/cloudflare/workers-sdk/issues/2733#issuec...

The basic idea of D1 is great. I just don't trust the implementation.

For a hobby project it's a neat product for sure.

▲rs_rs_rs_rs_rs 1 hour ago

Yeah but the 10GB limit for D1 is crazy, can you really start building on that? Other than toy projects?

▲dpark 32 minutes ago

Really depends on what you’re putting in the DB. Cloudflare is clear that these are supposed to be very localized DBs. Per user or tenant.

▲kylehotchkiss 3 hours ago

* D1, but agreed. I wish Cloudflare would offer a built-in D1-R2 backups system though! (Can be done with custom code in a worker, but wish it was first-party)

▲BoorishBears 2 hours ago

> For those who don’t use Workers, we’ll be releasing REST API support in the coming weeks, so you can access the full model catalog from any environment.

Cloudflare seems to be building for lock-in and I don't love it. I especially don't understand how you build an OpenRouter and only have bindings for your custom runtime at launch.

▲switz 1 hour ago

Workers runtime is open source and permissively licensed fwiw

https://github.com/cloudflare/workerd

▲messh 13 minutes ago

So, is this similar to openrouter?

▲mips_avatar 7 minutes ago

with Argo networking

▲datadrivenangel 2 hours ago

Good to see their purchase of Replicate paying off!

▲bm-rf 6 hours ago

Not seeing any pricing info on the models[1] page. Wonder how much of a lift this is over paying providers directly. Perhaps Cloudflare is doing this at cost? Also interesting that zero data retention is not on by default, and is not supported with all providers[2]. Finally, would be great if this could return OpenAI AND Anthropic style completions.

[1] https://developers.cloudflare.com/ai/models/

[2] https://developers.cloudflare.com/ai-gateway/features/unifie...

▲samjs 3 hours ago

Hey! I'm one of the engineers who built this :)

We'll be adding prices to the docs and the model catalog in the dashboard shortly.

In short: currently the pricing matches whatever the provider charges. You can buy unified billing credits [1] which charges a small processing fee.

> Finally, would be great if this could return OpenAI AND Anthropic style completions.

Agreed! This will be coming shortly. Currently we'll match the provider themselves, but we plan to make it possible to specify an API format when using LLMs.

[1]: https://developers.cloudflare.com/ai-gateway/features/unifie...

▲agentifysh 1 hour ago

excellent! please make sure to include rate limit details as well.

▲yoavm 6 hours ago

Workers AI pricing is this: https://developers.cloudflare.com/workers-ai/platform/pricin...

▲bm-rf 5 hours ago

Thanks, I don't see pricing for foundation models however, such as GPT-5.4

▲ 4 hours ago

▲ashleypeacock 3 hours ago

It’s at-cost pricing I believe, no mark up

▲ramesh31 5 hours ago

Big, could be a viable Bedrock alternative. Probably better uptime than Anthropic or AWS, too.

▲Jack5500 6 hours ago

Sadly no mention on regions.

▲pjmlp 2 hours ago

It will work great in Spain! /s

▲pprotas 6 hours ago

Can't wait for the free tier!

▲yoavm 6 hours ago

Workers AI had a free tier since it launched, I think? See the pricing page I linked to above.

▲indigodaddy 4 hours ago

So looks like the AI Platform free tier will have access to the open models only perhaps? And the 10,000 neuron thing? I don't see any mention of frontier models in the url you linked in the other comment ( https://news.ycombinator.com/item?id=47792538#47793142 )

▲throwpoaster 6 hours ago

Anthropic gonna acquire Cloudflare for stock. Solves their infrastructure problems in one shot.

▲kylehotchkiss 3 hours ago

No way! Cloudflare will buy anthropic when the economy begins self-correcting. Looking forward to Workers AI getting all those H100s to run more Qwens

▲neya 6 hours ago

I'm not ready to for another rug pull, so please no :( I really enjoy Cloudflare's CDN.

▲ernsheong 5 hours ago

What is Cloudflare trying to be? Everything everywhere all at once?

▲charcircuit 5 hours ago

They want to be an edge networking platform. Anything that would be useful doing on an edge node close to the end user is in scope.

▲ 5 hours ago

▲PUSH_AX 5 hours ago

A CSP.

▲6thbit 6 hours ago

don’t attach to a single AI provider when you can attach to cloudflare as your single AI gateway provider!

rant aside, they are greatly positioned network wise to offer this service, i wonder about their princing and potential markup on top of token usage?

i presume they wont let you “manage all your AI spend in one place” for free.

▲koolba 6 hours ago

> i presume they wont let you “manage all your AI spend in one place” for free.

Of course they will. In return they get to control who they’re routing requests to. I wouldn’t be surprised if this turns I to the LLM equivalent of “paying for order flow”.

▲6thbit 6 hours ago

i got shivers thinking about a future ai dynamic pricing and automatic gateway choosing the cheapest provider available

▲nhecker 5 hours ago

Openrouter already does this, unless I've misunderstood the premise.

▲nubg 2 hours ago

shivers? as in it frightens you? i believe there is no way around tokens being prices like gasoline at the gas station - it changes every hour. Any other system means you are either over- or underspending.

▲wahnfrieden 6 hours ago

No spending limit / no ability to set a budget, unlike Google or OpenAI. Be prepared for an eye-watering invoice if you have a bug or get hacked.

edit: Why downvote? It's correct, and it's a risk that competitors handle better, including for their CDN products (compared to Bunny CDN). Maybe you are just used to the risk and haven't felt the burn yourself yet. Or you have the mistaken notion that there is no price at which temporary downtime is worthwhile to avoid paying.

▲james2doyle 2 hours ago

I just added some credits to my account. You can set a daily $ spend limit as well as add credits without auto-refill

▲mbtrucks 5 hours ago

Can I set a hard cost limit ? Else I'm not interested, don't be like googles mess of billing.

▲james2doyle 2 hours ago

Seems like it. I just added some credits to my account. You can set a daily $ spend limit as well as add credits without auto-refill

▲mbtrucks 5 hours ago

Can I set a hard cost limit per day ? With no drift, else I'm not interested.

▲tln 1 hour ago

I think you should look at OpenRouter. It has budget controls

▲stult 5 hours ago

A few weeks ago, I ran into a bug with Cloudflare's DNS server not detecting when I updated the records with the registrar. The bug was 100% on their end, entirely unsolvable by me, yet they have made it literally impossible to contact them to file a bug report. Their standard user help workflow dead-ended by forcing me to talk to their absolutely useless AI help chatbot, which proceeded to regurgitate their FAQ (inaccurately, uselessly), then referred me to a phone number that was disconnected/not in service, then gave me an email address that auto-replied it was no longer in use, then just looped back to the FAQ. There was no way for me to even send them an email to let them know they have a major bug.

I immediately pulled all my sites off of Cloudflare and I will never use that godawful nightmare of a company for anything ever again. If they can't even host a generic help bot without screwing it up that badly, why would I ever use them for anything at all, never mind an AI platform?

▲allthetime 1 hour ago

What was the bug? I configure DNS for both public and private networks on cloudflare semi-frequently and always see changes in minutes or less.