In my opinion the question isn’t so much “if” but rather “when”.
When will AI research and hardware capabilities reach a point that it’s practical to embed something like that into a regular document?
We’ve already seen proof of concept LLMs embedded into OpenType fonts.
I guess the other question is then “what capabilities would these AI agents have?” You’d hope just permission to present within that document. But that depends entirely on what unpatched vulnerabilities are lurking (such as the Microsoft ANSI RCE also featured on the HN front page)
> // Use interpreted JS only to avoid RWX pages in our address space. Also, --jitless implies --no-expose-wasm, which reduce exposure since no PDF should contain web assembly.
The first widespread AI Malware will be a historic moment in this century. It will adapt like a real biological virus to its host and we have no cure for this.
Related: Ange Albertini, the creator of the .PDF/.ZIP/ELF reference diagrams (github/corkami) has started posting overview videos on his YT channel (@corkami-albertini) including creating .PDF+.PNG+.ZIP chimera files.
i recently discovered that the Canadian government depends on this for some fillable forms, because it shows a message at the top that says "JavaScript is disabled" and all the boxes show errors. i couldn't get it to work on Linux and had to dust off a Windows machine (and it still didn't work in firefox, it needed acrobat reader).
I have faced this exact problem with Canadian govt forms.
Evince doesn't support them. They are so specific about only adobe acrobat to fill out the forms.
I can open them in firefox but can't update them properly
The only option is to use my barely hanging on 10-yr old windows machine.
Let's hope that eventually they move on to a simpler web form.
I was joking in 2007, when I was working at Siemens, to my boss, that an Excel cell can contain God and the Multiverse when I put an ActiveX inside that was basically a program I made which would draw a 3D animation based on parameters contained on other cells. Let's say the boss was impressed though for me was just basic OLE.
I see from time to time that younger generations reinvent/rediscover the wheel and I chuckle.
I, for one, was surprised that Chrome's PDF renderer would allow persistent JS code like this to run - not just limited code in response to user actions, but a real game loop.
But there's a spec for all this and everything! https://www.t10.org/ftp/js_api_reference.pdf (2007) - be warned, the light of Ecma TC39 standardization does not extend to this place.
From a security perspective, they're able to build on top of V8 isolate primitives and Chrome's sandboxing systems - but from the logs, security improvements in PDFium are being continuously developed as recently as the past few weeks! I feel like I've stumbled upon a parallel universe, in the best possible way.
Gzipped PostScript documents were fairly popular during the 90's and are functionally identical to PDFs for 99% of use cases. (PDF is essentially PostScript, but with more features.)
Ok, I kinda knew it was possible (I guess, anybody did), but this should be a very illustrative example. And unfortunately it doesn't seem like PDFs are gonna go away (though, really, why the hell there isn't any alternative?!) So it raises the question: is there any way to handle this garbage safely? I.e. in a way it couldn't run JS? I'm pretty sure it is not really necessary to read 99.999% PDFs out there.
PDFs are still used to delver malware. Adobe gets picked on less often now since everyone has PDF readers in the browser but that just makes chrome the new target of choice (not that alternative viewers don't get attention too https://thehackernews.com/2024/05/foxit-pdf-reader-flaw-expl...) but what I see most often in malicious PDF files recently are just links to websites that contain malware since they can work no matter what your viewer is.
i've tried making "interactive" PDFs before but using POST and server side rendering rather than client, e.g. a PDF typewriter i made a little while back on http://news.coffee
Took a bit of prompting but was able to get a semi-working (only in Chrome) Flappy Bird out of Claude in ~10 minutes. Seems like the collision detection needs some work :)
"It was a bit tricky to find a union of features that work in both engines [..]"
I am curious what the constraints are to make this work and in which environments it does? Does it work in PDF viewers outside the browser? Is there documentation what is available in which environment? What is enabled by default, can be switched on or off?
I barely looked at Adobe Reader so not sure about that one, it definitely does not work with this PDF though, likely because it's not compliant in several ways. Besides that I wouldn't be surprised if it supports all the required JS APIs and more, just possibly behind some permission prompts.
It might work in Foxit as I believe it supports some scripting. Most of the other native PDF renderers are more static, as far as I know. In either case, I was most interested in the browser-native engines, as I always thought of them as more "static"/limited.
As for documentation on specific features: to be honest, I just looked at the implementations of PDF.js and PDFium. Both only support a subset of the "standard" API, likely for security reasons. But PDF.js for example allows changing a field's background color (colored pixels!), and PDFium allows modifying their position/bounding box (I tried a high res color display by moving a row vertically as if it's a scanline, but things become quite laggy).
I got the same conclusions. Unless I misunderstood, Pdfium is based on Foxit so that should work. And as both pdf.js and pdfium decided to implement only a thin part of the adobe js sdk, then there are good chances that it works there too.
Sadly, Adobe Acrobat Viewer cannot load it, but if go to Chrome and choose Open.. That should use chrome PDF to display it in the browser (depending on your settings maybe) which worked for me.
Works in Edge's PDF viewer, after exiting the initial mode via the <- in the upper left corner. (If you know how to avoid this being the default, let me know.)
The vulnerability was in images parsing, and exploit was distributed by sending an imessage to the target. So don't open any images, and don't read imessages.
They are also known to use browser exploits, so don't visit random websites.
That was sarcasm, in case it's not clear over the internet. Telling people to avoid "suspicious" pdfs/websites is common but ultimately not very useful advice.
The real takeaway is: don't become a target of a nation state intelligence agency. If you own a phone, they can take over it, and there's nothing you can do.
The Pegasus Project has shown that pretty much anyone could be targeted. It's enough to know someone in a publicly owned company or publicly say something negative about corruption or just be in the wrong place at the wrong time.
Nothing you do will guarantee that the state won't come after you.
Well, it's quite cool, but if PDF supports javascript, putting a javascript game in a PDF is something obviously possible. I don't know if it qualifies as genius. If the game was made from PostScript commands somehow, that would be genius.
Anyway, I love this content on Hacker news, as opposed to people explaining how they want Apple to take their freedom away, because freedom is dangerous.
I don't know how serious you are, but for others projects like this are virtually never a waste of time. There's opportunity cost of course, but that's very difficult to measure. I'm sure OP learned a ton about PDFs in the process, and there is/are no shortage of needs for PDF creation. More broadly they also deepened their knowledge of javascript and other things.
I was considering doing exactly that ahah. We should connect to share our hacks and pains. One could project would be to run wasm4 games because, yes, pdfium and pdf.js can run webassembly.
and this is why I can't read HN at work anymore........
I have increasing confidence that when AIs finally destroy the Internet the delivery vehicle will be the file format that was created, as the Internet itself was, as a form of digital paper.
could you use checkboxes for display? I'm no sure if you can style them, but I think you can access them in JS, and that should result in having basic "pixels" which you can use to draw anything.
I made a game of life in pdf using this technique, but pdf.js is less open to chromium to respect the standard on letting the pdf designer defining the ON and OFF state.
One other way would be to use normal text fields and leveraging custom fonts. I think there are an enormous potential with fonts in the realm of pdf hacking. I think there is also a story of past vuln on pdf.js because fonts were evaluated outside the sandbox.
The Canadian passport application PDF has Javascript that updates a QR code in the top-right corner of the first page whenever you change or fill in a field.
Seems like a pretty genius way of avoiding transcription errors. When I dropped my passport application off yesterday the passport officer marked up a few things on the PDF and then scanned it in, so I assume that they use the QR code to automatically fill in the data as I entered it and then make any updates necessary from after-the-fact modifications manually.
Only seemed to work correctly in Acrobat Reader, but I haven't tried others (like Foxit) or anything.
Yes, elsewhere in this thread people were complaining about how Canadian government PDFs only work in Acrobat Reader on Windows and what a PITA that is.
That's how it inevitably goes with Turing completeness :)
The real achievement here arguably isn't running code (that's provided by the PDF spec and implementations), but managing to hook it up to user input/output in an ergonomic-enough way to play Tetris.
The mention of Turing Completeness got me curious, so I looked something up. Behold, a C compiler written in Lambda Calculus: https://github.com/woodrush/lambda-8cc
The PDF [1] containing the Lambda calculus term manages to hang/glitch/crash both Firefox's and macOS Preview's PDF renderer, which in itself is quite the achievement in portability.
Update: Nevermind, Firefox handles it perfectly, it just (probably wisely) disables seamless scrolling and I have to use the "next/previous" page buttons manually. macOS got there after a minute or two of loading with no UI indications.
These 'tricks' are exactly what makes programming the passion I love. Thanks for capturing the difference between coding for joy and coding for a paycheck so succinctly. Also, it's not a wrapper—it's a full parser and compiler.
Back in school pdfs would circulate that had a bunch of flash games on them. I have no idea how or who made them, but they let us play dolphin olympics on lab computers with no internet connection.
Do the tutorials. If/when you outgrow it, the concepts will carry over to FreeCAD which otherwise has a steeper learning curve but has more capabilities.
An aside, but I found FreeCAD to be a real pain. The dependency tracking across sketches is really quite horrid. If I have sketch2 linked to sketch1, and I delete a line in sketch1, it will arbitrarily reassign all the sketch2->sketch1 dependencies. Maybe they fixed that since I've used it, but I've transferred over to Onshape for all my hobby stuff...
EDIT: looks like they finally addressed the topological naming problem, I guess I better give it a second chance!