edge(x, z) :- edge(x, y), edge(y, z).
you'd take for P the database instance containing two rows (a_1, a_2) and (a_2, a_3), and the database instance H contains additionally (a_1, a_3). Now saying that a Database D satisfies this rule means that every morphism P -> D (i.e., every match of the premise of the rule) can be completed to a commuting diagram P --> D
| ^
| /
⌄ /
Q
where the additional map is the arrow Q -> D, which corresponds to a match of both body and head.This kind of phenomenon is known in category theory as a "lifting property", and there's rich theory around it. For example, you can show in great generality that there's always a "free" way to add data to a database D so that it satisfies the lifting property (the orthogonal reflection construction/the small object argument). Those are the theoretical underpinnings of the Datalog engine I'm sometimes working on [1], and there they allow you to prove that Datalog evaluation is also well-defined if you allow adjoining new elements during evaluation in a controlled way. I believe the author of this post is involved in the egglog project [2], which might have similar features as well.
[1] https://github.com/eqlog/eqlog [2] https://github.com/egraphs-good/egglog
I had a look at eglog and egglog and if I'm understanding things correctly then one possible use case is type inference and optimization. I'm particular I looked at the example in [1].
I'm thinking that this could be useful in the PRQL [2] compiler, in particular for: a) inference of type restrictions on input relations and resultant output relation types, b) optimization of resultant SQL queries.
Would you be able to comment on whether that's correct?
Any links to related examples, papers, or work would be appreciated. Thanks!
1: https://egglog-python.readthedocs.io/latest/tutorials/sklear...
(if my question even makes sense, pardon the ignorance)
Someone had an issue because SQLite failed to optimize the following query
select * from t where x = 'x' or '' = 'x'
Someone said that SQLite could not optimize out the "or '' = 'x'" because it would be too expensive to compute. Which is obviously true only for huge queries on tiny datasets.But of course, in normal life, outside of the world of people having fun with Homomorphisms, queries are much smaller than databases.
Just a guess.
Well... there's your problem. SQLite is not a general-purpose RDBMS, it is marketed as a replacement for "fopen()", a purpose for which it excels.
A similar product is the Microsoft Jet database engine, used in products such as Microsoft Exchange and Active Directory. Queries have to be more-or-less manually optimised by the developer, but they run faster and more consistently than they would with a general-purpose query engine designed for ad-hoc queries.