Fresh Hacker News | Prolog and Natural-Language Analysis (1987) [pdf]

▲Prolog and Natural-Language Analysis (1987) [pdf](mtome.com)

74 points by Tomte 419 days ago | 6 comments

▲mcswell 418 days ago

Around 1985, while I was working at the Artificial Intelligence Center of (the now defunct) Boeing Computer Services, I evaluated Fernando Pereira's NLP code written in Prolog for his dissertation (he was one of the authors of the referenced 1987 article). My recollection is that his parser was very slow, and difficult to extend (adding rules to account for other English grammatical structures). Another fellow working at the AIC at the time had written a parser in LISP, and I ended up writing the English grammar for his parser.

That's not to say that LISP was faster than Prolog in general, just this particular program was slow.

Now a days, of course nobody writes parsers or grammars by hand like that. Which makes me sad, because it was a lot of fun :).

▲mcswell 417 days ago

I should have added, Pereira was (and is) a lot smarter than I am. He went on to do great things in computational linguistics, whereas I went on to do...smaller things.

▲YeGoblynQueenne 416 days ago

Yeah, left-corner parsers can be slow for CFGs. There was another Prolog-for-NLP book of the same generation, not by Pereira or Shreiber, that showed how to write an Earley parser in Prolog and those can be much faster. I don't remember the title though, I read it maybe ten years ago now.

▲mcswell 416 days ago

The LISP parser my co-worker built had the ability to encode in a grammar rule which "corner" to trigger that rule on. Often it was the head of the phrase, but it could be things like the word "and" (no need to trigger a conjunction rule unless there's a conjunction).

▲JimmyRuska 417 days ago

Pretty amusing the old AI revolution was pure logic/reasoning/inference based. People knew to be a believable AI the system needed some level of believable reasoning and logic capabilities, but nobody wanted to decompose a business problem into disjunctive logic statements, and any additional logic can have implications across the whole universe of other logic making it hard to predict and maintain.

LLMs brought this new revolution where it's not immediately obvious you're chatting with a machine, but, just like most humans, they still severely lack the ability to decompose unstructured data into logic statements and prove anything out. It would be amazing if they could write some datalog or prolog to approximate more complex neural-network-based understanding of some problem, as logic based systems are more explainable

▲LunaSea 417 days ago

One of the reasons for why word vectors, sentence embeddings and LLMs won (for now) is that text found on the web especially, does not necessarily follow strict grammar and lexical rules.

Sentences that are incorrect but still understandable.

If you then include leet speak, acronyms, short form writing (SMS / Tweets), it quickly becomes unmanageable.

▲puzzledobserver 417 days ago

I am not a linguist, but I don't think that many linguists would agree with your assessment that dialects, leet speak, short form writing, slang, creoles, or vernaculars are necessarily ungrammatical.

From what I understand, the modern understanding is that these point to the failure of grammar as a prescriptive exercise ("This is how thou shalt speak"). Human speech is too complex for simple grammar rules to fully capture its variety. Strict grammar and lexical rules were always fantasies of the grammar teacher anyway.

See, for example, the following article on double negatives and African American Vernacular English: https://daily.jstor.org/black-english-matters/.

▲mcswell 416 days ago

I am a linguist, and I agree. But it does complicate the grammar to allow for these other options. (I haven't studied leet speak, but my impression is that it's more a matter of vocabulary than grammar, and vocabulary is relatively easy to add.)

For the record, the parser I worked on ended up having the "interesting" rules removed, leaving it as a tool for finding sentences that didn't conform to a Basic English grammar with a controlled vocabulary--and used to QC aircraft repair manuals, which need to be read by non-native English speakers.

▲throwaway11460 416 days ago

There are languages that have fully codified grammar which completely covers everything people actually use (and more). But we spend 10 years learning the grammar itself 1-2 hours every day at school (then you have literature etc on top of that)...

I see it as a complete waste of my youth, BTW. Today I speak English that I learned through listening, reading and watching, and all of this mother tongue grammar nonsense that used to stress me out daily at school and during homework is absolutely useless to me.

▲agumonkey 417 days ago

I wonder if people approach NLP as a sea of semes rather than a semi-rigid grammatical structures to then be affected with meaning. (probably but I'm not monitoring these field)

▲zcw100 417 days ago

That’s what Stardog is doing https://www.stardog.com/

▲srush 417 days ago

This book is great. Really mind warping at first read. Fernando Pereira has had an incredible influence across NLP for his whole career. Here is an offhand list of papers to check out.

* Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001) - Central paper of structured supervised learning in the 2000s era

* Weighted finite-state transducers in speech recognition (2002) - This work and OpenFST are so clean

* Non-projective dependency parsing using spanning tree algorithms (2005) - Influential work connecting graph algorithms to syntax. Less relevant now, but still such a nice paper.

* Distributional clustering of English words (1994) - Proto word embeddings.

* The Unreasonable Effectiveness of Data (2009) - More high-level, but certainly explains the last 15 years

▲rhelz 417 days ago

I bought a copy of this book used. There was a stamp in the frontispiece, saying that it was from the library of Bell Labs.

I nearly cried---this is how a great institution crumbles---this is how great libraries are destroyed.

Future generations are going to really be scratching their heads, wondering why we disbanded the institution which brought us the transistor and Unix, and instead funded billions of dollars into research in how to get us to click on buttons and doom scroll.

▲linguae 417 days ago

During Bell Labs’ heyday, it was the beneficiary of AT&T’s nationwide monopoly. AT&T was a monopoly subject to various restrictions by the US government as part of a series of legal cases. When a lab is part of a monopoly, whether it’s due to being a natural monopoly (like AT&T) or due to patent rights (Xerox in its heyday), it could lavishly fund labs that give its researchers large amounts of freedom. Such resources and freedom resulted in many groundbreaking discoveries and inventions.

However, AT&T was broken up in 1984 as the result of yet another lawsuit involving AT&T’s monopoly. Bell Labs still remained, but it no longer had the same amount of resources. Thus, the lab’s unfettered research culture gradually gave way to shorter-term research that showed promise of more immediate business impact. A similar thing happened to Xerox PARC when the federal government forced Xerox to license its xerography patents in the mid-1970s; this, combined with the end of a five-year agreement where Xerox’s executives promised not to meddle in the operations of Xerox PARC, led to increased pressure on the researchers (though, ironically, Xerox infamously didn’t take full advantage of the research PARC produced, though that’s another story).

Combine this with a business culture that emerged in the 1990s that disdains long-term, unfettered research and emphasizes short-term research with promises of immediate business impact, and this has resulted in the transformation of industrial research. There are some labs like Microsoft Research that still provide their researchers a great deal of freedom, but such labs are rare these days. It’s amazing that well-resourced companies like Apple don’t have labs like Bell Labs and Xerox PARC, but if businesses are beholden to quarterly results, why would they invest in long-term, risky research projects?

This leaves government and academia. Unfortunately government, too, is often subject to ROI demands from politicians (which is nothing new; check out how the Mansfield Amendment changed ARPA into DARPA), and academia is subject to “publish or perish” demands.

The running theme is that unfettered research with a proper amount of resources can result in world-changing discoveries and inventions. However, funding such research requires a large amount of resources as well as patience, since research takes time and results don’t always come in neat quarterly or even annual periods. Our business culture lacks this type of patience, and many businesses lack the resources to devote to maintaining labs at the level of Bell Labs or Xerox PARC. Even academia and government lacks this type of patience.

The question is how can we encourage unfettered research in a world that is unwilling to fund it. I’ve been thinking of ideas for quite some time, but I haven’t fully fleshed them out yet.

▲verdverm 417 days ago

If you like this kind of stuff, CUE(lang) is highly influenced by Prolog and pre 90's NLP. The creator Marcel worked on a Typed Feature Structure for optimally representing grammar rules to support the way NLP was approached at the time.

The CUE evaluator is a really interesting codebase for anyone interested in algos

▲YeGoblynQueenne 416 days ago

This was the book that taught me the difference between "facts", "rules" and "queries" in Prolog, which was the terminology I had been taught at my university Prolog course. The difference being: there is no difference. They are all Horn clauses, the "queries" are Horn goals and the "facts" and "rules" are definite program clauses. I was enlightened.

The NLP stuff -Definite Clause Grammars- is also interesting, in the sense that it's surprising that a notation that can be used to parse language (definite clauses) can also be used to represent any program a Universal Turing Machine can compute.