This paper describes an implementation of an algorithm that can learn rich classes of languages from examples. Here, a language is to be understood as a set of strings, and to learn a language means to induce a program that can generate all strings in the set. The base language described is a probabilistic programming language, thus universal in nature. Thus, evaluating candidate programs is expensive -- the authors describe a few technical tricks that make this scale. Contrary to the Chomskian argument that a lot has to be built-in so that humans can learn languages, given the relatively little data we receive as children before becoming profficient, this paper aims to show that a very small basis is enough for learning many interesting languages from just positive examples.
The assumption that helps inference is that learners will look for the simplest explanation for the observed strings (e.g., shortest program, minimum description length). Thus, even with just positive examples, it might be easier to describe a large set with a small program that captures the underlying pattern rather than a program that simply memorizes all given examples. This makes learning possible.
The easiest/hardest languages for the algorithm to learn cross the boundaries of the Chomsky hierarchy (regular, context-free, context-sensitive), suggesting that that typology might not be the most cognitively accurate for human learning. Some languages are shown to be hard for the algorithm to learn, but humans also struggle with some languages. Ultimately, a simple combinatorial argument shows that it's impossible for all languages to be learnable from few examples.
This is an interesting realization of a previously known idea (that a universal learning algorithm for positive examples only is theoretically possible), though it's unclear how much having a working implementation contributes to the theoretical debate about human language learning.