Recently Carson Chow posted a short quote on the musings of Von Neumann on straying into pure mathematics when an idea originated in an empirical context. [Hat tip Daniel Lemarie for retweeting it around.]
“[M]athematical ideas originate in empirics, although the genealogy is sometimes long and obscure. But, once they are so conceived, the subject begins to live a peculiar life of its own and is better compared to a creative one, governed by almost entirely aesthetic considerations, than to anything else, and, in particular, to an empirical science. There is, however, a further point which, I believe, needs stressing. As a mathematical discipline travels far from its empirical source, or still more, if it is a second and third generation only indirectly inspired by ideas coming from ‘reality’, it is beset with very grave dangers. It becomes more and more purely aestheticising, more and more purely l’art pour l’art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste. But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganised mass of details and complexities. In other words, at a great distance from its empirical source, or after much ‘abstract’ inbreeding, a mathematical subject is in danger of degeneration.”
The quote was unsourced in the post. It originally appeared in J.V.M.'s 1947 essay The Mathematician which is republished here:
Part 1 and Part 2
Part 1 and Part 2
Recently I was reading Emanuel Derman's book Models.Behaving.Badly.: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life
Friedrich Hayek, the Austrian economist who received the 1947 Nobel Memorial Prize in Economics, pointed out that in the physical sciences we know the macroscopic through concrete experience and proceed from to the microscopic by abstraction. [...] In economics, Hayek argued, the order of abstraction should be reversed: we know the individual agents and players from concrete personal experience, and the macroscopic "economy" is the abstraction. If the correct way to proceed is from the concrete to abstract, he argued, in economics we should begin with agents and proceed to economies and markets rather than vice versa.
These are highly contrasting views!
Hayek argued that the market economy was not designed, rather emerged from the fairly simple actions of actors within the economy (negotiating on price etc). Hayek is essentially saying that we should be able to reproduce the complexity of a given economy if only we can capture an accurate description of all of the 'rules' the actors are obeying.
Looking back on my own intellectual development I was first drawn to these emergent ideas within the 'bottom-up' branch of Artificial Intelligence. It was fascinating to set up a simple rule in cellular automata and watch it generate amazing complexity. Similarly the basic Darwinian model of evolution can be duplicated in a very short bit of code that given an objective function can solve problems indirectly via a massive number of randomized trials directed via the function.
Lately I'm a 'data scientist' which means that I'm attempting to extract models from data. Common methods attempt to wash out enough complexity in the data to derive a model that is inspect-able and understandable by a person. This is fundamentally grounded in empirics as the extracted model is tested for accurate prediction against other data. The inspected models are used mostly for 'story telling', helping users of the system to understand what is being learned from the data.. yet we leave specific predictions exclusively to the algorithms.
Of course one can't resolve these abstract differences in a blog post, that's a tall order.
I would however assert that every learning algorithm should have two fundamentals.
- It accurately predicts outcomes given input data
- It emits a model that is inspectable and can inform the building of better 'toy models' of the system under study.
No one disputes the first, yet many seem all to happy to ignore the second and are OK with building increasingly powerful black-boxes with 'big data' machinery.