Follow the reluctant adventures in the life of a Welsh astrophysicist sent around the world for some reason, wherein I photograph potatoes and destroy galaxies in the name of science. And don't forget about my website,

Tuesday, 4 September 2018

This Equation Shows You Can't Quantify Everything

Yeah, I used a clickbaity headline. So sue me.

Recently I went on an extensive rant about the fundamental assumptions of science. One of them, I said, was that things have to be measurable. And that's basically true, I think... but there are interesting subtleties. You might well be familiar with the weirdness of the quantum realms, as in the double-slit experiment where "particles" can apparently be in two places at once. What you might be less aware of is that much, much larger things can be just as hard to measure. You really don't need carefully controlled laboratory conditions to see how bizarre reality can get.

Measuring some things is hard...

In astronomy, if you're hunting for galaxies in a new data set, you have to try and estimate these things called completeness and reliability. They're quite simple concepts but they have very strict meanings - thankfully, for once, quite intuitive ones. Consider a naturalist trying to identify some meerkats at a great distance :

There are ten animals here - nine meerkats and one mere cat. Now the naturalist could, if he really wanted, shoot all of them dead or gas them or something, and count them at leisure. In that case there would be no uncertainty at all.

Real naturalists obviously aren't like that. They're more likely to try and count them from a safe distance, say using a small hand-held telescope. Our naturalist won't be able to hold it perfectly steady, it might be a bit blurry, and the animals are probably going to move around a bit - maybe it's getting a bit dark too. His observations therefore have limited sensitivity, resolution, and various sorts of errors. There are all kinds of reasons he might miss or misidentify some of the animals. Maybe he's also very stupid, blind drunk, or simultaneously fornicating with a rhinoceros. Tonnes of reasons.

Hey, I'm not judging.
If the naturalist correctly catalogues the nine actual meerkats, then we say his catalogue is 100% complete : he's found all the animals he was interested in. It doesn't matter if he also thinks the mere cat is a meerkat or if he goes completely mental and decides that some rocks and blades of grass are also meerkats, the completeness will still be 100%.

If, on the other hand, the naturalist is more diligent and demanding, but too much of a perfectionist, he might only identify one meerkat and nothing else. In this case his catalogue will be 100% reliable. The fact he's missed eight other meerkats doesn't diminish the reliability at all, it just means the completeness isn't as good as it could be.

Ideally of course you want a catalogue which is both 100% complete (finding all the meerkats) and 100% reliable (only finding real meerkats). Of course in reality things are never this good. This terminology matters quite a lot... consider this shark-finding drone, which claims to have a 92% reliability. See the problem ? Reliability is independent of completeness, so - in principle - it could be missing thousands upon thousands of sharks !

And that would be bad.
Naturalists at least have the option of going out and catching their subjects, if they really want to. Astronomers don't have that luxury, making it crucial to understand the difference between sensitivity, reliability, and completeness. Sensitivity is about whether it's even possible to detect something at all, e.g. do you have enough light and/or a sufficiently powerful telescope to see the meerkats ? Completeness and reliability, on the other hand, are about whether you actually do detect them. You might have good enough vision and sufficient light, but all sorts of other errors can lead to misidentifications.

...but measuring other things is impossible

It's possible to rigorously quantify sensitivity. Let's switch to astronomy so we can have some hard numbers. In that case, we can quantify very precisely the smallest mass of a galaxy we can ever possibly detect. This is our theoretical sensitivity limit. With the data that we have, we'll never be able to detect galaxies less massive than this - not ever*. But does that mean we will absolutely definitely actually detect things above this limit ?

* As long as we don't reprocess the data in some fancy way. There are various methods for doing this, but they all have associated penalties.

Of course not. It's just like the meerkats : just because you can spot something doesn't mean you actually will. Except there's an added complication here that makes things fundamentally different and more philosophically interesting. We can never know for sure how many galaxies our data sets contain. It's as though we looked at the African savannah and decided that while we couldn't see any, we couldn't quite rule out the possible existence of a gigantic, fifty tonne super-meerkat.

Dammit, internet ! That meerkat is clearly much heavier than fifty tonnes ! Idiots...
One way to illustrate this is through low surface brightness galaxies. Here's an image of low sensitivity of a fairly boring looking galaxy :

My word, that's dull. We could quite easily work out, though, how much light we'd need in any single pixel to be able to detect it. This would be our sensitivity limit : there'd be no way to detect something fainter than the faintest thing we could see in one pixel. This lower limit would be nice and solid. The problem is that this doesn't tell us anything much at all about features more massive than this that we could detect but just wouldn't. And in fact, a much more sensitive survey of the same region found this :

This is an astronomical fifty tonne super meerkat, otherwise known as the galaxy Malin 1. "Low surface brightness" just means that it doesn't emit much light per unit area, like spreading butter on toast so thin you can barely taste it in any bite. Malin 1 is massive, but so spread out it's difficult to see. This is why completeness is, strictly speaking, impossible to measure in astronomy catalogues - and you have to be extremely careful when you speak of sensitivity limits. Sensitivity limits are not at all the same as completeness limits.

To be fair, the way you calculate sensitivity does matter : if you account for the surface brightness sensitivity, then Malin 1 was indeed undetectable in the first image. But that still means you can't give a mass completeness; you can't say, "I've definitely detected all the galaxies more massive than such-and-such", because there could always be something really big but very faint hiding in the noise. And worse, the problem remains that you can never guarantee everything detectable will actually be detected. Let me switch to radio astronomy for this.

Let's do some maths (but nothing difficult, I promise)

That's right : maths. Not math. That would be short for mathematic, and that doesn't make any sense at all.

Anyway, in radio astronomy what you often get is not an image (though of course we can get those too) but a spectrum. This plots brightness at different frequencies. Galaxies emit radio waves at different frequencies depending on how fast they're moving towards or away from us. Individual galaxies have stars and gas all moving at slightly different velocities, so each one is typically detected over some small frequency range. They can look like this, for example :

That would be a nice clear detection, very easy to spot. You can see there's quite a lot of random noise - this is due to a whole bunch of different effects and can never be eliminated completely - but the galaxy itself is obvious. A useful way to measure how detectable a galaxy is is through its signal to noise ratio (S/N). An S/N of 1.0 means the galaxy is only as bright as the typical noise values, so it would be impossible to distinguish from the noise and not detectable. That's what gives us our sensitivity limit.
Examples (fake) of a galaxy at lower and lower S/N levels from left to right.

But what about a completeness limit ? That's harder. A S/N of 2 probably wouldn't be detected either, because the noise level does tend to vary a fair bit. Neither would 3, 4, 5 or even higher values... depending on the frequency range the galaxy emits at. If it's very narrow, then we might need really high values - say 10 or 20 - to stand a good chance of detecting it. The reason is that real data sets are often plagued by very narrow spikes in the noise, due to the natural variation in the noise and artificial sources of interference. In contrast, if the range was a bit wider, it might be quite easy to detect at lower S/N levels.

Here's the equation that we need to understand this :

The numerical constants aren't important. What matters is that the S/N level is governed by distance (d), mass (MHI), and velocity (or frequency) width (W). The parameter σrms is a measure of how noisy the data is, and not important for us.

So let's imagine we have a galaxy at a fixed distance and of a fixed mass, but we're magically able to vary its velocity width. Real galaxies do have different widths because their rotation speed varies, so this example is very much applicable to real observations. This little animation shows what happens as we make the width greater and greater while keeping everything else constant :

We start off with a narrow spike, reach a happy middle where the galaxy is unambiguous, and then we get the galaxy appearing as little more than a bump in the noise. And the mass is the same at every stage. So again, we can't guarantee that we'd detect every galaxy of a certain mass, just because of the variation in galaxy properties. Mass completeness is impossible to measure. Literally impossible - it's not a matter of using different ways to examine the data, because if the galaxy is wide enough then it becomes absolutely indistinguishable from the noise. Objective algorithms and subjective visual inspection are equally hapless here.

Ironically then, this simple equation has led us to immeasurable properties. There are even more - quite a lot more, actually - subtleties to this, but the point has been made. While we can measure reliability by redoing the observations, we can't know if our survey has missed something. So we can't know what the full properties of the real galaxy population are really like. How wide a frequency range can they really span ? How massive can they get ? We can never know for sure.

Which brings me back to my original point. We have an equation - an actual honest-to-God equation, not some namby-pamby wishy-washy handwaving philosophy - showing to us that there are things we can't measure. And I, for one, think that's rather neat.

You've killed science. Please don't do that.

Does this mean I was wrong to say science assumes things are measurable ? Not exactly, but it does need to be phrased more... delicately. We assume physical things are measurable, but not necessarily with perfect accuracy. The Uncertainty Principle already famously puts fundamental limits on things on ridonculosly teeny-weeny scales, but here we have an example of uncertainty on a much, MUCH BIGGER scale. And just as quantum effects tend to reduce us to probabilistic estimates rather than forbidding measurements completely, so it is here, to some extent.

We can't measure the true completeness limit. But we can at least compare the completeness of different search techniques to each other. Remember, we can verify reliability, by doing repeat observations to see if what we find is really there. So by combining all our different search techniques and follow-up measurements, we can at least estimate completeness if not measure it directly, and we can certainly get a handle on which methods are better.

The point is that completeness, while scientifically of undeniable importance, isn't a physical thing. Some properties are innate, others are relational. Take sheep. If we have two sheep charging across the fields at each other, they have both innate and relational properties. The mass of each sheep (or number of atoms if you want to avoid complications like the nature of mass*) is innate. The velocity of each sheep relative to each other is relational, by definition. While every property arguably does have relations to every other, they aren't all intrinsically relational. The number of atoms in each sheep might be related to what it was doing earlier (e.g. pooping), but at any given moment it doesn't depend on the properties of anything else at all. The relative speed of the sheep, on the other hand, is intrinsically a relational property. It can never be expressed except with reference to the other sheep.

* Let's leave the nature of number for now, mmmkay ?

Completeness isn't a physical thing. Is it a relational thing ? Arguably, in some sense. Completeness can be measured as a relational property, by comparing different measurements. But true completeness can never be measured. It's neither physical nor relational : it's conceptual. And conceptual properties, despite being very useful scientifically, can have disturbingly un-scientific aspects...

At least we can quantify completeness, even if we can't know the true numerical value (it's a bit like the difference between countable and uncountable infinities). But consider justice, or guilt, or yellow. Can you quantify them ? Can you put a number of how fair an action is ? Guilt's an especially nice one. If someone was discovered to have aided a criminal, the original criminal's guilt clearly isn't diminished, not even as a fraction of the total guilt, because they obviously wouldn't have diminished responsibility because they had assistance. Guilt isn't like mass or energy, which are conserved - you can't even quantify it at all.

In case you thought I'd gone mad by suggesting colour as an immeasurable quality,
 there are no red pixels in this image.
Are conceptual concepts real ? Clearly yes, but they're non-physical. Which means that reality is more than physicality. And if that seems like a very bold statement on such a profound issue, it probably is. I'll make it anyway for the sake of argument.

What exactly does this mean ?

It depends on how far we can extend this. A pet idea of mine is that notions like these imply that dualism - the old idea that the mind and body are distinct - is true at least in a very limited extent. Descartes had his famous mind-body problem (do read that link), where he couldn't work out how a non-physical mind could apparently control a physical body. Leaving aside the nature of mind and thought, the basic problem seems to be whether the non-physical can ever affect the physical. Maybe :
  • The world is entirely physical, with stuff interacting through direct contact though in ways we clearly don't yet fully understand that gives rise to the mere appearance of non-physicality.
  • The world is partially physical and non-physical. Non-physical properties could either interact somehow with physical ones (e.g. E.M. fields, gravity, ideas of justice, etc.), or simply be non-participating, essentially illusory artifacts, like rainbows. 
  • The world is entirely non-physical : a shard of the mind of God or a high-tech simulation. Causality may or may not be real.
Philosophy has the liberty to explore all of these possibilities and more, whereas science is constrained by the evidence of the time. While the two have undeniably grown apart, and sometimes estranged, I think this is one issue on which they remain inseparable.

Everyday intuition would probably suggest to most of us that the middle one is correct : conceptual properties are real, non-physical, but interact with the world. If we see something that goes against our idea of justice, we may take action to correct it. If our galaxy-finding algorithm performs badly, we may improve it. And we obviously can't act without having observed these problems. So these conceptual properties do have influence... ah. Oh dear.

Stop and think about that for a moment.

These are non-physical, immeasurable things, apparently having a profound effect on reality ! Does this mean there are some things we'll never be able to understand rationally, or simulate ? Is idealism correct after all ? Is the boundary between physical, objective reality and subjective thought more blurred than we might like ?

Which is something I've previously attempted to depict artistically using radio data.
Well, some of those questions are hard to answer. But don't panic ! We need not fear that the woo-woo merchants are about to disembowel science with ritual chanting and whatnot. Even if we grant that non-physical things affect the world, they do so very much indirectly. They affect our mental states, which in turn causes us to take direct physical action. They do not cause galaxies to explode or your cactus to sing or anything stupid like that. And your mental actions don't directly cause any crazy things to happen either. Your dream about the giant wombat with the staple gun poses no threat to society or anything else for that matter. It's a bit like the simple "Change" spells of Terry Pratchett's Discworld, i.e. in Wyrd Sisters the young witch Magrat finds her broom has run out of energy mid-flight :
Some kind of Change spell was probably in order. Magrat concentrated.
Well, that seemed to work. 
Nothing in the sight of mortal man had in fact changed. What Magrat had achieved was a mere adjustment of the mental processes, from a bewildered and slightly frightened woman gliding inexorably toward the inhospitable ground to a clearheaded, optimistic and positive thinking woman who had really got it together, was taking full responsibility for her own life and in general knew where she was coming from although, unfortunately, where she was heading had not changed in any way. But she felt a lot better about it. 
So this doesn't appear to be something that can obliterate the scientific method, or even give it a nasty shock. What it does do is say that the scientific method is potentially limited, that there are some things we can't simulate... at least not mathematically. Which is very interesting, but it doesn't suggest that existing simulations or mathematical analyses are wrong.

Let me reinforce that. That some non-physical states exist, in this interpretation, doesn't mean that every conceivable non-physical state is actually possible, much less actually does exist somehow. You're no more compelled to believe in God or ghosts than you are in a Bose-Einstein condensate lurking in your closet or that you'd find a lump of strange matter in your cheese. Just because something can in principle exist in no way means that it's possible that it does exist*, still less that is actually does. Phew, thank goodness for that !

* E.g. in principle the Moon could be made of cheese, in practise this is impossible.

Some people consider even this limited influence of the non-corporeal to be a step too far. They quite rightly point out that it still doesn't solve the problem of how things of such different natures could interact. Most people, I'd say, are quite happy to let this be, accepting that while they can't explain how the physical and non-physical can interact, they quite clearly do, so nah-nah-nah-nah-nah. A more reductionist perspective finds this unsatisfying. They'd probably point out that E.M. fields and the like can be explained by force-carrying particles, so cases of apparent "spooky action at a distance" can be restored to normality.

Of course, there's more to the notion of action at a distance than E.M. fields. There's wave-particle duality, Many Worlds, pilot waves, and all that quantum craziness, not to mention curved spacetime in general relativity. The reductionist view is essentially that either non-physical things just don't exist - they're a sort of illusion but produced entirely by physical things - or that they do exist but have no influence of any kind, not even mentally. Consciousness, for example, is a process that just observes what physical processes get up to, whilst being completely unimportant by itself.

Here philosophy and science collide head-on, and anyone who thinks they definitely know what the answer is ought to be given a very wide berth indeed. Personally, while admitting that not being able to explain how the physical and non-physical can interact is clearly unsatisfactory, it seems to me at least equally unsatisfactory to suggest the non-physical doesn't really exist. I would even say it's contradicted by not just by advanced contemporary science but also simple relational properties. The reductionist perspective offers no real clue as to where the illusion of non-physical stuff actually comes from. And it seems to me that science does seem to allow things of very different natures to interact - e.g. massless photons can excite electrons, neutrinos mostly don't interact but sometimes do, etc. - even if, again, it can't necessarily explain exactly how.


I don't have any, though I do have preferences.

What seems to be reasonably clear is that some things are unmeasurable and unquantifiable. The consequences of that depend very strongly on the true nature of those quantities.

If, as in my preference, they can affect the world, then this means there's a limit to what we can simulate and describe through mathematical analysis. There would be aspects of the world that no amount of improvements to scientific accuracy would ever allow us to measure, because they're fundamentally unmeasurable. This doesn't imply the reality of any kind of Magical Mystical Woo* : the existence of some unmeasurable things doesn't necessitate the existence of all unmeasurable things. It would just mean that we can't know everything scientifically, no matter how carefully we examine the world.

* Someone should really name their child that. And they should grow up to become a teacher, so that all can benefit from the teachings of Magical Mystical Woo.

Obviously this viewpoint is not without its problems. It wouldn't solve how non-physical and physical things can apparently interact. While we don't necessarily observe the non-physical things, we do conceive them and are thus influenced by them.

The difference is interesting and important. For example if I measure completeness of a survey, or better yet something more mathematically complex that requires extended cognition, I have to write down the number before I can observe it. Doesn't matter how I write the number : I could use ink, bits of pasta, or arrange megalithic stones if I wanted. My brain doesn't care what configuration the number is, it's able to discern the number itself from the infinite different ways it could have been presented. So I'm not observing completeness directly : that's a thing which only arises mentally. It still affects me, but it's very different from, say, a ghost, which would have to interact directly with the observable world to be visible.
Imagining and observing a ghost are clearly different things, despite absurd claims to the contrary.
And this idea wouldn't solve what thoughts are either, or what makes some electrochemcial processes give rise to awareness while others, like those in calculators and possibly in plants, apparently do not. But since this viewpoint holds that some things are unknowable anyway, that ought not to be a major issue... science, so far as I'm aware, anyway isn't yet capable to saying how things of such different natures as photons and atoms can interact. It just describes the ways in which they do.

I also favour the view that our awareness allows us control we wouldn't otherwise have. Extended cognition is a great example : here it seems we actually need to be conscious to make calculations, and we can't act on the results until they are consciously observed. Blindsight is interesting, but this seems more like a flawed consciousness than truly lacking one. Anyway, consciousness isn't exactly a binary state : we may be unconscious while dreaming, but it's hardly as if closing our eyes gives us the mental capacity of a rock. We're still thinking, still perceiving our thoughts. It would seem to me a highly contrived scenario if we could do all this unconsciously but somehow, for whatever reason, just didn't. Much more likely we actually do genuinely need awareness for some things.

That's my view then : not everything is measurable, but I discount Mystical Woo; I believe our mental concepts allow us to interact with the world through our own choices. I don't claim to know how it all works. And this view is somewhat dependent on scientific findings, so I'd have to revise it if suitable scientific models came along.

More reductionist approaches aren't uninteresting, however, but I find them unsatisfactory. I rather like the idea of consciousness as a sort of pure observer that isn't able to influence the world, with everything we think we control being a deception. Yet this seems a strange and completely unnecessary process, and doesn't seem to really do away with the unphysical as it might appear to. It doesn't say anything much about how vast, complex, imaginary concepts arise from atoms bashing about. And anyway pure observation is considered by mainstream science to be impossible.

I'm more intrigued by studies on emergence. Rather than doing away with non-physical things completely, relational, non-physical properties can arise only with sufficient complexity, e.g. two atoms can have relative velocities but enough atoms together can have a sense of social justice and a burning desire for pizza. There are even particularly strange notions wherein complexity is required for emergent behaviour but doesn't directly cause it...

In any case, emergent complexity is intriguing. But it doesn't seem to me to be terribly convincing. I don't think it's going to help with understanding non-physical concepts much at all. Not at their root, at any rate.

So I say the common sense view has it right in this case. Imaginary things are imaginary and exist in a different sense than physical things. They can affect things but only mentally, not directly. It's an open question as to whether, as some have suggested, we might need new physics to explain this. And as for free will, that's a topic for another post.

Tuesday, 21 August 2018

Will No-One Rid Me Of This Turbulent Sphere ?


December, 1170 A.D. Henry II, king of England and de facto ruler of much of France, is holding court in Normandy. Though one of the most powerful men in Europe, Henry's Christmas season is about to be upset by one of the most infamous and darkest incidents of his long and successful reign. A capable warrior and masterful diplomat, even Henry is not wholly immune to mistakes, and his worst, a gnawing corrosion of his authority for much of the last decade, is reaching a climatic finale that will forever overshadow all of his achievements.

Then as now, despite a far greater direct influence of the Church in political affairs, the separation of Church and state is clear. Years before, the ambitious Henry, seeking further authority over the multitudinous subjects within his vast dominions, attempted to circumvent this by appointing his best friend as Archbishop of Canterbury. It was a rare but catastrophic misjudgement. Conscious of his own inadequacies as a bishop, Thomas Becket was unable to balance the competing demands of his energetic prince and the barely concealed sneers of his ecclesiastical flock. Henry, unwittingly, had forced his hand.

For reasons we will never fully understand, Becket sided firmly and irrevocably with the Church. Years of conflict destroyed his friendship with Henry. Finally, late in December of 1170, a relatively petty incident sealed his fate. Hearing of his latest misdemeanour, an exasperated Henry is said to have cried out one of the most famous lines in English history : "Will no-one rid me of this turbulent priest ?!?"

Henry's words, the story goes, were overheard by a group of soldiers. Dubiously interpreting this rhetoric as royal command, Becket was cruelly martyred a few days later. It seems that Henry was genuinely distraught by this and never forgave himself for what may well have been, in defiance of cynical expectations, little more than a tragic accident.

That was the prologue. Did you enjoy it ? I hope so, but now it's time to talk about science. We'll get back to Becket later.

Introduction : turbulence is still a right pain

Almost 850 years have elapsed since that dark and fateful day. Fortunately, while not itself without occasional conflict with more literal elements of the priesthood, the relationship to contemporary circumstances in astronomy is merely allegorical. Our problems with turbulence are not poetic expressions of anger : they actually are problems with turbulence itself, albeit turbulent spheres rather than priests. Although I suppose if you could find a really fat, angry priest who hated astronomy, that would count too.

Turbulence is the subject of my latest paper*, specifically regarding dark galaxies. Regular readers know that I return to this topic like an over-zealous yo-yo, so I'll be brief. Cosmological simulations predict many more galaxies than we can see. In the most recent simulations, while most clouds of dark matter eventually accrete gas and start churning it into stars, many others do not. According to these models, most galaxies are surrounded by swarms of these dark galaxies - small, sterile, stillborn clumps condemned to a perpetually starless un-life.

* It took a surprising amount of effort to get MNRAS to accept the title and the final version is long-winded and boring. At least they let us mention Henry II in the acknowledgements though.

That sounds like a very ugly scenario that's hard to test and fiendish to disprove. How do you go about finding something that, by definition, you can't see ?

Fortunately a few of these ethereal bodies might not be quite as dark as the rest. Star formation, we know, is not an inevitable consequence of gas. While it doesn't appear to be particularly difficult to start, there are conditions under which it can be suppressed. Even normal, giant galaxies have gas discs which are typically much more extended than their stellar discs - they have enormous regions, often double the size (or more) of what we can see in visible light, without any star formation at all. So, maybe some of these much smaller dark halos could have just enough gas to be detectable, but not enough to trigger that vital process that turns dead gas into a shining nuclear furnace.

Candidates for such exotic objects are rare and controversial. A few do exist, but there are rival explanations that at first glance seem more plausible and less drastic than having to invoke starless galaxies. If we're to have a chance of working out what these things really are, rather than what we'd like them to be, we need to be rather more cautious than King Henry's turbulent priest.

Key to understanding the true nature of these starless atomic hydrogen clouds are their internal velocities. The critical requirement is that their motions must be so fast that without a binding dark matter halo they would quickly disintegrate (that's why we think dark matter exists in the first place). Of course, it's also possible that we happen to detect clouds which actually are in the process of disintegration, so the velocity dispersion by itself is of limited benefit. But we can do better than that : we can map how the gas is moving throughout the galaxy. So what would be much more persuasive would is if those motions had the neat, ordered structure expected from rotation but hard to create through other processes.

From the SDSS blog. A galaxy dominated by nice, ordered rotational motion would have a velocity structure like the middle panel, whereas one dominated by random motions would look more like the right panel.

And if we really get to indulge ourselves, we'd like our candidates to be as antisocial and far away from other galaxies as possible. It's well established that many starless clouds can be produced by gravitational interactions that tear gas out of galaxies if they pass sufficiently close. Our previous works have attempted to quantify the parameters of such debris, so that we can say which clouds are more likely to be tidal debris and which might be genuine dark galaxies. The answer depends on the size, mass, isolation and velocity width of the clouds all at once.

Juggling all these parameters together is tricky, but we can simplify things. We were particularly interested in a few clouds in the Virgo cluster which I'd found during my PhD, partly out of sheer ego if I'm completely honest, and partly because of their remarkably high velocity dispersions. Velocities dispersions have to be coupled with mass and size to tell you anything about an object's total mass. That's not so easy with Arecibo, which has unrivalled sensitivity but its spatial resolution is about the same as the human eye. So those measurements can only give us an upper size limit. Even so, the velocities of the Virgo clouds seemed so high there was no way to avoid a significant unseen dark matter component at any plausible size*.

* In principle the clouds could be teeny-tiny and super dense, which would mean they wouldn't need any dark matter. But at that size they'd be so dense they ought to be undergoing an insane orgy of wanton star formation, and they're not.

So if we fix the other parameters of the clouds to be similar to the observed ones in Virgo, we can reduce things down to just the velocity dispersion as our important measurement. That means we need any clouds in simulations to be small (no more than 50,000 light years across, about half the size of the Milky Way), isolated (at least 300,000 light years from the nearest galaxy), and have enough mass to be detectable to our survey (more than about 10 million solar masses of hydrogen). These criteria are very broad ranges, so we're not restricting ourselves to impossibly specific demands.

In our simulations, we found that tidal encounters could easily explain clouds meeting these criteria if they had measured velocity dispersions less than 50 km/s. Objects with dispersions 50 - 100 km/s were unusual, but present. Features above 100 km/s were vanishingly rare. The clouds we were studying in reality were more like 180 km/s, so very unlikely to have been produced by these interactions. We also showed that this result isn't terribly model dependent : clouds of these parameters are fundamentally difficult to produce by tidal encounters.

Turbulent spheres are turbulent

So, the dark galaxy hypothesis naturally explains the properties of the observed clouds, while the tidal debris scenario has severe difficulties. In a testament to the scientific principle of dealing aggressively with confirmation bias, the dark galaxy hypothesis - which would also solve that long-standing missing galaxy problem - has proven chronically unpopular (though maybe less so these days), while the tidal debris scenario is generally accepted.

But perhaps there's a completely different explanation. In 2016, Burkhart & Loeb suggested that maybe there was another factor at work : the external pressure of the intracluster medium. Galaxy clusters, like the Virgo cluster where we found our interesting clouds, are not just great empty voids flecked with galaxies. Clusters themselves possess their own gas component :  very hot, thin, and not bound to any particular galaxy but filling most of the cluster volume.

The intracluster medium as seen through ROSAT X-ray observations.
Burkhart & Loeb's model was a simple enough calculation. The external pressure from this hot gas acts to crush any unfortunate clouds, whereas pressure from the gas in the clouds themselves acts in the opposite way, trying to blow them apart. Their model said that maybe these two forces were in (more or less) balance, keeping the clouds stable for long enough that we'd be able to observe them. They used the pressure estimated from the X-ray data to infer the size of the clouds if they were to remain stable, and found it agreed with the size constraint from the original Arecibo measurements. Hurrah !

And we should remember that there's an even more basic feature of the clouds that we often take for granted : they're made of neutral atomic hydrogen gas. Above a few tens of thousands of Kelvin, this gas becomes ionised and undetectable to our survey. The temperature of the clouds this new model demands is, if their pressure comes from ordinary thermal pressure, in excess of 100,000 K. This is much too hot to remain neutral. That means the source of the pressure can't be their internal heat, but it must be due to bulk motions of the gas moving in different directions : turbulence.

There's quite a lot of different possibilities for the clouds high velocity dispersions, then. Let me try and summarise this graphically. Click here for the image in its original format if it doesn't display correctly.

This turbulent sphere

It must be said that we weren't terribly optimistic about the turbulent sphere model. A cloud which is simultaneously trying to tear itself apart in different directions while being crushed form outside does not suggest it has a happy, balanced existence. Thermal pressure would be okay, since that could act uniformly in all directions, neatly balancing the external gas pressure since that too acts uniformly in all directions. But this dynamic, turbulent pressure simply can't do that.

So we were pretty confident that such clouds would rapidly become a big ugly mess of one sort or another. The question was how rapidly this would happen, and for how long the clouds would be compatible with the observations. The velocity dispersion and size of the clouds suggested about 100 million years, but given the complexities of two different gases with different processes acting on each, this could be an oversimplification. And it's much harder to predict how long the all-important velocity dispersion would last as the clouds simultaneously implode and explode.

Which is why we set up a series of numerical simulations. Each model contains a central sphere of neutral hydrogen with some turbulent velocity field, representing the Virgo clouds, surrounded by hot, thin gas representing the intracluster X-ray gas. We kept the intracluster gas the same in each model, since it doesn't vary all that much throughout the cluster. For the clouds, we tried varying their mass, size, and parameters of the turbulence (e.g. on what scales the velocity varies inside the cloud, the contrast between maximum and minimum velocity, etc.). And we made synthetic observations, so we could directly compare what our simulation did with what we would observe if it was a real gas cloud in the sky.

Naturally we began with a model that closely reproduced the Burkhart & Loeb calculation. What happened ? Well, it went splat. The whole thing garbled itself into a writhing mass of filaments that tore itself apart.

While the cloud did "survive", in the sense that there was still some gas left in the middle, it was no longer anything like the cloud it once was. It's a bit like if Rambo decided to cut off his penis and become a florist : he wouldn't die, exactly, but he'd hardly still be Rambo any more either.

None of our other models did any better. No matter how we varied the parameters, or what combination of values we used, we couldn't do much better than that first attempt. We think it's a fundamental property of the mechanism : bulk random motions from turbulence aren't the same as the much smaller-scale motions from temperature.

We did a bunch of simulations, and found three main behaviours. Top : the cloud just disperses, as in the gif. Middle : the cloud heats itself up through its own internal collisions, becoming undetectable. Bottom : the cloud's collisions cause it to collapse before it eventually disperses.

But it's good practise to ask not just if this model works as stated, but if there's any way it could be made to work. Simulations and observations are always limited, and, as someone once quipped, it's a terrible mistake to think that all the facts you have are all the facts there are - it's the basic mechanism we want to test, not one very specific case. Here, for example, we set up the turbulent velocity field by magic, without trying to model how it might have formed (we have really no idea how you'd get such a strong turbulence field in such a small object). We didn't model complex processes like thermal conduction or heating via the X-ray radiation from the hot cluster gas, or magnetic fields.

The problem is that even the combination of all these missing factors doesn't seem to give the model much wiggle room. The only way a cloud can remain stable is if the forces causing expansion and compression are balanced. So far as we know, there is absolutely no reason whatsoever to suppose that all of that missing physics could help solve that basic difficulty. What you'd need is some mechanism that continuously drives the turbulence but continuously, reactively adapts to how the hot gas behaves in response, constantly moving the gas around but always keeping it in the same general region. And we haven't got a sodding clue what could do that. At this stage, our model is more than sufficient to say that the basic idea just doesn't work in this case.

Which is not quite to say that it couldn't work at all. Remember that the high line width of the clouds is Rambo's penis what we're interested in here. But we're interested in this precisely because it's unusual. Most other dark hydrogen clouds have much lower widths, and for those turbulence could indeed play an important role. What we tended to find in our models - purely by accident and not design - was that our clouds generally evolved to having those lower line widths found in more "normal" clouds.

While the paper was being refereed, a certain Bellazzini et al. did their own set of simulations of one of these more typical clouds. They found it could survive for a billion years, even while moving through the cluster gas. It ended up in a unpleasantly bedraggled state, mind you, but it survived.

A Rorschach test ? No, it's a figure from Bellazzini et al. The top row shows the temperature of the cloud at three different times while the bottom row shows its density.
These simulations didn't actually use turbulence at all, just thermal pressure. Their cloud had only a very low line width, so it wouldn't need a huge amount of turbulence anyway. Unlike our high line width clouds, it's easy to imagine that stripping gas out of a galaxy might give it a little bit of turbulence as it starts its lonely journey through the cluster. Turbulence could certainly play an important and interesting role in the evolution of clouds like this. Just not in the case of clouds that we were interested in.

What's next ?

So, to wrap up, there's this problem of observations not finding enough galaxies, and an idea that maybe they exist but are made only of dark matter and gas but no stars. Objects which seem to fit the description have actually been found. But these clouds might not just be dark galaxies. The current three leading ideas are in following state :
  • The clouds might be dark galaxies. In this case their high line widths would be due to rotation. Simulations show they could be stable during encounters with other galaxies, giving them a good chance of long term survival. This also fits with recent discoveries of large numbers of large but incredibly faint (dim rather than dark) galaxies.
  • They could be tidal debris, gas ripped out of galaxies during interactions. Their line width would be due to streaming motions along the line of sight. Simulations show that this is possible but highly improbable - it's really difficult to produce a cloud which is small, isolated, and with that all-important very high line width in this way. Every cloud detected would be a case of us happening to observe a galaxy interaction during a very short phase of its evolution.
  • They could be turbulent spheres. Random bulk motions within the clouds could be kept in check by the compression from the hot external gas pressure. This doesn't work - such clouds disintegrate very rapidly, though in a variety of interesting ways.
None of this is to imply that tidal debris or pressure confinement aren't important. In fact, I'd say most clouds probably are tidal debris, because they have low line widths which the simulations say is easy to do. And pressure confinement might well be important in those cases. Turbulence ? We're not so sure about that one. We don't know of a good way to keep injecting energy into a cloud after it's been separated from its parent galaxy.

The point is that while these processes have important roles to play for some clouds, it doesn't automatically imply that they matter very much for the particular clouds that we're interested in. So far as we can tell, turbulence and external pressure don't help in the slightest for these objects. In fact, because the clouds disperse in roughly the time predicted just based on their size and motion, it could be that it's making very little difference to most of them. Of course, some other explanation might need to use turbulent motions as well, but what we've shown so far is that by itself, turbulence simply doesn't work. And I doubt that it can be made to work without employing some fundamentally different physics.

I don't know what the clouds really are. The evidence thus far points quite clearly towards dark galaxies, but I have a sneaking suspicion that they might turn out to be some form of tidal debris after all. Oh, it won't be the sort we've tested. It'll be something more complicated, caused by a combination of tidal forces and the intracluster gas. Maybe. Of course, it's also possible that they'll turn out to be something else we haven't even thought of yet.

Finally, we should return to poor old Thomas Becket. For if his noble liege had been not a medieval prince but an astronomer, he would surely have cried out, "Will no-one rid me of this turbulent sphere ?". To which Becket, his postdoc, would have responded, "There is no need, sire, it hath destroyed itself." And they all lived happily ever after.

Conclusion : Thomas Becket should have gone in for astronomy instead. It's generally a lot less bloody.

Thursday, 26 July 2018

I Read Edward Gibbon So You Don't Have To

Literally. There is no point anyone reading this book unless they're an actual historian.

Pretty much any history of Rome will mention, at some point, Gibbon's mighty tome, often in exalted terms as though its sheer magisterialness can be absorbed through osmosis. Well, as y'all know, I prefer to read the source material myself. They usually turn out to be a darn good read and (sometimes) the modern interpretations aren't quite up to scratch.

See, the other thing that pretty much any history of Rome will go on about is how gosh darn amazing Gibbon is. Peerless rhetoric, they said. A masterful command of language and persuasion, they said. "I devoured Gibbon", said Churchill. "An undisputed masterpiece... a work that will only perish with the death of the language itself", it says on the back of my paperback abridged edition. You can see why I was particularly eager to read this one.

Well, I'm here to tell you that they're all wrong. It's a crappy book full of dreary, inscrutably dense and dry prose and whoever did the editing of this particular version ought to be thrown to the lions or given some other suitably Roman punishment.

I suppose competing in the modern incarnation of Gladiators would be appropriate.
This particular version is a hefty 40% or so of the full work; a mere 1,056 pages making it a positively light read compared to Plato's 1,800, my last big read. But Plato was lively, engaging, occasionally funny (and not dry humour either), deeply analytical and a work of genius. Gibbon, on the other hand... well, imagine the most stereotypically boring librarian you can can conjure up and pretend you gave them unlimited funding for twenty years to write an encyclopedia about cabbage production. That's roughly what you get from Edward Gibbon.

Now, I'm not sure how much I should blame Gibbon and how much I should blame the editors of this cut-down version, but I'm pretty confident that both of them are morons so I'll blame 'em both. The editors because their chapter selection sucks, and Gibbon because during his twenty year project he apparently never learned how to explain things clearly, use commas correctly, stay on topic, assess relevancy, or have a single damn analytical thought in his life.

Gibbon's greatest strength is also his greatest weakness. His History is one of pure observation. As a compendium of citations with raw descriptions of what happened, it is indeed peerless and will never be surpassed. The amount of reading and note-taking the poor sap must have had to do in an age before the invention of ctrl+f should give everyone horrifying nightmares. The problem is that that is literally the only good thing about it.

Take his legendary rhetoric. At best, this is vastly over-rated. It's not that he didn't know how to turn a phrase so much as it was that he wouldn't stop doing it. Ever. Who needs clarity when you can have pseeudopoetry ? Except that it's rather worse than that... it's more pseudo-poetric ramblings. The occasional flash of rhetorical brilliance can't compensate for the impenetrable fug of verbal diarrhoea that fills most of the book. The real annoyance is that there's just enough good stuff in there - whole chapters, even - that I can't blame this on the writing style of the age. Some parts of the text are crystal clear, flowing narrative. The rest is what I'm calling Gibbonish... not exactly gibberish, but sort of half-narrative, half general commentary that ends up as the worst of both worlds. So focused is Gibbon, nay obsessed, with constructing rhetoric that it feels almost like deliberate obfuscation. It's not text you read so much as parse. And that quickly becomes mentally exhausting and whatever godforsaken point Gibbon was trying to make is utterly lost.

In particular, while the description of what happened is usually okay, deciphering who did what to who is often a fiendish challenge. For example Gibbon is overly-fond of referring to "that person", rather than saying, "he" or "she" or whatever. It feels lazy and weird, exactly like this cat :

"Yeah, it was that guy. You know, the emperor. The one with the shoes."
And it's much worse if multiple groups are interacting. Trying to understand which group did what is less of a challenging read and more of a decryption exercise from Bletchley Park's glory days. Often I found myself completely befuddled as to who attacked who and frequently baffled as to which side even inflicted an apparently decisive victory, and sometimes if was a victory at all or if everyone just went home for tea instead. Gibbon might very well have penned the least pedagogical history of all time.

Then there's his analysis. Other works I've read - most notably Peter Heather's Fall of the Roman Empire - have been quite insistent that Gibbon largely attributes the causes of the collapse as primarily internal discord and Christianity. But Gibbon actually has very little explicit to say about this, and nothing at all about the deepest underlying causes. As an analyst he is hopeless, as a philosopher he is a non-entity.

And what little Gibbon does have to say just isn't terribly convincing. He paints a reasonable picture of an arrogant Rome in a state of decay following the Antonine Emperors. The problem is that he then seems to describe a reasonable successful recovery during the following century, making that particular little episode not really very important in the grand scheme of things. And a grand narrative is something Gibbon singularly fails to deliver. It's just a case of "this happened, and then this happened, and then this, and then there were some angry barbarians for some reason."

Now I can't avoid mentioning the problems of this cut-down version, just in case I'm doing poor Gibbon a massive disservice. I might be. The way this edition has been abridged is by presenting a selection of chapters in their entirety, with brief notes describing the omitted chapters. These chapter selections are simply awful.

Several included chapters are almost entirely irrelevant, have no meaningful context, and often they don't relate to the Empire at all, much less its downfall. Chapter 21, for example, is exclusively devoted to some stupid old priest who (so far as I could tell) did absolutely nothing and died. Later chapters are about the rise of the Prophet Mohammed and the crusades, which would be pefectly fine in the context of the end of the Byzantine Empire except that's not how they're told. They're just about those particular episodes for their own sake, and I'm pretty sure some chapters don't even mention the Romans at all. And dammit, I don't care about some stupid priest who did absolutely nothing and died, or even the Crusades : I want to hear about the goddamn Romans. You're a moron, Gibbon, and I don't like you.

Gibbon covers the fall of both the Western and Eastern empires. One might imagine that an edited version would focus on either one of these, and that's mostly what's done with this edition, but reeeeally badly. Crucial chapters detailing how barbarian tribes became incorporated into the Western empire are cut, so we go from the empire having some minor difficulties to suddenly OMG there are barbarians everywhere and where the hell did they come from ? 

Admittedlyone does get a nice sense of the empire transmuting into something different in its last couple of decades, dissolving rather than falling... it's clear that by the nominal end the Empire was already quite, quite dead, the deposition of "Emperor" Romulus Augustulus a mere formality. Very good. But how did it come to that ? All those critical decisions in which Rome failed to properly deal with the massing barbarian tribes... those chapters are completely missing. And that sucks. The most important part of the story is gone. Why ? Buggered if I know.

Later on we miss out all the best bits of the Byzantine Empire as well, and that sucks too because it's a damned epic story : the Eastern Empire's spectacular (but tragically brief) recovery after its near-fatal wars against Sassanid Persia has been called one of the greatest military comebacks of all time. But we don't get any of it in this edition, we just skip straight to Mohammed, the Crusades, and the final siege of Constantinople. It's bloody stupid, is what it is.

I somehow doubt that the missing chapters would help that much though. Gibbon is at least half decent at writing narrative, so as far as individual stories go, he's fine as long as he doesn't slip into Gibbonish. Not brilliant, but okay (if you want a modern writer who could kick Gibbon's ass, I recommend the aforementioned Peter Heather, Tom Holland and Roger Crowley; Lars Brownworth tells a really gripping tale but I have my doubts about his accuracy). The chapter on the rise of Islam, for instance, is very long but a thoroughly good read, and Gibbon - never one to be shy of judgement - does at least venture some interesting commentary on the character of Mohammed.

But what about analysis ?

Nope nope nope nope nope. Gibbon doesn't do that; the closest he gets is judging people's personalities. Sorry Gibbon, but there's more to an Empire than the quality of its Emperor. How about some thoughts on what forces were at work to influence the choices of the soldiers when they brought good and bad emperors to power ? Under what conditions were emperors able to control their armies rather than the other way around ? What about the state of the legions and their organisation over time ? The economic forces ? The social changes ? Why did Rome, otherwise masterful at exporting its culture to distant lands and assimilating others into its vast edifice, later do such a crappy job of incorporating barbarian tribes when its Empire was still powerful ? No, no discussion of that. The person who read more of the original source materials than any other also seems to have thought about them the least. Even the final chapter, where Gibbon ventures to describe the conditions that led to Rome's ruin, is disappointingly and almost comically literal. He describes why the city itself fell into ruin, of which the causes are pathetically trivial : buildings fall down, it's a thing that they do.

Thanks, Gibbon. Thanks so much.

Of course you could read between the lines and infer your own hypotheses, which is what later historians have done. But that really is a job for historians, as without a starting point this ain't so easy. Comparisons and analysis of a stated position are relatively easy : if Gibbon were to say "Christians did it", you could point to examples of where his evidence supports the claim and examples where it doesn't. Then you can sum up and decide if this idea makes sense overall. Spotting the possible underlying trends from the raw facts - especially with Gibbon's focus on individual character rather than systemic trends - is much more difficult, and because Gibbon's text is largely drier than the surface of the Moon, I'm not going to try.

Even at a much more trivial level this edition is unnecessarily difficult to read : I'm talking about footnotes. I can understand why these are useful in a historical narrative where you want to state important caveats without interrupting the flow of the story. Fine. But the editing here is pretty horrendous; 95% of the footnotes are bibliographic references but some do contain interesting additions that are worth reading. It would not be a monumental task - seriously, the work of a week or two, and if anyone wants to pay me money then dammit I'll do it myself to prove it - to arrange these to appear at the bottom of the relevant page so that the reader doesn't have to treat the work as a gigantic flip book, leaving the bibliographic references at the back for the historian. Oh, and most of the non-English language footnotes have been translated, but some haven't and no explanation is provided. It's weird and silly and I don't like it.

How to summarise this nine month endeavour of mine ? Well, sorry Gibbon, but I'm rating your life's work 3/10. Low on clarity, long-winded, and while there are indeed flashes of genuine rhetorical brilliance, they're rarer than sightings of Bigfoot being abducted by a flying saucer. Obsessive descriptions of irrelevant minutiae are hardly a substitute for deep analysis, a fact obvious to just about everyone except Gibbon. Go away and read the dictionary instead. You'll learn more and make more friends.

Thursday, 12 July 2018

Ask An Astronomer Anything At All About Astronomy (XLVII)

Catching up on a remaining bunch of questions. As always, click the links for full answers, and if you want even more, have a browse through the enormous Q&A main page. Current grand total : 436 !

1) How hot does a pole get moving at mach 1 through the atmosphere ?
Ah, should I make an innuendo or a casually racist joke, or both ? SO HARD TO CHOOSE !

2) Are eyepieces overrated ?
Yeah they suck.

3) To stop the xenomorph infestation, should we nuke the ISS from the ground or from orbit ?
From the ground, because we don't have any nukes in orbit.

4) Are there really only four planets in the Solar System ?

5) You're lying about the size of Betelgeuse.
No I'm not.

6) What do you think of detecting belts of exoplanetary satellites to find aliens ?

7) Did you predict the battle of Callisto from the Expanse ?
Well I predicted a battle...

8) Could you make a colonial spaceship smaller using frozen genetic material instead of a living crew ?
Yes you could.

9)  Do you have to time everything just right to dock with a rotating space station ?
Yes, but it's not that bad.

10) Is the radial acceleration a good method to choose between the standard model and modified gravity theories ?
No it sucks.

11) What do you think of this paper on missing matter ?
It's OK. 7/10.

12) How much stuff has been found by microlensing ?
Not much.

Monday, 9 July 2018

Ask An Astronomer Anything At All About Astronomy (XLVI)

Yeah, I know, these aren't updated very much. I think I have to change the schedule as weekends are seldom available to update things these days. Still, it's not like there's been an absence of questions ! Here's a round-up of the last few months (I still have more stockpiled, but I reckon twenty is enough to be getting on with). As always, click on the links for the longer, generally less sarcastic answers, and have a browse of the enormous Q&A page. Current total stands at 425 questions !

1) Are you a rah-rah scientist ?

2) Is that my timeline of the Universe Stephen Hawking is using ?
Yes it is !

3) Your Discovery variant of Orion is all wrong.
Blame Kubrick.

4) What's neat about this galaxy without any dark matter ?
It looked like it was really cool... but it wasn't.

5) The galaxy can't have more than one black hole !
Err... yes, it can.

6) Can you make a side-by-side 3D version of this galaxy fly-through ?

7) Is this the Andromeda galaxy ?

8) What do you call it when you reach the point of closest/furthest approach to a black hole ?

9) Is every round object a planet ?

10) Is mainstream science being a douchebag by pretending the Big Bang is real ?
Umm, no.

11) Don't quasars disprove the ages of distant galaxies ?
What ? No.

12) I've disproved the expansion of the Universe !
No, you haven't.

13) Is the Cosmic Microwave Background homogeneous ?
Yes, very.

14) How come distance doesn't matter for surface brightness ?
Geometry works like that.

15) Wait, are your sure distance doesn't affect surface brightness ?
Yes. There are lots of subtleties but they can be accounted for.

16) Could time dilation affect our estimates of star formation rate in distant galaxies ?

17) Isn't space a vacuum, so how come turbulent clouds don't just fly apart ?
It's not a perfect vacuum, just a very good one.

18) What shape are black holes ?
Very, very pointy.

19) Is this asteroid impact video accurate ?
It's not bad.

20) What's the data set shown in this pretty movie ?
Your mum.

Tuesday, 5 June 2018

H One

Richard Feynman was both a great scientist and a wonderful philosopher of science (though he was also, and it's worth bringing this up a lot more often, a dick). The imagination of the artist is of course a very interesting thing indeed, and scientific and artistic creativity aren't always so unlike each other. For actual science, imagination is by necessity tempered by observation. But for data visualisation, sometimes it's better to let imagination off its leash and run wild and free without giving a flying poop about whether it's "useful" or "relevant" or not. I'm a firm believer in the principle of doing things because they are cool. All that spin-off gubbins can come later, or not at all.

Do you think this crocodile cares that data isn't uniformly sampled ?
No, neither do I.

A long time ago in an institute far far away, I made an abstract visualisation about neutral hydrogen data. Five years later, software, hardware, and techniques have improved quite a bit, so I decided it was time for another one. In general I think art should stand up on its own and let the viewer decide how they want to interpret it, but if and when an explanation is required, it'd better be a bit more than, "old clay pipe stuck in festering monkey's uterus" or whatever crap the modern art world is currently plagued with. So if you want, you can view the final video product below and walk away. Or, if you prefer, you can keep scrolling and look at some additional pretty pictures. If you're really enthusiastic, you might even read some of the accompanying text, think about the philosophy behind this, and then watch it again. I'll be asking a lot of questions here and answering hardly any of them. Be warned.

The main purpose of this work was to show how different data visualisation methods let us perceive the same data sets in radically different ways. In principle, scientific conclusions should be limited by the data itself. In practise, the process of interpreting the data is creative and subjective. This can be true even of numerical analysis of very simple data sets - raw numbers, by themselves, tend not to be terribly inspirational, restricting the contextual environment of the data to existing ideas. And more trivially, as we all know, statistical analyses of larger data sets can all too easily lead to conclusions which are objectively constructed but simply wrong. All interpretations are ultimately subjective because interpretation is never done by the data itself, but always by human judgement.

How you perceive the data is therefore freakin' important. It strongly affects which conclusions you can find and even consider, shaping your worldview. Data visualisation has many exploratory and communicative purposes : to help discover what the data set contains, to compare different observations and models, to persuade others of whatever conclusion seems most probable, and, not least, to inspire new ideas. Creativity is a complex process, but it's fair to say that most people are far more inspired by visual imagery than mathematical text : the former is, after all, something we're much more evolved to process, since cave art tends to have more pictures of mammoths and stuff than differential equations.

And we're now in an era where data sets of millions of data points are commonplace. It's true that we're going to have to adapt our analysis methods to deal with the new problems associated with big data, but it doesn't mean we can stop looking at the data : not now, not ever. Rather it means that we have to come up with new visualisation methods, which will affect both our hard scientific conclusions and purely philosophical interpretations.

Here then is my offering to attempt philosophy through art driven by science. I'll briefly describe each sequence of the movie both in terms of the aim of each sequence and the method of its construction. I finally managed to get myself to learn Python for Blender > 2.49 for this (all of which is rendered in Blender, mostly in 2.78), so I've included links to scripts where useful. All of the data sets used here are, as is my speciality, 3D neutral atomic hydrogen data cubes. Again, links to public data sets are provided where available.

As this might interest different audiences, each section has a separate description of how the image was rendered which is written for Blender specialists and can be skipped by everyone else*. The main point is that these data cubes are three-dimensional maps of the hydrogen gas in different parts of the universe. Imagine them as a sort of slightly unconventional atlas. Every page shows a map of the density of gas at exactly the same locations in space, but at very slightly different frequencies. This corresponds to velocity, which, depending on the data set, can tell us how fast the gas is rotating and/or its distance away from us.

* Originally I had this fancy thing of using buttons to hide the text, which worked brilliantly for about twenty minutes and then stopped for no reason whatsoever. So I'm afraid you'll have to skip this the old-fashioned way. "Computers are logical", they said. "They give repeatable, objective results", they said...

Too Much Information

We begin with a Matrix-style shot, panning back from a single number to a whole array of text. This is of course a cliché, but a very useful one. The data that we have to process is, ultimately, just numbers, so in a sense this is the simplest and most natural form of data visualisation. Yet we very rarely use this for large data sets, because, unlike characters in the Matrix, we can't train ourselves to automatically see "blonde, brunette spiral, elliptical galaxy". We are compelled to process the data in different ways in order to make any sense of it.

Note that I said the data we have to process. Whether the data itself is really all numbers or not is another matter entirely. What we're ultimately processing is radio waves or photons (or particles, but let's not even go into how we're fundamentally unsure about the nature of the stuff we're measuring) from the sky, which induce minute electrical currents that experience an enormous amount of complex processing before they're reduced to the final numbers we get to muck around with.

So are numbers, in some sense, a very literal, accurate representation of reality, or are they as flawed as any abstract representation, like language ? Are they continuous or discrete ? One view is that because you can't contain infinity in a finite volume, physical properties - length, height, weight, acceleration, charge, etc. - can't have infinite precision. The problem with this is how you'd ever prove that volume is truly finite, that you can't just keep dividing space up into ever smaller parts. And if reality is continuous, that implies all the problems typically associated with infinities. So you can certainly represent data with numbers, but whether than means the world really is made of numbers or they are merely conceptual mental constructs is far less clear.

How it's made

This shot (the above image is actually from a later sequence) attempts to make sense of the data in a few different ways. First, I used this script to slice the data set into a series of text files. Each one contains a 2D slice of the 3D data, one for each frequency (velocity) channel. I've cheated here - despite appearances only one slice is ever visible, with each frame of the animation increasing the visible channel. The script to create the text grid is available here and the animation script is here. Rendering a true volume of text is just too computationally demanding, so I used two planes, one above and one below the text objects, with a mirror material to create the illusion of extra depth. This uses a very simple render layer setup to hide one of the planes, which you can find an example of here.

The first sequence in the movie uses data of the Triangulum galaxy M33 available here. As the animation proceeds, the height (as well as the value) of the text is adjusted to correspond to its intensity, so you get a hint of a Matrix-style surface effect. I wanted the visuals to be driven as much as possible by the data, but I allowed myself a fair amount of "artificial" window dressing if I wanted to make a particular point. In this case the text has some random value variation to give it a more "Matrixy" appearance, since the numbers in the font I used weren't particularly unusual to look at.

The M33 text only samples a small fraction of the full data set, mainly because my first script was flawed : it links each text object to the scene as it's created. Later I realised it's very much faster to create the object and link them to the scene all at once, which made the image shown above possible. That one uses private data of the Pegasus cluster. Each of the whiteish columnar features is a galaxy. Whereas M33 is nice and close and well resolved by the Arecibo telescope, these guys are much further away. We detect them over many different channels because they're rotating, but they're essentially just blobs in the spatial plots - hence the long, cigar-like features.

Even with the improved script I couldn't render all the data at full resolution - it's just too large to display everything as text. The galaxies, however, are unresolved, so they did require the highest resolution or they wouldn't be shown at all. So I clipped the data : the brighter flux, corresponding to galaxies, is sampled at the maximum resolution possible, whereas the fainter noise is much more restricted. Creating the appearance of the galaxies as textual surfaces (rather than volumes filled with text) was done by restricting the flux shown to a narrow range, so the interiors are avoided. The appearance of the galaxies was animated by this very simple script which just controls their visibility.

The Dark Tower

We next get a transition from representing the data as numbers to a landscape, via a brief transition sequence I'll describe later. A happy rendering accident produced the murky scene that resulted, giving a slow, reluctant change from text to surface. Of course the data here still is numerical, but this has a completely different feel to it. I liked the idea of thinking of these highly abstract slices of data as real places. Of course they are places, but a diffuse, gaseous galaxy is conceptually different from a rocky landscape. And you don't even need the concept of a number to understand a landscape. Do our minds even quantify things at all, or are they making relative comparisons in some other way ?

I played around with this quite a bit. In the video we get this gloomy blue scene, blurring the line between discrete surface and volumetric fog - a running theme of the project. I'm fascinated by how continuous data can emerge from seemingly incompatible discrete points. Eventually we see the data (the same as in the first sequence) rendered in the style of a barren rocky desert, albeit with some reflections to keep it suitably surreal. At the end the sky changes from a clear blue sky to one filled with clouds generated from hydrogen data from our own galaxy : a hydrogen sky illuminating a hydrogen landscape, data of the same type shown with starkly contrasting methods. Altering the colour scheme is also fun.

As Above, So Below

Minas Morgul

Interestingly, despite having looked at this data (as above, of the Triangulum galaxy M33) a lot, I've never noticed the twin peaks here before. So in some cases you really do get new information by changing your visualisation method. Which makes one wonder if and why it's more legitimate to view your data as a landscape, map, or volume. What exactly do we mean when we ask what the data really looks like ? Why do our minds perceive things as colours and not as heightfields, or smells, or boredom, or homoerotic ennui ?

For example, imagine that you had extremely densely-packed nerves in your fingertips. You could in principle receive tactile information as though you were seeing it, though no photons would be involved. Similarly, photons received by your eye could trigger the same sensation as when you touch something, though this wouldn't be accessing the same information as from a direct physical interaction. Or your brain could directly generate maps of emotion rather than brightness and colour. It's hard to imagine finding your way around based on how hilarious/erotic you find your surroundings, but why not ? It would be the same information as you have now, just processed differently. Indeed many animals lack a sense of sight entirely and get by just fine on smell and touch; others, such as sharks, have electrical senses quite beyond our ability to imagine. Does their electrical sense feel similar to touch, or is it as different from touch as touch is to sight ?

Synaesthesia, where one sense triggers another, is sort-of what I'm getting at. Blindsight, where the brain process signals from the eyes only at an unconscious level, is perhaps closer but still an imperfect analogy. My questions are more why we should consider our senses legitimate at all. To what degree are we experiencing the real world ? How does our perception shape our view of it ? Why do we only perceive things in such limited ways ?

All this comes about from spending too long looking at those channel maps. False colour images are one thing; maps of the motion of an object are quite another - and they all give real information. For comparison, here's how I would normally render the above data set :

So which one is real ? A mind-wrenchingly difficult question. None of them show the data as it would appear to our eye, but on what grounds do we grant visual sense a special privilege ? None, really. Maybe one day we'll have the equivalent of a Copernican Principle but for senses, holding information from smell and taste to have the same level of validity as that from photons.

How it's made

Rendering data as a landscape is fairly easy - I just used each channel map to displace a grid mesh vertically. Stupidly not realising that Blender can now easily do this for animated textures, I wrote a couple of Python scripts : one to create the initial mesh which you can find here, and a second to animate the meshes which you can find here. Pointless, but it got me learning to code in modern Blender at long last.

The hydrogen sky data is taken from the Leiden Argentine Bonn survey and is available here. The data files are equirectangular so are easily mapped to a sphere in Blender without even needing UV mapping. To create the colours, I used a classic technique where different channels contribute to the RGB components separately.

I had a happy accident creating the murky appearance. I found that if you enable indirect lighting and ambient occlusion, and completely surround your objects with some closed mesh (which must be traceable so it can cast shadows), you get this wonderfully gloomy appearance. I never bothered to figure out how to change it, but fortunately it looked nice to me anyway.

Infinity Sphere

This works a little better as an animation but the stills are nice enough, I guess. As the hydrogen sky appears to set, we change to the reverse angle - a view from beyond the sky, with the hydrogen data mapped onto a sphere. Now here I'll admit I've used something that looks pretty rather than being driven directly by the data. The sphere is between two reflecting mirror spheres, so you get a series of "infinite" (well, okay, about ten) reflections fading into the finite volume of the sphere. This merges the smooth, solid appearance of the sphere with the diffuse nature of the hydrogen, combined with strange distortions from the spherical reflections and a deliberately ambiguous sense of motion.

It's true that slapping on a pair of shiny spheres might not be necessarily the most informative way of viewing data. But to hell with that ! You're look at a series of literally timeless photons that have, from their perspective, instantaneously travelled fifty thousand light years to be intercepted by a series of reflective metal surfaces in order to cause a small, measurable electrical current and then re-emitted as photons on a computer monitor and finally viewed through twin organic, refracting lenses which transform them back into electrical signals and then... lord knows. What exactly is strange about throwing a couple more reflecting spheres in there, hmm ?

More Than Darkness In The Depths

How it's made

Not much to say about this one - LAB data again, with offsets to generate the RGB channels, and a couple of reflecting spheres, with a similar render layer setup to prevent the outer one from being visible.


Next comes a very different style of sequence. Here the transition from an initially sharp, discrete, digital mesh to continuous volumetric cloud is very explicit, and hopefully very hard to pinpoint. You ought to be able to definitely say that the end points are discrete or fuzzy, but not determine where one begins at the other ends. Towards the end, you might spot hints of a crystalline structure that I'll return to later. I liked this sequence, but I wanted to exaggerate the effect still further.

If there's anyone ought there still suffering the naive impression (as I certainly used to) that once you've got a data set, you pretty quickly understand it, this may hopefully cure them of that.


The Fires of Mordor


The same data set with the same colour scheme, but the final image has the faintest material stripped away, revealing an inner core much cooler in appearance.

How it's made

This sequence relies on the volumetric rendering techniques of FRELLED, which renders volume data a series of transparent planes. With enough planes the appearance of a continuous volume can be faked very convincingly at low computational cost. Modifying this to initially appear solid was done by the simple approach of removing most of the planes, making the visible ones opaque, and using the build modified to show very discrete, square sections of the data that gets gradually filled in. The data set used for his one is from the VLA GPS survey of our own Galaxy.

The only quirk I encountered was towards the end of the sequence, where I wanted the data to fade out. Since this is a realtime render I wanted to animate the clip alpha value, which to my surprise I found was impossible. Also, the tooltip in Blender that tells you how to access this via Python just doesn't work. Eventually some Google searching gave me the correct answer, so here's a script to show how this was done. The sudden drops in intensity are not deliberate but a result of the very low alpha value of each mesh, which means that Blender can't set enough precision on the Clip Alpha value to give a smooth transition. Not what I was aiming for, but I liked it.

Non Spectral Lines

After another Matrix-style shot, we now get something very different. Instead of turning the data into a landscape as in the first sequence, now we see a series of advancing lines, rendered with infinite reflections. This is easier to explain with something we briefly saw in the second sequence where the M33 data gradually turned into a landscape :

Phantom Spectra

Normally with these data cubes we plot spectra. These show how the intensity of a source varies with frequency, which gives us clues to its rotation and total mass. The lines above are different, they are non-spectral lines that show how the intensity varies with position on the sky. Such plots are sometimes made in astronomy, but my point here is again that rendering the same data in a slightly different way produces a conceptually different result. All that's been done is to restrict the data displayed and we get a sort of electrical appearance, very different from the landscape we had before.

You might also have noticed another phase during M33's lines-to-landscape transition sequence :

Spectral Surface

Here we see the same landscape but rendered as a partially transparent surface, with transparency depending on viewing angle. This helps to make the transition from lines to surface as gradual as possible.

How it's made

It was easy to modify the landscape scripts to restrict the meshes to a series of lines, and the creation script can be found here. For the M33 sequence these are trivially animated by just controlling their visibility (via layers rather than display) based on the current frame - a single velocity channel is used, so the height of each line never varies. That script can be found here. For the Pegasus sequence (the one with reflections), the lines are animated both in space and velocity channel. Each line that appears advances through the spatial and velocity pixels of the cube simultaneously, reversing direction whenever it reaches an edge. The animation script to do this is available here. The semi-transparent landscape is made using a very old "X-ray" technique for which you can find an example file here.

Crystalline Hydrogen

Next we return to the crystalline sequence show earlier, but this time attempting to make it explicit in a single frame. I wanted something that appeared diffuse and continuous one one side, but discrete and crystalline on one side - structured and structureless in the same form. This I found could be done by removing certain parts of the data. Again, there's no clear point at which one can say that the object is solid or gaseous. The idea of a crystal of gas is very appealing to me. What two substances could be more different ? Yet here they are, reconciled harmoniously. Or at least as good as I was able. One day it might be fun to improve the crystalline appearance and merge the shadeless material of the volumetrics with something with specular reflections and other crystalline effects.

I'll admit that thoughts about mind-body duality, the apparent contradiction where mental, non-physical concepts somehow control physical ones, was more than a little influential here. Since I've decided I have absolutely no clue how this works (and will greet anyone claiming they've got an easy answer with a shifty-eyed look), I'm not even going to attempt a speculation. What the hell is a thought, anyway ? How can I experience something via electrochemcial reactions whereas a plant or a calculator apparently does not ? Is our conscious perception just an emergent phenomena of a flabbergastingly complex web of reactions or something more mysterious ?

And then I got to thinking about blindsight again, wondering if perhaps we all do this constantly - only being truly aware of our surroundings for brief moments yet still receiving external information unconsciously. Can we truly be called intelligent if all we do is, like a camera hooked up to a monitor or computer screen, process data ? Where does data processing end and true consciousness begin ? Is objective intelligence possible or does it innately require subjectivity and bias ?

Buggered if I know. But I like the idea of a crystalline gas. Maybe the subjective and objective aren't so diametrically opposed as we think. Or then again maybe they are. I dunno. What am I, a magician ?

I also experimented with this display technique using M33, so here's yet another render showing how different the data can appear.

How it's made

The crystalline hydrogen sequence uses Milky Way data from the GALFA-HI survey. This is rendered using FRELLED, but with some modifications. Each image plane is heavily subdivided - not to the extent of having as many vertices as data points, but pretty heavy (there a few million vertices in the scene). The first part of the sequence is easy : all the planes are parented to a empty, which is scaled down heavily in the vertical direction so the stretched data is compressed to the size of the visible area.

Making the data look half-crystalline is more subtle. Each mesh has a randomised build modifier. I wrote a Python script to alter the length of each build, depending non-linearly on the channel number but always keeping the same starting frame and random seed. This gives the appearance of long "crystals". Build modifiers were then applied at a well-chosen frame so that they were no longer animated, and the clip alpha script was used to fade everything out. You can get the build modifier script here - it was quite carefully calibrated to work on this data set, mind you.

Ghosts of Virgo

In the next sequence we return to the idea of data emerging from a surface. Once again the surface represents a particular velocity channel of the data, but this time it's shown as a solid, reflecting pool rather than a wall of Matrix text. Galaxies emerge as partially transparent surfaces, this time showing their true spectral information. As with some of the other sequences, it can take a little while before the eye understands what it's looking at. What's particularly fun about this sequence is that these galaxy surfaces are genuinely useful ways of analysing the data - this is something I'm working on currently.

How it's made

The "surfaces" are actually a series of contour plots, one at each channel of each galaxy, extruded to look like a surface. They've got the X-ray material applied whereas the flux surface is simply purely reflective. This data is from the Virgo cluster and is available here.

Broken Cloud

At the end of the Virgo sequence we again see LAB data fade into the sky. Suddenly we see the view from beyond the spherical sky, but this time rendered as a volumetric cloud. The simple, almost cartoonish sky is replaced with something far more dramatic and weird, yet it's the same data set. There's a mixture of the diffuse and discrete, coupled with strange sort of turbulent motion.

How it's made

This uses a modification of the FRELLED technique I describe here (and formally here), mapping the data to spheres rather than planes. Since the data is from the whole sky, this is better for removing its distortions. The weirdness of the motion arises partly thanks to a strongly varying camera field of view and exploiting Blender's problems of sorting multiple layers of transparency. That is, transparent meshes only display "correctly" (if there even is such a thing) from one direction. With spherical meshes, one can peer through meshes that would otherwise appear dull and boring by using the clip alpha value in combination with simple distance clipping.

Eye of Harmony

We now fly through the LAB data. Everything is once again smooth and continuous with no hint of flaws in the data.

Xeelee Tunnel

We stay with the LAB data for the endgame. Again we see a confusing, now especially asymmetric, view of data that is sometimes discrete and sometimes diffuse, often dark but occasionally flaring into waves of bold, primary colours, before finally fading back into nothing.

Reality Bomb


Shattered Mirror

The Dream Is Collapsing

How it's made

This sequence worked out far better than I dared hope. Blender can't automatically displace spherical textures - they have to be UV mapped for this to work. But all it is is the colourised LAB data displacing a sphere. Originally I just wanted to show the hydrogen as an object, something you could conceivably hold rather than a place you could visit. I found than the interior of the sphere looked so much more interesting than I'd anticipated I abandoned my original idea and went for pure surrealism. This uses a wide angle camera moving on a complex orbit, together with distance clipping and an reflecting icosphere for the background. There are just two meshes in this scene, but it looks like so much more.


Data visualisation is the best thing ever. It just is. But if you think about it for too long, you end up spouting pretentious twaddle about "infinity" and "consciousness". You start muttering dark things about how life is a crystal and all knowledge is subjective. Pretty soon you go Full Philosopher, start asking random people on the street if they've ever wondered how language constructs reality, and eventually do all kinds of weird things a normal person should only do under the influence of psychoactive drugs, like pretending to be a small hummingbird named Hilda. At that point it's probably time to stop and go and plot some dang graphs.