Follow the reluctant adventures in the life of a Welsh astrophysicist sent around the world for some reason, wherein I photograph potatoes and destroy galaxies in the name of science. And don't forget about my website, www.rhysy.net



Tuesday 5 June 2018

H One


Richard Feynman was both a great scientist and a wonderful philosopher of science (though he was also, and it's worth bringing this up a lot more often, a dick). The imagination of the artist is of course a very interesting thing indeed, and scientific and artistic creativity aren't always so unlike each other. For actual science, imagination is by necessity tempered by observation. But for data visualisation, sometimes it's better to let imagination off its leash and run wild and free without giving a flying poop about whether it's "useful" or "relevant" or not. I'm a firm believer in the principle of doing things because they are cool. All that spin-off gubbins can come later, or not at all.

Do you think this crocodile cares that data isn't uniformly sampled ?
No, neither do I.

A long time ago in an institute far far away, I made an abstract visualisation about neutral hydrogen data. Five years later, software, hardware, and techniques have improved quite a bit, so I decided it was time for another one. In general I think art should stand up on its own and let the viewer decide how they want to interpret it, but if and when an explanation is required, it'd better be a bit more than, "old clay pipe stuck in festering monkey's uterus" or whatever crap the modern art world is currently plagued with. So if you want, you can view the final video product below and walk away. Or, if you prefer, you can keep scrolling and look at some additional pretty pictures. If you're really enthusiastic, you might even read some of the accompanying text, think about the philosophy behind this, and then watch it again. I'll be asking a lot of questions here and answering hardly any of them. Be warned.



The main purpose of this work was to show how different data visualisation methods let us perceive the same data sets in radically different ways. In principle, scientific conclusions should be limited by the data itself. In practise, the process of interpreting the data is creative and subjective. This can be true even of numerical analysis of very simple data sets - raw numbers, by themselves, tend not to be terribly inspirational, restricting the contextual environment of the data to existing ideas. And more trivially, as we all know, statistical analyses of larger data sets can all too easily lead to conclusions which are objectively constructed but simply wrong. All interpretations are ultimately subjective because interpretation is never done by the data itself, but always by human judgement.

How you perceive the data is therefore freakin' important. It strongly affects which conclusions you can find and even consider, shaping your worldview. Data visualisation has many exploratory and communicative purposes : to help discover what the data set contains, to compare different observations and models, to persuade others of whatever conclusion seems most probable, and, not least, to inspire new ideas. Creativity is a complex process, but it's fair to say that most people are far more inspired by visual imagery than mathematical text : the former is, after all, something we're much more evolved to process, since cave art tends to have more pictures of mammoths and stuff than differential equations.


And we're now in an era where data sets of millions of data points are commonplace. It's true that we're going to have to adapt our analysis methods to deal with the new problems associated with big data, but it doesn't mean we can stop looking at the data : not now, not ever. Rather it means that we have to come up with new visualisation methods, which will affect both our hard scientific conclusions and purely philosophical interpretations.

Here then is my offering to attempt philosophy through art driven by science. I'll briefly describe each sequence of the movie both in terms of the aim of each sequence and the method of its construction. I finally managed to get myself to learn Python for Blender > 2.49 for this (all of which is rendered in Blender, mostly in 2.78), so I've included links to scripts where useful. All of the data sets used here are, as is my speciality, 3D neutral atomic hydrogen data cubes. Again, links to public data sets are provided where available.

As this might interest different audiences, each section has a separate description of how the image was rendered which is written for Blender specialists and can be skipped by everyone else*. The main point is that these data cubes are three-dimensional maps of the hydrogen gas in different parts of the universe. Imagine them as a sort of slightly unconventional atlas. Every page shows a map of the density of gas at exactly the same locations in space, but at very slightly different frequencies. This corresponds to velocity, which, depending on the data set, can tell us how fast the gas is rotating and/or its distance away from us.

* Originally I had this fancy thing of using buttons to hide the text, which worked brilliantly for about twenty minutes and then stopped for no reason whatsoever. So I'm afraid you'll have to skip this the old-fashioned way. "Computers are logical", they said. "They give repeatable, objective results", they said...


Too Much Information


We begin with a Matrix-style shot, panning back from a single number to a whole array of text. This is of course a cliché, but a very useful one. The data that we have to process is, ultimately, just numbers, so in a sense this is the simplest and most natural form of data visualisation. Yet we very rarely use this for large data sets, because, unlike characters in the Matrix, we can't train ourselves to automatically see "blonde, brunette spiral, elliptical galaxy". We are compelled to process the data in different ways in order to make any sense of it.

Note that I said the data we have to process. Whether the data itself is really all numbers or not is another matter entirely. What we're ultimately processing is radio waves or photons (or particles, but let's not even go into how we're fundamentally unsure about the nature of the stuff we're measuring) from the sky, which induce minute electrical currents that experience an enormous amount of complex processing before they're reduced to the final numbers we get to muck around with.

So are numbers, in some sense, a very literal, accurate representation of reality, or are they as flawed as any abstract representation, like language ? Are they continuous or discrete ? One view is that because you can't contain infinity in a finite volume, physical properties - length, height, weight, acceleration, charge, etc. - can't have infinite precision. The problem with this is how you'd ever prove that volume is truly finite, that you can't just keep dividing space up into ever smaller parts. And if reality is continuous, that implies all the problems typically associated with infinities. So you can certainly represent data with numbers, but whether than means the world really is made of numbers or they are merely conceptual mental constructs is far less clear.


How it's made

This shot (the above image is actually from a later sequence) attempts to make sense of the data in a few different ways. First, I used this script to slice the data set into a series of text files. Each one contains a 2D slice of the 3D data, one for each frequency (velocity) channel. I've cheated here - despite appearances only one slice is ever visible, with each frame of the animation increasing the visible channel. The script to create the text grid is available here and the animation script is here. Rendering a true volume of text is just too computationally demanding, so I used two planes, one above and one below the text objects, with a mirror material to create the illusion of extra depth. This uses a very simple render layer setup to hide one of the planes, which you can find an example of here.

The first sequence in the movie uses data of the Triangulum galaxy M33 available here. As the animation proceeds, the height (as well as the value) of the text is adjusted to correspond to its intensity, so you get a hint of a Matrix-style surface effect. I wanted the visuals to be driven as much as possible by the data, but I allowed myself a fair amount of "artificial" window dressing if I wanted to make a particular point. In this case the text has some random value variation to give it a more "Matrixy" appearance, since the numbers in the font I used weren't particularly unusual to look at.

The M33 text only samples a small fraction of the full data set, mainly because my first script was flawed : it links each text object to the scene as it's created. Later I realised it's very much faster to create the object and link them to the scene all at once, which made the image shown above possible. That one uses private data of the Pegasus cluster. Each of the whiteish columnar features is a galaxy. Whereas M33 is nice and close and well resolved by the Arecibo telescope, these guys are much further away. We detect them over many different channels because they're rotating, but they're essentially just blobs in the spatial plots - hence the long, cigar-like features.

Even with the improved script I couldn't render all the data at full resolution - it's just too large to display everything as text. The galaxies, however, are unresolved, so they did require the highest resolution or they wouldn't be shown at all. So I clipped the data : the brighter flux, corresponding to galaxies, is sampled at the maximum resolution possible, whereas the fainter noise is much more restricted. Creating the appearance of the galaxies as textual surfaces (rather than volumes filled with text) was done by restricting the flux shown to a narrow range, so the interiors are avoided. The appearance of the galaxies was animated by this very simple script which just controls their visibility.


The Dark Tower



We next get a transition from representing the data as numbers to a landscape, via a brief transition sequence I'll describe later. A happy rendering accident produced the murky scene that resulted, giving a slow, reluctant change from text to surface. Of course the data here still is numerical, but this has a completely different feel to it. I liked the idea of thinking of these highly abstract slices of data as real places. Of course they are places, but a diffuse, gaseous galaxy is conceptually different from a rocky landscape. And you don't even need the concept of a number to understand a landscape. Do our minds even quantify things at all, or are they making relative comparisons in some other way ?

I played around with this quite a bit. In the video we get this gloomy blue scene, blurring the line between discrete surface and volumetric fog - a running theme of the project. I'm fascinated by how continuous data can emerge from seemingly incompatible discrete points. Eventually we see the data (the same as in the first sequence) rendered in the style of a barren rocky desert, albeit with some reflections to keep it suitably surreal. At the end the sky changes from a clear blue sky to one filled with clouds generated from hydrogen data from our own galaxy : a hydrogen sky illuminating a hydrogen landscape, data of the same type shown with starkly contrasting methods. Altering the colour scheme is also fun.


As Above, So Below



Minas Morgul


Interestingly, despite having looked at this data (as above, of the Triangulum galaxy M33) a lot, I've never noticed the twin peaks here before. So in some cases you really do get new information by changing your visualisation method. Which makes one wonder if and why it's more legitimate to view your data as a landscape, map, or volume. What exactly do we mean when we ask what the data really looks like ? Why do our minds perceive things as colours and not as heightfields, or smells, or boredom, or homoerotic ennui ?

For example, imagine that you had extremely densely-packed nerves in your fingertips. You could in principle receive tactile information as though you were seeing it, though no photons would be involved. Similarly, photons received by your eye could trigger the same sensation as when you touch something, though this wouldn't be accessing the same information as from a direct physical interaction. Or your brain could directly generate maps of emotion rather than brightness and colour. It's hard to imagine finding your way around based on how hilarious/erotic you find your surroundings, but why not ? It would be the same information as you have now, just processed differently. Indeed many animals lack a sense of sight entirely and get by just fine on smell and touch; others, such as sharks, have electrical senses quite beyond our ability to imagine. Does their electrical sense feel similar to touch, or is it as different from touch as touch is to sight ?

Synaesthesia, where one sense triggers another, is sort-of what I'm getting at. Blindsight, where the brain process signals from the eyes only at an unconscious level, is perhaps closer but still an imperfect analogy. My questions are more why we should consider our senses legitimate at all. To what degree are we experiencing the real world ? How does our perception shape our view of it ? Why do we only perceive things in such limited ways ?

All this comes about from spending too long looking at those channel maps. False colour images are one thing; maps of the motion of an object are quite another - and they all give real information. For comparison, here's how I would normally render the above data set :


So which one is real ? A mind-wrenchingly difficult question. None of them show the data as it would appear to our eye, but on what grounds do we grant visual sense a special privilege ? None, really. Maybe one day we'll have the equivalent of a Copernican Principle but for senses, holding information from smell and taste to have the same level of validity as that from photons.


How it's made

Rendering data as a landscape is fairly easy - I just used each channel map to displace a grid mesh vertically. Stupidly not realising that Blender can now easily do this for animated textures, I wrote a couple of Python scripts : one to create the initial mesh which you can find here, and a second to animate the meshes which you can find here. Pointless, but it got me learning to code in modern Blender at long last.

The hydrogen sky data is taken from the Leiden Argentine Bonn survey and is available here. The data files are equirectangular so are easily mapped to a sphere in Blender without even needing UV mapping. To create the colours, I used a classic technique where different channels contribute to the RGB components separately.

I had a happy accident creating the murky appearance. I found that if you enable indirect lighting and ambient occlusion, and completely surround your objects with some closed mesh (which must be traceable so it can cast shadows), you get this wonderfully gloomy appearance. I never bothered to figure out how to change it, but fortunately it looked nice to me anyway.


Infinity Sphere



This works a little better as an animation but the stills are nice enough, I guess. As the hydrogen sky appears to set, we change to the reverse angle - a view from beyond the sky, with the hydrogen data mapped onto a sphere. Now here I'll admit I've used something that looks pretty rather than being driven directly by the data. The sphere is between two reflecting mirror spheres, so you get a series of "infinite" (well, okay, about ten) reflections fading into the finite volume of the sphere. This merges the smooth, solid appearance of the sphere with the diffuse nature of the hydrogen, combined with strange distortions from the spherical reflections and a deliberately ambiguous sense of motion.

It's true that slapping on a pair of shiny spheres might not be necessarily the most informative way of viewing data. But to hell with that ! You're look at a series of literally timeless photons that have, from their perspective, instantaneously travelled fifty thousand light years to be intercepted by a series of reflective metal surfaces in order to cause a small, measurable electrical current and then re-emitted as photons on a computer monitor and finally viewed through twin organic, refracting lenses which transform them back into electrical signals and then... lord knows. What exactly is strange about throwing a couple more reflecting spheres in there, hmm ?


More Than Darkness In The Depths



How it's made

Not much to say about this one - LAB data again, with offsets to generate the RGB channels, and a couple of reflecting spheres, with a similar render layer setup to prevent the outer one from being visible.


Divide



Next comes a very different style of sequence. Here the transition from an initially sharp, discrete, digital mesh to continuous volumetric cloud is very explicit, and hopefully very hard to pinpoint. You ought to be able to definitely say that the end points are discrete or fuzzy, but not determine where one begins at the other ends. Towards the end, you might spot hints of a crystalline structure that I'll return to later. I liked this sequence, but I wanted to exaggerate the effect still further.

If there's anyone ought there still suffering the naive impression (as I certainly used to) that once you've got a data set, you pretty quickly understand it, this may hopefully cure them of that.


Flux



The Fires of Mordor





Ice


The same data set with the same colour scheme, but the final image has the faintest material stripped away, revealing an inner core much cooler in appearance.



How it's made

This sequence relies on the volumetric rendering techniques of FRELLED, which renders volume data a series of transparent planes. With enough planes the appearance of a continuous volume can be faked very convincingly at low computational cost. Modifying this to initially appear solid was done by the simple approach of removing most of the planes, making the visible ones opaque, and using the build modified to show very discrete, square sections of the data that gets gradually filled in. The data set used for his one is from the VLA GPS survey of our own Galaxy.

The only quirk I encountered was towards the end of the sequence, where I wanted the data to fade out. Since this is a realtime render I wanted to animate the clip alpha value, which to my surprise I found was impossible. Also, the tooltip in Blender that tells you how to access this via Python just doesn't work. Eventually some Google searching gave me the correct answer, so here's a script to show how this was done. The sudden drops in intensity are not deliberate but a result of the very low alpha value of each mesh, which means that Blender can't set enough precision on the Clip Alpha value to give a smooth transition. Not what I was aiming for, but I liked it.


Non Spectral Lines



After another Matrix-style shot, we now get something very different. Instead of turning the data into a landscape as in the first sequence, now we see a series of advancing lines, rendered with infinite reflections. This is easier to explain with something we briefly saw in the second sequence where the M33 data gradually turned into a landscape :


Phantom Spectra


Normally with these data cubes we plot spectra. These show how the intensity of a source varies with frequency, which gives us clues to its rotation and total mass. The lines above are different, they are non-spectral lines that show how the intensity varies with position on the sky. Such plots are sometimes made in astronomy, but my point here is again that rendering the same data in a slightly different way produces a conceptually different result. All that's been done is to restrict the data displayed and we get a sort of electrical appearance, very different from the landscape we had before.

You might also have noticed another phase during M33's lines-to-landscape transition sequence :


Spectral Surface


Here we see the same landscape but rendered as a partially transparent surface, with transparency depending on viewing angle. This helps to make the transition from lines to surface as gradual as possible.


How it's made

It was easy to modify the landscape scripts to restrict the meshes to a series of lines, and the creation script can be found here. For the M33 sequence these are trivially animated by just controlling their visibility (via layers rather than display) based on the current frame - a single velocity channel is used, so the height of each line never varies. That script can be found here. For the Pegasus sequence (the one with reflections), the lines are animated both in space and velocity channel. Each line that appears advances through the spatial and velocity pixels of the cube simultaneously, reversing direction whenever it reaches an edge. The animation script to do this is available here. The semi-transparent landscape is made using a very old "X-ray" technique for which you can find an example file here.


Crystalline Hydrogen



Next we return to the crystalline sequence show earlier, but this time attempting to make it explicit in a single frame. I wanted something that appeared diffuse and continuous one one side, but discrete and crystalline on one side - structured and structureless in the same form. This I found could be done by removing certain parts of the data. Again, there's no clear point at which one can say that the object is solid or gaseous. The idea of a crystal of gas is very appealing to me. What two substances could be more different ? Yet here they are, reconciled harmoniously. Or at least as good as I was able. One day it might be fun to improve the crystalline appearance and merge the shadeless material of the volumetrics with something with specular reflections and other crystalline effects.

I'll admit that thoughts about mind-body duality, the apparent contradiction where mental, non-physical concepts somehow control physical ones, was more than a little influential here. Since I've decided I have absolutely no clue how this works (and will greet anyone claiming they've got an easy answer with a shifty-eyed look), I'm not even going to attempt a speculation. What the hell is a thought, anyway ? How can I experience something via electrochemcial reactions whereas a plant or a calculator apparently does not ? Is our conscious perception just an emergent phenomena of a flabbergastingly complex web of reactions or something more mysterious ?

And then I got to thinking about blindsight again, wondering if perhaps we all do this constantly - only being truly aware of our surroundings for brief moments yet still receiving external information unconsciously. Can we truly be called intelligent if all we do is, like a camera hooked up to a monitor or computer screen, process data ? Where does data processing end and true consciousness begin ? Is objective intelligence possible or does it innately require subjectivity and bias ?

Buggered if I know. But I like the idea of a crystalline gas. Maybe the subjective and objective aren't so diametrically opposed as we think. Or then again maybe they are. I dunno. What am I, a magician ?

I also experimented with this display technique using M33, so here's yet another render showing how different the data can appear.



How it's made

The crystalline hydrogen sequence uses Milky Way data from the GALFA-HI survey. This is rendered using FRELLED, but with some modifications. Each image plane is heavily subdivided - not to the extent of having as many vertices as data points, but pretty heavy (there a few million vertices in the scene). The first part of the sequence is easy : all the planes are parented to a empty, which is scaled down heavily in the vertical direction so the stretched data is compressed to the size of the visible area.

Making the data look half-crystalline is more subtle. Each mesh has a randomised build modifier. I wrote a Python script to alter the length of each build, depending non-linearly on the channel number but always keeping the same starting frame and random seed. This gives the appearance of long "crystals". Build modifiers were then applied at a well-chosen frame so that they were no longer animated, and the clip alpha script was used to fade everything out. You can get the build modifier script here - it was quite carefully calibrated to work on this data set, mind you.


Ghosts of Virgo


In the next sequence we return to the idea of data emerging from a surface. Once again the surface represents a particular velocity channel of the data, but this time it's shown as a solid, reflecting pool rather than a wall of Matrix text. Galaxies emerge as partially transparent surfaces, this time showing their true spectral information. As with some of the other sequences, it can take a little while before the eye understands what it's looking at. What's particularly fun about this sequence is that these galaxy surfaces are genuinely useful ways of analysing the data - this is something I'm working on currently.


How it's made

The "surfaces" are actually a series of contour plots, one at each channel of each galaxy, extruded to look like a surface. They've got the X-ray material applied whereas the flux surface is simply purely reflective. This data is from the Virgo cluster and is available here.


Broken Cloud



At the end of the Virgo sequence we again see LAB data fade into the sky. Suddenly we see the view from beyond the spherical sky, but this time rendered as a volumetric cloud. The simple, almost cartoonish sky is replaced with something far more dramatic and weird, yet it's the same data set. There's a mixture of the diffuse and discrete, coupled with strange sort of turbulent motion.


How it's made

This uses a modification of the FRELLED technique I describe here (and formally here), mapping the data to spheres rather than planes. Since the data is from the whole sky, this is better for removing its distortions. The weirdness of the motion arises partly thanks to a strongly varying camera field of view and exploiting Blender's problems of sorting multiple layers of transparency. That is, transparent meshes only display "correctly" (if there even is such a thing) from one direction. With spherical meshes, one can peer through meshes that would otherwise appear dull and boring by using the clip alpha value in combination with simple distance clipping.


Eye of Harmony


We now fly through the LAB data. Everything is once again smooth and continuous with no hint of flaws in the data.


Xeelee Tunnel


We stay with the LAB data for the endgame. Again we see a confusing, now especially asymmetric, view of data that is sometimes discrete and sometimes diffuse, often dark but occasionally flaring into waves of bold, primary colours, before finally fading back into nothing.


Reality Bomb




Chrysalis



Shattered Mirror



The Dream Is Collapsing



How it's made

This sequence worked out far better than I dared hope. Blender can't automatically displace spherical textures - they have to be UV mapped for this to work. But all it is is the colourised LAB data displacing a sphere. Originally I just wanted to show the hydrogen as an object, something you could conceivably hold rather than a place you could visit. I found than the interior of the sphere looked so much more interesting than I'd anticipated I abandoned my original idea and went for pure surrealism. This uses a wide angle camera moving on a complex orbit, together with distance clipping and an reflecting icosphere for the background. There are just two meshes in this scene, but it looks like so much more.


Summary

Data visualisation is the best thing ever. It just is. But if you think about it for too long, you end up spouting pretentious twaddle about "infinity" and "consciousness". You start muttering dark things about how life is a crystal and all knowledge is subjective. Pretty soon you go Full Philosopher, start asking random people on the street if they've ever wondered how language constructs reality, and eventually do all kinds of weird things a normal person should only do under the influence of psychoactive drugs, like pretending to be a small hummingbird named Hilda. At that point it's probably time to stop and go and plot some dang graphs.


2 comments:

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.