How can you trust your own senses ? How can you be sure that what you're seeing isn't all just some kind of elaborate illusion set up by a powerful entity with a truly warped sense of humour ?
Don't worry, this isn't going to be much of a philosophical rant. Actually, it's going to be an extended rant about how I hurt myself by looking at over 170,000 pictures of static.
Yes, really. All of which relates, of course, to radio astronomy. It's time for the public-friendly explanation of my latest paper, in which I pit human vision against a set of different algorithms and find that there's life in us old monkeys yet. I also inflict things upon myself that I hope never to repeat ever ever again.
1) How To Hunt For Gassy Galaxies
Regular readers will be only too well aware by now of what galaxies look like in neutral hydrogen (HI - "H one") data cubes. There's a longer explanation here, but in general they look like this :
"Yes, yes," you say, "but how do you ever know if what you've found is real ?"
Indeed. That's a question that plagues both visual and automatic searches. Oh, sure, it's fine if you find a great big blazing beast of a galaxy (like some of those in the above animations), but what about when you've got something more piddly like this ? How could you be sure it wasn't just some slightly brighter-than-usual bit of noise ?
Well, there are a few measurements we can make of the gas itself that give us some clues : quantitative values can be a lot more reliable than simply eyeballing it. But first, in the above images you can also see optical data alongside the radio, and that's a very powerful verification check. Gas clouds without associated optical emission aren't non-existent, but they're extremely rare (about 1% of all HI detections by some estimates). You'd be forgiven for missing this one based only on looking at the data cube, but here at least the optical galaxy is unmistakable.
Understandably then, hardly anyone relies exclusively on the original HI data. We almost always have some optical data to act as independent confirmation (except where the Milky Way blocks our view in the aptly-named Zone of Avoidance) and can usually get follow-up radio observations to directly confirm at least the most interesting signals. That's the absolute gold-standard for verification, and how we know that most optically dark sources aren't real : we've checked them many, many times, and indeed still do when we find anything interesting.
But... what if we can't get either of these ? Exactly how good is the eye at distinguishing signal from noise, and what fraction of the faintest signals does it pick up at all ? That's what I wanted to answer with this paper.
2) Yes, But Why ? And How ?
The key aspect of the problem is that, once you get down to the faint stuff, there are no objective criteria, no magical algorithms, that can 100% reliably distinguish between real signals and noise. Some truly pathetic signals turn out to be real while some fairly convincing ones end up being discarded as worthless junk. The only way to be really sure is to do more observations.
BUT... algorithms are at least objective and repeatable : throw 'em the same data set and search with the same parameters, and you'll get the same objects every time. Different search methods or parameters can give you different catalogues, but at least if you keep everything the same, you'll get the same results again and again. That's a big advantage over using squishy, emotional humans that might get distracted because they haven't had enough tea or they got sick or their pet hamster died or something.
![]() |
I thought about refining ChatGPT's weird take on this, but decided the bizarreness of putting the hamster in a box labelled NO TEA was just too funny to alter. |
How much does this matter though, really ? Exactly how good are humans compared to algorithms ? Can we even quantify it, or are we all such an emotional, whimsical bunch of wet blankets that we just come up with totally different results every time ? Or if you're a Daily MFail reader, HAS WOKENESS KILLED ASTRONOMY ?
The only way I could see to test this was to look for lots and lots and lots of sources, all with different parameters. Throw enough sheer statistics at the problem and it ought to be possible to see if human abilities could be quantified or not.
Now to do this requires we have full knowledge of what's there for us to find. Ordinarily this isn't the case at all, because the whole point of the problem is that we don't know for sure which sources we've missed. So the only way to do this is by using fake sources – only then can we be absolutely sure if we've found everything. The basic idea is very simple : to try and find as many artificial signals as we can and measure their parameters.
Of course, for this we also need a data set which doesn't have any actual galaxies in it, otherwise we'll confuse the artificial signals with real ones. Fortunately one of our unpublished data sets includes just such a cube, spanning a frequency range in which real signals just can't happen. To find emission here would require galaxies moving towards us at insane velocities, many thousands of kilometres per second – no real galaxy is known which moves at anything even close to this*. Bingo ! We've got a real data cube with all the imperfections of real observational data, but with absolutely no real galaxies in it. Perfect.
* Redshifts of this magnitude are normal, due to the expansion of the Universe. But there the galaxies are moving away from us. Here we'd need galaxies of extreme blueshift, and there's no known mechanism by which this could happen.
Once we've done the detecting, how to parameterise the results ? Well, with any catalogue it's important to understand its completeness and reliability : that is, what fraction of the sources present it detected, and what fraction of its detections are real. With enough sources we could also see if the eye is especially sensitive to particular properties, like the total brightness and velocity width, and maybe also figure out what sort of false signals fool the eye into thinking there's something present. And I also wanted to test the street wisdom that, apart from speed, humans are generally better than algorithms when it comes to sheer detection capabilities.
3) The Experiment
Figuring out the best approach took a lot of trial and error. The simplest method would be to inject lots of signals into a single large data cube, but this wasn't feasible. This would mean I'd have to mask each galaxy as I went along to avoid cataloguing it twice, which is... not a huge amount of work, but it adds up. And for an experiment of the scale this one became, this would have been unbearable.
The problem is that galaxies themselves have two parameters which control their detectability : their width and their brightness. Here's an example spectrum I use in lectures :
What this is showing is a signal of fixed total flux but a varying velocity width. At the very beginning, all that flux is confined to just a few velocity channels, so it's very narrow but bright. Even though it's so bright, because of the way we typically display the data, the narrowness of the signal makes it hard to spot. As the movie advances the velocity width increases, so that it gets wider and wider but appears dimmer and dimmer. At first this makes it much easier to see : it's still bright but it's no longer narrow, so it's really obvious that there's something atypical here. But eventually that flux is spread out over so many channels that it's barely distinguishable from the background noise at all, even though the total amount of flux is the same throughout the animation.
I've long thought it an interesting question as to which one matters most. If a source is wide enough, does this compensate for its dimness ? Or is it brightness alone which determines detectability ? My PhD supervisor took it for granted it was the latter, but I was never quite convinced of this.
The only way I could see to tackle the problem was to inject many galaxies each of a given width and brightness. I'd inject, say, 100 with some combination of values, see how many I could find, and then repeat this ad nauseum. I'd need to have plenty of objects for each combination to get a statistically significant result. Since I had very little clue ahead of time where exactly the detectability threshold would be, this would mean injecting a lot of galaxies.
That made the idea of using a single cube a complete non-starter. Eventually I figured out a working strategy, which goes like this :
- Pick a width and brightness (signal to noise, S/N) level of the signal.
- Extract 100 small "cubelets" at random from the main cube.
- For each cubelet, randomly inject (or not inject) a signal of the specified parameters, at a random location within each one.
- Modify my source extraction program so I could go through each cube sequentially, just clicking on a source if I thought I could see one, or clicking outside the data set if I thought there wasn't one.
- Choose new signal parameters and do the whole thing again.
![]() |
The cubelet is an adorable animal, but, like tribbles, they tend to multiply exponentially if you're not careful. |
Without any of these it becomes too easy to be lost in the visual fog - one needs some clue as to what one is looking for or the experience is unendurably frustrating... Being a visual search process, one needs to take much more account of the psychological, emotional experience than in using a pure algorithm.
![]() |
170,000 images. I mean, FFS. |
- Green points : these test for the fact that I knew ahead of time that the fraction of cubelets containing a source was always about 50%. Here I instead set this fraction to a random number, but lo, the detection statistics were unchanged. In fact even when I went back and searched cubes where I'd previously missed the source, but now with the full knowledge that a source was present, I still couldn't find them. Foreknowledge just doesn't help much.
- Red points : randomised source properties. Here I injected cubelets with three different integrated S/N levels designed to give 25, 50 and 75% completeness levels, but with entirely randomised widths, in random order, without giving any indication of which identifications were correct until the experiment was over. Essentially for each cubelet I had no idea if it contained no source at all or one which would be marginally, modestly, or probably detectable, or what it would look like. This again made no difference to the results.
- Orange points : as above, but now injecting the sources into a single large cube instead of a many cubelets. In the main experiment I knew there was at most one source per cubelet; here I didn't know the number injected (randomised with some sensible upper limit) or their properties. This was about as similar to real-world conditions as it's ever going to get, and it still made no difference.
![]() |
"It's a movie about a killer robot radio astronomer who travels back in time for some reason." |
![]() |
"Fiducial" is just the main visual experiment. It sounds more sciency than "the one what I did earlier". |
![]() |
After all, the question "real or fake" ? has caused much debate elsewhere. Methinks I ran the wrong experiment. |
No comments:
Post a Comment
Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.