Adventures with InfoGANs: towards generative models of biological images (part 2)
In the last post I introduced neural networks, generative adversarial networks (GANs) and InfoGANs.
In this post I’ll describe the motivation and strategy for creating a GAN which generates images of biological cells, like this:
Two microscopic of cells from an identical cell-line are never going to be pixel-for-pixel identical, yet they still share key similarities in terms of morphology. One way to get a computer to demonstrate it “understands” these morphological properties is to force it to create images of new cells. When a model is capable of generating any image that might be taken of a specific cell one might argue that it has gained some knowledge about the cell.
This is an appealing approach as it is is easy to acquire large numbers of static images of cells, for instance by using an imaging flow cytometer. This device flows cells past a camera and acquires thousands of images each second, each with a large number of fluorescent channels. These are static images of course, but if they come from an synchronous population they should contain representatives of every stage in the cell cycle. If our generative model was sufficiently capable of truly understanding the structure in the data (and there is evidence that such models do) then provided if we could generate a model where one dimension in our generator corresponded to pseudotime we could generate timecourse “videos” from models trained on these single images. This would, apart from anything else, have practical utility, avoiding the problems of photobleaching in imaging.
What I did
From my work at the Rayner lab at the Sanger Institute I have access to a large dataset of images of infected red blood cells acquired with an imaging flow cytometer. These images contain both a brightfield transmitted light channel and a fluorescent channel showing DNA in the parasite. I used some filtering (with a classification network) to isolate only images containing single cells which were fully visible. I then trained an InfoGAN in much the same way as in the digit dataset described in the last post.
I tried a couple of different versions, one with a large number of “communicated” values – which is what you see above.
The network learns about quite a few aspects of Plasmodium
biology, and also some basic optics:
- It learns about red blood cell morphology and to produce images of plausible cells
- It learns that the nucleus in the DAPI channel is always within the bounds of the RBC in the brightfield channel.
- It learns that there are a subset of cells (schizonts) which have both black haemozoin, and widespread, bright DAPI nuclei (often arranged in a circular shape)
- It learns that cells can appear in front of or behind the focal plane and how to render both types.
- I could go on.. the images produced by the network are almost all entirely plausible images that a biologist would be unable to distinguish from true parasites.
As an aside I really like watching this network cycle between samples, it is strikingly similar to watching a motile cell wriggling:
I also trained a second version, with many fewer communicated variables, which really lets us see what the network sees as the most salient features. Unsurprisingly, these are mostly about the location and orientation of the red blood cell. Interestingly this network makes one of the parameters the focal position of the cell:
For any cell you look at, one end of this variable produces one type of halo, and the other the opposite, just as if this parameter was controlling the focus of a microscope.
This property illustrates the sort of property it would be really exciting to get the network to learn. In an ideal world the sliders would represent something like: focus, cell orientation and parasite lifecycle stage. So if you had a young ring stage parasite and you wanted to watch it mature you could just drag the slider across. Or if you wanted to focus on the far side of the cell you could just drag another slider. Such an idea may sound fanciful, but I don’t think it’s too far away.
There are a number of simple ways I could improve the work I’ve done here. The network has to capture the complete distribution of cellular images presented as “real”. In this case you can see that these are in many different orientations, and different positions in the image. So a vast amount of the networks efoort goes into recapturing that, not very interesting, spatial variation. I made some basic attempts to align the images presented to the network, but this is something I could definitely improve. Something you will see as the cells dance from one space in the network to another is that while the nucleus tends to move smoothly around the black haemozoin will often fade out in one location and appear in another. The lack of smoothness here is an example of “modal collapse” that often haunts GANs, but there are a number of ways to tackle it.
In short, this work just scratches the surface of what what I suspect is possible with generative models of biological cells.
In some organisms there are genome-wide fluorescent-tag libraries available. Building a generative model using these (possibly with the need for some pairwise imaging) could allow the creation of a synthetic cell in which every protein can be simultaneously visualised. It’s an exciting prospect, and I think it’s nearer than it seems.
P.S. I belatedly looked for similar published work, and found two cool papers. The second of these introduces a star-shaped network designed to allow alignment in much the way I imagine at the end of this post. And more generally there are a ton of GAN papers applying the technique to super-resolution microscopy, in silico staining, etc.