Institute of Food Research site search
Computational Microbiology at the Institute of Food Research

Research Leader: Arthur Thompson



  • The core of the IFR microarray facility consists of a fast linear-servo driven spotting robot which was built in-house by Arthur Thompson and Matthew Rolfe in 2004. The robot is based on plans developed at UCSF and Stanford University by Joe DeRisi, Pat Brown and others and was featured in the 2002 Cold Spring Harbor course ‘Making & Using DNA Microarrays’

The IFR's fast linear-servo driven spotting robot

  • The robot is capable of printing 30-40 000 features (or spots) onto 261 slides. This supersedes the smaller model built in-house by Arthur Thompson, Sacha Lucchini and Bruce Pearson in 2000 according to plans originally developed at Stanford University by Pat Brown and Joe deRisi1.

The Stanford Mk II Arrayer

  • For array analysis we use an Axon GenePix 4000A scanner and for data mining we use GeneSpringTM (Silicon Genetics), although we are in the process of developing some of our own analysis software.

The Axon GenePix

  • The handling of PCR reactions, aliquoting and dilutions is carried out using a customised MWG RoboAmp 4200P.

The MWG RoboAmp

  • The facility was funded by the BBSRC.

Microarray background

  • Assessment of transcription at the genomic scale has been achieved with DNA microarrays, which are glass slides containing an ordered mosaic of the entire genome as a collection of either oligonucleotides (oligonucleotide microarrays) or PCR products representing individual genes (commonly referred to as cDNA microarrays).

  • Microarrays containing up to 50 000 or more features on a single microscope slide can be achieved using highly accurate robotic 'spotting' technologies2.

  • To assess genome wide transcriptional profiles, the microarrays are hybridised against RNA or DNA samples labelled with fluorescent dyes3

E.coli K12 hydridised microarray

1 DeRisi, J., Penland, L., Brown, P.O., Bittner, M.L., Meltzer, P.S., Ray, M., Chen, Y., Su, Y.A. & Trent, J.M. (1996). Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nature Genet 14, 457-460.


3 Schena, M., Shalon, D., Davis, R.W. & Brown, P.O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467-470.

Current IFR Microarrays:

  • E. coli K12
  • E. coli O157
  • Shigella flexneri
  • Salmonella Typhimurium



Data analysis:


  • Our comprehensive set of controls consist of a set of ten in vitro transcribed yeast RNA's, diluted to various concentrations that can be spiked into labelling reactions. Each microarray is printed in duplicate both with the genome and with a serial dilution of the ten yeast PCR products. This provides an indication of the level of sensitivity of the hybridisation.
  • Lucidea ScoreCardTM (Amersham Biosciences)
  • Other controls include both ss and ds M13 DNA, and low and highly expressed E.coli genes, printed out as serial dilutions.

Data Centring

  • Normalisation or 'centring' of microarray data is very important because it evens out experimental inconsistencies such as differential labelling efficiencies, batch to batch variations in chemicals, slides etc. and thus enables intra and inter experimental comparisons.
  • Data centring is performed by bringing the median Ln(Red/Green) for each block to zero (one block being defined as the group of spots printed by the same pin) using the following equation: ln(Ti) = ln(Ri/Gi) - c, where T is the centred ratio, i is the gene index, R and G are the red and green intensities and c is the 50th percentile of all Red/Green ratios.


  • Each slide carries a duplicate printed microarray.
  • Hybridisations are also performed in duplicate, thus at least a fourfold hybridisation is performed for each sample, this includes two 'technical replicates' of the RNA sample itself and two 'biological replicates' of RNA from separate experiments.For statistical analysis we use a parametric filter based on a two-sample t-test for two groups or ANOVA for multiple groups and the Benjamini and Hochberg multiple testing correction to adjust individual p-values. These tests are features of the GeneSpring™ (Silicon Genetics) microarray analysis software package which we also use for data visualisation and mining purposes

It should be noted that the protocols, microarray design and data analysis methods described above are used specifically in the Molecular Microbiology group and may differ from those used by other groups or individuals within the IFR.



Institut für Mikrobiologie und Tierseuchen Freie Universität Berlin

Collaboration on ppGpp and global gene regulation in Salmonella with Dr Karsten Tedin

Ghent University, Belgium Frank Pasmans, Department of Pathology, bacteriology and poultry diseases

Current microarrays


Microarray No. of

ShE. coli microarray

representing the complete genomes of
E. coli
K12, O157, S. flexneri plus a
selection of E. coli virulence genes
# E. coli K12 specific genes
# E. coli O157 EDL933 specific genes
# S. flexneri 2a str. 301 specific genes
# Other E. coli virulence genes



S. Typhimurium LT2a microarray
(includes pSLT)
‘Salsa’ Salmonella serovar microarray

# S. Typhimurium LT2a (and pSLT) genes
# S. Typhimurium DT104 specific genes
# S. Typhimurium SL1344 specific genes
# S. Enteritidis PT4 specific genes
# S. Gallinarum 287/91 specific genes



Campylobacter jejuni microarray ~ 1700
Human microarray 13971


Future microarrays at the IFR:

  • Lactobacillus
  • Bifidobacterium


Why do we use genomic DNA as a reference rather than compare RNA to RNA?

An experiment where sample RNA's are labelled with Cy3 and Cy5 and hybridised to each other is designated a 'type I' experiment (DeRisi et al, Science (1997), 278:680-686). For example, in an experiment comparing a mutant with a wild-type strain, RNA is extracted from each strain, labelled with Cy3 and Cy5, then mixed and hybridised to the array. When one of the dyes is used to label a reference - this may be genomic DNA from the strain of interest, or a mixture of all of the RNA's from a particular experiment - this is designated a 'type II experiment'. The characteristic of the labelled reference is that it should hybridise to all of the spots on the array. The other dye is used to label the sample RNA. We prefer type II experiments because they allow us to compare lots of different experiments in which a common reference has been used. Also the labelled genomic DNA acts as a quality control because it should hybridise to every spot on the array.The original concept of type I and II experiments is described in the above reference and also in Yang and Speed, Nature Reviews Genetics (2002) 3:579-588.

What is the median coefficient of variation for our microarray experiments?

Our median CV for labellings using 16 µg of RNA is around 5% for technical replicates and around 15% for biological replicates. This increases as the amount of RNA decreases. For the alternative labelling method for reduced amounts of RNA the CV 's are less than 10%

What is 'data centring'

Raw intensity data is often skewed due to unequal incorporation of Cy-dyes, variations in slide coatings, etc. Data centring methods compensate for this skewing by evening out these variations. It is possible to use defined control spots scattered throughout the array, smoothing functions such as the Lowess transformation or by applying an adjustment to the median intensity ratios.

What is the best way to visualise microarray data?

Scatter plots are a good way for the initial visualisation of the data and can provide information on the efficiency of data centring as well as applying a fold cut off. We prefer to use GeneSpring™, a software package available from Silicon Genetics which generates scatter plots, clustering algorithms and diagrams and plots of intensity versus experimental condition where the genes appear as lines. These are colour coded for 'over-expressed' and under-expressed genes.

What is our microarray distribution policy?

We only distribute microarrays to collaborators, but are willing to provide advice, and in some cases perform microarray experiments in-house using RNA sent to us.

Which labelling method do we prefer?

At the moment we use the 'Direct' method, where Cy-dCTP is incorporated into cDNA during reverse transcription of the RNA. We label genomic DNA using the 'Direct' method with the other dye and perform type II experiments. See 'Protocols section' for details of our labelling procedures.

How to tell if the labelling reaction was successful

As a general rule of thumb, it is possible to 'see' a colour after the clean-up step of labelling. There are however spectrophotometric methods for determining the efficiency of labelling.

How much RNA do we label?

We prefer to label 16µg of RNA, but can go down to 10µg .

Related links:

Other useful links: