C.neoformans var grubii
Leptospira spp.

Streptococcus pneumoniae

Multilocus sequence typing of Streptococcus pneumoniae


The pneumococcal MLST database currently contains over 5000 isolates, obtained from serious invasive disease, acute otitis media and nasopharyngeal carriage, as well as penicillin-resistant and multiply antibiotic-resistant isolates.

The database will be expanded by the addition of further invasive isolates and antibiotic-resistant isolates, plus further isolates from other pneumococcal diseases and from nasopharyngeal carriage.

It is envisaged that the allelic profiles of reference isolates of all published clones of antibiotic-resistant pneumococci will be maintained in our database, which will allow the characterisation of penicillin-resistant pneumococci via the Internet. Our studies have shown that members of the major antibiotic-resistant clones usually have the same allelic profile, or differ from that profile at only a single locus (Zhou, J., Enright, M.C., and Spratt, B.G. Identification of the major Spanish clones of penicillin-resistant pneumococci via the Internet using multilocus sequence typing.J. Clin. Microbiol., 38, 977-986, 2000).

This database will be maximally useful if you deposit the allelic profiles and epidemiological information on your strains at this site. If you are not able to carry out MLST in your own laboratory, it would also be helpful if a typical isolate of any novel penicillin-resistant or multi-resistant clone, or any other novel clone of interest, could be sent to us, so that we can determine its allelic profile and enter it in the database.

Acknowledging the use of the MLST database in your publications.

Please acknowledge the use of this site in your publications as follows: 'We acknowledge the use of the pneumococcal MLST database which is located at Imperial College London and is funded by the Wellcome Trust'.

[Top of page]

Obtaining an allelic profile and comparing your strains with those in our database

The allelic profile of a pneumococcal strain is obtained by sequencing internal fragments of seven house-keeping genes. The primers for the amplification and sequencing of these gene fragments can be obtained here. The sequences must be obtained on both strands, and they must be 100% accurate, since even a single error may convert a known allele into a novel allele.

The sequences have to be trimmed so that they correspond exactly to the region that we use to define the alleles. The sequences of the seven loci from a typical pneumococcus can be obtained here and can be used to ensure that your sequences have been trimmed correctly.

You then need to access our databases, which involves a simple registration process, that allows us to inform you of new developments by e-mail.

Select the Streptococcus pneumoniae database, and the multiple locus and allelic profile query, followed by submit. You then cut and paste your seven sequences into the corresponding boxes and submit them.

The software will check that the sequences are the correct length and that they do not contain any unrecognised characters. A check is also made to see if the submitted sequence is at least 70% similar to another allele at that locus (in case you have cut and pasted a sequence into the wrong box).

After submitting the seven sequences, you will obtain the allelic profile of your isolate and details of any pneumococcal isolates that are identical to the one you submitted. You can also search for isolates that have allelic profiles that are similar to yours. For example, isolates that have at least 4/7, 5/7 or 6/7 matches to the submitted allelic profile.

Further details about strains that are identical, or similar, to the submitted strain can be obtained by clicking on the strain names.

There are also options to assign the allele at a single locus, or to enter an allelic profile and find isolates in the database that match or nearly match this profile, or to browse the database (e.g. to look at the details of all strains of a particular serotype) and for advanced querying.

Is it really a pneumococcus?

If a query allele is more than one or two percent different from any known allele, the strain may not be a pneumococcus.  For example, if the strain is non-typeable, and has novel alleles at six or seven loci, and these differ by several percent from the pneumococcal alleles in the MLST database, it is likely that the strain is an 'atypical' pneumococcus or a closely-related species.  Check if the strain is serotypeable as such strains will not have a pneumococcal serotype..

Some non-typeable pneumococci are authentic non-capsulated isolates and these have typical pneumococcal alleles at six or all seven loci.  Some of these alleles may not all be in the database, but they should be similar in sequence to those in the database. N.B. Authentic pneumococci do sometimes have diverged alleles at one of the seven loci.

A new facility is the ability to concatenate the DNA sequences at six of the seven MLST loci (ddl is omitted as this gene from penicillin-resistant isolates often contains highly diverged sequences) and to compare this concatenated sequence with the concatenated sequences of a reference set of pneumococci and pneumococcal-like organisms.  If your isolate clusters on the resulting tree within the pneumococcal strains you can be confident that it is a pneumococcus.  If it is clearly separated from the pneumococcal strains on the tree it is almost certainly not an authentic pneumococcus.

 [Top of page]

The seven loci and the primers and conditions used for PCR

The pneumococcal MLST scheme uses internal fragments of the following seven house-keeping genes:-

aroE (shikimate dehydrogenase)

gdh (glucose-6-phosphate dehydrogenase)

gki (glucose kinase)

recP (transketolase)

spi (signal peptidase I)

xpt (xanthine phosphoribosyltransferase)

ddl (D-alanine-D-alanine ligase)

The primer pairs used for the PCR amplification of internal fragments of these genes are:-

aroE-up, 5'-GCC TTT GAG GCG ACA GC and


gdh-up, 5'-ATG GAC AAA CCA GC(G/A/T/C) AG(C/T) TT and

gdh-dn, 5'-GCT TGA GGT CCC AT(G/A) CT(G/A/T/C) CC

gki-up, 5'-GGC ATT GGA ATG GGA TCA CC and


recP-up, 5'-GCC AAC TCA GGT CAT CCA GG and


spi-up, 5'-TTA TTC CTC CTG ATT CTG TC and


xpt-up, 5'-TTA TTA GAA GAG CGC ATC CT and


ddl-up, 5'-TGC (C/T)CA AGT TCC TTA TGT GG and

ddl-dn, 5'-CAC TGG GT(G/A) AAA CC(A/T) GGC AT

PCR amplification is carried out on chromosomal DNA using an extension time of 30 seconds, and an annealing temperature of 50oC, with Qiagen Taq polymerase. As the same primers are used for amplification and sequencing, it is important that only a single DNA fragment is amplified in the initial PCR. This may involve some optimisation of the annealing temperature.

The DNA fragments are purified using QIAquick (Qiagen) and sequencing reactions are carried out, in each direction, using the primers that were used for the initial PCR amplification. The samples are applied to an automated DNA sequencer with d-Rhodamine-labeled terminators (PE Applied Biosystems).

 [Top of page]


McGee L., McDougal L, Zhou J, Spratt BG, Tenover FC, George R, Hakenbeck R, Hryniewicz W, Lefevre JC, Tomasz A,Klugman KP. Nomenclature of major antimicrobial-resistant clones of Streptococcus pneumoniae defined by the pneumococcal molecular epidemiology network.J. Clin. Microbiol. 39, 2565-71, 2001

Maiden, M.C.J. , Bygraves, J.A., Feil, E., Morelli, G., Russell, J.E., Urwin, R., Zhang, Q., Zhou, J., Zurth, K., Caugant, D.A., Feavers, I.M., Achtman, M., and Spratt, B.G. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. USA, 95, 3140-3145, 1998.

Enright, M. and Spratt, B.G. A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology 144, 3049-3060, 1998.

Shi, Z.-Y. , Enright, M.C., Wilkinson, P., Griffiths, D., and Spratt, B.G. Identification of three major clones of multiply antibiotic-resistant Streptococcus pneumoniae in Taiwanese hospitals using multilocus sequence typing. J. Clin. Microbiol. 36, 3514-3519, 1998.

Spratt, B.G. Multilocus sequence typing: Molecular typing of bacterial pathogens in an era of rapid DNA sequencing and the Internet. Current Opinion in Microbiology, 2, 312-316, 1999.

Enright, M.C., Fenoll, A., Griffiths, D., and Spratt B.G.  The three major Spanish clones of penicillin-resistant Streptococcus pneumoniae are the most common clones recovered from recent cases of meningitis in Spain.  J. Clin. Microbiol. 37, 3210-3216, 1999.

Coffey, T.J., Daniels, M., Enright, M.C., and Spratt, B.G.  Serotype 14 variants of the Spanish penicillin-resistant serotype 9V clone of Streptococcus pneumoniae arose by large recombinational replacements of the cpsA-pbp1a region.  Microbiology 145, 2023 - 2031, 1999

Zhou, J., Enright, M.C., and Spratt, B.G. Identification of the major Spanish clones of penicillin-resistant pneumococci via the Internet using multilocus sequence typing.  J. Clin. Microbiol., 38, 977-986, 2000.

Enright, M.C. , Knox, K., Griffiths, D., Crook, D.W.M., and Spratt, B.G.  Multilocus sequence typing of Streptococcus pneumoniae directly from cerebrospinal fluid.Eur. J. Clin. Microbiol. Infect. Dis. 19, 627-630, 2000.

Profile Query

Locus Query

Batch Query