Virus Structure

In a series of seminal experiments in 1955, Fraenkel Conrat and Williams demonstrated that tobacco mosaic virus (TMV) spontaneously formed when mixtures of purified coat protein and its genomic RNA were incubated together, i.e. the structure that TMV adopts is self-ordered and corresponds to a free energy minimum. This was and remains a remarkable discovery.

Despite the great variability shown in virus properties, at a structural level all are based on a few basic designs which will be described in this lecture.

Methods of Analysis

Electron microscopy has been very useful in giving information at a low resolution about virus structures. Resolution used in this context means the size of a structural unit that can be clearly visualized. With electron microscopy the level of resolution is 5nm (1nm = 10-9 metres). To put this into some kind of perspective:

Electron microscopy can therefore tell you about the overall shape of a virus. Quantitative statistical imaging of many different pictures in which the signal to noise ratio is improved can give better information with a higher resolution. However for detailed atomic resolution structures the only suitable technique is X-ray crystallography. This requires that a virus can be crystallized. In many cases they can and crystallization of a virus was first reported in the 1930s. Back then it was not known then that this meant they had a defined structure nor that the structure could be determined from a diffraction pattern. [You may know that Fred Sanger did not determine the first amino acid sequence of a protein, insulin, until the mid 1950s and that John Kendrew did not report the first crystal structure of a protein, myoglobin, until the early 1960s]. It is not surprising therefore that the first atomic resolution structure of a virus was not solved until 1978 when that of tomato bushy stunt virus appeared. Many viral structures have been determined since then and new ones appear on a regular basis in scientific journals. In some cases the structures are determined from crystals of viruses. In other cases this direct approach is not possible. If so a biochemical analysis can be used to determine what constituents are present in the virus and in what ratio. The individual components of the virus are then crystallised and their structures determined. Gene cloning and sequencing techniques facilitate the isolation of large amounts of purified proteins and nucleic acids for this purpose. The overall structure is then deduced by 'building' the virus with these subunit structures taking into account information from other techniques such as electron microscopy.

Types of Virus Structures

Electron microscopy suggests that many viruses are spherical. A small virus (e.g. parvovirus) has a diameter of about 25nm. A large virus (e.g. poxviruses) have a diameter of up to 300nm.

Many viruses are enveloped, some are roughly spherical having a symmetry based on an icosahedron e.g. HIV-1. Others are filamentous e.g the rabies virus.

HIV-A typical enveloped virus.

HIV particle

HIV contains 2 identical copes of a positive sense (i.e. same as mRNA) single stranded RNA about 9500 nucleotides long. This RNA genome is associated with a basic nucleocapsid protein. Nucleocapsid proteins are usually basic (+ve charged) proteins which can neutralize and facilitate the packaging of acidic (-ve charged) nucleic acid. This nucleoprotein filament may be helical (see below). The nucleoprotein filament is encapsidated by a capsid layer made up of multiple copies of capsid protein. The capsid layer may have an icosahedral type structure. It is in turn encapsidated by a layer of matrix protein. This may also show an icosahedral symmetry. This matrix protein is associated with a lipid bilayer or envelope The HIV envelope is derived from the cell plasma membrane and is acquired when the virus buds through the cell membrane. The envelope is thought to contains the lipid and protein constituents of the cell plasma membrane from which it is derived. In addition it also contains viral proteins forming spikes or peplomers. The major HIV protein associated with the envelope is gp120/41. It functions as the viral antireceptor or attachment protein. gp41 traverses the envelope, gp120 is present on the outer surface and is attached to gp41.

Variations on this theme:

An envelope is a common feature in animal viruses but uncommon in plant viruses. In some other viruses the envelope is derived from the nucleus membrane or the golgi body membrane. A detailed examination of spherical viruses shows that they often have icosahedral symmetry or a symmetry based on the icosahedron. Icosahedral viruses are very common plant and animal viruses. Some icosahedral viruses e.g. those of the picornaviridae, are not enveloped and do not have a matrix layer. The capsid is the outer layer of the virus.

The icosahedral capsid in more detail:

Icosahedral symmetry

The subunits of the capsid are located around the vertices or face of an icosahedron. An icosahedron has 20 equilateral triangles arranged around the face of a sphere. It is defined by having 2, 3 and 5 fold axis of symmetry. Viruses having 20 subunits are not known to exist. A few viruses with only 60 subunits have been found e.g. ØX174. Most viruses generally fit 60 x N subunits into their capsids. N is sometimes called the triangulation number and values of 1, 3, 4, 7, 9, 12 and more are found. An icosahedral virus containing 60 subunits has perfect symmetry. However geometrically it is not possible to arrange more than 60 subunits in an equivalent fashion around an icosahedron. Instead subunits have to be arranged in a quasi equivalent fashion.

To illustrate this consider a particle with 180 subunits. Protein subunits are not spaced independently but cluster because this maximizes the intermolecular interactions which stabilize the particle. 3 kinds of clustering are possible:

Icosahedral viruses

In polio they cluster at the centre of the triangle giving rise to 60 morphological structures or capsomers composed of trimers. In Turnip Crinkle Virus they cluster at the centre of edges giving 90 capsomers composed of dimers. In Turnip yellow mosaic virus they cluster at the point of the triangles get 20 hexamers and 12 pentamers and 32 capsomers. One consequence of this clustering is that bonds between subunits in a capsomer are stronger than bonds between capsomers which means they can be isolated for functional and structural studies.

Molecular structure of an icosahedral capsid - structure of bean pod mottle virus (BPMV):

BPMV is a como virus. It has a bipartite RNA single stranded genome of positive sense. It is a T3 virus. Many T3 viruses have evolved from a common ancestor and have a similar structure. In a T3 virus there are 3 different subunits per 60 triangles, 180 in total. The amino acid sequence of the protein defines the secondary and tertiary structure of the protein In this case each unit is composed of 3 antiparallel beta-barrel proteins giving 180. The beta-barrel is a tertiary structural motif often formed by beta-pleated sheets and common to many proteins.

BMPV has 60 copies of two coat proteins. The S subunit with a MW of 22Kd and the L subunit with a MW of 42Kd. The S and L subunits are made as a polyprotein, C-B-A. A is cleaved off to give the S (subunit and has 1 beta-barrel domain. This leaves C-B, the L subunit which has 2 beta-barrel domains. In total therefore there are 120 subunits but 180 barrel domains. Each domain is 180-190 AA long. Subunit structures have been determined by crystallography. The L subunit, B and C are covalently linked, their interface stabilized by hydrophobic interactions. The S subunit , A domain, helix interacts with a helix in the B domain of the large subunit. The whole ensemble of the S and L subunits forms a wedge 5nm long, 1.7nm at the narrow end and 3nm at the wide end. In the diagram the large subunit is coloured in two shades of green, the small subunit is blue. The quaternary structure is roughly spherical with icosahedral symmetry.

Filamentous Viruses

Many viruses when examined by electron microscopy are found to be rod shaped, e.g. filoviruses are about 80nm wide and 14000nm long. Many plant viruses are also filamentous. Their exact length often depends on the length of the genome, but 300-500nm is typical; their diameter is usually 15-20nm. Tobacco mosaic virus (TMV) is a particularly well understood example of this type of virus:

TMV

Protein subunits can be placed around the circumference of a circle to form a disc. It the discs are stacked, then a tube is created with room for the nucleic acid genome in the middle. A closer examination of these virus structures shows that the coat proteins are not arranged cylindrically but helically. This is because of the propensity for nucleic acids to adopt helical structures. By arranging the protein subunits helically then equivalent bondings between the proteins and nucleic acid can be made - except for the two end subunits. All known filamentous viruses are helical. The structure of TMV can be described in terms of the number of subunits per turn of the helix i.e. 16.3. The pitch or rise per turn of the helix ie 2.28nm and the axial rise per subunit, i.e. 0.14nm. It is possible that the nucleoprotein filament of HIV has a similar structure.

Rhabdoviruses (e.g the rabies virus) also have a helical nucleofilament similar to that described here but they are also enveloped and have a matrix layer like HIV.

Complex Virus Structures

Many viruses have a more complicated structure than that described here although they are often made up of units which may have either icosahedral or helical symmetry. A well known example is the tailed bacteriophages such as T4. The head of these viruses is icosahedral with a triangulation number of 7. This is attached by a collar to a contractile tail with helical symmetry .

Concluding Remarks.

Why bother to encapsidate the genome?

Why is subunit construction common to all viruses ?

Necessity.

A triplet codon has a MW approximating to 1000 and codes for an amino acid of average MW 150. So at best a nucleic acid can only code for 15%; of its weight as a protein. As viruses are composed of 50-90 %; by weight protein there must be more than one protein and subunit construction is essential.

Self Assembly.

Seminal experiments in 1955 by Fraenkel Conrat and Williams. It was shown that TMV virus spontaneously formed when mixtures of purified coat protein and its genomic RNA were incubated together. This means that the structure that TMV adopts is self ordered and therefore corresponds to a free energy minimum. Incorporating multiple copies of 1 or a very few different subunits is presumably an easy way to accomplish this.

Fidelity.

DNA, RNA and protein synthesis are all subject to occasional error. By using a smaller protein and hence gene it means there is less chance of an error occurring.

Economy.

The correct structure can be formed with the minimum of waste since if a subunit is synthesize or folded incorrectly then only a small unit has to be discarded.

Complexity.

There are physical constraints which prevent the tight packing of say an octahedra or tetrahedra. Put crudely, the holes between the subunits would be to big and the particle to leaky. Small number of contacts might be insufficient for stability. The larger the number of subunits the more stable the virus becomes, the larger the virus particle and the bigger and more complex its genome can be.


Reference: Cann A.J: Principles of Molecular Virology. Academic Press, 2nd Edition, 1997 Chapter 2.


Search for more information on this topic.

Return to BS224 HomePage


© Dr Shaun Heaphy.