Phylogeny and Antigenicity


The microbial mutations and the host immune response to it is an iterative process. In this process, 1) the pathogens mutate into new strains to evade the host immune system and 2) the host adaptive immunity changes to detect the new strains. 3) The microbes mutate again to thwart the new host defenses, returning to the first step.

Therefore, the F^P/A-PM developed here approaches the solution from those two opposite ends, i.e., 1) Phylogeny: the virology of the forward-moving virus mutations and 2) Antigenicity: the adaptive immunology opposing it.

Mutation History of H1N1


H1N1 was quiescent prior to the 2009 H1N1 swine flu pandemic. The then current A/Brisbane/ 59/2007 vaccine and the prevalent strains were matched well (> 95%). Due to the antigenic shift during the pandemic, 439 of the 566 residues (77.5%) became incompatible with the vaccine. It quickly made the vaccine obsolete.

The virus has stayed in a metastable state since then, as seen in the graph above. The current vaccine,  A/Michigan/ 45/2015 is still effective.  

Mutation History of H3N2

In comprison

In comparison with H1N1, which is relatively stable, H3N2 has been mutating aggressively and erratically. The strain entered the human

population during the 1968 H3N2 Hong Kong pandemic. The vaccinologists are still struggling with it 50 years later, changing the vaccine frequently, with little success.

H3N2 is the most prevalent and virulent strain, infecting about 78% of the reported cases. It is also responsible for this year's pandemic. The reverse vaccinology analysis shows that a new vaccine is needed for the 2018-19 season.

Data and Tools

The F^P/A-PM engine is currently analyzing 262,226,616 nucleotides from the hemagglutinin (HA) segments of 172,230 strains. New data is being added as it is made available by the labs worldwide. The data spans the last century (1917-2018) and all continents.

The tools comprise the custom developed computer code in C/C++, BioPython and its extensions e.g. NumPy and SciKit-Learn. The source code has been augmented with online bioinformatics tools such as RAxML, Clustal/Ω and MEME .

Fourier-enhanced Phylogeny/Antigenicity Predictive Model (F^P/A-PM)

F^P/A-PM is the core processing engine of this research. It is a mathematical model based on reverse vaccinology. It uses biostatistics and calculus to model the phylogeny and the antigenicity of the prevalent influenza strains.

The phylogeny half models the mutations in the influenza genome over spatial and temporal  domains. The antigenicity component computes the avidity of the vaccine-induced antibodies to the infecting strains.

Together, the computed phylogenetic and antigenic vectors 1) optimize the design of a new vaccine and 2) quantify the effectiveness of an existing vaccine.


This project is a work in progress. It uses the weekly influenza data published by the World Health Organization (WHO) and the Centers for Disease Control (CDC). Please revisit for updates.