Protein families, Orthologs and Paralogs from Phytozome.
Current data
Phytozome Families
All proteins from the Phytozome database are grouped into a hierachical structure of related polypeptide sequence. A family consisting of more than one protein also has a computed multiple sequence alignment (MSA) and computed centroid sequence.
Ortholog/Paralog Data
Phytozome has compiled a collection orthologs and paralogs for the genes in our database.

Ortholog calls were generated using inParanoid 4.1. InParanoid was run on proteome sequences for all possible organism-organism pairs in Phytozome using the default two-pass blast strategy. Genes from different species that fell in the same InParanoid ortholog cluster were considered orthologs.

The scale of the homolog data makes it impractical to store in our InterMine database. You can best query for homologs of a gene or set of genes by using our BioMart interface.

Bulk download
Bulk data files for all organisms in Phytozome are available for download from the JGI Download Portal .
