Variant Filtration SOPs

This chapter contains SOPs directly related to the filtration, prioritization, and interpretation of variants. The first SOPs cover the filtration of variants for singleton and trio exomes in various modes of inheritance. When dealing with different case structures (e.g., siblings or only having one parent present), they can be handled with adjusted trio SOPs. This is followed with SOPs for assessing variants for pathogenicity and suitability as candidate variants.

SOP: Filtering Singletons for Autosomal Variants

Aims and Scope

The aim of this SOP is the filtration of singleton data for variants on the autosomal chromosomes. Depending on the hypothesis on the mode of inheritance the steps differ slightly. Alternative actions are given for de novo, dominant, homozygous recessive, and compound recessive variants.

Filtration for variants on the X chromosomes is described in SOP: Filtering Singletons for X-chromosomal Variants. The evaluation of variants is described in SOP: Variant Assessment, the use of phenotype and pathogenicity scores is described in SOP: Prioritization with Phenotype and Pathogenicity Scores.

Result

The result is a list of variants in compatible mode of inheritance with appropriate population frequency. These can then be assessed as described in SOP: Variant Assessment. A typical WES data set yields the following variant counts (numbers will vary depending on the enrichment kit):

de novo

dominant

hom. rec.

comp. rec.

0-80

100-500

0-30

TODO

Steps

  1. Use the Load Preset button to load filter presets (according to the table below and your mode of inheritance).

  2. Configure the Genotype according to the table below.

    setting

    de novo

    dominant

    hom. rec.

    comp. rec.

    presets

    De Novo

    Strict

    Recessive

    Recessive

    genotype

    0/1

    0/1

    1/1

    c/h index

    • For compound recessive mode of inheritance, selecting “c/h index” as mode of inheritance for the child enables the comp. het. mode.

  3. Click Filter & Display.

  4. Compare the resulting variant count with the numbers from the table above. Also check that all query result records are displayed1.

  5. Handle unexpected high and low number of variants.

    • In case of too few variants try relaxing the Quality settings, e.g., by setting DP het. to 8 and min AAB to 0.2.

    • Try adjusting the Frequency settings (keep in mind incidence rates of the case’s disorder).

    • The presets Relaxed and Super Strict can be used for non-recessive modes of inheritance to adjust multiple thresholds at once.

1(1,2,3,4)

Check the First N of M records label on above the results table, potentially adjust the Result row limit setting you can find in the More … ‣ Miscellaneous tab.

Thresholds

SOP: Filtering Singletons for X-chromosomal Variants

Aims and Scope

The aim of this SOP is the filtration of singleton data for variants on the X chromosome. Depending on the hypothesis on the mode of inheritance the steps differ slightly. Alternative actions are given for de novo, dominant, homozygous recessive, and compound recessive variants.

Filtration for variants on the autosomes is described in SOP: Filtering Singletons for Autosomal Variants. The evaluation of variants is described in SOP: Variant Assessment, the use of phenotype and pathogenicity scores is described in SOP: Prioritization with Phenotype and Pathogenicity Scores.

Result

The result is a list of variants in compatible mode of inheritance with appropriate population frequency. These can then be assessed as described in SOP: Variant Assessment. A typical WES data set yields the following variant counts (numbers will vary depending on the enrichment kit):

X de novo

X dominant

X hom. rec.

X comp. rec.

TODO

TODO

TODO

TODO

Steps

Note

The following needs work by a geneticists, also in terms of practicability

  1. Use the Load Preset button to load filter presets (according to the table below and your mode of inheritance).

  2. Configure the Genotype according to the table below.

    setting

    X de novo

    X dominant

    X hom. rec.

    X comp. rec.

    presets

    De Novo

    Strict

    Recessive

    Recessive

    genotype (M)

    1/1

    1/1

    N/A

    N/A

    genotype (F)

    0/1

    0/1

    1/1

    c/h index

    • The genotype of the index is chosen based on its sex (male M, female F).

    • For compound recessive mode of inheritance, selecting “c/h index” as mode of inheritance for the daughter.

  3. Enter chrX into the field Gene Lists & Regions ‣ Genomic Region.

  4. Click Filter & Display.

  5. Compare the resulting variant count with the numbers from the table above. Also check that all query result records are displayed1.

  6. Handle unexpected high and low number of variants.

    • In case of too few variants try relaxing the Quality settings, e.g., by setting DP het. to 8 and min AAB to 0.2.

    • Try adjusting the Frequency settings (keep in mind incidence rates of the case’s disorder).

    • The presets Relaxed and Super Strict can be used for non-recessive modes of inheritance to adjust multiple thresholds at once.

Thresholds

SOP: Filtering Trios for Autosomal Variants

Aims and Scope

The aim of this SOP is the filtration of trio data for variants on the autosomal chromosomes. Depending on the hypothesis on the mode of inheritance the steps differ slightly. Alternative actions are given for de novo, dominant, homozygous recessive, and compound recessive variants.

Filtration for variants on the X chromosomes is described in SOP: Filtering Trios for X-chromosomal variants. The evaluation of variants is described in SOP: Variant Assessment, the use of phenotype and pathogenicity scores is described in SOP: Prioritization with Phenotype and Pathogenicity Scores.

Result

The result is a list of variants in compatible mode of inheritance with appropriate population frequency. These can then be assessed as described in SOP: Variant Assessment. A typical WES data set yields the following variant counts (numbers will vary depending on the enrichment kit):

de novo

dominant

hom. rec.

comp. rec.

0-3

50-150

2-75

2-20

Steps

  1. Use the Load Preset button to load filter presets (according to the table below and your mode of inheritance).

  2. Configure the Genotype according to the table below.

    setting

    de novo

    dominant

    hom. rec.

    comp. rec.

    presets

    Strict

    Strict

    Recessive

    Recessive

    genotype

    index

    0/1

    0/1

    1/1

    c/h index

    parents

    0/0, 0/0

    0/0, 0/1

    0/1, 0/1

    • For dominant mode of inheritance, set the genotypes of the affected parent to 0/1 and the unaffected parent to 0/0.

    • For compound recessive mode of inheritance, selecting “c/h index” as mode of inheritance for the child enables the comp. het. mode and the parents’ genotype does have to be selected.

  3. Click Filter & Display.

  4. Compare the resulting variant count with the numbers from the table above. Also check that all query result records1.

  5. Handle unexpected high and low number of variants.

    • Too many de novo and too few variants in the other modes of inheritance can be an indicator of issues with the sample relatedness (cf. SOP: Quality Control).

    • In case of too few variants try relaxing the Quality settings, e.g., by setting DP het. to 8 and min AAB to 0.2. In the case of too few de novo variants, try setting the max AD setting of the parents to 2.

    • Try adjusting the Frequency settings (keep in mind incidence rates of the case’s disorder).

    • The presets Relaxed and Super Strict can be used for non-recessive modes of inheritance to adjust multiple thresholds at once.

Thresholds

TODO

SOP: Filtering Trios for X-chromosomal variants

Aims and Scope

The aim of this SOP is the filtration of trio data for variants on the X chromosome. Depending on the hypothesis on the mode of inheritance the steps differ slightly. Alternative actions are given for X-linked de novo, dominant, recessive.

Filtration for variants on the autosomes is described in SOP: Filtering Trios for Autosomal Variants. The evaluation of variants is described in SOP: Variant Assessment, the use of phenotype and pathogenicity scores is described in SOP: Prioritization with Phenotype and Pathogenicity Scores.

Result

The result is a list of variants in compatible mode of inheritance with appropriate population frequency. These can then be assessed as described in SOP: Variant Assessment. A typical WES data set yields the following variant counts (numbers will vary depending on the enrichment kit):

X de novo

X dominant

X hom. rec.

X comp. rec.

TODO

TODO

TODO

TODO

Steps

Note

The following needs work by a geneticists, also in terms of practicability

  1. Use the Load Preset button to load filter presets (according to the table below and your mode of inheritance).

  2. Configure the Genotype according to the table below.

    setting

    X de novo

    X dominant

    X hom. rec.

    X comp. rec.

    presets

    Strict

    Strict

    Recessive

    Recessive

    genotype

    index (M)

    1/1

    1/1

    N/A

    c/h index

    index (F)

    0/1

    0/1

    1/1

    c/h index

    mother

    0/0

    0/1 or 0/0

    0/1

    father

    0/0

    1/1 or 0/0

    1/1

    • The genotype of the index is chosen based on its sex (male M, female F).

    • For dominant mode of inheritance, set the genotypes of the affected parent to variant (0/1 or 1/1 according to the table) and of the unaffected to 0/0.

    • For compound recessive mode of inheritance, selecting “c/h index” as mode of inheritance for the child enables the comp. het. mode and the parents’ genotype does have to be selected.

  3. Enter chrX into the field Gene Lists & Regions ‣ Genomic Region.

  4. Click Filter & Display.

  5. Compare the resulting variant count with the numbers from the table above. Also check that all query result records are displayed (check the First N of M records label on above the results table, potentially adjust the Result row limit setting you can find in the More … ‣ Miscellaneous tab).

  6. Handle unexpected high and low number of variants.

    • Too many de novo and too few variants in the other modes of inheritance can be an indicator of issues with the sample relatedness (cf. SOP: Quality Control).

    • In case of too few variants try relaxing the Quality settings, e.g., by setting DP het. to 8 and min AAB to 0.2. In the case of too few de novo variants, try setting the max AD setting of the parents to 2.

    • Try adjusting the Frequency settings (keep in mind incidence rates of the case’s disorder).

    • The presets Relaxed and Super Strict can be used for non-recessive modes of inheritance to adjust multiple thresholds at once.

Thresholds

SOP: Prioritization with Phenotype and Pathogenicity Scores

Aims and Scope

The aim of this SOP is to use scores for prioritizing a list of candidate variants. Phenotype scores can be used for ranking variants by their affected gene’s match to the patient’s phenotypes. Pathogenicity scores can be used for estimating the impact of a variant.

The filtration of variants is described in the SOPs above. For guidelines on interpreting the scores see SOP: Phenotype Score Interpretation and SOP: Pathogencity Score Interpretation.

Result

The result is a list of variants annotated with phenotype and/or pathogenicity scores that can be used for sorting and ranking variants. Further, by putting thresholds on the largest rank to consider or thresholds on the scores, the list of variants to be assessed can be shortened.

Steps

  1. Open the More … ‣ Prioritization tab.

  2. For using phenotype-based prioritization

    • tick the Enable phenotype-based prioritization box,

    • select an appropriate prioritization Algorithms, and

    • enter (or paste) the HPO terms into the HPO Terms field.

  3. For using variant pathogenicity prioritization

    • tick the Enable variant pathogenicity-based prioritization box, and

    • select the scoring method2 to use.

  4. Click Filter & Display to trigger the filtration.

    • Also check that all query result records are displayed1. The limit is applied to the variants sent for prioritization. You will not see the N top-ranking records but you will see a ranking of an arbitrary selection of N records in the case that the limit of records to display is smaller than the query result size N.

  5. Click on the score and rank heading below the phenotype, pathogenicity, and/or pheno. & patho. columns to sort the table by phenotype, pathogenicity, or a combination of both scores.

  6. Consider the top variants by one of the sorting methods from above, stop based on the rank or score:

    • Rank: Consider the top N (e.g., =20) variants only.

      • If you are in a time-limited setting, you should pick the number N in advance of your study to get reproducible results in terms of diagnostic yield.

    • Score: (Note that the distribution of the different scores varies significantly).

      • Consider the top-scoring variants until the score drops by a factor of 2 from one variant to the next.

      • Consider the top-scoring variants until the score drops below a threshold T.

See SOP: Phenotype Score Interpretation and SOP: Pathogencity Score Interpretation for more information in score interpretation.

2(1,2)

For using the UMD Predictor score you have to obtain a API token from https://umd-predictor.eu/ and enter it in VarFish in your user profile. You can reach the user profile by clicking on the person icon on the top left, then User Profile ‣ Settings ‣ Update ‣ UMD Predictor API Token. Note that UMD Predictor can only score SNVs.

Thresholds

SOP: Variant Assessment

Aims and Scope

This SOP describes how to assess variants with the information integrated into VarFish. Clicking the little “>” on the left of the result table folds out the details of the given variant.

Result

The result is a better understanding of the variant and gene.

Steps

Note

The following needs refinement. Actually, it does not read like a SOP but rather an extended manual.

  1. Consider the Gene information box.

    • The Name, Gene Family, and NCBI Summary give a first impression about the gene and its molecular functional and implication in diseases. Genes with missing or very short NCBI Summary are often not well-characterized and such genes are hard to link to diseases.

    • ClinVar for Gene gives the number of pathogenic and likely pathogenic variants in the gene and shows how often the gene has been implicated in disease in ClinVar.

    • HPO Terms displays all HPO terms associated with a gene and, if present, the annotated modes of inheritance of diseases linked to this gene.

    • OMIM Phenotypes gives the OMIM diseases linked to the gene.

    • Gene RIFs displays short “reference into function” notes on PubMed articles that report on the gene.

    • Constraints shows gene contraint scores from ExAc and gnomAD for this gene.

    • The remaining fields provide link-outs into NCBI Entrez, ENSEMBL, and OMIM.

  2. The ClinVar for Variant table shows ClinVar annotations for the given variant, if any.

  3. The Frequency Details table provides detailed information about the frequency of the variant in different populations given in the different population databases.

  4. The Transcript Information table shows the impact of the variant on all transcripts of the gene.

  5. The Genotype and Call Infos provides detailed information about the variant call.

  6. The UCSC 100 Vertebrate Conservation box shows the alignment of the corresponding amino acid in the UCSC 100 vertebrate alignment (the evoluationary distance to human decreases from left to right), if available. This information can be used for getting a feeling on how conserved the location is in the gene.