This section describes how to configure the
When running with the
varfish-docker-compose files and the provided database files, VarFish comes preconfigured with sensible default settings and also contains some example datasets to try out.
There are a few things that you might want to tweak.
Please note that there might be more settings that you can change when exploring the VarFish source code but right now their use is not supported for external users.
VarFish & Docker Compose
The recommended (and supported) way to deploy VarFish is using Docker compose.
The VarFish server and its component are not installed on the system itself but rather a number of Docker containers with fixed Docker images are run and work together.
docker-compose.yml file starts a fully functional VarFish server.
Docker Compose supports using so-called override files.
Basically, the mechanism works by providing an
docker-compose.override.yml file that is automatically read at startup when running
This file is put into the .gitignore so it is not in the
varfish-docker-compose repository but rather created in the checkouts (e.g., manually or using a configuration management tool such as Ansible).
On startup, Docker Compose will read first the base
It will then read the override file (if it exists) and recursively merge both YAML files with the override file overriding taking precedence over the base file.
Note that the recursive merging will be done on YAML dicts only, lists will overwritten.
The mechanism in detail is described in the official documentation.
We provide the following files that you can use/combine into the local
docker-compose.override.yml file of your installation.
docker-compose.override.yml-cert– use TLS encryption with your own certificate from your favourite certificate provider (by default an automatically generated self-signed certificate will be used by traefik, the reverse proxy).
docker-compose.override.yml-letsencrypt– use letsencrypt to obtain a certificate.
docker-compose.override.yml-cadd– spawn Docker containers for allowing pathogenicity annotation of your variants with CADD.
The overall process is to copy any of the
*.override.yml-* files to
docker-compose.yml and adjusting it to your need (e.g., merging with another such file).
Note that you could also explicitely provide multiple override files but we do not consider this further. For more information on the override mechanism see the official documentation.
The following sections describe the possible adjustment with Docker Compose override files.
TLS / SSL Configuration
varfish-docker-compose setup uses traefik as a reverse proxy and must be reconfigured if you want to change the default behaviour of using self-signed certificates.
Use the contents of
docker-compose.override.yml-cert for providing your own certificate.
You have to put the cerver certificate and key into
server.key and then restart the
Make sure to provide the full certificate chain if needed (e.g., for DFN issued certificates).
If your site is reachable from the internet then you can also use the contents of
docker-compose.override.yml-letsencrypt which will use [letsencrypt](https://letsencrypt.org/) to obtain the certificates.
Make sure to adjust the line with
--certificatesresolvers.le.acme.email= to your email address.
Note well that if you make your site reachable from the internet then you should be aware of the implications.
VarFish is MIT licensed software which means that it comes “without any warranty of any kind”, see the
LICENSE file for details.
After changing the configuration, restart the site (e.g., with
docker-compose down && docker-compose up -d if it is running in detached mode).
VarFish can be configured to use up to two upstream LDAP servers (e.g., OpenLDAP or Microsoft Active Directory).
For this, you have to set the following environment variables in the file
.env in your
varfish-docker-compose checkout and restart the site.
The variables are given with their default values.
Enable primary LDAP authentication server (values:
URI for primary LDAP server (e.g.,
Distinguished name (DN) to use for binding to the LDAP server.
Password to use for binding to the LDAP server.
DN to use for the search base, e.g.,
Domain to use for user names, e.g. with
EXAMPLEusers from this domain can login with
Domain used for printing the user name.
If you have the first LDAP configured then you can also enable the second one and configure it.
Enable secondary LDAP authentication server (values:
The remaining variable names are derived from the ones of the primary server but using the prefix
AUTH_LDAP2 instead of
Besides LDAP configuration, it is also possible to authenticate with existing SAML 2.0 ID Providers (e.g. Keycloak). Since varfish is built on top of sodar core, you can also refer to the sodar-core documentation for further help in configuring the ID Providers.
To enable SAML authentication with your ID Provider, a few steps are necessary. First, add a SAML Client for your ID Provider of choice. The sodar-core documentation features examples for Keycloak. Make sure you have assertion signing turned on and allow redirects to your varfish site.
The SAML processing URL should be set to the externally visible address of your varfish deployment, e.g.
Next, you need to obtain your metadata.xml aswell as the signing certificate and key file from the ID Provider. Make sure you convert these keys to standard OpenSSL
format, before starting your varfish instance (you can find more details here).
If you deploy varfish without docker, you can pass the file paths of your metadata.xml and key pair directly. Otherwise, make sure that you have included them
into a single folder and added the corresponding folder to your
docker-compose.yml (or add it as a
docker-compose-overrrided.yml), like in the following snippet.
varfish-web: ... volumes: - "/path/to/my/secrets:/secrets:ro"
Then, define atleast the following variables in your docker-compose
.env file (or the environment variables when running the server natively).
[Default 0] Enable  or Disable  SAML authentication
The SAML client ID set in the ID Provider config (e.g. “varfish”)
The externally visible URL of your varfish deployment
The path to the metadata.xml file retrieved from your ID Provider. If you deploy using docker, this must be a path inside the container.
The url to your IDP. In case of keycloak it can look something like https://keycloak.example.com/auth/realms/<my_varfish_realm>
Path to the SAML signing key for the client.
Path to the SAML certificate for the client.
[Default /usr/bin/xmlsec1] Path to the xmlsec executable.
By default, the SAML attributes map is configured to work with Keycloak as SAML Auth provider. If you are using a different ID Provider,
or different settings you also need to adjust the
A dictionary identifying the SAML claims needed to retrieve user information. You need to set atleast
To set initial user permissions on first login, you can use the following options:
Comma separated list of groups for a new user to join.
[Default True] Whether a new user is considered active.
[Default True] New users get the staff status.
[Default False] New users are marked superusers (I advise leaving this one alone).
If you encounter any troubles with this rather involved procedure, feel free to take a look at the discussion forums on github and open a thread.
Sending of Emails
You can configure VarFish to send out emails, e.g., when permissions are granted to users.
Enable sending of emails.
String to use for the sender, e.g.,
Prefix to use for email subjects, e.g.,
URL to the SMTP server to use, e.g.,
External Postgres Server
In some setups, it might make sense to run your own Postgres server. The most common use case would be that you want to run VarFish in a setting where fast disks are not available (virtual machines or in a “cloud” setting). You might still have a dedicated, fast Postgres server running (or available as a service from your cloud provider). In this case, you can configure the database connection settings as follows.
Adjust to the credentials, server, and database name that you want to use.
The default settings do not make for secure settings in the general case.
However, Docker Compose will create a private network that is only available to the Docker containers.
In the default
docker-compose setup, postgres server is thus not exposed to the outside and only reachable by the VarFish web server and queue workers.
Text to display on the login page.
Key to use for encrypting secrets in the database (such as saved public keys for the Beacon Site feature). You can generate such a key with the following command:
python -c 'import os, base64; print(base64.urlsafe_b64encode(os.urandom(32)))'.
Maximal number of cases to query for at the same time for joint queries. Default is
Sentry is a service for monitoring web apps. Their open source version can be installed on premise. You can configure sentry support as follows
Enable Sentry support.
A sentry DSN to report to. See Sentry documentation for details.
HGMD Professional Documentation
Users can enable a gene and variant wise link-out to HGMD professional as follows.
Enable HGMD Professional link-out.
Configure the URL prefix for HGMD Professional link-outs.
System and Docker (Compose) Tweaks
A number of customizations customizations of the installation can be done using Docker or Docker Compose. Other customizations have to be done on the system level. This section lists those that the authors are aware of but in particular network-related settings can be done on many levels.
Using Non-Default HTTP(S) Ports
If you want to use non-standard HTTP and HTTPS ports (defaults are 80 and 443) then you can tweak this in the
traefik container section.
You have to adjust two parts, below we give them separately with full YAML “key” paths.
services: traefik: ports: - "80:80" - "443:443"
To listen on ports
8443 instead, your override file should have:
Also, you have to adjust the command line arguments to traefik for the
web (HTTP) and
websecure (HTTPS) entrypoints.
services: traefik: command: # ... - "--entrypoints.web.address=:80" - "--entrypoints.websecure.address=:443"
Use the following in your override file.
services: traefik: command: # ... - "--entrypoints.web.address=:8080" - "--entrypoints.websecure.address=:8443"
Based on the
docker-compose.yml file alone, your
docker-compose.override.yml file should contain the following line.
You will have to adjust the file accordingly if you want to use a custom static certificate or letsencrypt by incorporating the files from the provided example
services: traefik: ports: - "8080:80" - "8443:443" command: - "--providers.docker=true" - "--providers.docker.exposedbydefault=false" - "--entrypoints.web.address=:80" - "--entrypoints.web.http.redirections.entryPoint.to=websecure" - "--entrypoints.web.http.redirections.entryPoint.scheme=https" - "--entrypoints.web.http.redirections.entrypoint.permanent=true" - "--entrypoints.web.address=:80" - "--entrypoints.websecure.address=:443"
Then, restart by calling
docker-compose up -d in the directory with the
Listing on Specific IPs
By default, the
traefik container will listen on all IPs and interfaces of the host machine.
You can change this by prefixing the
ports list with the IPs to listen on.
The settings to adjust here are:
services: traefik: ports: - "80:80" - "443:443"
And they need to be overwritten as follows in your override file.
services: traefik: ports: - "10.0.0.1:80:80" - "10.0.0.1:443:443"
More details can be found in the corresponding section of the Docker Compose manual.
Of course, you can combine this with adjusting the ports, e.g., to
Limit Incoming Traffic
In some settings you might want to limit incoming traffic to certain networks / IP ranges.
In principle, this is possible with adjusting the Traefik load balancer/reverse proxy.
However, we would recommend you to use the firewall of your operating system or your overall network for this purpose.
Consult the corresponding manual (e.g., of
firewalld for CentOS/Red Hat or of
ufw for Debian/Ubuntu) for instructions.
We remark that in most cases it is better to perform an actual separation of networks and place each (virtual) machine into one network only.
volumes sub directory of the
varfish-docker-compose directory contains the data for the containers.
These are as follows.
Databases for variant annotation with CADD (large).
Databases for variant prioritization (medium)
Transcript databases for annotation (small).
Storage for files uploaded from client via REST API (big).
PostgreSQL databases (very big).
Storage for the work queues (small).
Configuration and certificates for load balancer (very small).
In principle, you can put these on different storages systems (e.g., some over the network and some on directly attached disks).
The main motivation is that fast storage is expensive.
Putting the small and medium sized directories on slower, cheaper storage will have little or no effect on storage efficiency.
At the same time, access to
exomiser directories should be fast.
postgres, this storage is accessed most heavily and should be on storage as fast as you can afford.
cadd-rest-api should also be on fast storage but it is accessed almost only read-only.
You can put the
minio folder on slower storage to shave off some storage costs from your VarFish installation.
You can put
minioon cheaper storage.
cadd-rest-api, you can probably get away to put this on cheaper storage.
Put everything else, in particular
postgreson storage as fast as you can afford.
As described in the section Performance Tuning, the authors recommend using an advanced file system such as ZFS on multiple SSDs for large, fast storage and enabling compression. You will get excellent performance and can expect storage saving of 50%.
Beacon Site (Experimental)
An experimental support for the GA4GH beacon protocol.
Whether or not to enable experimental beacon site support.
The following list remains a points to implement with Docker Compose and document.
Updating Extras Data