Authorisation Model

Cafe Variome uses a mixed, but mainly local authorisation model to handle user permission. In short, each user or node would need to be approved locally.

User account and encryption key

For a user to log into a CV3 instance, they would need to have an account in the system. This account is created in OIDC provider, and should have a local database entry. Meanwhile, for a user to query data, they would need to have an encryption key, which is generated by the system and stored in the vault. The encryption key is used to encrypt/decrypt the query data.

Here is a table of user types, their account conditions and permissions to perform queries:

User Type	OIDC account	Local DB account	Encryption key	permission
Anonymous	No	No	No	None*
Remote user	Yes	No	No	None**
Remote query user	Yes	Yes	No	Query
Local user	Yes	Yes	Yes	Login and query

*: Anonymous users, by default, cannot query the system. However, admins can selectively allow anonymous querying. If the option is turned on, anonymous user would have access to all metadata, and the record level sources that have been assigned to the anonymous internal user.
**: Remote users have no local database entry, and their queries will be rejected. However, admins may choose to turn on automatic registration, which will assign a local database entry to a user who has an OIDC account, but no local database entry. This would turn the user into a remote query user. Refer to Automatic user registration for more details.

Discovery group

Discovery group model

Access to data sources is managed through a model called Discovery Groups. Each discovery group is associated with a discovery network and contains one or more data sources, one or more users, and a defined discovery policy.

The available discovery policies are:

Boolean response: This indicates whether there are records within a data source that meet the filtering criteria.
Range response: This provides the maximum and minimum possible count of records within a data source that meet the filtering criteria. This range is calculated based on factors such as user access, query sensitivity, and the data source itself. For example, if 50 records fit the criteria, the result might be displayed as a range of 25–65, with the actual average varying for each user. This feature is still under active development and is not available in the current release.
Count response: show the exact number of records inside a datasource that fits the filtering requirements.
Count with subject ID: This refers to a response that provides both the exact number of records within a data source that meet the filtering criteria, as well as the subject IDs associated with those records. The subject ID is the unique identifier assigned to each record in the database. This ID may be the actual identifier, or a de-identified ID, depending on the data ingestion process and data governance policies of the providing user. The Cafe Variome software only provides the platform to allow data discovery, the end user must ensure that they have all of the appropriate approvals in place to allow them to display the level of detail desired in their queiry responses.
Count with details: This displays the records matching the filtering criteria, along with all associated data for each record. Please be aware that this may significantly increase the response size, especially if the data includes large representations like gene sequences. Use with caution.

The Cafe Variome software only provides the platform to allow data discovery. The end user must ensure that they have all of the appropriate approvals in place to display the level of detail desired in the queiry responses.

Managing discovery groups

A discovery group assigns the same level of access to all users within it for all data sources contained in the group. However, both users and data sources can be included in multiple discovery groups, granting them different access levels for different sources. In the case where a user is granted access to a source multiple times with varying access levels, the highest access level will always apply.

For example, consider the following discovery groups:

Group 1 grants users A, B, and C boolean access to sources 1 and 2.
Group 2 grants users C and D count access to sources 1 and 3.

In this scenario:

User C will have count access to sources 1 and 3, and boolean access to source 2.
Users A and B will have boolean access to sources 1 and 2, but no access to source 3.
User D will have count access to sources 1 and 3, but no access to source 2.

Additionally, a user may have different access levels granted through different discovery networks. However, if the same source is involved, the user will always use the highest access level, regardless of the network from which the query originates. Since access levels are definitive and take precedence, it's important to adhere to the principle of least privilege when granting user access.

Last modified: 31 March 2025