Either by eliciting the menu command Button16-SearchSamplesLocally.png Tools | Search Similar Samples in Database or by clicking the icon Button16-SearchSamplesLocally.png in the sample's workspace or in the context menu of a loaded sample allows to search for samples in the database that are similar to the selected sample(s), based on the cgMLST allele profiles.

SearchSimilarSamplesInDatabase.png
  • Search Similar Samples for
Allows to select query sample(s) either by a database search or from loaded sample(s). This option is only available when the menu command Button16-SearchSamplesLocally.png Tools | Search Similar Samples in Database is invoked.
  • Task Template for Distance
If the projects contains multiple Task Templates the one to be used for distance calculation can be selected here.
  • Search recursively for found similar samples
If the recursive option is turned on (default is off), all samples that are within the defined allelic threshold to any other found samples are returned. This search is repeated until no additional samples are found.
  • Allelic Profile Distance Criteria
Defines the threshold for similar samples. By default, the clustering threshold of the cgMLST Task Template is used. Alternatively a multiple of the clustering threshold, an absolute allele distance, or with a similarity above percentage threshold can be chosen.
A filter allows to exclude samples with too much missing values from the search. By default all samples with more than 10% missing values will be excluded.
If a S. aureus spa-typing task template is chosen for distance calculation, the section is replaced against a Spa-Type Distance Criteria section. Here it can be chosen if samples with the same spa-type or with a similar spa-type should be searched.
  • Metadata Criteria
Allows to define metadata limit criteria for the search. If multiple limits are defined they will be linked by the AND operator.
  • Limit to same project: By default only samples in the same project are searched. If turned off the whole database is searched.
  • Limit to same location: The hit sample(s) must have the same location as the query sample(s). If this limit is used the option match if missing is set by default. Thereby samples are reported even if the query and/or hit sample location field is empty. If this option is deselected no matches are reported if the query and/or hit sample location field is empty. The Country of Isolation or the City of Isolation can be defined as location field.
  • Limit to date: The date of the hit sample must match within a given time period (days, weeks, months, or years) with the date of the query sample. If this limit is used the option match if missing is set by default. Thereby samples are reported even if the query and/or hit sample date field is empty. The Collection Date or Created for the sample entry creation date can be defined as date field.
  • Limit to same value in fields: Other database fields than location and/or date fields that must match can be selected here. If this limit is used the option match if missing is deselected by default. Thereby no matches are reported if the query and/or hit user defined fields is empty.
  • Limit by tags: Only query sample(s) with the defined tag(s) will initiate the search and only similar samples that fulfill the tag criteria will be reported as hit samples. If multiple tags are selected they will be linked by the AND operator.
  • Similarity Search Comparison Table Settings
The number of samples for the comparison table that is shown as search result can be set here. By default, only the first 100 Samples of a search result are shown. Additionally, the number of the nearest samples that are not within the defined threshold to be added to the table can be configured (by default turned off).
SearchSimilarSamplesInDatabaseResult.png

The search result can be opened in a comparison table. The query samples are highlighted in orange and the found similar samples are highlighted in yellow. If applied the nearest samples that are not within the defined threshold are not highlighted. - The default epidemiological and procedure metadata fields that will be shown in a comparison table settings are defined on project level in the Project Editor.