Visualization#
Interactable Model#
The model’s name can be found above its visualization, along with a info button for further information.
The visualization is interactable, using click and drag as well as zooming (with either the mouse wheel or the buttons found right below the model’s name). The zoom can be reset by using the rightmost button on the top, as seen in the figure below. By hovering over the individual cells in the visualization, a tooltip pops up, displaying its subcategory, dependent on the current category selection.
The visualization can be set to show the query, reference data or both, using the leftmost buttons on the top, as seen in the figure below.
Models and Workflow#
ScArches is a novel deep learning model that enables mapping query to reference datasets. The model allows the user to construct single or multi-modal (CITE-seq) references as well as classifying unlabelled query cells. Currently, mapping of query to reference datasets is offered on GeneCruncher.
We use a REST-API that allows us to provide a unified endpoint for the different scarches models to backend and compute the query file and parse for the visualization.
We support the following models: scVI, scANVI and totalVI.
scVI#
The scVI is an unsupervised model and does not require cell type labels for the mapping. Generally, it also takes the least amount of time to train in comparison to the other models. scVI maps query to atlases.
scANVI#
scanVI supports labled and unlabled data and predicts the cell types. Due to that, there is an additional button on the top left (next to the query/reference button) that toggles the predicted cells in the visualization. scANVI is a semi-supervised variant of scVI designed to leverage any available cell state annotations. Compared to unsupervised models, this model will perform better integration if cell type labels are partially available in the query.
totalVI#
totalVI is a multi-modal CITE-seq RNA and protein data model that can be used to map to multi-modal reference atlases. totalVI takes the most amount of time amongst the models and imputes the proteins that were observed.