ArchMap Mapping Documentation#

ArchMap (https://www.archmap.bio) is a free, no-code query-to-reference mapping framework that extends to python-based mapping methods and includes out-of-the-box cell type label transfer, uncertainty estimation, and collaborative analysis features. The fully automated approach makes query mapping accessible to users with and without coding and machine learning expertise through its graphical user interface. A cloud-based setup, enables secure sharing of results, codeless downstream analysis, and collaborative annotation of the user’s mapping results. ArchMap query-to-reference mapping is powered by scArches. More information about scArches can be found here.

First steps#

First login or sign up using your academic affiliation. Click on “Map” to go straight to mapping your data without logging in.

_images/homepage1.png

Click on the plus button to create a new mapping project.

_images/mappings_dashboard.png

On the page below you can pick a reference atlas you want to map your data to and the model associated with this atlas. Please note that some models are not compatible with some atlases. If so, they will be disabled. You can also choose a classifier for cell type label transfer of reference cell types to your query data. Available classifiers include KNN, XGBoost, and the native classifiers for scANVI and scVI, if applicable.

_images/choose_atlas.png _images/choose_model.png

After selecting your atlas, model, and classifier, you can then upload your query data by drag-and-drop or by clicking on the upload field. In order to map your data successfully, please follow the instructions on the left of the upload page!. To check whether your file satisfies these requirements, you can run the relevant colab tutorial:

_images/upload_query.png

Once your data is uploaded, click on “Create Mapping”, give your project an appropriate name, and submit. You will then be forwarded back to the homepage where you can eventually review the result once it is done processing. Please note that it takes some time to process, depending on the size of the query data (no longer than one hour). Please wait until both the launch button and download button is clickable before launching a CellxGene Annotate instance. Once clicking on “Launch”, a color-coded build status will appear to notify that a CellxGene instance is loading. Once fully loaded, a “View Results” button will appear. Clicking on this button will open CellxGene in a new tab, where a UMAP of your combined mapping can be viewed.

_images/mapping_page.png

You can also view mapping info and all evaluation metrics calculated by ArchMap for each mapping by clicking on the info icon to the left of the “Launch” button (if you are logged in, this will be to the left of the “Add To Team” button). For more information on how to interpret the evaluation metrics, checkout the “Mapping evaluation” page.

_images/evaluation_metrics.png

Once the CellxGene page is loaded, you can view the UMAP of the combined query and reference. The cells of the UMAP can be coloured using the toggleable categorical and contionuous variables on the left. For example, you can colour based on the cell type label predictions, query vs reference cells, and the uncertainty scores assigned to the query. Additionally, you can create new categories, subset your cells and conduct differential gene expression analysis. You can read more about the functionalities of CellxGene Annotate here. In the screenshot below, you see the UMAP coloured by query and reference:

_images/cellxgene_query.png

Below, we colour the query cells based on the Euclidean uncertainty scores assigned to each cell post-mapping.

_images/cellxgene_uncertainty.png

We can also subset to strictly the query cells and colour by the cell type labels assigned by the KNN classifier. The cell type classification results are saved under the columns that end with “prediction_knn” (or “prediction_xgboost” if the XGBoost classifier is selected) and “prediction_knn_filtered_by_uncert>0.5”. For example, for the HLCA, the label transfer using the “ann_level_5” annotation level of the reference atlas will output the columns “ann_level_5_prediction_knn” and “ann_level_5_prediction_knn_filtered_by_uncert>0.5”, where cell types with an uncertainty score greater than 0.5 are labelled as “Unknown” in the second column.

_images/cellxgene_labels.png

Check out a short tutorial video of the mapping process here:

Note

This project is under active development. If you come across any issues, please write an email to archmap.bio@gmail.com.