cospar.pp.refine_state_info_by_marker_genes

cospar.pp.refine_state_info_by_marker_genes(adata, marker_genes, express_threshold=0.1, selected_key='state_info', selected_values=None, new_cluster_name='new_cluster', confirm_change=False, add_neighbor_N=5)

Refine state info according to marker gene expression.

In this method, a state is selected if it expresses all genes in the list of marker_genes, and the expression are above the relative express_threshold. You can also specify which time point you want to focus on. In addition, we also include cell states neighboring to these valid states to smooth the selection (controlled by add_neighbor_N).

When you run it the first time, set confirm_change=False. Only when you are happy with the result, set confirm_change=True to update the adata.obs[‘state_info’]. The original state_info will be stored at adata.obs[‘old_state_info’].

Parameters
adata : AnnData object

marker_genes : list or ‘str’

List of marker genes to be used for defining cell states.

express_threshold : float, optional (default: 0.1)

Relative threshold of marker gene expression, in the range [0,1]. A state must have an expression above this threshold for all genes to be included.

selected_key

A key in adata.obs, including ‘state_info’, or ‘time_info’

selected_values : list, optional (default: include all)

A list of clusters/time_points for further sub-clustering. Should be among adata.obs[selected_key].

new_cluster_name : str, optional (default: ‘new_cluster’)

confirm_change : bool, optional (default: False)

If True, update adata.obs[‘state_info’].

add_neighbor_N : int, optional (default: 5)

Add to the new cluster neighboring cells of a qualified high-expressing state according to the KNN graph with K=add_neighbor_N.

Returns

Update the adata.obs[‘state_info’] if confirm_change=True.