cospar.tl.differential_genes¶

cospar.tl.differential_genes(adata, cell_group_A=None, cell_group_B=None, FDR_cutoff=0.05, sort_by='ratio', min_frac_expr=0.05, pseudocount=1)¶

Perform differential gene expression analysis and plot top DGE genes.

We use Wilcoxon rank-sum test to calculate P values, followed by Benjamini-Hochberg correction.

Parameters

adata : AnnData object: Need to contain gene expression matrix.
cell_group_A : np.array, optional (default: None): A boolean array of the size adata.shape[0] for defining population A. If not specified, we set it to be adata.obs[‘cell_group_A’].
cell_group_B : np.array, optional (default: None): A boolean array of the size adata.shape[0] for defining population B. If not specified, we set it to be adata.obs[‘cell_group_A’].
FDR_cutoff : float, optional (default: 0.05): Cut off for the corrected Pvalue of each gene. Only genes below this cutoff will be shown.
sort_by : float, optional (default: ‘ratio’): The key to sort the differentially expressed genes. The key can be: ‘ratio’ or ‘Qvalue’.
min_frac_expr : float, optional (default: 0.05): Minimum expression fraction among selected states for a gene to be considered for DGE analysis.
pseudocount : int, optional (default: 1): pseudo count for taking the gene expression ratio between the two groups

Returns

diff_gene_A (pd.DataFrame) – Genes differentially expressed in cell state group A, ranked by the ratio of mean expressions between the two groups, with the top being more differentially expressed.
diff_gene_B (pd.DataFrame) – Genes differentially expressed in cell state group B, ranked by the ratio of mean expressions between the two groups, with the top being more differentially expressed.