python_bioinformagicks.plotting package#

Module contents#

plot_gprofiler_results(ora_df: DataFrame, title: str = '', cmap: bool = 'Blues', n_terms: int = 15, sort_by: str = 'fold_enrichment', min_FE: int = 0)[source]#

Generates a barplot representing gProfiler ORA results, with: x = -log10(FDR) y = term name color = term fold enrichment

Parameters#

ora_df: pandas.DataFrame

The results of a gProfiler ORA.

title: str (default: “”)

The base title of the plot; may be modified if iterating over gProfiler data sources

cmap: str (default: “Blues”)

The matplotlib colormap to map to the color parameter (fold-enrichment, FE)

n_terms: uint (default: 15)

The number of over-represented terms to keep for plotting, ranked by term fold enrichment (FE)

sort_by: str (default: ‘fold_enrichment’)

One of (FE, FDR). Sort the statistically significant terms by term fold enrichment (FE), i.e. (n_in_term/n_expected_in_term) before cutting off to top n_terms. If FDR, sort instead by statistical significance before cutting off.

min_FE: uint (default: 0)

Ignore terms if their term fold enrichment is below this minimum value.

Returns#

fig: matplotlib.Figure

The resulting figures, one per source, aligned vertically.

plot_grouped_proportions(adata: AnnData, factor_to_plot: str, split_by: str, batch_key: str = 'batch', stacked: bool = True)[source]#

Generates (stacked) bar plots of item (cell) counts and proportions, with each bar representing how many items are in each group of “factor_to_plot” across each group of “split_by”. Item counts are reported on a per-batch basis with batch assignments found in the “batch_key” column of the adata.obs table.

If adata.uns[split_by + “_colors”] exists, bars will be colored to match.

Parameters#

adata: ad.AnnData

The anndata object

factor_to_plot: str

The factor in adata.obs to generate count/proportion plots for. Commonly celltype, leiden, cluster, etc.

split_by: str

The factor in adata.obs to compare/split by, such that groups of count/proportion bars are split across this factor. Commonly condition, treatment, age, etc.

batch_key: str (default: “batch”)

The column in adata.obs that indicates the batch an item belongs to. This is used to normalize reported counts to item counts per batch. Important when some conditions are represented by multiple batches and others are not so that comparisons between conditions are fairer.

stacked: bool (default: True)

If True, generate a stacked bar graph, else stagger bars side-by-side.

Returns#

fig: matplotlib.Figure

Contains two axes, one with item counts per batch and one with proportions.

plot_split_embedding(adata: AnnData, groupby: str, color: list[str] | str, use_rep: str = 'X_umap', last_legend_only: bool = True, **kwargs)[source]#

Plots a split embedding, where each panel shows the same colors (features) but for a different group in the groupby column.

kwargs are passed to relevant sc.pl function.

Parameters#

adata: ad.AnnData

The anndata object.

groupby: str

The name of the categorical column in obs to split by.

color: str or list of str

The var_names and/or obs columns to plot. One item in color is plotted on each row.

use_rep: str (default: “X_umap”):

The embedding in :code: adata.obsm to use

last_legend_only: bool (default: True)

If True, only the plots in the last column (right-side) will have a legend. Useful to avoid crowding, however may be problematic when plotting categoricals where categories are missing in the subsetted data used for the final column, as those missing categories will not appear in the legend.

Returns#

fig: matplotlib.fig object