python_bioinformagicks.plotting package#
Module contents#
- plot_gprofiler_results(ora_df: DataFrame, title: str = '', cmap: bool = 'Blues', n_terms: int = 15, sort_by: str = 'fold_enrichment', min_FE: int = 0)[source]#
Generates a barplot representing gProfiler ORA results, with: x = -log10(FDR) y = term name color = term fold enrichment
Parameters#
- ora_df: pandas.DataFrame
The results of a gProfiler ORA.
- title: str (default: “”)
The base title of the plot; may be modified if iterating over gProfiler data sources
- cmap: str (default: “Blues”)
The matplotlib colormap to map to the color parameter (fold-enrichment, FE)
- n_terms: uint (default: 15)
The number of over-represented terms to keep for plotting, ranked by term fold enrichment (FE)
- sort_by: str (default: ‘fold_enrichment’)
One of (FE, FDR). Sort the statistically significant terms by term fold enrichment (FE), i.e. (n_in_term/n_expected_in_term) before cutting off to top n_terms. If FDR, sort instead by statistical significance before cutting off.
- min_FE: uint (default: 0)
Ignore terms if their term fold enrichment is below this minimum value.
Returns#
- fig: matplotlib.Figure
The resulting figures, one per source, aligned vertically.
- plot_grouped_proportions(adata: AnnData, factor_to_plot: str, split_by: str, batch_key: str = 'batch', stacked: bool = True)[source]#
Generates (stacked) bar plots of item (cell) counts and proportions, with each bar representing how many items are in each group of “factor_to_plot” across each group of “split_by”. Item counts are reported on a per-batch basis with batch assignments found in the “batch_key” column of the adata.obs table.
If adata.uns[split_by + “_colors”] exists, bars will be colored to match.
Parameters#
- adata: ad.AnnData
The anndata object
- factor_to_plot: str
The factor in adata.obs to generate count/proportion plots for. Commonly celltype, leiden, cluster, etc.
- split_by: str
The factor in adata.obs to compare/split by, such that groups of count/proportion bars are split across this factor. Commonly condition, treatment, age, etc.
- batch_key: str (default: “batch”)
The column in adata.obs that indicates the batch an item belongs to. This is used to normalize reported counts to item counts per batch. Important when some conditions are represented by multiple batches and others are not so that comparisons between conditions are fairer.
- stacked: bool (default: True)
If True, generate a stacked bar graph, else stagger bars side-by-side.
Returns#
- fig: matplotlib.Figure
Contains two axes, one with item counts per batch and one with proportions.
- plot_split_embedding(adata: AnnData, groupby: str, color: list[str] | str, use_rep: str = 'X_umap', last_legend_only: bool = True, **kwargs)[source]#
Plots a split embedding, where each panel shows the same colors (features) but for a different group in the groupby column.
kwargs are passed to relevant sc.pl function.
Parameters#
- adata: ad.AnnData
The anndata object.
- groupby: str
The name of the categorical column in obs to split by.
- color: str or list of str
The var_names and/or obs columns to plot. One item in color is plotted on each row.
- use_rep: str (default: “X_umap”):
The embedding in :code: adata.obsm to use
- last_legend_only: bool (default: True)
If True, only the plots in the last column (right-side) will have a legend. Useful to avoid crowding, however may be problematic when plotting categoricals where categories are missing in the subsetted data used for the final column, as those missing categories will not appear in the legend.
Returns#
fig: matplotlib.fig object