MA plot to visualize gene expression data using Python

Renesh Bedre        2 minute read

What is MA plot?

  • 2-dimensional (2D) scatter plot to visualize gene expression datasets
  • Visualize and identify gene expression changes from two different conditions (eg. normal vs. treated) in terms of log fold change (M) on Y-axis and log of the mean of normalized expression counts of normal and treated samples (A) on X-axis
  • Genes with similar expression values in both normal and treated samples will cluster around M=0 value i.e genes expressed with no significant differences in between treatments
  • Points away from M=0 line indicate genes with significant expression, For example, a gene is upregulated and downregulated if the point is above and below M=0 line respectively
  • MA plot does not consider statistical measures (P-values or adjusted P-values) and therefore we can not tell genes with statistically significant differences between normal vs. treated from MA plot (Use Volcano plot if you want indicates genes with statistically significant differences)

How to create MA plot in Python?

  • We will use bioinfokit v0.8.8 or later
  • Check bioinfokit documentation for installation and documentation
  • For generating the MA plot, I have used gene expression data published in Bedre et al. 2016 to identify statistically significantly induced or downregulated genes in response to salt stress in Spartina alterniflora (Read paper). Here’s you can download gene expression dataset used for generating MA plot: dataset
# you can use interactive python interpreter, jupyter notebook, spyder or python code
# I am using interactive python interpreter (Python 3.7)
>>> from bioinfokit import analys, visuz
# load dataset as pandas dataframe
>>> df = analys.get_data('ma').data
>>> df.head()
          GeneNames  value1  value2    log2FC       p-value
0  LOC_Os09g01000.1    8862   32767 -1.886539  1.250000e-55
1  LOC_Os12g42876.1    1099     117  3.231611  1.050000e-55
2  LOC_Os12g42884.2     797      88  3.179004  2.590000e-54
3  LOC_Os03g16920.1     274       7  5.290677  4.690000e-54
4  LOC_Os05g47540.4     308      18  4.096862  2.190000e-54

>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2')
# plot will be saved in same directory (ma.png)
# set parameter show=True, if you want view the image instead of saving

Generated MA plot by adding above code (green: upregulated and red: downregulated genes),

Add legend to the plot,

>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', plotlegend=True)

Change color of MA plot

# change colormap
>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', plotlegend=True, 
    color=('#00239CFF', 'grey', '#E10600FF'))

Change log fold change threshold,

>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', lfc_thr=2, plotlegend=True, 
    color=('#00239CFF', 'grey', '#E10600FF'))

Change the shape of the points

# add star shape
# check more shapes at https://matplotlib.org/3.1.1/api/markers_api.html
>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', color=('#00239CFF', 'grey', '#E10600FF'), 
    markerdot='*', plotlegend=True)

Change the transparency of the points

# add star shape
# check more shapes at https://matplotlib.org/3.1.1/api/markers_api.html
>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', color=('#00239CFF', 'grey', '#E10600FF'), 
    markerdot='*', valpha=0.5, plotlegend=True)

Draw log fold change threshold lines

# change colormap
>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1',  st_count='value2', color=('#00239CFF', 'grey', '#E10600FF'),
    fclines=True, plotlegend=True)

Change X and Y range ticks, font size and name for tick labels

>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', color=('#00239CFF', 'grey', '#E10600FF'), 
    markerdot='*', figtype='svg', xlm=(0,16,1), ylm=(-6,6,1), axtickfontsize=10, axtickfontname='Verdana', plotlegend=True)

Change legend position and labels

>>> visuz.gene_exp.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', color=('#00239CFF', 'grey', '#E10600FF'),
    plotlegend=True, legendpos='lower right', legendlabels=['Upregulated', 'Normal', 'Downregulated'])

In addition to these parameters, the parameters for figure type (figtype), X and Y axis ticks range (xlm, ylm), axis labels (axxlabel, axylabel),
axis labels font size and name (axlabelfontsize, axlabelfontname), and axis tick labels font size and name (axtickfontsize, axtickfontname) can be provided.

Check detailed usage

How to cite?
Renesh Bedre.(2020, July 29). reneshbedre/bioinfokit: Bioinformatics data analysis and visualization toolkit (Version v0.9). Zenodo. http://doi.org/10.5281/zenodo.3965241

If you have any questions, comments or recommendations, please email me at reneshbe@gmail.com

Last updated: July 02, 2020

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.