sampledoc
News and Announcements »

make_3d_plots.py – Make 3D PCoA plots

Description:

This script automates the construction of 3D plots (kinemage format) from the PCoA output file generated by principal_coordinates.py (e.g. P1 vs. P2 vs. P3, P2 vs. P3 vs. P4, etc., where P1 is the first component).

Usage: make_3d_plots.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --coord_fname
This is the path to the principal coordinates file (i.e., resulting file from principal_coordinates.py), or to a directory containing multiple coord files for averaging (e.g. resulting files from multiple_rarefactions_even_depth.py, followed by multiple beta_diversity.py, followed by multiple principal_coordinates.py).
-m, --map_fname
This is the metadata mapping file [default=None]

[OPTIONAL]

-b, --colorby
This is the categories to color by in the plots from the user-generated mapping file. The categories must match the name of a column header in the mapping file exactly and multiple categories can be list by comma separating them without spaces. The user can also combine columns in the mapping file by separating the categories by “&&” without spaces [default=None]
-a, --custom_axes
This is the category from the user-generated mapping file to use as a custom axis in the plot. For instance,there is a pH category and would like to seethe samples plotted on that axis instead of PC1, PC2, etc., one can use this option. It is also useful for plotting time-series data [default: None]
-p, --prefs_path
This is the user-generated preferences file. NOTE: This is a file with a dictionary containing preferences for the analysis [default: None]
-k, --background_color
This is the background color to use in the plots (Options are ‘black’ or ‘white’. [default: None]
-o, --output_dir
Path to the output directory
--ellipsoid_smoothness
The level of smoothness used in plotting ellipsoids for a summary plot (i.e. using a directory of coord files instead of a single coord file). Valid range is 0-3. A value of 0 produces very coarse “ellipsoids” but is fast to render. The default value is 2. If you encounter a memory error when generating or displaying the plots, try including just one metadata column in your plot. If you still have trouble, reduce the smoothness level to 0 or 1.
--ellipsoid_opacity
Used when plotting ellipsoids for a summary plot (i.e. using a directory of coord files instead of a single coord file). Valid range is 0-3. A value of 0 produces completely transparent (invisible) ellipsoids. A value of 1 produces completely opaque ellipsoids. The default value is 0.33.
--ellipsoid_method
Used when plotting ellipsoids for a summary plot (i.e. using a directory of coord files instead of a single coord file). Valid values are “IQR” (The Interquartile Range) and “sdev” (The standard deviation). The default is IQR.
-t, --taxa_fname
If you wish to perform a biplot, where taxa are plotted along with samples, supply an otu table format file. Typically this is the output from summarize_taxa.py.
--n_taxa_keep
If performing a biplot, the number of taxa to display; use -1 to display all. [default: 10]
--biplot_output_file
If performing a biplot, save the biplot coordinates in this file. [default: None]
--master_pcoa
If performing averaging on multiple coord files, the other coord files will be aligned to this one through procrustes analysis. This master file will not be included in the averaging. If this master coord file is not provided, one of the other coord files will be chosen arbitrarily as the target alignment. [default: None]
--output_format
Output format. Valid choices are: king, invue. If this option is set to invue you will need to also use the option -b to define which column(s) from the metadata file the script will write an output file from; it will also do not account for any other optional paramenter pass. [default: king]
-n, --interpolation_points
Number of extra points to use between samples and interpolate, the minimum is 2. Only used with the inVUE output. [default: 0]
--polyhedron_points
Points to be generated to create a frame around the PCoA plots. Only used when –output_format is inVUE. [default: 4]
--polyhedron_offset
Offset to be added to the points created in the –polyheadron_points option. Only used when –output_format is inVUE. [default: 1.5]

Output:

By default, the script will plot the first three dimensions in your file. Other combinations can be viewed using the “Views:Choose viewing axes” option in the KiNG viewer (Chen, Davis, & Richardson, 2009), which may require the installation of kinemage software. The first 10 components can be viewed using “Views:Paralled coordinates” option or typing “/”. The mouse can be used to modify display parameters, to click and rotate the viewing axes, to select specific points (clicking on a point shows the sample identity in the low left corner), or to select different analyses (upper right window). Although samples are most easily viewed in 2D, the third dimension is indicated by coloring each sample (dot/label) along a gradient corresponding to the depth along the third component (bright colors indicate points close to the viewer).

Default Usage:

If you just want to use the default output, you can supply the principal coordinates file (i.e., resulting file from principal_coordinates.py) and a user-generated mapping file, where the default coloring will be based on the SampleID as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt

Additionally, the user can supply their mapping file (“-m”) and a specific category to color by (“-b”) or any combination of categories. When using the -b option, the user can specify the coloring for multiple mapping labels, where each mapping label is separated by a comma, for example: -b ‘mapping_column1,mapping_column2’. The user can also combine mapping labels and color by the combined label that is created by inserting an ‘&&’ between the input columns, for example: -b ‘mapping_column1&&mapping_column2’.

If the user would like to color all categories in their metadata mapping file, they can pass ‘ALL’ to the ‘-b’ option, as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -b ALL

As an alternative, the user can supply a preferences (prefs) file, using the -p option. The prefs file allows the user to give specific samples their own columns within a given mapping column. This file also allows the user to perform a color gradient, given a specific mapping column.

If the user wants to color by using the prefs file (e.g. prefs.txt), they can use the following code:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -p prefs.txt

Output Directory:

If you want to give an specific output directory (e.g. “3d_plots”), use the following code:

make_3d_plots.py -i principal_coordinates-output_file --o 3d_plots/

Background Color Example:

If the user would like to color the background white they can use the ‘-k’ option as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -b ALL -k white

Jackknifed Principal Coordinates (w/ confidence intervals):

If you have created jackknifed PCoA files, you can pass the folder containing those files, instead of a single file. The user can also specify the opacity of the ellipses around each point “–ellipsoid_opacity”, which is a value from 0-1. Currently there are two metrics “–ellipsoid_method” that can be used for generating the ellipsoids, which are ‘IQR’ and ‘sdev’. The user can specify all of these options as follows:

make_3d_plots.py -i jackknifed_pcoas/ -m Mapping_file.txt -b 'mapping_column1,mapping_column1&&mapping_column2' --ellipsoid_opacity=0.5 --ellipsoid_method=IQR

Bi-Plots:

If the user would like to see which taxa are more prevalent in different areas of the PCoA plot, they can generate Bi-Plots, by passing a principal coordinates file or folder “-i”, a mapping file “-m”, and a summarized taxa file “-t” from summarize_taxa.py. Can be combined with jacknifed principal coordinates.

make_3d_plots.py -i pcoa.txt -m Mapping_file.txt -t otu_table_level3.txt

Site index


sampledoc