sampledoc
News and Announcements »

make_3d_plots.py – Make 3D PCoA plots

Description:

This script automates the construction of 3D plots (kinemage format) from the PCoA output file generated by principal_coordinates.py (e.g. P1 vs. P2 vs. P3, P2 vs. P3 vs. P4, etc., where P1 is the first component).

Usage: make_3d_plots.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --coord_fname
Input principal coordinates filepath (i.e., resulting file from principal_coordinates.py). Alternatively, a directory containing multiple principal coordinates files for jackknifed PCoA results.
-m, --map_fname
Input metadata mapping filepath

[OPTIONAL]

-b, --colorby
Comma-separated list categories metadata categories (column headers) to color by in the plots. The categories must match the name of a column header in the mapping file exactly. Multiple categories can be list by comma separating them without spaces. The user can also combine columns in the mapping file by separating the categories by “&&” without spaces. [default=color by all]
-a, --custom_axes
This is the category from the metadata mapping file to use as a custom axis in the plot. For instance, if there is a pH category and you would like to see the samples plotted on that axis instead of PC1, PC2, etc., one can use this option. It is also useful for plotting time-series data. Note: if there is any non-numeric data in the column, it will not be plotted [default: None]
-p, --prefs_path
Input user-generated preferences filepath. NOTE: This is a file with a dictionary containing preferences for the analysis. [default: None]
-k, --background_color
Background color to use in the plots. [default: black]
--ellipsoid_smoothness
Used only when plotting ellipsoids for jackknifed beta diversity (i.e. using a directory of coord files instead of a single coord file). Valid choices are 0-3. A value of 0 produces very coarse “ellipsoids” but is fast to render. If you encounter a memory error when generating or displaying the plots, try including just one metadata column in your plot. If you still have trouble, reduce the smoothness level to 0. [default: 1]
--ellipsoid_opacity
Used only when plotting ellipsoids for jackknifed beta diversity (i.e. using a directory of coord files instead of a single coord file). The valid range is between 0-1. 0 produces completely transparent (invisible) ellipsoids and 1 produces completely opaque ellipsoids. [default=0.33]
--ellipsoid_method
Used only when plotting ellipsoids for jackknifed beta diversity (i.e. using a directory of coord files instead of a single coord file). Valid values are “IQR” and “sdev”. [default=IQR]
--master_pcoa
Used only when plotting ellipsoids for jackknifed beta diversity (i.e. using a directory of coord files instead of a single coord file). These coordinates will be the center of each ellipisoid. [default: None; arbitrarily chosen PC matrix will define the center point]
-t, --taxa_fname
Used only when generating BiPlots. Input summarized taxa filepath (i.e., from summarize_taxa.py). Taxa will be plotted with the samples. [default=None]
--n_taxa_keep
Used only when generating BiPlots. This is the number of taxa to display. Use -1 to display all. [default: 10]
--biplot_output_file
Used only when generating BiPlots. Output coordinates filepath when generating a biplot. [default: None]
--output_format
Output format. If this option is set to invue you will need to also use the option -b to define which column(s) from the metadata file the script should use when writing an output file. [default: king]
-n, --interpolation_points
Used only when generating inVUE plots. Number of points between samples for interpolatation. [default: 0]
--polyhedron_points
Used only when generating inVUE plots. The number of points to be generated when creating a frame around the PCoA plots. [default: 4]
--polyhedron_offset
Used only when generating inVUE plots. The offset to be added to each point created when using the –polyhedron_points option. This is only used when using the invue output_format. [default: 1.5]
--add_vectors
Create vectors based on a column of the mapping file. This.parameter accepts up to 2 columns: (1) create the vectors, (2) sort them. If you wanted to group by Species and order by SampleID you will pass –add_vectors=Species but if you wanted to group by Species but order by DOB you will pass –add_vectors=Species,DOB; this is useful when you use –custom_axes param [default: None]
--rms_algorithm
The algorithm to calculate the RMS, either avg or trajectory; both algorithms use all the dimensions and weights them using their percentange explained; return the norm of the created vectors; and their confidence using ANOVA. The vectors are created as follows: for avg it calculates the average at each timepoint (averaging within a group), then calculates the norm of each point; for trajectory calculates the norm from the 1st-2nd, 2nd-3rd, etc. [default: None]
--rms_path
Name of the file to save the root mean square (RMS) of the vectors grouped by the column used with the –add_vectors function. Note that this option only works with –add_vectors. The file is going to be created inside the output_dir and its name will start with “RMS”. [default: RMS_output.txt]
-o, --output_dir
Path to the output directory

Output:

By default, the script will plot the first three dimensions in your file. Other combinations can be viewed using the “Views:Choose viewing axes” option in the KiNG viewer (Chen, Davis, & Richardson, 2009), which may require the installation of kinemage software. The first 10 components can be viewed using “Views:Paralled coordinates” option or typing “/”. The mouse can be used to modify display parameters, to click and rotate the viewing axes, to select specific points (clicking on a point shows the sample identity in the low left corner), or to select different analyses (upper right window). Although samples are most easily viewed in 2D, the third dimension is indicated by coloring each sample (dot/label) along a gradient corresponding to the depth along the third component (bright colors indicate points close to the viewer).

Default Usage:

If you just want to use the default output, you can supply the principal coordinates file (i.e., resulting file from principal_coordinates.py) and a user-generated mapping file, where the default coloring will be based on the SampleID as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt

Additionally, the user can supply their mapping file (“-m”) and a specific category to color by (“-b”) or any combination of categories. When using the -b option, the user can specify the coloring for multiple mapping labels, where each mapping label is separated by a comma, for example: -b ‘mapping_column1,mapping_column2’. The user can also combine mapping labels and color by the combined label that is created by inserting an ‘&&’ between the input columns, for example: -b ‘mapping_column1&&mapping_column2’.

If the user would like to color all categories in their metadata mapping file, they can pass ‘ALL’ to the ‘-b’ option, as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -b ALL

As an alternative, the user can supply a preferences (prefs) file, using the -p option. The prefs file allows the user to give specific samples their own columns within a given mapping column. This file also allows the user to perform a color gradient, given a specific mapping column.

If the user wants to color by using the prefs file (e.g. prefs.txt), they can use the following code:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -p prefs.txt

Output Directory:

If you want to give an specific output directory (e.g. “3d_plots”), use the following code:

make_3d_plots.py -i principal_coordinates-output_file --o 3d_plots/

Background Color Example:

If the user would like to color the background white they can use the ‘-k’ option as follows:

make_3d_plots.py -i beta_div_coords.txt -m Mapping_file.txt -b ALL -k white

Jackknifed Principal Coordinates (w/ confidence intervals):

If you have created jackknifed PCoA files, you can pass the folder containing those files, instead of a single file. The user can also specify the opacity of the ellipses around each point “–ellipsoid_opacity”, which is a value from 0-1. Currently there are two metrics “–ellipsoid_method” that can be used for generating the ellipsoids, which are ‘IQR’ and ‘sdev’. The user can specify all of these options as follows:

make_3d_plots.py -i jackknifed_pcoas/ -m Mapping_file.txt -b 'mapping_column1,mapping_column1&&mapping_column2' --ellipsoid_opacity=0.5 --ellipsoid_method=IQR

Bi-Plots:

If the user would like to see which taxa are more prevalent in different areas of the PCoA plot, they can generate Bi-Plots, by passing a principal coordinates file or folder “-i”, a mapping file “-m”, and a summarized taxa file “-t” from summarize_taxa.py. Can be combined with jacknifed principal coordinates.

make_3d_plots.py -i pcoa.txt -m Mapping_file.txt -t otu_table_level3.txt

Site index


sampledoc