Software Tools
Innovation and development of software tools, in partnership with software consultants, will enable researchers to use their data to better characterize protein complex structures, their interactions, and how they relate to biological functions.
Data Processing Tools
Several research groups have presented tools to aid in data processing and interpretation. These are mainly focused on assisting in deconvolution of the mass spectra, or aiding in the interpretation of ion mobility data particularly in the case of protein unfolding studies. This section describes this tools in more detail, and provides links for downloading the software.
UniDec is a suite of computational tools built around a core Bayesian deconvolution algorithm. UniDec is fast, robust, and easily generalized to mass and ion mobility spectra. It scales from very simple spectra to highly polydisperse ensembles. UniDec was originally described in “Bayesian Deconvolution of Mass and Ion Mobility Spectra: From Binary Interactions to Polydisperse Ensembles, Michael T. Marty , Andrew J. Baldwin, Erik G. Marklund , Georg K. A. Hochberg , Justin L. P. Benesch , and Carol V. Robinson, Analytical Chemistry, 2015, 4370.” Unidec is freely available from: https://github.com/michaelmarty/UniDec/releases.
See also UniDec tutorial presented by Michael Marty
ORIGAMI is a two component software suite for faster acquisition of activated ion mobility mass spectrometry (IM-MS) datasets and analysis of MS and IM-MS data. The acquisition software (ORIGAMIMS) works by interfacing MassLynx and Waters Research Enabled Software (WREnS) and enables the automated activation of ions via a sequential ramping of collision voltage prior to ion mobility analysis. Following acquisition, the data can be analyzed in the second component, ORIGAMIANALYSE, which allows the user to visualize the effect of activation on the mobility of the parent ion, as well as on any fragment ion. Data can then be exported in the form of heat maps, waterfall or wire plots as well as interactive and shareable webpages. ORIGAMI was originally described in “ORIGAMI: A software suite for activated ion mobility mass spectrometry (aIM-MS) applied to multimeric protein assemblies, Lukasz G.Migas, Aidan P.France Bruno Bellina, Perdita E. Barran, International Journal of Mass Spectrometry, 2018; 427, 20.” ORIGAMI is freely available from: https://github.com/lukasz-migas/ORIGAMI
iFAMS is a Python algorithm for Fourier analysis of mass spectra with repeated subunits. The method is parameter-free and requires no initial guesses of charge states, total mass, or subunit mass. The algorithm facilitates identification of the charge states, subunit mass, and charge-state specific total mass distribution present in the ion population. iFAMS was originally described in “Fourier Analysis Method for Analyzing Highly Congested Mass Spectra of Ion Populations with Repeated Subunits, Sean P. Cleary, Avery M. Thompson, and James S. Prell, Analytical Chemistry, 2016, 88, 6205.” iFAMS is freely available from: https://github.com/seanpatcleary/ifams and has recently also been incorporated into UniDec.
Demo (download slides)
CIUSuite, a suite of software modules designed for the rapid processing, analysis, comparison, and classification of collision induced unfolding (CIU) data. CIUSuite, a series of Python modules for the generation and manipulation of CIU fingerprints. CIUSuite consists of six modules that allow the user to readily access statistical and structural information from CIU experiments by designing user-defined workflows. CIUSuite was originally described in “CIUSuite: A Quantitative Analysis Package for Collision Induced Unfolding Measurements of Gas-Phase Protein Ions, Joseph D. Eschweiler, Jessica N. Rabuck-Gibbons, Yuwei Tian, and Brandon T. Ruotolo, Analytical Chemistry, 2015, 87, 11516”. CIUSuite is freely available from: https://sites.lsa.umich.edu/ruotolo/software/ciu-suite/
PULSAR is software for analyzing ion-mobility mass spectrometry data. It handles importing mobility data, and organizing spectra by experimental meta-information, and viewing mass and mobility spectra. It provides a tool to fit the mass spectrum, and will automatically calculate CCS values of ions from the measurements made on both travelling-wave and drift-tube ion mobility mass spectrometers. It also provides a workflow for the modelling of gas-phase unfolding trajectories, and the tools to quantify and compute gas-phase stabilization of proteins. PULSAR was originally described in “Quantifying the stabilizing effects of protein–ligand interactions in the gas phase, Timothy M. Allison, Eamonn Reading, Idlir Liko, Andrew J. Baldwin, Arthur Laganowsky, and Carol V. Robinson, Nature Communications, 2015, 6, 8551.” PULSAR is freely available from: http://pulsar.chem.ox.ac.uk/
TWIMExtract, a data extraction tool to export defined slices of liquid chromatography/ion mobility/mass spectrometry (LC-IM-MS) data, providing a route to quantify ion mobility resolution from a commercial traveling-wave ion mobility time-of-flight mass spectrometer. TWIMExtract collapses multidimensional data to a single dimension (drift time, m/z, or retention time) chosen by the user, allowing rapid generation and comparison of CIU fingerprints, mass spectra, m/z specific chromatograms, and more. TWIMExtract provides a simple interface for automated extraction of slices of multidimensional data from Waters .raw format. Ranges of drift time, m/z, and retention time can be specified for any extraction, allowing for flexible and specific extraction of features of interest from an arbitrary number of raw files. TWIMExtract was originally described in “Variable-Velocity Traveling-Wave Ion Mobility Separation Enhancing Peak Capacity for Data-Independent Acquisition Proteomics, Sarah E. Haynes, Daniel A. Polasky, Sugyan M. Dixit, Jaimeen D. Majmudar, Kieran Neeson, Brandon T. Ruotolo, and Brent R. Martin, Analytical Chemistry, 2017, 89, 5669.” TWIMExtract is freely available from: https://sites.lsa.umich.edu/ruotolo/software/twim-extract/
Theoretical CCS Calculation Tools
Several research groups have presented tools which can be used to determine theoretical CCS from previously solved structures; either from X-ray crystallography, NMR, electron density maps, or computational models. These theoretical CCS can then be compared to experimental CCS determined from IM experiments. This section describes this tools in more detail, and provides links for downloading the software.
IMPACT is a computational tool for calculating collision cross-sections from structural models. It is designed for the high-throughput processing of large molecular structures and models (>10,000 atoms) without significant decrease in accuracy. It can accept input coordinate files for single structures (e.g. from X-ray crystallography), ensembles (e.g. from NMR), electron density maps (e.g. from electron microscopy), and coarse-grained models (e.g. from SAXS). IMPACT is 2 to 6 orders of magnitude faster than other algorithms, which have typically been designed for much smaller molecules, and can be invoked directly as a command-line tool in UNIX/Linux or Mac OSX, or used as a library linked with other software for molecular dynamics and integrated structural biology applications. IMPACT was originally described in: “Collision Cross Sections for Structural Proteomics, Erik G. Marklund, Matteo T. Degiacomi, Carol V. Robinson, Andrew J. Baldwin, Justin L. P. Benesch, Structure, 2015, 23, 791.” IMPACT is freely available from: http://impact.chem.ox.ac.uk/
The EMnIM software package works by modelling the electron density map, obtained from EM experiments, as a collection of tightly packed spheres (isosurface). IMPACT is then used to calculate a collisional cross section from the model. EMnIM allows analysis of larger, more flexible protein targets. EMnIM was first reported in “EM∩IM: software for relating ion mobility mass spectrometry and electron microscopy data, Matteo T. Degiacomia and Justin L. P. Benesch, Analyst, 2016, 141, 70.” EMnIM is freely available from: http://emnim.chem.ox.ac.uk/
In the framework of the PSA, molecular collision cross sections are computed as a projection approximation modified to account for collective size and shape effects. The PSA algorithm has been reported to be able to handle the complex molecular shapes (concave, convex, pores, cavities, channels) as well as the range in molecular size typical to proteins. PSA offers the advantage of smaller computational demand than traditional approaches such as the trajectory method. PSA was originally described in “A novel projection approximation algorithm for the fast and accurate computation of molecular collision cross sections (I). Method, Christian Bleiholder, Thomas Wyttenbach, Michael T.Bowers, International Journal of Mass Spectrometry, 2011, 308, 1.” The PSA webserver is hosted by Christian Bleiholder at Florida State University and can be accessed here: http://psa.chem.fsu.edu/login
Collidoscope is an open-source program for calculating collisional cross sections via the Trajectory Method. Using pdb files or simple xyz-type files, users can compute low-field collisional cross sections for many organic and biological molecules ranging in size from 100 Da to >1 MDa in helium or nitrogen gas. Parallelization is automatically implemented (though it can be turned off as needed) to speed up computations, and a built-in option automatically places positive charges on protein ions in low-energy configurations before calculating collisional cross sections. Collidoscope was originally described in “Collidoscope: An Improved Tool for Computing Collisional Cross-Sections with the Trajectory Method, Simon A. Ewing, Micah T. Donor, Jesse W. Wilson, James S. Prell, Journal of The American Society for Mass Spectrometry, 2017, 28, 587”. Collidoscope is freely available here: https://github.com/prellgroup/Collidoscope
Demo (download slides)
Top-Down Software
There are many tools to aid in identification of fragments from Top-down experiments. Below we detail the packages that are used most frequently within the resource for Top-Down experiments, links are also provided for downloading the software.
The National Resource for Translational and Developmental Proteomics (NRTDP) TDPortal is a high-performance computing location that assists in analysis of high-throughput top-down proteomics data. This portal assigns a characterization score (C-score) to every identified proteoform, allowing users to filter and define proteoforms based on confidence. NRTDP TDPortal was originally described in “Advancing Top-down Analysis of the Human Proteome Using a Benchtop Quadrupole-Orbitrap Mass Spectrometer, Luca Fornelli, Kenneth R. Durbin, Ryan T. Fellers, Bryan P. Early, Joseph B. Greer, Richard D. DeLuc, Philip D. Compton, and Neil L. Kelleher, Journal of Proteome Research, 2017, 609-618.” TDPortal can be requested here: http://nrtdp.northwestern.edu/tdportal-request/
The National Resource for Translational and Developmental Proteomics (NRTDP) TDViewer is a free Windows application that displays results from the TDPortal seach system. It can display .tdReport files by protein or by proteoform and can also filter by false discovery rate. NRTDP TDViewer was originally described in “Top-Down Proteomics Enables Comparative Analysis of Brain Proteoforms Between Mouse Strains, Roderick G. Davis, Hae-Min Park, Kyunggon Kim, Joseph B. Greer, Ryan T. Fellers, Richard D. DeLuc, Elena V. Romanova, Stanislav S. Rubakhin, Jonathan A. Zombeck, Cong Wu, Peter M. Yau, Peng Gao, Alexandra J. van Nispen, Steven M. Patrie, Paul M. Thomas, Jonathan V. Sweedler, Justin S. Rhodes, and Neil L. Kelleher, Analytical Chemistry, 2018, 3802-3810.” TDViewer can be requested here: http://topdownviewer.northwestern.edu/
ProSight PTM is a free web portal that enables identification and characterization of proteins analyzed via top-down methods. ProSight PTM allows users to search MS/MS data against UniProt. ProSight warehouses are annotated with all known PTMs, alternative splicing events, and single nucleotide polymorphisms (SNPs). ProSight PTM was originally described in “ProSight PTM: an integrated environment for protein identification and characterization by top-down mass spectrometry, Richard D. DeLuc, Gregory K. Taylor, Yong-Bin Kim, Thomas E. Januszyk, Lee H. Bynum, Joseph V. Sola, John S. Garavelli, and Neil L. Kelleher, Nucleic Acids Research, 2004, W340-W345.” ProSight PTM can be found here: https://prosightptm2.northwestern.edu/
A free Windows version of ProSight is available in the form of ProSight Lite. ProSight Lite allows users to match a candidate protein sequence and modifications to observed mass spectrometry data. ProSight Lite was originally described in “ProSight Lite: Graphical Software to Analyze Top-Down Mass Spectrometry Data, Ryan T. Fellers, Joseph B. Greer, Bryan P. Early, Xiang Yu, Richard D. DeLuc, Neil L. Kelleher, and Paul M. Thomas, Proteomics, 2015, 1235-1238.” ProSight Lite can be found here: http://prosightlite.northwestern.edu/
- FlashDeconv: https://pubmed.ncbi.nlm.nih.gov/32078799/
- MASH Explorer: http://ge.crb.wisc.edu/software.html
- MS-Align+: http://bix.ucsd.edu/projects/msalign/
- MS-DECONV: http://bix.ucsd.edu/projects/msdeconv/
- MSPathFinder: https://omics.pnl.gov/software/mspathfinder
- ProteinGoggle: http://proteingoggle.tongji.edu.cn/
- Proteoform Characterization Tool: http://pcs.kelleher.northwestern.edu/
- Protter: http://wlab.ethz.ch/protter/start/
- pTop: http://pfind.ict.ac.cn/software/pTop/index.html
- Search Engine for Multi-Protein Complexes: http://complexsearch.kelleher.northwestern.edu/
- SpectroGene: https://github.com/fenderglass/SpectroGene
- TopFIND: http://clipserve.clip.ubc.ca/topfind
- TopPIC: http://proteomics.informatics.iupui.edu/software/toppic/index.html
- YADA: http://patternlabforproteomics.org/yada20/download/
Crosslinking Software
A couple of tools to aid in the identification of crosslinks, and interpretation of crosslinking data have been made freely available. This section provides more detail on the freely available tools and provides links for downloading.
DynamXL provides a means to compare experimental cross-linking data against protein structural information. DynamXL accounts for the flexible nature of both protein and cross-linker molecules by predicting the accessible space of amino acid side chains, aggregating measures from multiple protein conformations and measuring distances with a powerful shortest path algorithm. DynamXL was originally described in “Accommodating Protein Dynamics in the Modeling of Chemical Crosslinks, Matteo T. Degiacomi, Carla Schmidt, Andrew J. Baldwin, Justin L.P. Benesch, Structure, 2017, 25, 1751.” DynamXL is freely available here: http://dynamxl.chem.ox.ac.uk/
StavroX & MeroX are tools for identifying cross-linked peptides with mass spectrometry. StavroX identifies multiple different kinds of cross-linked peptides (including DSS, BS³, Disulfides, and zero-length cross-links), while MeroX identifies cross-links of cleavable cross linkers (CID-cleavable). StavroX was first described in “StavroX—A Software for Analyzing Crosslinked Products in Protein Interaction Studies, Michael Götze, Jens Pettelkau, Sabine Schaks, Konstanze Bosse, Christian H. Ihling, Fabian Krauth, Romy Fritzsche, Uwe Kühn, Andrea Sinz, Journal of The American Society for Mass Spectrometry, 2012, 23, 76.” MeroX was first described in “Automated Assignment of MS/MS Cleavable Cross-Links in Protein 3D-Structure Analysis, Michael Götze, Jens Pettelkau, Romy Fritzsche, Christian H. Ihling, Mathias Schäfer, Andrea Sinz, Journal of The American Society for Mass Spectrometry 2015, 26, 83”. StavroX & MeroX are freely available here: https://www.stavrox.com/
Vendor Software
There are several commercially available software tools for either deconvolution of protein mass spectra or interpretation and identification of protein crosslinking data. The software packages we have been using within the resource are detailed below.
XlinkX is a node available within Proteome Discoverer, Thermo Fisher Scientific. The XlinkX node features a fully integrated crosslink peptide search engine for crosslinking MS analysis and enables crosslinked peptide annotation with assignment of inter- and intra-crosslinked peptides and mono-adducts. XlinkX is compatible with non-cleavable and MS-cleavable crosslinkers. XlinkX was originally reported in “Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry, Fan Liu, Dirk T S Rijkers, Harm Post, Albert J R Heck, Nature Methods, 2015, 12, 1179.” XlinkX is incorporated into Proteome Discoverer and more information about the software and purchase is available here: https://www.thermofisher.com/nl/en/home/industrial/mass-spectrometry/proteomics-protein-mass-spectrometry/proteomics-protein-mass-spectrometry-workflows/crosslinking-mass-spectrometry.html
Intact Mass is a deconvolution tool for native MS data, and can analyze both low and high resolution data. It can analyze either direct infusion or LC-MS runs. Using a modern, parsimonious algorithm, Intact Mass analyzes any number of elution peaks, avoiding the need for each chromatographic region to be processed independently. Intact Mass enables workflow automation, from file reading to report generation, for streamlined processing of large numbers of samples. Intact Mass was first reported in “Parsimonious Charge Deconvolution for Native Mass Spectrometry, Marshall Bern, Tomislav Caval, Yong J. Kil, Wilfred Tang, Christopher Becker, Eric Carlson, Doron Kletter, K. Ilker Sen, Nicolas Galy, Dominique Hagemans, Vojtech Franc, and Albert J. R. Heck, Journal of Proteome Research, 2018, 17, 1216.” More information on Intact Mass, including purchasing, can be found here: https://www.proteinmetrics.com/products/intact-mass/
BioPharma Finder is a deconvolution tool for native MS data, which can analyze LC-MS or direct infusion data. BioPharma Finder has been designed for high resolution data analysis. It can be used to study intact proteins in addition to subunit analysis. BioPharma finder can also be used for top-down data analysis for protein identification and for peptide mapping. BioPharma finder can also analyze CID, HCD, EThcD and ETD fragmentation spectra. More information on BioPharma Finder, including purchasing, can be found here: http://planetorbitrap.com/biopharma-finder#tab:overview