3 years ago

Comparison and Evaluation of Clustering Algorithms for Tandem Mass Spectra

Comparison and Evaluation of Clustering Algorithms for Tandem Mass Spectra
Bernhard Blank-Landeshammer, Laura Kerschke, Jörg Rahnenführer, Karin U. Schork, Albert Sickmann, Vera Rieder
In proteomics, liquid chromatography–tandem mass spectrometry (LC–MS/MS) is established for identifying peptides and proteins. Duplicated spectra, that is, multiple spectra of the same peptide, occur both in single MS/MS runs and in large spectral libraries. Clustering tandem mass spectra is used to find consensus spectra, with manifold applications. First, it speeds up database searches, as performed for instance by Mascot. Second, it helps to identify novel peptides across species. Third, it is used for quality control to detect wrongly annotated spectra. We compare different clustering algorithms based on the cosine distance between spectra. CAST, MS-Cluster, and PRIDE Cluster are popular algorithms to cluster tandem mass spectra. We add well-known algorithms for large data sets, hierarchical clustering, DBSCAN, and connected components of a graph, as well as the new method N-Cluster. All algorithms are evaluated on real data with varied parameter settings. Cluster results are compared with each other and with peptide annotations based on validation measures such as purity. Quality control, regarding the detection of wrongly (un)annotated spectra, is discussed for exemplary resulting clusters. N-Cluster proves to be highly competitive. All clustering results benefit from the so-called DISMS2 filter that integrates additional information, for example, on precursor mass.

Publisher URL: http://dx.doi.org/10.1021/acs.jproteome.7b00427

DOI: 10.1021/acs.jproteome.7b00427

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.