This article provides a comprehensive framework for interpreting complex datasets generated by high-resolution mass spectrometry (HRMS) in non-targeted analysis (NTA).
This article provides a comprehensive framework for interpreting complex datasets generated by high-resolution mass spectrometry (HRMS) in non-targeted analysis (NTA). Covering foundational principles, methodological workflows, optimization strategies, and validation protocols, we address critical challenges including uncertainty management, data processing techniques, machine learning integration, and quantitative interpretation. Designed for researchers and analytical professionals, this guide synthesizes current best practices to enhance confidence in chemical identification, support robust study design, and facilitate the transition of NTA from research tool to decision-support application in biomedical and environmental health contexts.
In the fields of environmental monitoring, food safety, and pharmaceutical development, the ability to comprehensively characterize complex chemical mixtures is paramount. Traditional targeted analytical methods have long been the gold standard for quantifying specific, predefined analytes. However, the expanding universe of chemical substances, including numerous emerging environmental contaminants (EECs) and non-intentionally added substances (NIAS), has revealed the limitations of targeted approaches [1] [2]. These challenges have propelled the adoption of non-targeted analysis (NTA), a powerful paradigm that enables the detection and identification of unknown or unexpected chemicals without prior knowledge of their presence [3] [4].
NTA, often used as a blanket term that encompasses both suspect screening and true non-targeted analysis, represents a fundamental shift in analytical strategy [3]. This in-depth technical guide delineates the core principles, methodological workflows, and key differentiators of NTA from traditional targeted approaches, providing researchers and drug development professionals with a framework for selecting and implementing appropriate analytical strategies for their specific applications.
Targeted analysis is a quantitative analytical method designed to detect and measure specific, predefined analytes with a high degree of confidence [5]. This approach relies on the availability of authentic chemical standards for each target compound to optimize detection parameters, establish retention times, and generate calibration curves for accurate quantification [3]. The performance of targeted methods is evaluated using well-established metrics including selectivity (ability to differentiate the target analyte from interferents), sensitivity (limit of detection and quantification), accuracy (closeness to the true value), and precision (reproducibility across measurements) [5]. Targeted methods are ideal for regulatory compliance monitoring, routine quantification of known contaminants, and any application where the chemical targets are well-defined and reference standards are available.
Non-targeted analysis (NTA), also referred to as non-target screening or untargeted screening, is a theoretical concept broadly defined as "the characterization of the chemical composition of any given sample without the use of a priori knowledge regarding the sample's chemical content" [3]. Unlike targeted methods, NTA does not focus on specific predefined analytes but aims to comprehensively detect a wide range of chemicals present in a sample [4]. The resulting detections may be used to classify samples based on their entire chemical profile, and subsequent analyses may focus on identifying individual chemicals of interest [3].
A related approach, suspect screening analysis (SSA), occupies a middle ground between targeted and true non-targeted analysis. SSA involves the identification of chemicals by comparison to a predefined list or library containing known chemicals of interest, essentially narrowing the scope of the investigation to compounds of suspected relevance [3]. In practical usage, the term "NTA" is often applied as a blanket term that encompasses both suspect screening and true non-targeted analysis, particularly when workflows incorporate elements of both approaches [3].
Table 1: Fundamental Differences Between Targeted, Suspect Screening, and Non-Targeted Analysis
| Aspect | Targeted Analysis | Suspect Screening Analysis (SSA) | Non-Targeted Analysis (NTA) |
|---|---|---|---|
| Objective | Quantify specific, predefined analytes | Identify suspected chemicals from a predefined list | Comprehensively characterize sample chemical composition |
| Prior Knowledge Requirement | Complete (reference standards required) | Partial (suspect list required) | None |
| Scope of Analysis | Narrow (limited to target analytes) | Moderate (limited to suspect list) | Broad (theoretically unlimited) |
| Quantitative Capability | Fully quantitative | Semi-quantitative or qualitative | Primarily qualitative |
| Standard Dependence | Dependent on authentic standards | Not dependent, but improves confidence | Not dependent |
| Primary Application | Regulatory compliance, routine monitoring | Chemical forensics, hypothesis testing | Discovery, exploratory research, hazard identification |
The relationship between targeted, suspect screening, and non-targeted approaches can be visualized as a spectrum of analytical strategies with varying levels of prior knowledge requirements and chemical scope. The following diagram illustrates the conceptual workflow and relationship between these approaches:
The implementation of NTA relies heavily on high-resolution mass spectrometry (HRMS) platforms, which provide the mass accuracy and resolving power necessary to distinguish between thousands of chemical features in complex samples [1] [5]. Common HRMS instruments used in NTA include quadrupole time-of-flight (QTOF) and Orbitrap mass spectrometers, often coupled with separation techniques such as liquid chromatography (LC) or gas chromatography (GC) [6] [2]. The emergence of multidimensional separation techniques, including two-dimensional chromatography (LC×LC or GC×GC) and high-resolution ion mobility spectrometry (HRIMS), has further enhanced the peak capacity and separation power available for NTA, enabling more comprehensive analysis of complex mixtures [6].
The data acquisition modes commonly employed in NTA include data-dependent acquisition (DDA), which selects the most abundant ions for fragmentation, and data-independent acquisition (DIA), which fragments all ions within predefined mass windows [2]. Both approaches generate MS/MS spectral data that are crucial for compound identification, with each offering distinct advantages in coverage and reproducibility.
A comprehensive NTA study involves multiple interconnected steps, from initial study design to final data interpretation. The following diagram illustrates a generalized NTA workflow, highlighting key stages and decision points:
Targeted methods employ optimized sample preparation techniques specifically tailored to the physicochemical properties of the target analytes, aiming to maximize extraction efficiency and minimize matrix effects for those specific compounds [3]. In contrast, NTA utilizes generic sample preparation protocols designed to extract a broad range of chemicals with diverse properties, inevitably introducing biases toward certain compound classes while potentially missing others [3] [2]. The study design in NTA must intentionally incorporate quality assurance and quality control (QA/QC) approaches, including procedural blanks, quality control materials, and internal standards, to enable performance assessment after data acquisition and analysis is complete [3].
Data processing in targeted analysis is relatively straightforward, focusing on quantifying specific precursor-product ion transitions for each target analyte [5]. NTA, however, generates complex, high-dimensional datasets that require sophisticated data processing pipelines for feature detection, peak alignment, and molecular formula assignment [1]. The identification process in NTA follows confidence levels based on the available evidence, ranging from level 1 (confirmed structure with authentic standard) to level 5 (exact mass of interest) [7] [3].
Compound identification in NTA relies on multiple lines of evidence, including:
The integration of computational tools and spectral libraries is essential for NTA data interpretation. Resources such as the NIST Mass Spectral Library, NORMAN Suspect List Exchange, and various open-source spectral databases support compound identification by providing reference spectra for comparison [7] [8].
While targeted analysis provides absolute quantification using authentic standards and calibration curves, NTA typically offers semi-quantitative estimates based on assumed response factors or class-based calibration [4]. Recent advancements in quantitative non-targeted analysis (qNTA) aim to address this limitation by developing approaches for more accurate concentration estimation without reference standards for every compound [1] [4].
The criteria for assessing method performance differ significantly between targeted and non-targeted approaches. The table below summarizes key performance metrics and their application in each paradigm:
Table 2: Performance Assessment in Targeted vs. Non-Targeted Analysis
| Performance Metric | Targeted Analysis Application | Non-Targeted Analysis Application |
|---|---|---|
| Selectivity | Ability to distinguish target analyte from interferents using unique ion transitions | Chemical space coverage, isomeric resolution, specificity of identification |
| Sensitivity | Limit of detection (LOD) and quantification (LOQ) for specific analytes | Feature detection rate, minimum identifiable concentration across chemical space |
| Accuracy | Agreement with true value using certified reference materials | Structural identification correctness, database matching accuracy |
| Precision | Reproducibility of quantitative results across replicates | Consistency of feature detection and identification across replicates |
| Uncertainty | Well-defined confidence intervals for concentrations | Multiple sources: feature detection, compound identification, quantification estimation |
Quality assurance in NTA presents unique challenges due to the absence of reference standards for most detected compounds [5]. Best practices include:
Initiatives such as the Best Practices for Non-Targeted Analysis (BP4NTA) have developed reporting frameworks and quality control metrics to improve the reliability and comparability of NTA results across different laboratories and studies [7] [3].
The large number of features detected in NTA studies (often thousands per sample) creates a bottleneck in data interpretation, necessitating effective prioritization strategies to focus resources on the most relevant compounds [9]. Recent approaches include:
The US Environmental Protection Agency's INTERPRET NTA tool exemplifies efforts to streamline NTA data review by integrating chemical metadata, predicted spectra, and hazard information to support defensible prioritization of chemical candidates [8].
Recent advancements in machine learning (ML) and artificial intelligence (AI) are addressing key challenges in NTA, including:
These developments are gradually transforming NTA from a purely exploratory tool toward a more robust approach capable of supporting chemical risk assessment and regulatory decision-making [1] [8].
Table 3: Key Research Reagent Solutions for Non-Targeted Analysis
| Resource Category | Examples | Function and Application |
|---|---|---|
| Spectral Libraries | NIST Mass Spectral Library, MassBank, mzCloud | Reference spectra for compound identification via spectral matching |
| Suspect Lists | NORMAN Suspect List Exchange, EPA's CompTox Chemicals Dashboard | Predefined lists of potential contaminants for suspect screening |
| Data Processing Tools | MS-DIAL, XCMS, OpenMS | Feature detection, peak alignment, and data preprocessing |
| Quantitative Prediction | MS2Quant | Concentration prediction from MS/MS spectra without standards |
| Toxicity Prediction | MS2Tox, MSFragTox | Toxicity estimation from fragmentation patterns |
| Identification Tools | CSI:FingerID, SIRIUS, CFM-ID | In silico fragmentation and compound structure elucidation |
| Data Integration Platforms | INTERPRET NTA (EPA) | Tools for reviewing, interpreting, and reporting NTA data quality |
Non-targeted analysis represents a paradigm shift in analytical chemistry, moving from hypothesis-driven targeted approaches to discovery-oriented comprehensive characterization. While targeted methods remain essential for precise quantification of known analytes, NTA provides unparalleled capability for detecting unknown and unexpected compounds across diverse sample matrices [4] [2]. The key differentiators between these approaches extend beyond technical implementation to encompass fundamental differences in philosophy, application, and performance assessment.
The ongoing development of standardized practices, advanced computational tools, and harmonized reporting frameworks is addressing current limitations in NTA, particularly regarding compound identification confidence and quantitative capability [5]. As these advancements mature, the integration of targeted and non-targeted approaches within unified analytical workflows offers the most powerful strategy for comprehensive chemical characterization, combining the quantitative rigor of targeted methods with the expansive scope of non-targeted discovery [3] [9].
For researchers and drug development professionals, understanding these complementary analytical paradigms is crucial for selecting appropriate methodologies to address specific research questions, whether the goal is precise quantification of defined targets or exploratory investigation of complex chemical mixtures. As NTA continues to evolve, its integration with emerging technologies like machine learning and high-resolution ion mobility spectrometry promises to further enhance its capabilities, ultimately strengthening environmental monitoring, pharmaceutical development, and public health protection.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) represents a paradigm shift in analytical chemistry, moving from the detection of predefined analytes to the comprehensive investigation of all detectable chemical species in a sample [11]. This approach is particularly crucial for addressing emerging environmental contaminants (EECs) such as pharmaceuticals, pesticides, and industrial chemicals that pose significant challenges for detection and identification due to their structural diversity and lack of analytical standards [1]. Unlike traditional targeted methods that screen for a specific list of known compounds, NTA focuses on assigning structures or formulae to unknown signals in HRMS data, making it an indispensable tool for discovering novel contaminants, characterizing complex mixtures, and responding to unknown chemical releases [12] [13]. The versatility of HRMS-based NTA allows it to be applied to virtually any sample medium, including air, water, sediment, soil, food, consumer products, and biological specimens, providing researchers with a powerful capability for chemical discovery and exposure characterization [14].
The exceptional resolution and mass accuracy of HRMS instruments form the foundational principle enabling effective NTA. High mass resolution allows the instrument to distinguish between ions with subtle mass differences, which is critical for separating compounds in complex mixtures and reducing false positives from isobaric interferences [11]. Modern HRMS systems, including Time of Flight (TOF), Orbitrap, and Fourier Transform Ion Cyclotron Resonance (FT-ICR) instruments, achieve resolution powers ranging from tens of thousands to several million, enabling precise separation of ions with minute mass differences [11]. Mass accuracy, typically measured in parts per million (ppm), determines how closely the measured mass-to-charge ratio (m/z) aligns with the theoretical value. For NTA applications, mass accuracy within 3-5 ppm is generally required for confident molecular formula assignment, with higher accuracy significantly reducing the number of candidate formulas [13].
The combination of high resolution and accuracy allows for the determination of elemental compositions with high confidence, a capability that is fundamental for identifying unknown compounds when reference standards are unavailable [15]. This is particularly valuable for emerging contaminants like per- and poly-fluoroalkyl substances (PFAS), where nontarget HRMS methods have led to the discovery of more than 750 PFASs belonging to more than 130 diverse classes in environmental samples, biofluids, and commercial products [15].
Effective NTA requires the integration of high-performance chromatographic separation with HRMS detection to manage sample complexity and reduce ion suppression. Liquid chromatography (LC), particularly reversed-phase liquid chromatography (RPLC), coupled with HRMS has emerged as a prominent methodology for analyzing complex mixtures [16]. The chromatographic step separates compounds based on their chemical properties before they enter the mass spectrometer, reducing matrix effects and allowing for the detection of co-eluting isomers that would be indistinguishable by MS alone.
The retention time (RT) of a compound provides valuable supplementary information for identification. While not a direct HRMS parameter, RT can be predicted from chemical structure using quantitative structure-retention relationship (QSRR) models and used as an additional filter to increase identification confidence [16]. Recent advancements have focused on developing calibrant-free predicted retention time indices (RTIs) through machine learning models to enhance identification probability without the need for extensive calibration standards [16]. For high-quality data, laboratories should monitor critical chromatographic parameters including resolution, peak shape, and retention time reproducibility across samples, with greater than 94% of compounds demonstrating less than 20% relative standard deviation for peak height in quality control measures [13].
Tandem mass spectrometry (MS/MS or MSn) provides structural information critical for confident compound identification in NTA [1]. In MS/MS mode, precursor ions are isolated and fragmented through collisions with gas molecules, producing fragment ions that reveal structural characteristics of the original molecule [13]. The resulting fragmentation patterns serve as molecular fingerprints that can be matched against experimental or in silico reference spectra.
The acquisition of MS/MS data in NTA can follow either data-dependent acquisition (DDA) or data-independent acquisition (DIA) approaches. DDA selects the most abundant ions from the survey scan for fragmentation, providing clean spectra but potentially missing lower-abundance compounds. DIA fragments all ions within predefined mass windows, ensuring comprehensive coverage but producing more complex spectra that require advanced computational deconvolution [14]. For unknown identification, MS/MS spectra are searched against spectral libraries using similarity matching algorithms, though library coverage remains a challenge, with molecular networking and in silico fragmentation prediction serving as complementary strategies [13].
Table 1: Key HRMS Instrument Types and Their Characteristics in NTA
| Instrument Type | Typical Resolution | Mass Accuracy (ppm) | Strengths in NTA | Limitations |
|---|---|---|---|---|
| Time of Flight (TOF) | 40,000-100,000 | 1-5 ppm | Fast acquisition speed, wide dynamic range | Requires frequent calibration for high accuracy |
| Orbitrap | 100,000-500,000+ | 1-3 ppm | High resolution and accuracy, stable mass calibration | Slower acquisition at highest resolutions |
| FT-ICR | 1,000,000+ | <1 ppm | Ultra-high resolution, exceptional mass accuracy | High cost, complex operation, limited accessibility |
The end-to-end NTA workflow encompasses multiple stages from sample preparation to final reporting, with each stage requiring careful execution to ensure data quality and interpretability. The workflow can be visualized as a connected process of sequential stages:
The initial stage involves sample preparation and quality control, where implementing robust quality assurance/quality control (QA/QC) procedures is essential for generating trustworthy data [13]. This includes using non-targeted standard quality control (NTS/QC) mixtures containing compounds covering a wide range of physicochemical properties to monitor critical data quality parameters such as mass accuracy (typically within 3 ppm), isotopic ratio accuracy, and peak height reproducibility [13]. HRMS data acquisition follows, employing either data-dependent or data-independent approaches to comprehensively capture the chemical composition of samples.
Feature detection and peak picking algorithms then process the raw HRMS data to detect chromatographic peaks and extract relevant information including m/z values, retention times, and intensities [14]. The subsequent compound annotation and identification phase represents the core interpretive challenge, where multiple lines of evidence are combined to assign chemical structures to detected features [1]. Finally, prioritization and confirmation steps help focus efforts on the most relevant compounds, followed by comprehensive data interpretation and reporting to translate findings into actionable insights [17].
Data processing in NTA converts raw instrument data into chemically meaningful information through a multi-step procedure. Feature detection algorithms identify chromatographic peaks and assemble related ions (isotopes, adducts, and fragments) into features representing unique chemical entities [14]. This step requires careful parameter optimization to balance sensitivity (detecting true features) and specificity (avoiding false positives from noise).
Once features are detected, compound annotation begins with molecular formula assignment based on accurate mass measurements and isotopic patterns. The number of possible molecular formulas increases exponentially with mass and decreasing mass accuracy, highlighting the critical importance of high mass accuracy in HRMS instruments [11]. For example, a mass measurement of 300.1000 Da with 5 ppm accuracy could correspond to dozens of plausible molecular formulas, while the same mass with 1 ppm accuracy might yield only a few possibilities.
Structural elucidation leverages MS/MS fragmentation data through spectral matching against reference databases. The Universal Library Search Algorithm (ULSA) and similar tools compare experimental spectra to reference libraries, generating matching scores that indicate similarity [16]. However, library coverage remains incomplete, necessitating complementary approaches including molecular networking which groups compounds based on spectral similarity to identify structurally related compounds, and in silico fragmentation which predicts MS/MS spectra for candidate structures to expand identification capabilities beyond library coverage [13].
A critical aspect of NTA data interpretation is establishing confidence levels for compound identifications, as NTA data are inherently less certain than targeted data [14]. The scientific community has developed reporting standards that categorize identifications into different confidence levels based on the supporting evidence:
Table 2: Confidence Levels for Compound Identification in NTA
| Confidence Level | Required Evidence | Typical Uncertainty | Reporting Considerations |
|---|---|---|---|
| Level 1: Confirmed Structure | Match to reference standard using RT and MS/MS spectrum | Minimal | Considered definitive identification |
| Level 2: Probable Structure | Library spectrum match or in silico evidence | Moderate | Structure is plausible but not confirmed |
| Level 3: Tentative Candidate | Diagnostic evidence (e.g., class-specific fragments) | High | Class may be known but not exact structure |
| Level 4: Unknown Feature | Molecular formula or m/z only | Very High | Insufficient evidence for structural assignment |
This framework acknowledges that in NTA, unlike targeted analysis, if an analyst reports that a chemical is present in a sample, it may actually be absent (e.g., it may be an isomer or an incorrect identification), and if an analyst reports that a chemical is not present, it may actually be present but not correctly identified during data processing [14]. This inherent uncertainty necessitates careful reporting and interpretation of NTA results, with clear communication of confidence levels to stakeholders.
Machine learning (ML) approaches are revolutionizing NTA by enhancing identification confidence and streamlining data interpretation workflows. ML models can leverage multiple data dimensions including MS/MS spectra, retention time information, and molecular descriptors to improve the discrimination between true positive and false positive identifications [1] [16]. One innovative approach introduces class probability of true positives (P(TP)) as a metric that leverages data from MS/MS spectra and calibrant-free predicted retention time indices through multiple ML models to enhance identification probability [16].
A demonstrated implementation involves three sequential ML models: first, a molecular fingerprint-based regression model that correlates molecular fingerprints to retention time indices; second, a cumulative neutral loss-based regression model that predicts expected RTI values using experimental MS/MS spectra; and finally, a binary classification model that integrates information from both retention and m/z domains to calculate P(TP) for each matched reference spectrum [16]. This approach has shown significant improvements in identification probability, with reported increases of 54.5%, 52.1%, and 46.7% for pesticides spiked in blank, 10× diluted, and 100× diluted tea matrices, respectively, compared to library matching alone [16].
The application of machine learning in NTA follows a structured workflow that integrates traditional analytical data with computational predictions:
The workflow begins with input data collection including MS/MS spectra, retention times, and m/z values. The first model (MF-to-RTI) uses random forest regression to correlate molecular fingerprints to true RTI values of calibrants, trained on diverse chemical structures to ensure coverage of structural diversity [16]. The second model (CNL-to-RTI) employs cumulative neutral loss masses as features to predict expected RTI values using experimental MS/MS spectra from known compounds, leveraging the discriminative power of fragmentation patterns [16].
The third model (binary classification) integrates features from both RTI and m/z domains, including RTI error between values derived from the first two models, monoisotopic mass, and parameters obtained from spectral matching algorithms [16]. This model calculates the probability of true positive (P(TP)) for each matched reference spectrum, with larger RTI errors indicating true negative spectral matches while smaller errors correlate with true positive matches [16]. The final output provides an enhanced identification probability that incorporates multiple dimensions of evidence, significantly improving upon traditional spectral matching alone.
Assessing and communicating the performance of NTA methods presents unique challenges compared to targeted analyses. While targeted methods rely on well-established performance metrics for selectivity, sensitivity, accuracy, and precision, defining analogous metrics for NTA requires consideration of different study objectives [14]. Performance assessment in NTA typically focuses on three primary objectives: sample classification (distinguishing sample groups based on chemical patterns), chemical identification (confidently assigning structures to detected features), and chemical quantitation (estimating concentrations without reference standards) [14].
For chemical identification, performance can be evaluated using metrics derived from the confusion matrix, including recall (ability to correctly identify present compounds), precision (ability to avoid false identifications), and overall accuracy [14]. However, these metrics require knowledge of ground truth, which is often unavailable in true non-targeted applications. Alternative approaches include using identification confidence levels and reporting the distribution of identifications across these levels, or employing benchmark compounds with known presence/absence to characterize method performance [14].
For quantitative NTA (qNTA), performance assessment becomes even more challenging due to the lack of reference standards for most compounds. Performance can be estimated using a set of chemical standards that represent different chemical classes, with metrics including accuracy (closeness to true concentration), precision (reproducibility across replicates), and linear dynamic range [14]. However, these metrics are necessarily limited to the available standards and may not represent performance for all detected compounds.
Implementing robust quality assurance and control (QA/QC) procedures is essential for generating reliable NTA data. Recommended practices include:
Data quality parameters should be continuously monitored, including mass accuracy (typically within 3-5 ppm), retention time stability, intensity reproducibility, and chromatographic peak shape [13]. Any deviations beyond predefined thresholds should trigger investigation and potentially re-analysis of affected samples. These QA/QC measures help ensure that the complex data generated in NTA studies is trustworthy and fit for its intended purpose, whether for exploratory research or decision-support applications.
Table 3: Essential Research Reagents and Materials for HRMS-Based NTA
| Reagent/Material | Function in NTA Workflow | Key Considerations |
|---|---|---|
| NTS/QC Mixture | Quality control for data quality assessment | Should contain 80+ compounds covering diverse physicochemical properties (molecular weights 126–1100 Da, log Kow -8 to 8.5) [13] |
| HRMS Instrument | High-resolution mass measurement | Orbitrap, TOF, or FT-ICR systems with resolution >50,000 and mass accuracy <5 ppm [11] |
| Chromatography System | Compound separation before MS detection | UHPLC systems with C18 columns most common; method should resolve early eluting polar analytes [13] |
| Spectral Libraries | Reference for MS/MS spectrum matching | Combine commercial, public, and in-house libraries; recognize limitations in coverage [16] |
| Molecular Networking Tools | Grouping related compounds by spectral similarity | Identifies molecular families when reference spectra are unavailable [13] |
| Retention Time Prediction Models | Providing additional evidence for compound identification | Machine learning models trained on diverse chemical sets improve transferability [16] |
| In Silico Fragmentation Tools | Predicting MS/MS spectra for candidate structures | Expands identification beyond library coverage; domain of applicability is crucial [13] |
Effective interpretation of NTA data requires a solid understanding of core HRMS concepts including mass resolution, accuracy, chromatographic separation principles, and fragmentation pattern analysis. The integration of machine learning approaches with traditional analytical techniques significantly enhances identification confidence by leveraging multiple dimensions of chemical information. As the field continues to evolve, standardized performance assessment methods and robust quality control procedures will be essential for generating reliable, reproducible data that can support environmental monitoring, public health protection, and regulatory decision-making. By mastering these essential HRMS concepts and maintaining critical assessment of data quality and uncertainty, researchers can fully leverage the powerful capabilities of non-targeted analysis to discover and characterize novel chemicals in complex samples.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) represents a paradigm shift in analytical chemistry, enabling the detection and identification of unknown chemicals without a priori knowledge of sample composition. This technical guide explores the fundamental concept of "chemical space" in NTA, defining the theoretical and practical boundaries of what is detectable and identifiable within a given analytical workflow. We examine the key methodological parameters that define the "detectable space," present standardized workflows for chemical space mapping, and discuss advanced computational tools that enhance NTA capabilities. Understanding and communicating the coverage and limitations of NTA methods is critical for advancing environmental monitoring, exposomics research, and drug development applications, ultimately supporting more reliable and reproducible chemical exposure assessments.
The concept of "chemical space" in non-targeted analysis refers to the multidimensional domain of chemical properties that characterizes the constituents of a sample [18]. In principle, NTA can detect and identify a broad range of organic compounds across diverse matrices, but in practice, no single method can cover the entirety of the chemical universe, which encompasses >10⁶⁰ possible organic compounds [18]. Instead, each NTA method accesses specific domains of chemical space through a combination of sample preparation, instrumental analysis, and data processing techniques [18] [19].
The fundamental challenge in NTA lies in determining whether the non-detection of an analyte indicates its true absence above a detection limit or represents a false negative resulting from workflow limitations [18]. To address this, researchers have proposed mapping the "detectable space" of NTA methods—the region of chemical space where compounds are amenable to detection and identification given specific methodological constraints [18] [19]. This conceptual framework allows for better communication of method capabilities, more accurate interpretation of results, and direct comparison between different NTA workflows [18] [20].
In NTA methodology, chemical space is partitioned into three distinct regions [18]:
The relationship between these spaces is hierarchical; the identifiable space is necessarily a subset of the detectable space, as identification requires additional confirmatory data beyond mere detection [18].
Eight fundamental analytical parameters predominantly influence the region of chemical space accessible by an NTA method [18] [19]:
Table 1: Key Parameters Defining the Detectable Space in NTA
| Parameter Category | Specific Factors | Impact on Chemical Space |
|---|---|---|
| Sample Preparation | Sample matrix type, extraction solvent, extract pH, extraction/cleanup media, elution buffers | Determines which compounds are extracted from the matrix and prepared for analysis |
| Instrument Platform | LC-MS, GC-MS, ion mobility | Defines the physicochemical properties of amenable compounds (e.g., polarity, volatility) |
| Ionization Technique | Ionization type (ESI, APCI, EI), ionization mode (positive/negative) | Influences which compounds can be effectively ionized for detection |
| Separation | Chromatographic method, retention time | Separates complex mixtures into individual components |
These parameters work in concert to define the "method applicability domain" [18]. For example, LC-MS with electrospray ionization (ESI) is more amenable to polar, water-soluble compounds, while GC-MS with electron ionization (EI) better detects volatile, non-polar compounds [19]. The selection of extraction solvents and pH further narrows the chemical space by determining which compounds are efficiently recovered from specific sample matrices [18].
A generalized NTA workflow encompasses multiple stages from sample preparation to compound identification, each step influencing the final detectable space [21]:
This workflow illustrates the comprehensive process for NTA, where each stage progressively refines the detectable chemical space [21]. Sample preparation methods (e.g., extraction solvents, pH adjustment, clean-up media) determine initial compound recovery [18]. Data acquisition parameters (chromatography, ionization, mass analysis) further constrain detectable compounds based on their physicochemical properties [18] [19]. Finally, data processing approaches (feature detection, annotation algorithms) influence which detected features are ultimately identified [21].
The proposed Chemical Space Tool (ChemSpaceTool) would implement a systematic approach to define the detectable space of any NTA method through sequential filtering [18]:
This conceptual framework parses down the vast chemical universe into an Amenable Compound List (ACL) specific to a given methodology [18]. Each filtering step corresponds to a key methodological decision that excludes compounds outside the operable parameters. The resulting ACL represents the plausible detectable space and can be used both prospectively (to guide method selection) and retrospectively (to filter identification candidates) [18].
Various software tools have been developed to implement NTA workflows, each with distinct capabilities and limitations:
Table 2: Software Tools for Non-Targeted Analysis
| Software Tool | Type | Key Features | Applications |
|---|---|---|---|
| patRoon | Open-source (R) | Comprehensive workflow, environmental focus, combines multiple algorithms | Environmental monitoring, suspect screening [21] |
| Compound Discoverer | Commercial | Vendor-specific optimizations, user-friendly interface | General NTA, metabolomics [19] |
| MetFrag | Open-source | In silico fragmentation, compound identification | Structure elucidation [21] |
| SIRIUS/CSI:FingerID | Open-source | Molecular formula identification, structure database searching | Unknown compound identification [21] |
| XCMS | Open-source | Feature detection, peak alignment, statistical analysis | Metabolomics, exposomics [21] |
| MZmine | Open-source | Modular pipeline, visualization, vendor-neutral | General NTA, metabolomics [21] |
These tools help researchers navigate the complex data generated in NTA studies. Open-source platforms like patRoon provide tailored functionality for environmental applications and allow integration of various algorithms [21]. Commercial software often offers more user-friendly interfaces and vendor-specific optimizations but may limit data transparency and sharing [19].
Machine learning (ML) approaches are increasingly applied throughout NTA workflows to enhance chemical space characterization [1]. ML algorithms improve compound identification through better prediction of retention times, collision cross-section values, and mass fragmentation patterns [1] [22]. These models can also prioritize features for identification based on likelihood of detection and potential toxicological concern [1]. Furthermore, ML techniques enable quantitative structure-property relationship modeling to predict a compound's amenability to specific analytical methods based on its physicochemical properties [1].
Table 3: Essential Materials and Tools for NTA Experiments
| Category | Specific Examples | Function in NTA Workflow |
|---|---|---|
| Extraction Media | HLB (hydrophilic-lipophilic balance) sorbents, C18 silica, ion-exchange resins | Isolate and concentrate analytes from complex matrices [18] |
| Chromatography Columns | Reverse-phase C18, HILIC, GC capillary columns | Separate complex mixtures into individual components [18] |
| Ionization Sources | Electrospray ionization (ESI), atmospheric pressure chemical ionization (APCI), electron ionization (EI) | Generate gas-phase ions for mass analysis [19] |
| Mass Analyzers | Orbitrap, time-of-flight (TOF), quadrupole | Provide high-resolution mass measurements for molecular formula assignment [22] [19] |
| Data Processing Tools | patRoon, XCMS, MZmine, Compound Discoverer | Extract features, align chromatograms, and annotate compounds [21] [19] |
| Chemical Databases | PubChem, CompTox, NIST, mzCloud | Support compound identification through spectral matching [21] |
The detectable chemical space varies significantly across different analytical platforms and application areas. A review of NTA literature revealed that 51% of studies use only LC-HRMS, 32% use only GC-HRMS, and 16% use both platforms to expand chemical coverage [19]. This distribution reflects the complementary nature of these techniques, with LC-HRMS better suited for polar, thermally labile compounds and GC-HRMS more appropriate for volatile, non-polar analytes [19].
In environmental applications, the frequently detected chemical classes reflect both environmental prevalence and methodological biases [19]. Per- and polyfluoroalkyl substances (PFAS) and pharmaceuticals predominate in water samples due to their polarity and ionization efficiency in LC-ESI-MS [19]. Pesticides and polyaromatic hydrocarbons are more common in soil and sediment analyses, while flame retardants and plasticizers feature prominently in dust and consumer product studies [19]. In human biospecimens, plasticizers, pesticides, and halogenated compounds are frequently detected, reflecting exposure patterns and analytical considerations [19].
Characterizing the chemical space and detectable coverage of NTA methods is fundamental to producing reliable, interpretable, and comparable results across studies. The proposed frameworks for chemical space mapping, including the ChemSpaceTool concept, provide systematic approaches to define method boundaries and communicate capabilities [18]. As NTA continues to evolve, integration of machine learning approaches [1], development of open-source computational platforms [21], and community-wide standardization efforts [20] will enhance our ability to navigate the chemical exposome. Understanding the detectable space of NTA methods enables researchers to make informed decisions about method selection, appropriately interpret negative findings, and advance toward more comprehensive chemical exposure assessment.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) represents a paradigm shift from traditional targeted methods, enabling researchers to characterize the chemical composition of complex samples without a priori knowledge of their content [19] [20]. This discovery-based approach generates immensely complex datasets that require sophisticated data processing and interpretation strategies. The fundamental challenge lies in accurately reducing raw instrumental data into meaningful chemical information, a process that hinges on precise terminology and standardized workflows [23] [20]. Unlike targeted analysis, which focuses on specific predefined chemicals, NTA aims to capture a broader "chemical space" – the conceptual collection of all possible chemicals within a sample, limited only by methodological choices [19] [3]. Within this framework, three critical concepts form the foundation of data interpretation: features (the raw observations from instrumentation), annotations (the attribution of chemical characteristics to these features), and identifications (the confident assignment of a specific chemical structure) [23]. This technical guide establishes precise definitions, methodologies, and confidence assessment protocols for these core terminologies, providing researchers with a standardized framework for NTA data interpretation within drug development and related chemical research fields.
In NTA, a feature represents the primary signal entity detected during data processing. Specifically, a feature is defined as a set of grouped, associated m/z-retention time pairs (mz@RT) that represent MS1 components for an individual compound, which may include isotopologues, adducts, and in-source product ion m/z peaks [23] [20]. Where no such associations exist, a feature may simply be a single mz@RT pair. This grouping is crucial as it distinguishes signals arising from a single chemical entity from the thousands of individual data points collected by the HRMS instrument [23]. The process of feature formation involves several computational steps that transform raw mass spectral data into these defined chemical signals, which subsequently serve as the substrates for all further annotation and identification efforts.
Annotation represents the next critical step in NTA data interpretation, defined as the attribution of one or more properties or molecular characteristics to an MS1 feature or its components (such as isotopologues or adducts), or to MS/MS product ions [23]. It is essential to recognize that annotations provide evidence but do not typically constitute sufficient proof to confidently identify a single compound. Examples of common annotations include: designation of an observed mz@RT as a specific adduct (e.g., [M+H]+, [M+Na]+), assignment of a molecular formula to a feature or an MS/MS product ion, and assignment of a suggested substructure to an MS/MS product ion [23]. Annotations thus represent hypothetical assignments that require further evidence before progressing to confident identification.
Identification constitutes the highest level of confidence in NTA workflows, occurring when the annotated components, features, and/or product ions collectively provide sufficient evidence to attribute a specific compound to the detected feature, within a stated identification scope or confidence level [23]. The key distinction from annotation lies in the evidentiary standard – identification requires multiple lines of concordant evidence that collectively point to a single chemical structure with high confidence. This evidence hierarchy typically includes retention time matching with authentic standards, spectral library matching, and consistent fragmentation patterns, among other confirmatory data [23] [14].
Table 1: Comparative Definitions of Core NTA Terminology
| Term | Formal Definition | Key Characteristics | Examples |
|---|---|---|---|
| Feature | A set of grouped mz@RT pairs representing MS1 components for an individual compound [23] [20] | - Represents raw instrumental observations- Grouped signals from same compound- Foundation for further analysis | - Set of m/z peaks for isotopologues- Group of adduct signals- Single mz@RT where no grouping exists |
| Annotation | Attribution of properties/molecular characteristics to features or their components [23] | - Provides evidence but not conclusive- Represents hypothetical assignment- Multiple annotations possible per feature | - Adduct designation ([M+H]+)- Molecular formula assignment- Substructure assignment from MS/MS |
| Identification | Confident attribution of a specific compound to a detected feature [23] | - Requires sufficient evidence- States confidence level/scope- Represents conclusive assignment | - Match to authentic standard (RT, MS/MS)- Level 1 confidence identification- Definitive structure elucidation |
The transformation of raw HRMS data into meaningful features follows a multi-step computational workflow that intentionally reduces data complexity while preserving chemically relevant information. This data processing segment encompasses all steps that transform raw data into meaningful information prior to annotation and identification efforts, with inputs being raw or converted data files and outputs being lists of features in each sample with associated chromatography, MS, and MS/MS data [23]. As detailed in Table 2, these steps include both fundamental signal processing and advanced grouping algorithms that collectively enable the detection and definition of molecular features.
Table 2: Key Data Processing Steps in NTA Workflows
| Processing Step | Description | Purpose | Common Algorithms/Software |
|---|---|---|---|
| Initial m/z Detection | Selection of unique mz@RT pairs from raw data | Identify potential signals of interest | Vendor software, MZmine, MS-DIAL |
| Retention Time Alignment | Modifies RTs within a dataset based on representative compounds | Correct for chromatographic shift between runs | Cross-sample correlation algorithms |
| Isotopologue Grouping | Groups mz@RT pairs representing isotopologues of same compound | Reduce redundancy; link related signals | Isotope pattern recognition |
| Adduct Grouping | Groups mz@RT pairs representing adducts of same compound | Consolidate signals from same chemical entity | Adduct rule application |
| Between-Sample Alignment | Comparison of features across multiple samples | Identify same feature in different samples | MZ/RT window matching |
| Gap-Filling | Detection of features missed during initial selection | Improve comprehensiveness of feature detection | Recursive peak extraction |
The following diagram illustrates the complete NTA workflow, highlighting the critical pathway from raw data acquisition through feature detection, annotation, and finally to confident identification, with key decision points and confidence assessment stages.
Robust quality assurance procedures are essential throughout the NTA workflow to ensure reliable results. Unlike targeted analyses with well-established validation frameworks, NTA requires specialized approaches to performance assessment [14]. Key considerations include: implementing blank samples to identify and subtract contamination; using quality control spikes to monitor instrument performance and feature detection rates; employing pooled quality control samples to assess reproducibility; and utilizing internal standards to evaluate extraction efficiency and matrix effects [23] [14]. For performance assessment, promising approaches include using the confusion matrix for qualitative study outputs (sample classification and chemical identification) and adapting estimation procedures from targeted methods for quantitative applications, with consideration for additional sources of uncontrolled experimental error [14]. These procedures help address the inherent uncertainties in NTA, where false positives (reporting a chemical present when it is actually absent) and false negatives (failing to detect a present chemical) can significantly impact data interpretation and subsequent decision-making [14].
Confidence in chemical identification exists on a spectrum, and standardized levels have been established to communicate the degree of certainty unambiguously. The highest confidence (Level 1) requires confirmation with an authentic chemical standard analyzed under identical experimental conditions, providing matching retention time and MS/MS spectrum [23] [14]. Level 2 identification, considered probable structure, requires compelling evidence such as a library spectrum match without retention time confirmation or characteristic fragmentation patterns indicative of a specific compound class [14]. Level 3 confidence, representing tentative candidates, applies when a molecular formula can be unambiguously determined but insufficient evidence exists for structural elucidation [23]. For annotations without molecular formula assignment (Level 4), the chemical identity remains unknown but distinguishable based on spectral data [14]. This tiered confidence framework enables researchers to appropriately communicate the certainty of their findings and prevents overinterpretation of insufficient data.
Comprehensive reporting of experimental details is essential for interpreting NTA results and assessing their reliability. The Benchmarking and Publications for Non-Targeted Analysis (BP4NTA) Working Group has established guidelines to improve transparency and reproducibility across NTA studies [20]. Critical reporting elements include: complete description of sample preparation procedures; detailed instrumentation parameters and data acquisition methods; comprehensive documentation of data processing steps and software parameters with version information; clear description of annotation and identification criteria including scoring thresholds; and full disclosure of quality assurance/quality control measures and results [23] [20]. For suspect screening analyses, researchers must report the specific suspect list used, its source, version, and size, while for true NTA, the approaches for unknown compound identification should be thoroughly documented [3]. These reporting standards enable proper evaluation of study limitations, facilitate inter-laboratory comparisons, and support the growing integration of NTA data into regulatory decision-making frameworks.
The computational demands of NTA necessitate specialized software tools for data processing, annotation, and identification. Both commercial and open-source options are available, each with distinct strengths and applications. As evidenced by literature reviews, vendor-specific software (such as Thermo Compound Discoverer and Agilent MassHunter) is currently used in most studies (approximately 57%), while open-source platforms (including MZmine, MS-DIAL, and Cardinal) offer flexible, customizable alternatives [19] [24]. The selection of appropriate software depends on multiple factors including instrumental platform, sample type, study objectives, and computational resources. As shown in Table 3, these tools encompass the complete NTA workflow from raw data processing to final identification.
Table 3: Essential Research Reagents and Computational Tools for NTA
| Tool Category | Specific Examples | Primary Function | Application in NTA Workflow |
|---|---|---|---|
| Data Conversion Tools | Proteowizard MSConvert, Reifycs Abf Converter | Convert proprietary vendor files to open formats | Pre-processing: Enables cross-platform data analysis |
| Commercial Processing Software | Compound Discoverer, MassHunter | Comprehensive workflow management | Data Processing: Feature detection, alignment, annotation |
| Open-Source Processing Platforms | MZmine, MS-DIAL, XCMS | Flexible data processing and analysis | Data Processing: Customizable workflows for feature detection |
| Spectral Libraries | NIST, mzCloud, MassBank | Reference spectra for compound matching | Annotation & Identification: Spectral comparison and matching |
| Visualization Tools | QUIMBI, Cardinal, SCiLS | Interactive data exploration and visualization | Data Interpretation: Spatial distribution, spectral analysis |
| In Silico Prediction Tools | CFM-ID, MetFrag, SIRIUS | Predict fragmentation spectra and structures | Annotation: Generate hypotheses for unknown compounds |
While NTA aims to detect unknowns, reference standards remain essential for method validation, retention time calibration, and confident identification. Key categories include: isotope-labeled internal standards for quality control and semi-quantification; chemical class-specific standards for evaluating method performance for particular compound classes; retention time index standards for chromatographic alignment; and authentic chemical standards for definitive confirmation of identifications (Level 1) [14]. The strategic use of these materials throughout the NTA workflow, from initial method development to final confirmation, significantly enhances data quality and reliability.
The precise definitions and standardized application of the terms feature, annotation, and identification form the critical foundation for rigorous non-targeted analysis using high-resolution mass spectrometry. Features represent the fundamental observations detected through data processing; annotations provide interpretive hypotheses about chemical characteristics; and identifications constitute confident assignments of specific chemical structures based on sufficient evidence. As NTA methodologies continue to evolve and expand into new application areas including drug development, environmental monitoring, and exposomics research, consistent terminology and reporting standards become increasingly vital for scientific communication and data interoperability [19] [20] [14]. The frameworks, workflows, and confidence assessments presented in this technical guide provide researchers with a standardized approach for NTA data interpretation that promotes transparency, reproducibility, and appropriate confidence in analytical results. Through the continued adoption and refinement of these standards by the scientific community, NTA will increasingly deliver on its potential to comprehensively characterize complex chemical mixtures and uncover previously unrecognized chemical exposures and transformations.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful discovery tool for identifying unknown and unexpected chemicals across diverse sample matrices, from environmental samples to biological specimens [19]. Unlike targeted methods that provide definitive, quantitative data for predefined analytes, NTA generates information-rich data with inherent uncertainties that prevent its widespread acceptance by decision-makers [14]. If an analyst reports that a chemical is present in a sample, it may actually be absent; if reported absent, it may be present; and if a concentration is reported, the true value could be orders of magnitude different [14]. This technical guide examines the fundamental sources of uncertainty in NTA workflows and presents structured strategies for their management, enabling researchers to communicate data reliability effectively within chemical exposure and drug development research.
Uncertainty in NTA stems from multiple interconnected sources throughout the analytical workflow. These can be categorized into three primary domains: analytical, data processing, and identification uncertainties.
The initial stage of NTA introduces significant uncertainties through sample preparation and instrumental analysis. The "detectable chemical space" – the subset of chemicals ultimately observed – is heavily influenced by eight key analytical considerations: sample matrix type, extraction solvent, pH, extraction/cleanup media, elution buffers, instrument platform, ionization type, and ionization mode [19]. For instance, the choice between liquid chromatography (LC) and gas chromatography (GC) platforms fundamentally alters detectable chemicals, with LC being more amenable to water-soluble compounds with polar functional groups, while GC better captures non-polar, volatile compounds [19]. This methodological bias represents a fundamental uncertainty in comprehensive exposome characterization.
Table 1: Analytical Techniques and Their Influence on Detectable Chemical Space
| Analytical Technique | Chemical Space Bias | Common Ionization Modes | Frequency of Use in NTA Studies |
|---|---|---|---|
| LC-HRMS | Polar, water-soluble compounds | ESI+, ESI-, APCI | 51% (LC-only); 16% (combined with GC) |
| GC-HRMS | Non-polar, volatile compounds | EI, CI | 32% (GC-only); 16% (combined with LC) |
| Direct Injection HRMS | No chromatography separation | Various | 1% |
The transformation of raw HRMS data into meaningful chemical information introduces computational uncertainties through peak picking, feature alignment, and database searching. Most NTA studies (approximately 57 of 76 reviewed studies) rely on vendor software (e.g., Thermo Compound Discoverer, Agilent MassHunter), while only a minority use open-source platforms (e.g., MzMine, MS-DIAL) [19]. This software dependency creates consistency challenges as different algorithms employ distinct parameters and confidence metrics. The feature detection process must distinguish true chemical signals from instrumental noise, with variations in sensitivity thresholds directly affecting reported chemical spaces. Similarly, molecular formula assignment from accurate mass measurements carries uncertainty, particularly for compounds with complex isotopic patterns or those containing unusual elemental compositions.
The most significant uncertainty in NTA lies in compound identification, where multiple orthogonal criteria are needed to establish confidence. The absence of universal standards for unknown identification means that, unlike targeted methods, NTA cannot provide unambiguous links between detected features and specific chemical structures [14]. Research indicates substantial variability in identification confidence, with many studies relying on level 3 identifications (probable structures based on spectral library matching without reference standards) rather than level 1 (confirmed with reference standards) [19]. This uncertainty is compounded by the presence of isomeric compounds that may share virtually identical mass spectra but possess different toxicological properties, potentially leading to misidentification in exposure assessment.
Effective uncertainty management requires a systematic approach addressing each stage of the NTA workflow. The following strategies provide a structured framework for enhancing reliability in NTA data interpretation.
Implementing rigorous quality assurance/quality control (QA/QC) protocols forms the foundation for uncertainty management. Blank samples (method, procedural, and instrumental) must be incorporated to identify contamination, while pooled quality control samples (PQC) assess system stability and aid in distinguishing true features from artifacts [14]. For quantitative estimations, standard addition methods with internal standards spanning diverse physicochemical properties help account for matrix effects. Analytical standardization should also include reference materials when available, with established criteria for retention time stability, mass accuracy (< 5 ppm error), and signal intensity variation (< 30% RSD in QC samples) [14].
Table 2: Confidence Levels for Chemical Identification in NTA
| Confidence Level | Identification Criteria | Required Data | Uncertainty Level |
|---|---|---|---|
| Level 1: Confirmed Structure | Match to authentic standard | Retention time, MS/MS spectrum, accurate mass | Low |
| Level 2: Probable Structure | Library spectrum match or diagnostic evidence | MS/MS spectrum, accurate mass | Medium |
| Level 3: Tentative Candidate | Library match without MS/MS or in silico prediction | Molecular formula, possible structure | High |
| Level 4: Unknown Feature | No structural information | Accurate mass, retention time | Highest |
Managing computational uncertainties requires transparent reporting of all processing parameters and implementation of standardized confidence frameworks. The Schymanski scale for confidence assessment provides a harmonized approach for communicating identification certainty [14]. For peak picking, signal-to-noise thresholds should be optimized and consistently applied, with manual verification of high-priority features. Molecular formula assignment should incorporate multiple scoring algorithms that consider isotopic patterns, elemental probability, and ring double bond equivalents. Implementing open-source computational workflows enhances reproducibility and allows cross-validation between different software platforms, addressing a critical gap in current NTA research [19].
Translating NTA results into actionable information requires clear quantification and communication of uncertainties. Confusion matrices can assess qualitative performance (sample classification, chemical identification) by comparing true positives, false positives, true negatives, and false negatives [14]. For quantitative applications, traditional accuracy and precision metrics should be applied with consideration for additional sources of uncontrolled experimental error [14]. Stakeholder communication should include clear confidence statements using standardized terminology, distinguishing between identified, annotated, and unknown features, with explicit acknowledgment of methodological limitations, such as coverage biases introduced by platform selection (LC vs. GC) and ionization techniques.
This protocol establishes a standardized approach for assigning confidence levels to chemical identifications in NTA studies, adapted from community-established guidelines.
Level 1 Identification (Confirmed Structure)
Level 2 Identification (Probable Structure)
Level 3 Identification (Tentative Candidate)
This procedure quantifies method performance characteristics using chemically spiked samples to establish uncertainty metrics.
Sample Preparation
Data Acquisition and Processing
Performance Metric Calculation
NTA Uncertainty Sources and Relationships: This diagram illustrates the sequential nature of uncertainty propagation throughout the NTA workflow, from sample preparation to final quantification, highlighting specific uncertainty contributors at each stage.
NTA Uncertainty Management Workflow: This workflow diagram outlines key stages in managing NTA uncertainties, highlighting quality control checkpoints (green), high-uncertainty stages requiring special attention (red), and critical reporting phases (blue), with mitigation strategies at each step.
Table 3: Key Research Reagents for NTA Uncertainty Management
| Reagent / Material | Function in NTA Workflow | Uncertainty Addressed |
|---|---|---|
| Authentic Analytical Standards | Confirmation of compound identity | Identification uncertainty |
| Stable Isotope-Labeled Internal Standards | Correction for matrix effects and recovery | Quantitative uncertainty |
| Reference Materials (NIST, EPA) | Method validation and benchmarking | Method performance uncertainty |
| Quality Control Pooled Samples | Monitoring instrumental performance | Analytical variability |
| Blank Samples (Method, Procedural) | Contamination identification | False positive uncertainty |
| Retention Time Index Standards | Retention time alignment and prediction | Chromatographic variability |
| Ionization Efficiency Calibrants | Response factor estimation | Semi-quantification uncertainty |
Uncertainty management in non-targeted analysis represents both a fundamental challenge and critical opportunity for advancing exposure science and drug development research. The inherent uncertainties throughout the NTA workflow – from analytical biases affecting detectable chemical spaces to computational limitations in compound identification – necessitate systematic approaches to quality assurance and validation [14] [19]. By implementing the structured frameworks, experimental protocols, and uncertainty quantification strategies outlined in this guide, researchers can enhance the reliability and interpretability of NTA data. The advancing standardization of confidence assessment frameworks and growing availability of open-source computational tools promise to reduce current limitations, ultimately supporting the transition of NTA from a research tool to a methodology capable of informing regulatory decisions and public health protection. As the field progresses, transparent communication of uncertainties will remain essential for appropriate interpretation and utilization of NTA results by diverse stakeholders across scientific disciplines.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful analytical approach for detecting and identifying unknown and unexpected compounds across diverse sample matrices, including environmental, biological, and food samples [14]. Unlike targeted methods that focus on specific predefined analytes, NTA generates global chemical information, providing a comprehensive characterization of sample composition [14]. This capability makes NTA particularly valuable for discovering novel chemical stressors, retrospectively assessing past exposures through archived samples, and classifying samples based on their chemical profiles [14]. However, this information-rich data presents significant challenges in evaluation and interpretation, necessitating specialized experimental designs and performance assessment approaches distinct from traditional targeted methods.
The inherent uncertainty in NTA data represents a fundamental characteristic that differentiates it from targeted analysis [14]. When analysts report chemical presence through NTA, the compound may actually be absent due to misidentification as an isomer; conversely, reported absence may reflect failed detection rather than true absence [14]. Similarly, sample classification models may lack repeatability over time or transferability between instruments, and concentration estimates often lack confidence intervals, potentially deviating from true values by orders of magnitude [14]. These uncertainties have limited broader adoption of NTA data by decision-makers, creating an urgent need for methods that accurately measure and communicate uncertainty extent and implications for specific use cases. This article defines and explores the three primary NTA study objectives—sample classification, chemical identification, and chemical quantitation—that structure most NTA projects and yield results most useful for stakeholders [14].
Sample classification represents a fundamental NTA objective focused on distinguishing samples into categories based on their overall chemical profiles rather than individual compounds [14]. This approach utilizes the entire chemical fingerprint detected by HRMS to differentiate samples according to source, biological response, temporal changes, or spatial distribution [14]. In practical applications, sample classification enables researchers to identify patterns indicative of environmental contamination sources, disease states in biological specimens, or geographical origins of food products. The performance of qualitative studies emphasizing sample classification can be assessed using confusion matrices, though with recognized challenges and limitations [14]. This objective particularly valuable when specific chemical markers remain unknown, but differential patterns can still distinguish sample classes effectively.
Chemical identification constitutes a core NTA objective aimed at discovering and characterizing unknown or unexpected chemicals present in samples [14]. This process involves detecting chemical features in HRMS data and subsequently determining their molecular structures through various confirmation strategies [14]. The confidence levels for presumed identifications vary significantly based on available information, including molecular formula determination, fragmentation spectra matching, retention time consistency, isotopic distribution analysis, and library comparison [14]. Within chemical identification, researchers often distinguish between suspect screening analysis (SSA) and true non-targeted analysis [3]. SSA focuses on identifying chemicals through comparison with predefined libraries of known compounds of interest, effectively narrowing the study scope [3]. In contrast, true NTA aims to identify chemicals without a priori knowledge, including compounds not represented in established databases [3].
Chemical quantitation represents the most challenging NTA objective, seeking to estimate concentrations of identified chemicals without prior method optimization for specific analytes [14]. While targeted analytical methods provide precise quantitative data with well-defined confidence intervals for predefined chemicals, NTA quantitation carries greater uncertainty [14]. Quantitative NTA (qNTA) approaches typically employ estimation procedures adapted from targeted methods but must account for additional sources of uncontrolled experimental error [14]. Recent advancements incorporate machine learning models to improve quantification methods, leveraging computational approaches to overcome limitations in traditional calibration techniques [1]. Despite these innovations, quantitative results from NTA should be interpreted with appropriate caution, recognizing that true concentrations may differ significantly from reported values without proper validation [14].
Table 1: Comparison of Primary NTA Study Objectives
| Study Objective | Primary Focus | Key Outputs | Performance Assessment Approaches | Common Applications |
|---|---|---|---|---|
| Sample Classification | Overall chemical patterns | Sample categories, differentiation models | Confusion matrix, model repeatability and transferability | Source tracking, disease diagnosis, product authentication |
| Chemical Identification | Individual unknown compounds | Molecular structures, compound identities | Confidence levels based on supporting data (MS/MS, retention time, etc.) | Discovery of novel contaminants, metabolite identification |
| Chemical Quantitation | Concentration estimation | Semi-quantitative or quantitative concentrations | Estimation procedures with error consideration, machine learning validation | Exposure assessment, risk characterization |
Qualitative NTA performance, encompassing sample classification and chemical identification, requires specialized assessment approaches distinct from traditional targeted analysis [14]. For sample classification, the confusion matrix provides a foundational framework for evaluating model performance by comparing predicted versus actual class assignments across sample categories [14]. This matrix enables calculation of standard classification metrics including accuracy, precision, recall, and F1-score, though specific challenges emerge when applying these metrics to NTA data, particularly regarding class imbalance and uncertain ground truth [14].
For chemical identification, confidence ranking systems represent the primary performance assessment approach, categorizing identifications based on the strength of supporting evidence [14]. The highest confidence levels typically require matching retention times and fragmentation spectra with authentic analytical standards analyzed under identical conditions [14]. Lower confidence levels may rely on library spectrum matching, in silico fragmentation prediction, or molecular formula assignment alone [14]. The evolving nature of confidence frameworks for chemical identification reflects ongoing community efforts to standardize reporting and communicate uncertainty effectively to stakeholders [14].
Quantitative performance assessment in NTA adapts established metrics from targeted analysis while acknowledging additional uncertainty sources [14]. Key metrics include accuracy (closeness to true values), precision (measurement reproducibility), and sensitivity (detection limits), though each requires careful interpretation in NTA contexts [14]. Accuracy evaluation proves particularly challenging without authentic standards for all identified compounds, often necessitating surrogate approaches using structurally similar compounds or standard addition methods [25]. Precision assessment must account for additional variability sources throughout the non-targeted workflow, including feature detection consistency and alignment reliability across sample batches [14] [25].
Sensitivity characterization in NTA extends beyond traditional limit of detection (LOD) calculations to include concept of feature detectability across chemical space [14]. This broader sensitivity perspective acknowledges that detection capabilities vary significantly across different chemical classes and concentrations in non-targeted approaches [14]. Recent research has developed specific quality control guidelines to assure reliable quantitative NTA data, evaluating method specificity, precision, accuracy, and reproducibility in terms of peak area and retention time variability, true positive identification rates, and intraday/interday variations [25].
Table 2: Performance Assessment Metrics for NTA Study Objectives
| Performance Aspect | Targeted Analysis Approach | NTA Adaptation | Key Challenges in NTA |
|---|---|---|---|
| Selectivity/Specificity | Ability to distinguish target analyte from interferents [14] | Feature detection specificity, chromatographic separation, mass resolution [14] [25] | Unknown interferents, isobaric compounds, matrix effects |
| Sensitivity | Limit of detection (LOD) for specific analytes [14] | Feature detectability across chemical space, chemical coverage [14] | Variable detection capabilities across chemical classes |
| Accuracy | Agreement between measured and true values [14] | Identification confidence, quantitative agreement when standards available [14] [25] | Lack of authentic standards for most compounds |
| Precision | Reproducibility of measurements [14] | Feature detection repeatability, retention time stability, alignment consistency [14] [25] | Additional variability sources in non-optimized workflow |
Comprehensive experimental design precedes successful NTA implementation, requiring clear definition of study objectives and analytical scope [3]. Researchers must deliberately determine whether their approach emphasizes targeted analysis, suspect screening, non-targeted analysis, or integrated combinations thereof [3]. This intentional scope definition directly influences subsequent methodological choices across the analytical workflow, from sample preparation to data interpretation [3]. Practical NTA applications frequently combine targeted, suspect screening, and non-targeted approaches within unified workflows [3]. For example, data acquisition may operate comprehensively to maximize collected information, while data analysis sequentially applies suspect screening against defined chemical lists followed by true non-targeted identification efforts for remaining features [3]. Such integrated approaches efficiently leverage methodological strengths while acknowledging practical constraints.
Sample preparation strategies for NTA require careful consideration to balance comprehensive chemical coverage with practical analytical constraints [26]. For complex matrices like biological fluids or environmental samples, preparation often incorporates divide-and-conquer strategies that reduce sample complexity into manageable fractions [26]. These approaches may include abundant protein depletion in plasma samples, fractionation through chromatographic techniques, and enzymatic digestion for bottom-up proteomic approaches [26]. Quality assurance and quality control (QA/QC) implementation provides critical foundation for reliable NTA results, requiring intentional incorporation throughout study design [3]. Essential QA/QC elements include procedural blanks, replicate samples, reference materials, and internal standards that monitor analytical performance across sample batches [3] [25]. Specific quality control guidelines proposed for NTA methodologies evaluate method specificity, precision, accuracy and reproducibility using standardized approaches assessing peak area and retention time variability, true positive identification rates, and intraday/interday variations [25].
High-resolution mass spectrometry represents the foundational analytical platform for NTA, with different mass analyzers offering complementary capabilities [14] [26]. Common HRMS platforms include time-of-flight (TOF), Fourier transform ion cyclotron resonance (FT-ICR), and Orbitrap instruments, each providing the mass accuracy and resolution essential for confident molecular formula assignment [26]. Data acquisition strategies significantly influence chemical space coverage, with electrospray ionization (ESI) and atmospheric pressure chemical ionization (APCI) offering complementary selectivity for different compound classes [25]. Tandem mass spectrometry (MS/MS or MSn) fragmentation data provides critical structural information for compound identification, with acquisition methods ranging from data-dependent acquisition to unbiased fragmentation approaches [14] [1]. Advanced instrumental configurations combining multiple separation dimensions with high-resolution mass analysis further enhance chemical coverage and detection capabilities for complex samples [26].
NTA Experimental Workflow: This diagram illustrates the generalized workflow for non-targeted analysis, from sample preparation through data interpretation.
Machine learning (ML) and artificial intelligence (AI) applications represent the most significant advancement in NTA methodologies, offering transformative potential across all study objectives [1]. ML algorithms enhance sample classification through improved pattern recognition capabilities, enabling more accurate sample categorization based on complex chemical fingerprints [1]. For chemical identification, ML approaches facilitate structure annotation through improved in silico fragmentation prediction and spectral similarity assessment, significantly expanding identifiable chemical space beyond reference libraries [1]. In quantitative applications, ML models enable more accurate concentration predictions without authentic standards by leveraging chemical structure-property relationships [1]. These computational advancements also improve toxicity prediction capabilities through quantitative structure-activity relationship (QSAR) modeling, directly supporting risk assessment frameworks [1]. Current research focuses on refining ML tools for complex mixture analysis, improving inter-laboratory validation, and further integrating computational models into environmental risk assessment paradigms [1].
Interlaboratory studies and method validation initiatives address critical needs for standardization and reproducibility in NTA [25]. The Environmental Protection Agency's Non-Targeted Analysis Collaborative Trial (ENTACT) represents a prominent example, evaluating performance across multiple laboratories analyzing identical complex mixtures [25]. Such studies reveal substantial variability in reported identifications and concentrations, highlighting the urgent need for standardized protocols and performance benchmarks [14] [25]. Validation approaches for NTA methods continue evolving, with recent proposals advocating for standardized quality control metrics assessing accuracy, precision, selectivity, and reproducibility using defined reference materials [25]. These community-wide efforts aim to establish fit-for-purpose criteria enabling broader acceptance of NTA data in regulatory decision-making contexts [14].
Table 3: Essential Research Reagents and Materials for NTA Studies
| Tool/Reagent | Function | Application Context |
|---|---|---|
| High-Resolution Mass Spectrometer | Provides accurate mass measurements for elemental composition assignment [14] [26] | All NTA study objectives; different analyzers (TOF, FT-ICR, Orbitrap) offer complementary capabilities [26] |
| Chromatography Systems | Separate complex mixtures to reduce ionization suppression and enable isomer differentiation [26] | All NTA study objectives; LC-MS most common, with GC-MS expanding coverage for volatile compounds |
| Authentic Analytical Standards | Confirm compound identities and enable quantitative calibration [14] | Method development/validation, identification confidence assessment, quantitation |
| Stable Isotope-Labeled Internal Standards | Monitor analytical performance, correct for matrix effects, enable quantitative accuracy [14] | Quality control, quantitative NTA, method performance assessment |
| Sample Preparation Materials | Extract, clean-up, and concentrate analytes from complex matrices [26] | All NTA applications; specific materials (SPE cartridges, extraction solvents) determine chemical space coverage |
| Compound Databases & Spectral Libraries | Support chemical identification through mass, retention time, and fragmentation matching [14] [3] | Suspect screening, compound identification; examples include NIST, MassBank, mzCloud |
| Data Processing Software | Convert raw data to features, align across samples, perform statistical analyses [14] [1] | All NTA study objectives; both commercial and open-source platforms available |
| In Silico Fragmentation Tools | Predict MS/MS spectra for structural elucidation of unknowns without standards [1] | Compound identification, particularly for true NTA without library matches |
NTA Identification Confidence: This diagram outlines the decision process for compound identification in NTA, from initial feature detection through confirmation.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) generates complex, information-rich datasets that are impossible to interpret manually without sophisticated computational approaches [23]. The primary challenge in NTA research lies not merely in detecting chemical signals, but in developing robust computational methods to extract meaningful environmental information from the vast chemical datasets generated by HRMS instruments [27]. A critical, yet often poorly defined aspect of this process involves the terminology used to describe detections that are not yet annotated or identified. Establishing consistent definitions is essential for improving reproducibility and readability across NTA studies from different research groups [23].
Fundamental terminology used throughout this pipeline includes: m/z-retention time pair (mz@RT), defined as a unique pairing of mass-to-charge ratio and retention time values; and Feature, which represents a set of mz@RT pairs that form a grouping of associated MS1 components (e.g., isotopologue, adduct, and in-source product ion m/z peaks), represented as a tensor of observed retention time, monoisotopic mass, and intensity [23]. The comprehensive data processing pipeline transforms raw instrument data into chemically meaningful features through a structured sequence of computational steps that will be detailed in this guide.
The complete NTA workflow encompasses four critical stages that transform environmental samples into actionable insights about contamination sources. While this guide focuses primarily on Stage III (Data Processing), understanding the entire context is essential for proper implementation. Stage I involves sample treatment and extraction, requiring careful optimization to balance selectivity and sensitivity through techniques such as solid phase extraction (SPE), Soxhlet extraction, gel permeation chromatography (GPC), and pressurized liquid extraction (PLE) [27]. Stage II covers data generation and acquisition through HRMS platforms including quadrupole time-of-flight (Q-TOF) and Orbitrap systems coupled with liquid or gas chromatographic separation (LC/GC), which generate the complex datasets essential for NTA [27].
Stage III represents the ML-oriented data processing and analysis phase, where the transition from raw HRMS data to interpretable patterns occurs through sequential computational steps including data preprocessing, dimensionality reduction, and statistical analysis [27]. Stage IV focuses on result validation through a three-tiered approach incorporating analytical confidence verification using certified reference materials, model generalizability assessment on independent external datasets, and environmental plausibility checks correlating model predictions with contextual data [27]. The following diagram illustrates the complete workflow and the interrelationships between each stage:
The initial stage of data processing involves converting proprietary instrument data files into open, accessible formats that can be processed by subsequent algorithms. This conversion must occur before any intentional data interpretation or processing can take place [23]. Although idealized data format conversions are not intended to remove any data, it is possible that data losses may occur during this process. Researchers should carefully evaluate their format conversion steps and associated settings to assess whether such data losses are occurring, potentially by evaluating multiple format conversion platforms and proceeding through subsequent data processing steps to screen for known compounds [23].
Table 1: Data Format Conversion Methods
| Method Step | Description | Common Tools/Formats |
|---|---|---|
| File conversion to open-source format | Changes the raw data file format (e.g., .d, .raw, .wiff, etc.) to a different file format | mzML, mzXML, ANDI-MS (netCDF), ABF; Tools: Proteowizard MSConvert, Reifycs Abf Converter |
| Mass spectrum centroiding | For mass spectra collected in profile/continuous mode, centroiding reduces the individual m/z peaks to a single peak (a centroid) | Not necessary for mass spectra collected in centroid mode |
Data processing relies on user-defined settings and thresholds to reduce the raw or converted analytical data to meaningful information, such as a list of features with relative abundance information and accompanying LC data and MS & MS/MS spectra [23]. As data is intentionally reduced during processing, researchers should consider evaluating the impact of various settings through the use of QC spikes and/or QC samples [23]. The terminology differences across both open-source and proprietary software platforms present a significant challenge, as different software tools may use the same term to describe different steps. Until these discrepancies are resolved through communication and consensus among software developers, NTA researchers should carefully read software user manuals to ensure they correctly interpret the purpose of each step in the workflow [23].
Table 2: Detailed Data Processing Steps
| Processing Step | Description | Key Parameters |
|---|---|---|
| Initial m/z detection | Selection of unique mz@RT pairs | Sensitivity threshold, mass accuracy |
| Retention time alignment | Modifies retention times within a single dataset based on representative compounds | Alignment algorithm, reference compounds |
| Shoulder peaks filtering | Removes noise signals known as "shoulder peaks" (relevant to Fourier transform MS instruments) | Peak width, symmetry thresholds |
| Signal thresholding | Removes signal (m/z values) below a designated abundance threshold | Absolute value or signal-to-noise ratio |
| Chromatogram smoothing | Reduces the noise of a selected chromatogram | Smoothing algorithm, window size |
| Spectral deconvolution | Removal of undesired m/z peaks within a mass spectrum | Deconvolution algorithm, peak model |
| Isotopologue grouping | Grouping of unique mz@RT pairs that represent isotopologues of the same compound | Mass difference, isotopic pattern |
| Adduct grouping | Grouping of unique mz@RT pairs that represent adducts of the same compound | Common adduct masses, retention time tolerance |
| Between-sample alignment | Comparison of detected features in multiple samples to determine if the same feature was detected | m/z and RT windows, variance thresholds |
| Gap-filling | Detection of features that were missed during initial m/z selection | Lower thresholds, peak prediction algorithms |
| Feature filtering | Filtering detected features based on retention time or m/z range | m/z range, RT range, intensity thresholds |
| Duplicate feature removal | Removing duplicate features based on designated m/z and RT windows | m/z and RT tolerance windows |
| Replicate filter | Evaluating feature detection frequency across analytical replicates | Replicate frequency threshold |
| Abundance thresholding and/or blank comparison | Applying absolute or relative abundance thresholds | Blank subtraction methods, fold-change thresholds |
The data processing workflow involves both sequential and parallel operations that transform raw data into curated features ready for statistical analysis and machine learning. The following diagram illustrates the logical flow and relationships between these critical processing steps:
A typical output from the data generation and acquisition stage is a peak table that records the intensities of detected signals [27]. Preprocessing this data, including tasks such as harmonizing the dataset and minimizing noise, is necessary for ensuring data quality and consistency prior to machine learning applications. A high-quality preprocessing workflow is critical for enhancing the reliability and robustness of subsequent machine learning outcomes [27].
Variations in mass spectrometry data may arise due to differences in analytical platforms or acquisition dates, making data alignment essential to ensure the comparability of chemical features across all samples. This alignment process mainly includes three key steps: retention time correction, mass-to-charge ratio recalibration, and peak matching [27]. Retention time correction compensates for slight shifts in retention times caused by variations in chromatographic conditions, while m/z recalibration standardizes mass accuracy across different batches. Peak matching algorithms align identical chemical features detected across different batches, facilitating accurate compound identification and cross-sample comparison [27].
Exploratory ML-oriented data processing identifies significant features via univariate statistics (t-tests, Analysis of Variance [ANOVA]) and prioritizes compounds with large fold changes [27]. Dimensionality reduction techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) simplify high-dimensional data, while clustering methods (hierarchical cluster analysis [HCA], k-means clustering) group samples by chemical similarity [27]. Supervised ML models, including Random Forest and Support Vector Classifier, are subsequently trained on labeled datasets to classify contamination sources. Feature selection algorithms (e.g., recursive feature elimination) refine input variables, optimizing model accuracy and interpretability [27].
Objective: To process raw HRMS data into a curated feature table suitable for statistical analysis and machine learning applications.
Materials and Equipment:
Procedure:
Feature Detection and Processing:
Feature Grouping and Alignment:
Feature Curation:
Quality Control:
Table 3: Essential Research Reagents and Computational Tools for NTA
| Category | Item/Software | Function/Application |
|---|---|---|
| Sample Preparation | Solid Phase Extraction (SPE) cartridges | Compound enrichment and clean-up |
| Multi-sorbent strategies (Oasis HLB, ISOLUTE ENV+) | Broad-spectrum extraction coverage | |
| QuEChERS kits | Rapid sample preparation for complex matrices | |
| Reference Materials | Certified Reference Materials (CRMs) | Method validation and quality control |
| Internal standards (isotope-labeled compounds) | Retention time alignment and quantification | |
| Batch-specific quality control samples | Monitoring instrumental performance | |
| Data Conversion | Proteowizard MSConvert | Conversion of proprietary formats to open formats |
| Reifycs Abf Converter | Alternative conversion tool for ABF format | |
| mzML, mzXML formats | Standardized open data formats | |
| Data Processing | XCMS, MS-DIAL, OpenMS | Feature detection, alignment, and processing |
| Computational algorithms | Isotopologue grouping, adduct identification | |
| Statistical packages (R, Python) | Data analysis and visualization | |
| Advanced Analysis | PCA, t-SNE algorithms | Dimensionality reduction and pattern recognition |
| Random Forest, SVC classifiers | Supervised machine learning for source identification | |
| Feature selection algorithms | Identification of statistically significant features |
The integration of statistical and chemometric analysis, particularly through machine learning (ML), is revolutionizing pattern recognition in non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS). This paradigm shift enables researchers to translate complex HRMS datasets into actionable insights for contaminant source identification and chemical exposure assessment. This guide provides a detailed technical framework for ML-oriented NTA, covering core workflows, algorithmic selection, performance validation, and essential research tools to advance data interpretation in environmental and pharmaceutical research.
The transformation of raw HRMS data into interpretable patterns for source identification follows a systematic, multi-stage workflow. This process is critical for managing the high-dimensionality of NTA data, where the number of detected chemical features often far exceeds the number of samples. A structured approach ensures that the chemical signals are accurately processed, modeled, and validated to support environmentally actionable decisions [27]. The workflow can be conceptualized in four primary stages: (i) sample treatment and extraction, (ii) data generation and acquisition, (iii) ML-oriented data processing and analysis, and (iv) result validation [27]. The following diagram illustrates the logical sequence and key components of this integrated workflow.
The goal of this initial stage is to achieve a broad and sensitive extraction of compounds while minimizing matrix interference, thereby laying a reliable foundation for subsequent ML analysis [27].
Detailed Protocol: Solid Phase Extraction (SPE)
Detailed Protocol: QuEChERS (Quick, Easy, Cheap, Effective, Rugged, and Safe)
HRMS platforms, such as quadrupole time-of-flight (Q-TOF) and Orbitrap systems, generate the complex datasets required for NTA [27]. The output of this stage is a structured feature-intensity matrix, where rows represent samples and columns correspond to aligned chemical features (defined by mass-to-charge ratio m/z and retention time). This matrix is the fundamental input for all subsequent chemometric and ML analyses [27].
This stage represents the core of pattern recognition, where statistical and ML models are applied to extract meaningful patterns from the feature-intensity matrix.
Data Preprocessing Protocol: Preprocessing is critical to ensure data quality and model reliability.
m/z features across samples using algorithms within software like XCMS [27].knn function in R (e.g., impute::knn.impute), with k=10 neighbors, to estimate missing values based on feature similarity across samples.Exploratory Data Analysis and Pattern Recognition: This phase involves both unsupervised and supervised learning.
sklearn.cluster.KMeans in Python) to the preprocessed and normalized feature-intensity matrix. Visualize the resulting clusters using a PCA score plot.sklearn.ensemble.RandomForestClassifier with n_estimators=100) on the training set. Validate model performance on the held-out test set by calculating accuracy, precision, and recall.A tiered validation strategy is essential to ensure the reliability of ML-NTA outputs [27].
Evaluating the performance of both the qualitative and quantitative outputs of an NTA study is crucial for assessing its reliability. The table below summarizes key metrics adapted from targeted analysis and their application in NTA, addressing objectives like sample classification and chemical identification [14].
Table 1: Performance Assessment Metrics for NTA Studies
| Metric | Definition in Targeted Analysis | Application in NTA | Considerations & Challenges |
|---|---|---|---|
| Selectivity | A method's ability to differentiate a unique chemical from interferents [14]. | The confidence in correctly identifying a chemical structure among isomers or similar compounds. | NTA identifications are probabilistic. A reported compound may be an isomer, leading to false positives [14]. |
| Sensitivity (LOD) | The lowest concentration at which a chemical can be reliably detected [14]. | The minimum response or concentration at which a feature can be detected and reliably annotated. | Defining a universal Limit of Detection (LOD) is complex due to varying ion efficiencies. "Not detected" does not guarantee absence [14]. |
| Accuracy | Closeness of agreement between a reported concentration and a true value [14]. | For classification: Model accuracy in predicting the correct source category. For quantitation: Agreement of semi-quantitative estimates with true concentrations. | Quantitative NTA estimates can have high uncertainty, with true concentrations potentially orders of magnitude different [28]. |
| Precision | The consistency of reported values across repeated measurements [14]. | The reproducibility of feature detection and sample classification across analytical replicates or different model runs. | Model predictions may not be fully repeatable over time or transferable between different HRMS instruments [14]. |
The choice of ML algorithm depends heavily on the specific research objective, data structure, and the need for interpretability. The following diagram outlines a decision pathway for selecting an appropriate algorithm for pattern recognition in NTA data.
Table 2: Key Machine Learning Algorithms for NTA Pattern Recognition
| Algorithm | Type | Primary Use Case in NTA | Strengths | Technical References |
|---|---|---|---|---|
| Principal Component Analysis (PCA) | Unsupervised, Dimensionality Reduction | Exploring data structure, identifying outliers, visualizing sample groupings [27]. | Simplifies complex data, reveals major trends without sample labels. | [27] |
| k-Means Clustering | Unsupervised, Clustering | Grouping samples with similar chemical profiles to hypothesize common sources [27]. | Simple and efficient for finding intrinsic patterns in unlabeled data. | [27] |
| Random Forest (RF) | Supervised, Classification | Classifying samples into predefined source categories with high accuracy [27]. | Robust to overfitting, provides feature importance rankings for interpretability. | [27] |
| Support Vector Classifier (SVC) | Supervised, Classification | Effective for binary classification tasks (e.g., contaminant vs. control) in high-dimensional spaces [27]. | Performs well with complex, non-linear decision boundaries. | [27] |
| Logistic Regression (LR) | Supervised, Classification | A baseline model for classification; useful when model interpretability is paramount [27]. | Highly interpretable, outputs probabilities for class membership. | [27] |
| Partial Least Squares Discriminant Analysis (PLS-DA) | Supervised, Classification | Identifying source-specific indicator compounds through variable importance metrics [27]. | Powerful for finding features that best discriminate between known classes. | [27] |
Table 3: Essential Research Reagent Solutions for NTA Workflows
| Item | Function / Application | Technical Notes |
|---|---|---|
| Oasis HLB SPE Sorbent | Broad-spectrum extraction of polar and non-polar analytes from water samples [27]. | Often used in multi-sorbent strategies with WAX/WCX for comprehensive coverage [27]. |
| QuEChERS Extraction Kits | Rapid, efficient sample preparation for solid and complex matrices (e.g., soil, food, biological tissues) [27]. | Reduces solvent usage and processing time, ideal for large-scale environmental studies [27]. |
| ISOLUTE ENV+ / Strata WAX/WCX | Mixed-mode or ion-exchange sorbents used in multi-sorbent SPE to target a wider range of compound classes, particularly ionic species like PFAS [27]. | Expands the "detectable space" beyond what single-sorbent SPE can achieve [27] [19]. |
| Certified Reference Materials (CRMs) | Critical for tiered validation, used to confirm compound identities and support quantitative estimates [27] [14]. | Essential for establishing analytical confidence in Level 1 identifications [27]. |
| Quality Control (QC) Samples | (e.g., procedural blanks, pool QC samples) Monitor instrument stability, evaluate background contamination, and assess data quality throughout acquisition [27]. | Batch-specific QC samples are a fundamental part of data integrity assurance in Stage (ii) [27]. |
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) is a powerful, discovery-based approach for identifying unknown and suspected chemicals in complex samples without a priori knowledge of their presence [29] [19]. The interpretation of NTA data requires careful annotation of chemical features and systematic assessment of identification confidence. Unlike targeted methods that quantify predefined analytes, NTA attempts to characterize thousands of chemicals simultaneously, inevitably introducing varying degrees of uncertainty regarding chemical structures and concentrations [29]. Communicating this uncertainty consistently is crucial for the scientific acceptance and regulatory use of NTA data.
The fundamental challenge in NTA lies in moving from an observed instrumental signal to a confident chemical identification. This process involves gathering multiple lines of evidence to support or reject potential structures. To address this need, the scientific community has developed confidence frameworks that standardize how identification certainty is communicated [30] [31]. These frameworks enable researchers, reviewers, and end-users to properly evaluate the reliability of reported identifications and facilitate more meaningful comparisons between studies and laboratories.
The foundational framework for communicating identification confidence in NTA was established by Schymanski et al. (2014) and has been widely adopted across environmental chemistry, metabolomics, and exposomics research [31]. This framework defines multiple confidence levels ranging from confirmed structure (Level 1) to exact mass of interest (Level 5). The general principles of this framework have been specialized for particular chemical classes and applications, including per- and polyfluoroalkyl substances (PFAS) [30] [31].
PFAS present specific identification challenges due to their complex isomerism, existence in homologous series, and limited availability of reference standards. A specialized confidence framework has been developed to address these nuances (Table 1) [31]. This PFAS-specific scale maintains the same overall structure as the general framework but incorporates criteria particularly relevant to fluorinated compounds, such as the detection of homologous series and characteristic mass defect ranges.
Table 1: Confidence Levels and Criteria for PFAS Identification via HRMS
| Confidence Level | Level Name | Required Criteria | Additional Supporting Evidence |
|---|---|---|---|
| Level 1a | Confirmed by reference standard | Match to analytical standard for MS/MS spectrum, retention time, and accurate mass | Isotope pattern confirmation in matrix-matched sample |
| Level 1b | Indistinguishable from reference standard | Match to standard but existence of indistinguishable isomers | Distinction of branched/linear isomers when possible |
| Level 2a | Probable structure by library spectrum | Match to library MS/MS spectrum without reference standard | Sufficient diagnostic fragments to rule out isomers |
| Level 2b | Probable structure by diagnostic evidence | ≥3 diagnostic MS/MS fragments supporting specific headgroup | Retention time consistency; homologue detection |
| Level 2c | Probable structure by homologue evidence | ≥2 homologues identified at Level 2a or higher | Characteristic fragmentation pattern within series |
| Level 3 | Tentative candidate | Exact match to suspect list; limited spectral evidence | Class-specific mass defect; isotope pattern |
| Level 4 | Unequivocal molecular formula | Assigned molecular formula based on accurate mass | Elemental composition constraints |
| Level 5 | Exact mass of interest | Mass anomaly or suspect list match | Insufficient evidence for structure or formula |
This PFAS-specific confidence framework clarifies distinctions between isomeric forms and provides guidance on leveraging homologous series for identification. For example, branched and linear PFAS isomers that are chromatographically resolvable may be reported with higher confidence than those that co-elute [31]. The framework emphasizes that position of isomerization matters—isomerization in the headgroup typically creates distinct PFAS, while branching in the tail may not, reflecting conventional regulatory practices.
Establishing identification confidence requires executing a systematic analytical workflow from sample preparation to data interpretation. The following diagram illustrates the complete NTA process with key decision points for confidence assessment:
NTA Confidence Assessment Workflow
The initial stages of the NTA workflow significantly impact the quality of final identifications. Sample preparation methods must balance comprehensiveness with selectivity:
Following data acquisition, raw instrument files undergo extensive processing:
Successful implementation of NTA confidence frameworks requires specific analytical resources and computational tools. The following table details essential components of the NTA research toolkit:
Table 2: Essential Research Toolkit for NTA Confidence Assessment
| Tool Category | Specific Examples | Function in Confidence Assessment |
|---|---|---|
| Reference Standards | Analytical-grade certified standards | Level 1 confirmation via retention time and spectrum matching [31] |
| Mass Spectral Libraries | NIST, MassBank, mzCloud | Level 2a identification through MS/MS spectrum matching [32] [31] |
| Suspect Lists | EPA PFAS, NORMAN, CEUR | Level 3 candidate identification via exact mass matching [33] |
| Data Processing Software | Compound Discoverer, MZmine, MS-DIAL | Feature detection, alignment, and formula prediction [19] |
| Quantitative Structure-Activity Relationship (QSAR) | ChemSpace, EPI Suite | Predicting chemical properties and chromatographic behavior [32] |
| Quality Control Materials | INTERPRET NTA, Standard Reference Materials | QA/QC procedures for method validation and performance tracking [33] |
Reference standards represent the most critical resource for achieving the highest confidence levels (Level 1), yet they exist for only a small fraction of potential environmental contaminants [31]. This limitation has driven development of technical approaches that maximize information obtained from available standards, such as read-across methods within homologous series and prediction of retention time behavior based on chemical structure.
Achieving Level 1 confidence requires analytical reference standards analyzed under identical conditions as samples:
When reference standards are unavailable, Level 2 identification relies on spectral interpretation and diagnostic evidence:
When spectral data is insufficient for structural proposals, lower confidence assignments are appropriate:
Significant efforts are underway to harmonize confidence reporting practices across the NTA community:
These initiatives collectively address the need for standardized quality assurance/quality control (QA/QC) frameworks, shared compound databases and libraries, and clear linkages between identification confidence and potential decision contexts [33]. Continued community adoption of these harmonized approaches will strengthen the reliability and acceptance of NTA data across scientific, regulatory, and public health domains.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for discovering unknown chemical contaminants in environmental, food, and clinical samples. Unlike targeted analysis that focuses on predefined compounds, NTA aims to comprehensively characterize sample chemical composition without a priori knowledge of its content [3]. The principal challenge in contemporary NTA has shifted from detection capabilities to interpreting the vast, complex datasets generated by HRMS instruments [27]. Machine learning (ML) has redefined NTA potential by providing powerful pattern recognition capabilities that can identify latent patterns within high-dimensional data, making it particularly well-suited for contamination source identification and tracking [27] [35].
ML integration addresses critical gaps in traditional NTA workflows, particularly the inability to disentangle complex source signatures using conventional statistical methods. While early NTA data interpretation relied on univariate analysis and unsupervised clustering, these approaches often prioritize abundance over diagnostic chemical patterns, potentially overlooking low-concentration but high-risk contaminants and failing to account for source-specific chemical interactions [27]. ML-enhanced NTA represents a paradigm shift, enabling researchers to translate molecular features into environmentally actionable parameters through systematic computational frameworks [27] [1].
The integration of ML and NTA for contaminant source tracking follows a systematic four-stage workflow that transforms raw instrumental data into attributable contamination sources. This structured approach ensures analytical rigor while maximizing the extraction of meaningful environmental information from complex HRMS datasets [27].
Sample preparation requires careful optimization to balance selectivity and sensitivity, necessitating a compromise between removing interfering components and preserving as many compounds as possible with adequate sensitivity [27]. Key considerations include:
HRMS platforms, including quadrupole time-of-flight (Q-TOF) and Orbitrap systems, generate complex datasets essential for NTA [27]. Orbitrap systems generally show lower retention time drift than some Q-TOF instruments due to coupling with high-performance liquid chromatography systems, though their higher mass accuracy often necessitates more stringent alignment procedures [27]. Critical data processing steps include:
The transition from raw HRMS data to interpretable patterns involves sequential computational steps that leverage machine learning capabilities:
Validation ensures reliability of ML-NTA outputs through a three-tiered approach [27]:
Table 1: Machine Learning Algorithms for NTA Applications
| Algorithm Category | Specific Algorithms | NTA Application | Performance Examples |
|---|---|---|---|
| Supervised Learning | Random Forest, Support Vector Classifier, Logistic Regression, Decision Trees | Classification of contamination sources, quantitative structure-retention relationship modeling | 85.5-99.5% balanced accuracy for PFAS source classification [27] |
| Unsupervised Learning | k-means, Hierarchical Cluster Analysis, Principal Component Analysis | Sample clustering, dimensionality reduction, pattern discovery | Grouping samples by chemical similarity without prior labels [27] [36] |
| Deep Learning | Neural Networks, TensorFlow, PyTorch | Peak alignment, feature extraction, spectrum-structure relationship modeling | Automated feature extraction reducing manual operations [35] |
| Ensemble Methods | Random Forest, Model stacking | Improving prediction accuracy and robustness | Enhanced compound identification probability [16] |
The following workflow diagram illustrates the complete ML-NTA pipeline and the iterative validation process:
Objective: To identify and apportion contamination to specific sources using chemical fingerprinting and ML classification [27] [36].
Materials and Instruments:
Procedure:
Expected Outcomes: Source classification balanced accuracy of 85.5-99.5% has been demonstrated for PFAS source tracking [27].
Objective: To increase confidence in compound identifications by integrating spectral matching with predicted retention time indices using machine learning [16].
Materials and Instruments:
Procedure:
Expected Outcomes: This approach has demonstrated 54.5%, 52.1%, and 46.7% increases in identification probability for pesticides in blank, 10× diluted, and 100× diluted tea matrices, respectively, compared to library matching alone [16].
Table 2: Performance Metrics for ML-Enhanced NTA Applications
| Application Area | ML Technique | Performance Metrics | Reference |
|---|---|---|---|
| PFAS Source Tracking | Random Forest, SVC, Logistic Regression | 85.5-99.5% balanced accuracy for source classification | [27] |
| Compound Identification | k-Nearest Neighbors with RTI integration | 54.5% average increase in identification probability | [16] |
| Feature Extraction | Mass-Suite algorithms | 99.5% feature extraction accuracy | [36] |
| Inter-platform Transferability | Random Forest RTI prediction | R² = 0.96 (training), 0.88 (testing) for RTI correlation | [16] |
Machine learning enables sophisticated data mining approaches that extend beyond basic compound identification in NTA workflows. The Mass-Suite package exemplifies this advancement with specialized modules for unsupervised clustering and source tracking that leverage ML algorithms to extract meaningful patterns from complex HRMS datasets [36].
Unsupervised ML approaches including hierarchical clustering analysis and k-means clustering enable pattern discovery in NTA data without prior knowledge of sample groupings [36]. These techniques:
The source tracking function within Mass-Suite represents a cutting-edge application of ML in NTA, moving beyond classification to quantitative source apportionment [36]. This approach:
The following diagram illustrates the relationship between different ML approaches and their applications in the NTA workflow:
Table 3: Essential Research Reagents and Computational Tools for ML-NTA
| Tool Category | Specific Tools/Resources | Function in ML-NTA Workflow | Key Features |
|---|---|---|---|
| HRMS Instrumentation | Q-TOF, Orbitrap systems | High-resolution data acquisition for comprehensive chemical analysis | High mass accuracy, resolution; tandem MS capabilities [27] |
| Open-Source Software | Mass-Suite, XCMS, MZmine, PatRoon | Data processing, feature detection, alignment | Flexible workflows, machine learning integration [36] |
| Spectral Libraries | NIST, MassBank, in-house databases | Compound identification via spectral matching | Reference spectra for annotation confidence [3] [16] |
| Machine Learning Frameworks | scikit-learn, TensorFlow, PyTorch | Predictive modeling, pattern recognition | Pre-built algorithms, neural network architectures [35] [36] |
| Chemical Databases | DSSTox, PubChem, ChEMBL | Structural information, metadata | Chemical properties, hazard data for prioritization [8] |
| Specialized ML-NTA Tools | INTERPRET NTA | Data quality assessment, chemical prioritization | Integrates metadata, spectral similarity, hazard scoring [8] |
Despite significant advances, several challenges remain in the full operationalization of ML-assisted NTA for environmental decision-making. The most critical gap lies in the absence of systematic frameworks bridging raw NTA data to environmentally actionable parameters [27]. Key areas for future development include:
Complex models like deep neural networks can achieve high classification accuracy, but their black-box nature limits transparency and hinders the ability to provide chemically plausible attribution rationale required for regulatory actions [27]. Future research should focus on:
Current validation strategies in ML-assisted NTA studies remain fragmented and overly reliant on laboratory-based tests, which may underperform in real-world conditions involving field-validated source-receptor relationships [27]. Enhanced validation should include:
ML-assisted NTA shows promise for enhancing risk assessment frameworks through improved contaminant identification and hazard evaluation [1]. Future applications should focus on:
As ML-NTA methodologies continue to mature, they hold tremendous potential for transforming how we monitor, assess, and manage chemical contaminants in the environment, ultimately contributing to more effective environmental protection and public health safeguards [27] [1].
Quantitative non-targeted analysis (qNTA) represents a significant advancement in the field of analytical chemistry, serving as an essential tool for characterizing emerging contaminants in environmental, biological, and product-based samples. While traditional non-targeted analysis (NTA) focuses primarily on chemical identification, qNTA extends this capability by producing quantitative chemical concentration estimates. These estimates provide crucial data that can inform provisional risk-based decisions and prioritize targets for follow-up analysis, effectively bridging the gap between compound discovery and risk assessment [37].
The fundamental difference between NTA and qNTA lies in their analytical outputs. Suspect screening aims to identify potential known compounds, NTA works to discover and identify completely unknown chemicals, and qNTA adds the critical layer of concentration estimation for these identified unknowns. This quantitative dimension enables researchers to answer not just "what is present?" but also "how much is there?" – a essential question for meaningful risk assessment and regulatory decision-making [37].
Many common qNTA and "semi-quantitative" approaches rely on surrogate chemicals for calibration and model predictions. The selection of appropriate surrogates is therefore critical for generating accurate concentration estimates. Historically, surrogates have often been chosen based on a combination of intuition and/or availability rather than rational, structure-based selection. This limitation has constrained the degree to which qNTA can be objectively, mathematically assessed and improved [37].
Recent research has systematically assessed the extent to which chemical structure should inform the selection of qNTA surrogates using datasets from liquid chromatography high-resolution mass spectrometry (LC-HRMS) experiments. This work involves calculating a chemical space embedding using available LC-HRMS training data and 2D molecular descriptors deemed important to electrospray ionization efficiency. By implementing multiple structure-based surrogate selection strategies and comparing them to random selection using qNTA metrics for accuracy, uncertainty, and reliability, researchers have demonstrated that qNTA models can significantly benefit from rational surrogate selection strategies [37].
The following diagram illustrates the comprehensive workflow for quantitative non-targeted analysis, from sample preparation to final risk assessment:
This workflow begins with sample preparation and extraction, followed by analysis using liquid chromatography high-resolution mass spectrometry (LC-HRMS). The resulting data undergoes processing and feature detection before compound identification and structural elucidation. The quantitative phase involves careful surrogate selection and calibration, leading to concentration estimation and culminating in risk assessment and priority ranking of identified compounds [37].
Proper sample preparation is crucial for successful qNTA. Protocols vary depending on sample matrix but generally include:
For LC-HRMS analysis, the following parameters should be optimized:
Rational surrogate selection represents a significant advancement in qNTA methodology. The following protocol outlines the structure-based approach:
Chemical Space Embedding Calculation:
Leverage Calculation:
Surrogate Selection Strategies:
Coverage Assessment:
The following table details key research reagent solutions and essential materials used in qNTA experiments:
| Reagent/Material | Function in qNTA | Specification Notes |
|---|---|---|
| LC-MS Grade Solvents | Mobile phase preparation; sample reconstitution | Acetonitrile, methanol, water with < 1 ppm additives |
| Surrogate Standards | Calibration and response factor estimation | Preferably structure-informed selection from chemical space |
| Internal Standards | System performance monitoring; retention time correction | Stable isotope-labeled compounds covering various chemical classes |
| * SPE Cartridges* | Sample clean-up and concentration | Various chemistries (C18, HLB, etc.) based on application |
| Reference Mass Compounds | Mass axis calibration during HRMS analysis | Compounds providing precise mass locks in positive/negative modes |
The table below summarizes quantitative data and performance metrics for qNTA methods:
| Quantitative Metric | Target Range | Application in Risk Assessment |
|---|---|---|
| Concentration Estimate Accuracy | Typically within ±50% of true value | Determines reliability for risk-based decisions |
| Chemical Space Coverage (LARD metric) | Higher values indicate better coverage | Ensures representative quantification across diverse structures |
| Response Factor Variability | Lower variability improves quantification | Impacts uncertainty of concentration estimates |
| Limit of Quantification (LOQ) | Compound-dependent;越低越好 | Determines lowest measurable level for risk screening |
| Surrogate Selection Efficiency | Structure-based vs. random comparison | Informs optimal approach for specific applications |
Recent research has demonstrated that qNTA models benefit significantly from rational surrogate selection strategies. Interestingly, studies have also shown that a large enough random surrogate sample can perform as well as a smaller, chemically informed surrogate sample. This finding provides important practical guidance for researchers designing qNTA studies, suggesting that when sufficient surrogates are available, random selection may be adequate, but when working with limited surrogates, structure-based selection becomes crucial [37].
The following diagram illustrates the comparative effectiveness of structure-based versus random surrogate selection strategies in qNTA:
This diagram highlights the key finding that both structure-based selection and random selection with large sample sizes can achieve optimal performance, while small random sample sizes typically yield suboptimal results with higher uncertainty [37].
The application of high-resolution mass spectrometry continues to advance qNTA capabilities. Recent innovations in sample pretreatment and analysis have significantly reduced the unknown chemical space, while machine learning models have been increasingly incorporated into HRMS data mining workflows [38].
Effect-directed analysis represents another promising approach that aids in the discovery of toxic fractions, though the identification of specific toxicity drivers remains a challenge. As these methodologies mature, qNTA is poised to become an increasingly powerful tool for comprehensive chemical characterization and risk-based prioritization in complex environmental, biological, and product-based samples [38].
The integration of structure-based surrogate selection with advanced HRMS instrumentation and data processing algorithms will continue to enhance the accuracy, reliability, and application scope of qNTA methodologies. This progression promises to strengthen the bridge between compound discovery and meaningful risk assessment in increasingly complex sample matrices.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) represents a paradigm shift in analytical chemistry, enabling comprehensive characterization of complex samples without a priori knowledge of their chemical composition [14] [20]. This discovery-based approach has become indispensable across environmental monitoring, exposomics, and clinical biomarker discovery, where researchers face the challenge of identifying unknown or unexpected chemicals that may significantly impact ecosystem and human health [39] [19]. Unlike traditional targeted methods that quantify specific predefined analytes, NTA generates global chemical information, allowing researchers to detect novel chemical stressors, retrospectively screen archived samples, and classify samples based on chemical profiles [14] [39].
The versatility of HRMS platforms makes NTA amenable to virtually any sample medium, including air, water, soil, food, consumer products, and biological specimens [14] [19]. As momentum builds to integrate NTA into chemical monitoring and regulatory decision-making frameworks, standardized approaches for assessing and communicating method performance have become increasingly critical [14] [20]. This technical guide explores the foundational principles, current applications, methodological considerations, and future directions of NTA across three key domains, providing researchers with practical frameworks for implementing these powerful approaches in their own work.
Harmonized terminology is essential for accurate communication of NTA methods and results. The Benchmarking and Publications for Non-Targeted Analysis (BP4NTA) Working Group has established consensus definitions for key terms spanning all aspects of NTA workflows [20]:
Evaluating NTA method performance presents distinct challenges compared to targeted analyses due to inherent uncertainties in detecting and identifying unknown compounds [14]. In contrast to targeted methods where performance metrics like accuracy, precision, sensitivity, and selectivity are well-established, NTA requires specialized assessment approaches:
Table 1: Comparison of Targeted Analysis and Non-Targeted Analysis
| Characteristic | Targeted Analysis | Non-Targeted Analysis |
|---|---|---|
| Objective | Quantify predefined analytes | Discover unknown/unsuspected chemicals |
| Identification Confidence | High (reference standards) | Variable (level system) |
| Performance Metrics | Well-established (accuracy, precision, LOD/LOQ) | Evolving frameworks |
| Chemical Coverage | Limited (dozens to hundreds) | Extensive (thousands of features) |
| Standardization | Mature protocols | Under development |
NTA has revolutionized water quality assessment by enabling comprehensive detection of chemical contaminants, including transformation products and newly synthesized compounds that escape conventional targeted methods [19]. The chemical space captured in water samples predominantly includes per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, and their transformation products [19]. Successful NTA of water matrices requires careful consideration of sample preparation to concentrate low-abundance contaminants while minimizing interferences [39].
A systematic review of NTA applications found that in water studies, 51% used only LC-HRMS, 32% used only GC-HRMS, and 16% used both platforms to expand chemical coverage [19]. This multi-platform approach is crucial since LC-HRMS better captures polar, water-soluble compounds, while GC-HRMS excels for non-polar, volatile compounds [19]. Effect-directed analysis (EDA) combined with NTA has proven particularly valuable for identifying toxicity drivers in complex water samples, with studies demonstrating that NTA explains a median of 34% of observed toxicity compared to just 13% for targeted analysis alone [40].
In soil and sediment analysis, NTA frequently detects pesticides and polyaromatic hydrocarbons (PAHs), while air monitoring focuses on volatile and semi-volatile organic compounds (VOCs/SVOCs) [19]. The choice of ionization techniques significantly impacts the detectable chemical space:
A critical challenge in environmental NTA is the selective exclusion of polar, highly polar, and ionic compounds when using reverse-phase liquid chromatography (RPLC), which remains overrepresented in HRMS-NTA methods [40]. This analytical bias means many potentially significant environmental contaminants may be overlooked in standard monitoring campaigns.
Materials and Methods:
Data Processing Workflow:
The exposome encompasses all non-genetic exposures individuals experience throughout life, constituting a critical determinant of health [39]. HRMS-based exposomics aims to comprehensively profile small-molecule exposure agents (molecular weight ≤1000 Da), their transformation products, and associated biomolecules in human matrices [39]. This approach represents a fundamental shift from hypothesis-driven, quantitation-centric targeted analyses toward data-driven, hypothesis-generating chemical exposome-wide profiling [39].
Recent studies have demonstrated that all-cause mortality is driven more by the exposome than the genome, highlighting the critical importance of comprehensive exposure assessment [39]. The chemical space of the human exposome is vast, with global inventories cataloging over 350,000 compounds and mixtures in commercial production, approximately triple previous estimates [39]. Surprisingly, about 120,000 substances remain inconclusively identified due to corporate confidentiality, creating significant gaps in exposure knowledge [39].
Human exposomics faces unique analytical challenges distinct from other applications:
Table 2: Chemical Classes Frequently Detected in Human Exposomics Studies
| Chemical Class | Detection Frequency | Major Sources | Analytical Platform |
|---|---|---|---|
| Plasticizers | High | Food packaging, consumer products | LC-ESI(+/-) |
| Pesticides | High | Diet, residential applications | LC-ESI(+), GC-EI |
| Halogenated Compounds | Medium | Flame retardants, industrial processes | GC-EI, LC-ESI(-) |
| Pharmaceuticals | Medium | Medication use | LC-ESI(+) |
| PFAS | Medium | Stain-resistant coatings, firefighting foam | LC-ESI(-) |
| Personal Care Products | Medium | Cosmetics, hygiene products | LC-ESI(+) |
Sample Preparation:
LC-HRMS Analysis:
Data Processing and Annotation:
While the provided search results focus primarily on environmental and exposomics applications, NTA principles in clinical biomarker discovery share fundamental methodologies with exposomics research. The integration of NTA into clinical studies enables unprecedented discovery of metabolic signatures associated with disease states, treatment response, and environmental exposures [39]. The transdisciplinary field of exposomics provides a framework for discovering environmental drivers of disease through unbiased, scalable analytical approaches [39].
The transition from biomarker discovery to clinical validation requires careful consideration of analytical performance and standardization. As noted in NTA research, meaningful evaluation of study performance is predicated on harmonized terminology and clear guidance about best practices for analysis and reporting results [20]. This is particularly crucial in clinical applications where findings may inform diagnostic or therapeutic decisions.
Clinical biomarker discovery using NTA requires special attention to several methodological aspects:
Effective data processing is essential for transforming raw HRMS data into meaningful chemical information. A typical NTA data processing workflow includes three primary segments [23]:
Data Processing: Transforming raw data into a list of features with associated abundance information, including steps such as:
Statistical and Chemometric Analysis: Identifying trends, clusters, and relationships between samples and/or detections [23].
Annotation and Identification: Attributing molecular characteristics or specific compound identities to detected features [23].
A significant challenge in NTA data processing is the terminology differences across software platforms, where different tools may use the same term to describe different steps [23]. This underscores the importance of detailed methodological reporting to ensure reproducibility.
Data visualization plays a crucial role throughout the NTA workflow, providing core components for data inspection, evaluation, and sharing [41]. Effective visualization strategies include:
Visualizations extend human cognitive abilities by translating data to a more accessible visual channel, particularly important for abstract data components like multi-dimensional chromatographic outputs or MS/MS spectral data [41].
Table 3: Essential Research Reagent Solutions for NTA Workflows
| Reagent/Category | Function | Application Examples |
|---|---|---|
| HLB SPE Cartridges | Broad-spectrum extraction of organic compounds | Water analysis, serum proteome precipitation |
| HybridSPE-Precipitation Plates | Phospholipid removal from biological samples | Serum/plasma exposome analysis |
| Stable Isotope-Labeled Standards | Quality control, retention time calibration | Internal standards for performance monitoring |
| QC Pooled Samples | Monitoring instrumental performance | Inter-batch normalization |
| Reference Standard Mixtures | MS/MS spectral library generation | Compound identification verification |
| Mobile Phase Additives | Modifying chromatography and ionization | Formic acid, ammonium acetate buffers |
Despite significant advances, NTA still faces several critical limitations that constrain its application:
Promising developments are addressing current NTA limitations:
Non-targeted analysis using high-resolution mass spectrometry has emerged as a transformative approach across environmental monitoring, exposomics, and clinical biomarker discovery. By enabling comprehensive characterization of complex samples without a priori knowledge of chemical content, NTA provides unprecedented capabilities for discovering unknown environmental contaminants, mapping the human exposome, and identifying novel metabolic signatures of disease. As the field continues to mature, ongoing efforts in method harmonization, performance assessment standardization, and expanded chemical space coverage will be essential for realizing the full potential of NTA in protecting human health and the environment. The integration of advanced data visualization strategies, multi-platform analytical approaches, and community-wide collaboration frameworks will further enhance the impact and applicability of NTA across diverse scientific disciplines.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for identifying unknown and unexpected chemical compounds in complex samples. Unlike targeted methods that focus on predefined analytes, NTA generates comprehensive chemical profiles without prior knowledge of sample composition [12] [14]. This capability makes NTA particularly valuable for environmental monitoring, pharmaceutical development, and emergency response scenarios where unknown chemical releases may occur [12].
However, the interpretive power of NTA depends entirely on data quality. Various technical challenges can compromise results, leading to false positives, missed detections, and erroneous quantitations [14]. This technical guide examines common data quality issues in HRMS-based NTA and provides diagnostic approaches to address them, framed within the broader context of NTA data interpretation research.
The complex, multi-step workflow of HRMS-based NTA introduces numerous potential sources of error that can affect data quality and interpretation.
Complex sample matrices can obscure chemical signals through ion suppression or enhancement, particularly in biological and environmental samples [44]. Matrix effects alter ionization efficiency, leading to inaccurate compound quantification and detection. Additionally, chemical noise from solvents, contaminants, and sample processing materials can generate false features or mask true signals of interest [45].
HRMS instruments exhibit performance fluctuations that affect data quality. Mass accuracy drift, retention time shifting, and intensity variations can occur due to environmental changes, calibration status, or instrument aging [14] [44]. Without proper monitoring and correction, these variations reduce confidence in compound identification and quantification across multiple analytical batches.
Feature detection algorithms may generate false positives through incorrect peak picking, alignment errors, or adduct misassignment [27]. Conversely, these algorithms can also produce false negatives by missing low-abundance features or failing to separate co-eluting compounds. Inconsistent data processing parameters across samples or batches introduces additional variability that complicates result interpretation [14].
Without authentic standards, compound identification relies on spectral matching and in-silico fragmentation prediction, both of which have inherent limitations [14]. Isomeric compounds often produce similar fragmentation patterns, leading to ambiguous annotations. Database incompleteness further exacerbates this issue, as unknown compounds cannot be matched to reference spectra [1].
NTA methods typically provide semi-quantitative estimates rather than precise concentration measurements [14]. Response factors vary significantly across chemical classes, making accurate quantification without compound-specific standards challenging. This limitation is particularly problematic for risk assessment applications where precise concentration data is essential [14].
Table 1: Common Data Quality Issues and Their Impacts on NTA Results
| Quality Issue | Primary Causes | Impact on Data Interpretation |
|---|---|---|
| Chemical Noise | Matrix effects, contaminant ions, solvent impurities | Reduced signal-to-noise ratio, false feature detection |
| Mass Accuracy Drift | Instrument calibration status, environmental fluctuations | Incorrect molecular formula assignment, reduced identification confidence |
| Retention Time Shifts | Chromatographic system variability, column aging | Misalignment across samples, incorrect peak matching |
| Feature Misannotation | Incorrect adduct assignment, isotope pattern misidentification | Wrong molecular formula assignment, structural misidentification |
| Spectral Library Gaps | Limited reference databases, absent authentic standards | Reduced identification rates, uncertain compound annotation |
Implementing systematic quality assessment protocols is essential for identifying and mitigating data quality issues in NTA workflows.
Incorporating quality control (QC) samples throughout the analytical batch enables continuous monitoring of system performance [44]. Pooled QC samples, prepared by combining small aliquots of all study samples, assess overall system stability. Process blanks identify contamination sources, while spiked samples with internal standards monitor extraction efficiency and matrix effects [44].
Monitoring specific technical parameters throughout analysis provides quantitative assessment of data quality. Key metrics include mass error (typically < 5 ppm for Orbitrap instruments), retention time stability (relative standard deviation < 2%), and peak intensity variance across replicate injections [44]. Establishing acceptance thresholds for these metrics ensures consistent data quality across batches.
Implementing a standardized confidence framework for compound identification clarifies uncertainty levels. The Schymanski et al. (2014) scale is widely adopted, ranging from Level 1 (confirmed structure with reference standard) to Level 5 (exact mass of interest but no structural information) [27]. This tiered system communicates identification certainty to stakeholders and supports appropriate data interpretation.
A comprehensive evaluation of NTA performance should address four key aspects: quality, boundary, accuracy, and precision [44]. Quality assessment verifies adherence to QA/QC protocols. Boundary evaluation defines the chemical space covered by the method. Accuracy measurement compares results to known values, while precision assessment examines repeatability and reproducibility [44].
Table 2: Key Performance Metrics for NTA Data Quality Assessment
| Performance Aspect | Assessment Method | Recommended Frequency |
|---|---|---|
| Mass Accuracy | Analysis of reference compounds with known m/z values | Each analytical batch |
| Retention Time Stability | Monitoring of internal reference compounds | Throughout analytical sequence |
| Signal Intensity Reproducibility | Relative standard deviation of QC sample features | Every 10-12 samples |
| Feature Detection Consistency | Comparison of features detected in replicate analyses | Each sample type |
| Blank Contamination | Analysis of process blanks with study samples | Each extraction batch |
Purpose: Verify instrument performance before sample analysis. Materials: Reference standard mixture containing compounds spanning relevant chemical space. Procedure:
Purpose: Distinguish real chemical features from artifacts. Materials: Study samples, process blanks, pooled QC samples. Procedure:
Purpose: Assign confidence levels to compound annotations. Materials: HRMS/MS data, spectral databases (e.g., NIST, MassBank), computational tools. Procedure:
NTA Data Quality Assessment Workflow
Table 3: Key Research Reagents and Materials for NTA Quality Assurance
| Reagent/Material | Function | Application Example |
|---|---|---|
| Quality Control Reference Standards | Monitor instrument performance and data quality | System suitability testing with known compounds |
| Internal Standard Mixture | Correct for matrix effects and injection variability | Isotopically-labeled compounds spiked into all samples |
| Solid Phase Extraction Cartridges | Concentrate analytes and remove matrix interferents | Sample preparation for trace analysis (HLB, WAX, WCX) |
| Retention Index Calibration Mix | Standardize retention time alignment across samples | Hydrocarbon series for GC-HRMS or homologous series for LC-HRMS |
| Spectral Libraries | Support compound identification through spectrum matching | NIST, MassBank, mzCloud databases |
| Certified Reference Materials | Validate method accuracy and performance | EPA methods, NIST standard reference materials |
Emerging computational approaches enhance data quality in NTA workflows. Machine learning algorithms improve compound identification accuracy by recognizing complex patterns in HRMS data [1] [27]. Prioritization strategies help focus resources on the most relevant chemical features, addressing the data overload challenge in NTA [45] [9].
Seven key prioritization strategies have been identified: (1) target and suspect screening using reference databases; (2) data quality filtering to remove artifacts; (3) chemistry-driven prioritization focusing on specific compound classes; (4) process-driven prioritization using spatial/temporal comparisons; (5) effect-directed prioritization linking features to biological effects; (6) prediction-based prioritization using quantitative structure-property relationships; and (7) pixel- or tile-based analysis for complex chromatographic data [45] [9].
Prioritization Strategy Workflow
Data quality is fundamental to generating reliable, interpretable results in HRMS-based non-targeted analysis. By understanding common quality issues and implementing systematic diagnostic approaches, researchers can improve the accuracy and reproducibility of NTA studies. Integrating robust quality assurance protocols, standardized confidence frameworks, and advanced computational approaches addresses the inherent challenges of NTA and supports its transition from exploratory research to regulatory applications. As the field evolves, continued development of standardized performance metrics and validation procedures will further enhance the reliability and interpretability of NTA data.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for detecting unknown and unexpected compounds in complex samples, filling critical data gaps not easily addressed by targeted methods [5]. The principal challenge of contemporary NTA lies not in detection itself, but in developing computational methods to extract meaningful environmental information from the vast chemical datasets generated by HRMS instruments [27]. A typical LC-HRMS dataset comprises a series of high-resolution mass spectra collected over time, resulting in abstract feature triplets consisting of retention time (rt), mass-to-charge ratio (m/z), and intensity (I) for each detected substance [46]. The fundamental goal of data processing is to transform this raw, complex data into a structured list of chemically relevant features—groupings of associated MS1 components like isotopologues and adducts—which can then be used for statistical analysis, annotation, and identification [23].
The data processing workflow is critical yet challenging, involving numerous user-defined parameters with poorly understood interactions [46]. Variations in these parameters can significantly impact the final results, making optimization essential for reliable outcomes. Unlike targeted analyses, where performance metrics like selectivity, sensitivity, accuracy, and precision are well-defined, NTA methods lack standardized performance assessment procedures, creating a barrier to broader adoption and confident interpretation of results [5]. This guide addresses these challenges by providing a detailed framework for optimizing data processing parameters and thresholds, ensuring researchers can generate high-quality, reproducible data suitable for their specific research objectives, whether sample classification, chemical identification, or quantitative estimation.
The transition from raw HRMS data to interpretable chemical information follows a structured workflow involving sequential computational steps. This process intentionally reduces the data, transforming raw spectral information into aligned features ready for statistical analysis and chemical interpretation [27] [23]. Each stage employs specific algorithms with associated parameters that require careful optimization to balance sensitivity, specificity, and computational efficiency.
Before core processing begins, data often requires format conversion from proprietary vendor formats (.raw, .d, .wiff) to open standards like mzML or mzXML [23]. This step, while not intentionally interpretive, may cause data loss if not carefully evaluated. Subsequently, centroiding is typically the first user-applied processing step for data collected in profile mode. This process reduces the number of data points by a factor of 10–150 by converting high-resolution mass peak profiles into single centroids defined by an m/z value and intensity [46].
Table 1: Common Centroiding Algorithms and Their Characteristics
| Algorithm | Underlying Principle | m/z Determination | Key Considerations |
|---|---|---|---|
| Continuous Wavelet Transform (CWT) | Uses wavelet transforms for peak detection [46]. | Local maximum analysis of the scalogram (measured value) [46]. | Faster but may have larger m/z errors compared to interpolation methods [46]. |
| Full Width at Half Maximum (FWHM) | Identifies peaks based on width at half height [46]. | Interpolation of the center within the peak profile's FWHM range [46]. | An "exact mass" method; provides interpolated m/z values [46]. |
| Savitzky-Golay Derivative | Detects zero-crossings of the first-order derivative [46]. | Interpolation between data points [46]. | Improves m/z accuracy; based on well-established peak detection principles [46]. |
| Non-Linear Regression (e.g., Cent2Prof) | Fits a Gaussian peak model to the profile data [46]. | Regression coefficients from the model fit [46]. | Retains peak width information; computationally intensive [46]. |
| Linearized Regression | Linearizes the Gaussian function via log-transform [46]. | Faster linear regression coefficients [46]. | Significant time savings (factor 100–1000) vs. non-linear regression [46]. |
The choice of centroiding algorithm influences mass accuracy, a critical factor for subsequent compound identification. Vendor-specific algorithms are often used, but open alternatives like CWT and FWHM are widely implemented in common tools like MzMine and msConvert [46]. Furthermore, centroiding involves an inherent trade-off: while it achieves crucial data compression, it also results in information loss, such as peak width details that convey mass accuracy and precision. Advanced methods like Cent2Prof aim to mitigate this by retaining peak width as a regression coefficient [46].
Following centroiding, the processed data undergoes chromatographic peak detection. This step identifies features in the chromatographic dimension by analyzing extracted ion chromatograms (XICs). A key parameter here is signal thresholding, which removes signals below a designated abundance threshold (absolute value) or a signal-to-noise (S/N) ratio [23]. Setting this threshold is critical; too high a value risks missing low-abundance but chemically significant features, while too low a value drastically increases noise and false positives. Other relevant steps include chromatogram smoothing to reduce noise and shoulder peaks filtering, particularly for data from Fourier transform MS instruments [23].
After peak detection, retention time (RT) alignment corrects for minor shifts in chromatographic retention times across different samples within a batch. This is essential for ensuring that the same chemical feature is correctly matched across all samples. Alignment can be based on user-selected or algorithm-selected representative compounds [23]. It is important to note that different mass spectrometers exhibit different retention time stability; for instance, Orbitrap systems coupled with high-performance liquid chromatography often show lower retention time drift than some Q-TOF systems, which may influence alignment stringency [27].
A single compound generates multiple signals in HRMS, including various adducts, isotopologues, and in-source fragments. The next processing steps group these related signals into a single "feature" representing the molecular entity.
Gap filling is a crucial recursive step that attempts to detect features missed during initial peak picking, often by using slightly lower intensity or S/N thresholds than the initial settings [23]. This helps correct for instances where a compound is present in multiple samples but only detected above the threshold in some.
The final stages of processing prepare the feature list for statistical analysis. Between-sample alignment compares detected features across all samples in the study, grouping them based on allowed variances in m/z and RT [23]. This creates a consolidated feature list across the entire dataset.
Following alignment, several filtering steps are applied to enhance data quality:
The final output is a feature-intensity matrix, where rows represent samples and columns correspond to aligned chemical features, serving as the foundation for all subsequent statistical and chemometric analyses [27].
Optimizing data processing parameters is not a one-time task but an iterative process essential for ensuring data quality. The following protocols provide methodologies for systematically evaluating and optimizing key parameters.
Objective: To determine the optimal balance between sensitivity and selectivity for chromatographic peak detection parameters (intensity threshold and S/N ratio) and gap filling.
Methodology:
Objective: To optimize parameters for retention time alignment and between-sample alignment (m/z and RT tolerance windows).
Methodology:
For qualitative NTA objectives like sample classification or chemical identification, performance can be assessed using a confusion matrix, adapted from traditional metrics [5].
Table 2: Performance Assessment for Qualitative NTA Using a Confusion Matrix
| Performance Metric | Calculation | Interpretation in NTA Context |
|---|---|---|
| Accuracy | (TP + TN) / (TP + FP + FN + TN) | Overall correctness of sample classification or feature detection. |
| Precision | TP / (TP + FP) | Proportion of detected/classified features that are correct (reliability). |
| Recall (Sensitivity) | TP / (TP + FN) | Proportion of true features that were successfully detected (completeness). |
| False Discovery Rate (FDR) | FP / (TP + FP) | Proportion of detected features that are incorrect. |
Application: This framework can be used to evaluate a data processing workflow's performance against a ground-truth dataset, such as the spiked standard set from Protocol 3.1. It helps quantify the trade-offs inherent in parameter selection.
A robust NTA study relies on more than just optimized software parameters. Several physical reagents and reference materials are essential for quality control, method validation, and performance assessment throughout the data processing workflow.
Table 3: Essential Research Reagents and Materials for NTA
| Item | Function in NTA Workflow |
|---|---|
| Certified Reference Materials (CRMs) | Used to verify analytical confidence and confirm compound identities during the validation stage [27]. |
| Internal Standard Mixtures | Injected into every sample to monitor instrument performance, correct for signal drift, and aid in retention time alignment [27] [5]. |
| Quality Control (QC) Samples | A pooled sample from all study samples or a standard reference material analyzed repeatedly throughout the batch. Used to monitor system stability, evaluate feature reproducibility (e.g., via RSD), and for signal correction [27] [5]. |
| Procedural & Solvent Blanks | Samples taken through the entire sample preparation and analysis process without the sample matrix. Critical for identifying and filtering out background contamination and instrumental artifacts during data filtering [23]. |
| Spiked Standard Mixtures | Custom mixtures of known compounds, used to assess method performance metrics like detection limits, accuracy, and precision for the data processing workflow [5]. |
Machine learning (ML) is redefining the potential of NTA by identifying latent patterns within high-dimensional data, making it particularly well-suited for tasks like contamination source identification [27]. The integration of ML necessitates additional considerations for data processing.
Prior to ML analysis, the feature-intensity matrix requires specific preprocessing to ensure data quality and model robustness [27]. This includes:
Given the complexity of ML-NTA workflows, a robust, multi-tiered validation strategy is crucial for ensuring reliable results [27].
Optimizing data processing parameters and thresholds is a critical, multi-faceted endeavor in non-targeted analysis. From the initial centroiding of raw data to the final alignment and filtering of features, each step contains user-defined parameters that directly impact the quality, reliability, and interpretability of the results. As the field moves towards greater integration with machine learning and aims to provide actionable insights for environmental and health decision-making, the importance of systematic parameter optimization and rigorous, tiered validation cannot be overstated. By adopting the structured protocols and frameworks outlined in this guide—such as using spiked standards for sensitivity assessment, QC samples for alignment evaluation, and confusion matrices for performance quantification—researchers can advance the field of NTA, improve the comparability of results across different studies, and enhance the translation of complex HRMS data into meaningful scientific knowledge.
In the realm of non-target analysis (NTA) using high-resolution mass spectrometry (HRMS), complex matrices represent one of the most significant challenges for accurate data interpretation and compound identification. Matrix effects (MEs)—defined as the combined influence of all sample components other than the analyte on the measurement of quantity—can substantially alter ionization efficiency in the mass spectrometer source when interference species co-elute with target analytes [47]. These effects manifest primarily as ion suppression or, less frequently, ion enhancement, directly impacting method reproducibility, linearity, selectivity, accuracy, and sensitivity during validation [47]. For NTA, which aims to identify unknown or suspected environmental contaminants, pharmaceuticals, and transformation products without analytical standards, these interferences pose particular difficulties [1] [27]. The inherent variability of real-world samples, such as urban runoff, biological fluids, or environmental extracts, introduces unpredictable matrix components that can obscure the detection of low-abundance compounds and complicate the translation of raw HRMS data into actionable environmental insights [48] [27].
The following diagram illustrates the core challenge of matrix effects in the LC-ESI-MS process and the fundamental strategies to manage them:
Diagram: Matrix effects occur when sample components co-elute with analytes in LC-ESI-MS, leading to ion suppression/enhancement. Management follows compensation or minimization strategies.
Interferences in mass spectrometry can be systematically categorized into two primary classes: spectroscopic and nonspectroscopic interferences, each with distinct mechanisms and impacts on analytical results [49].
Spectroscopic interferences contribute directly to a specific analyte signal by sharing the same mass-to-charge ratio (m/z) as the analyte ion [49]. These are further subdivided into three types:
Nonspectroscopic interferences, often termed matrix effects, do not create new signals but rather alter the response of analytes [49]. These include:
In electrospray ionization (ESI)—the most common ionization technique for LC-MS applications—matrix effects primarily occur in the liquid phase during droplet formation and charge transfer processes [47]. Co-eluting matrix components can compete for available charge, alter droplet formation dynamics, or impede the efficient transfer of ions into the gas phase [47].
Two complementary paradigms exist for addressing matrix effects: compensation and minimization. The choice between these approaches depends on sensitivity requirements, availability of blank matrices, and the specific analytical context [47].
Compensation strategies acknowledge the presence of matrix effects and employ techniques to correct for their influence on quantitative results:
Minimization strategies aim to reduce the presence of interfering matrix components before they reach the mass spectrometer:
Table 1: Quantitative Comparison of Matrix Effect Management Strategies
| Strategy | Relative Efficiency | Implementation Complexity | Cost Considerations | Best-Suited Applications |
|---|---|---|---|---|
| Internal Standard (IS-MIS) | High (<20% RSD for 80% of features) [48] | High (requires method development) | Moderate (cost of labeled standards) | Heterogeneous samples, quantitative NTA [48] |
| Sample Dilution | Variable (median suppression 0-67% at REF 50) [48] | Low (simple to implement) | Low (minimal additional resources) | Initial approach, less complex matrices [48] |
| Solid-Phase Extraction | Moderate to High (depends on selectivity) | Moderate (method optimization needed) | Moderate (cartridge costs) | Complex matrices, need for pre-concentration [50] [27] |
| Chromatographic Optimization | Moderate (reduces co-elution) | High (method redevelopment) | High (instrumentation, columns) | All applications, particularly targeted methods [50] |
The decision framework for selecting the optimal strategy involves assessing sensitivity requirements and blank matrix availability, as shown below:
Diagram: Decision framework for managing matrix effects based on sensitivity requirements and blank matrix availability [47].
Rigorous assessment of matrix effects is essential during method development and validation. Several established protocols provide qualitative and quantitative evaluation of matrix effects.
This qualitative approach identifies retention time zones most susceptible to ion enhancement or suppression [47].
Protocol:
Considerations: This method provides spatial information about matrix effects but only qualitative results. It is less efficient for highly diluted samples and can be laborious for multi-analyte methods [47].
This quantitative method compares analyte response in neat solution to response when spiked into a blank matrix [47].
Protocol:
Considerations: This method requires access to blank matrix and provides quantitative data at a single concentration level [47].
A semi-quantitative approach that evaluates matrix effects across a concentration range [47].
Protocol:
Considerations: This approach provides information across the calibration range but requires more extensive preparation [47].
Recent advances in machine learning (ML) and quantitative non-targeted analysis (qNTA) are revolutionizing how complex matrices and interferences are managed in HRMS data interpretation.
ML algorithms excel at identifying latent patterns in high-dimensional HRMS data, making them particularly suited for contaminant source identification in complex environmental samples [27]. The systematic workflow for ML-assisted NTA encompasses four key stages:
ML classifiers such as Support Vector Classifier (SVC), Logistic Regression (LR), and Random Forest (RF) have demonstrated balanced accuracy ranging from 85.5% to 99.5% for source identification across different environmental samples [27].
Traditional NTA has primarily focused on compound identification, but recent efforts have established frameworks for deriving quantitative estimates from NTA measurements [28]. qNTA bridges the gap between contaminant discovery and risk characterization by providing concentration estimates essential for risk assessment [28]. Key considerations include:
Table 2: Essential Research Reagent Solutions for Managing Matrix Effects
| Reagent/ Material | Function | Application Examples | Technical Considerations |
|---|---|---|---|
| Isotopically Labeled Internal Standards | Compensation for analyte loss and matrix effects during sample preparation and analysis | IS-MIS normalization for urban runoff samples [48] | Match chemical properties and retention times with target analytes; limited availability for unknown compounds |
| Multi-Sorbent SPE Cartridges | Broad-spectrum extraction and clean-up | Oasis HLB + ISOLUTE ENV+ for comprehensive contaminant screening [27] | Different sorbents target specific compound classes; combination provides wider coverage |
| UHPLC Columns (C18, HILIC) | High-resolution chromatographic separation | BEH C18 column for urban runoff analysis [48] | Sub-2μm particles provide superior separation efficiency; requires high-pressure systems |
| QuEChERS Kits | Rapid sample preparation and clean-up | Food, environmental, and biological samples [27] | Combines extraction and partitioning salts with dispersive SPE for efficient matrix removal |
| Matrix-Matched Calibration Standards | Compensation for consistent matrix effects | Pharmaceutical and bioanalytical applications [47] | Requires access to appropriate blank matrices; challenging for unique sample types |
Effective management of complex matrices and interferences is fundamental to generating reliable, actionable data from non-target analysis using high-resolution mass spectrometry. A systematic approach—incorporating appropriate assessment methods, strategic application of compensation and minimization techniques, and leveraging advanced computational approaches—enables researchers to overcome the challenges posed by complex sample matrices. As ML-assisted NTA and quantitative frameworks continue to evolve, they promise to further bridge the gap between analytical capability and environmentally meaningful decision-making, ultimately supporting more effective chemical risk assessment and management across pharmaceutical, environmental, and public health domains.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for detecting unknown and unexpected compounds in complex sample matrices, enabling applications from environmental monitoring to drug discovery [27] [14]. Unlike targeted methods that provide unambiguous results for predefined chemicals, NTA generates information-rich data with inherent uncertainties that complicate interpretation and validation [14]. If an analyst reports that a chemical is present in a sample, it may actually be absent (e.g., an isomer or incorrect identification). Conversely, if a chemical is reported as absent, it may actually be present but not correctly identified during data processing [14]. These fundamental uncertainties create critical challenges for reliable data interpretation, necessitating robust strategies for managing false positives and annotation ambiguities throughout the NTA workflow.
The core of this challenge lies in the analytical gap between detection and confident identification. While HRMS instruments can detect thousands of features in a single sample, compound identification remains a major bottleneck, requiring sophisticated prioritization and validation strategies to focus identification efforts where they matter most [45] [52]. This technical guide examines systematic approaches for handling false positives and annotation ambiguities within the context of NTA-HRMS data interpretation, providing researchers with validated methodologies to enhance data reliability and support confident decision-making.
Effective management of false positives begins with strategic prioritization that filters out unreliable signals and focuses attention on the most chemically relevant features. Research by Zweigle et al. (2025) outlines seven complementary prioritization strategies that can be integrated into a comprehensive NTA workflow [45] [52] [9]. These strategies operate at different stages of the analytical process, collectively enabling stepwise reduction from thousands of detected features to a manageable number of high-confidence candidates worthy of further investigation.
Table 1: Seven Core Prioritization Strategies for NTA Workflows
| Strategy | Primary Function | Key Techniques | Impact on False Positives |
|---|---|---|---|
| Target & Suspect Screening (P1) | Filters known compounds | Library matching, predefined databases | Reduces false structural assignments |
| Data Quality Filtering (P2) | Removes analytical artifacts | Blank subtraction, replicate consistency, peak shape assessment | Eliminates instrument-derived false signals |
| Chemistry-Driven Prioritization (P3) | Identifies specific compound classes | Mass defect filtering, homologue series, halogenation patterns | Focuses on chemically plausible features |
| Process-Driven Prioritization (P4) | Highlights process-relevant features | Spatial/temporal comparisons, correlation analysis | Identifies environmentally relevant compounds |
| Effect-Directed Prioritization (P5) | Links features to biological activity | Bioassay integration, virtual EDA (vEDA) | Prioritizes toxicologically significant compounds |
| Prediction-Based Prioritization (P6) | Estimates risk and concentration | MS2Quant, MS2Tox, QSPR models | Ranks by potential environmental impact |
| Pixel/Tile-Based Analysis (P7) | Localizes regions of interest | Chrom. image analysis, variance mapping | Identifies significant regions before peak detection |
The power of these prioritization strategies emerges from their integration rather than individual application. For example, an initial dataset containing thousands of features might be reduced through P1 (target and suspect screening) to several hundred candidates. Application of P2 (data quality filtering) then removes artifacts and unreliable signals, while P3 (chemistry-driven prioritization) focuses attention on compound classes of specific interest, such as halogenated substances [45] [9]. Subsequent application of process-driven (P4) and effect-directed (P5) prioritization can further refine the list to dozens of features linked to specific environmental processes or biological effects. Finally, prediction-based prioritization (P6) enables risk-based ranking, resulting in a focused shortlist of less than ten high-priority compounds deserving of comprehensive identification and confirmation [9].
Mass spectral matching forms the foundation of compound annotation in NTA workflows, but traditional approaches suffer from significant limitations in accuracy and coverage. While library matching can identify compounds using authentic reference standards (Metabolomics Standards Initiative [MSI] level 1 identification), approximately 95% of measured spectra lack corresponding reference entries in databases, creating substantial annotation gaps [53]. This limitation has spurred development of more flexible spectral matching approaches that can tolerate analytical differences between experimental conditions and reference databases, though these introduce new challenges in distinguishing correct from incorrect matches [53].
Molecular networking and network annotation propagation (NAP) represent significant advances in computational metabolomics that address annotation ambiguities by grouping molecules of likely high chemical similarity based on their MS/MS spectra [53]. This approach allows propagation of chemical identities from confidently annotated molecules to structurally related unknowns within the same spectral network, effectively expanding annotation coverage across chemically related compound families. As noted in recent reviews, "identified (or annotated) molecules allow the propagation and use of this chemical identity to improve the annotation of other unidentified or unannotated members of this metabolite group or molecular family" [53]. This strategy is particularly valuable for characterizing novel transformation products or metabolic derivatives that share core structural elements with known compounds.
Machine learning (ML) approaches are redefining the potential of NTA by identifying latent patterns within high-dimensional data that traditional statistical methods often miss. ML classifiers such as Support Vector Classifier (SVC), Logistic Regression (LR), and Random Forest (RF) have demonstrated impressive performance in source tracking applications, with balanced accuracy ranging from 85.5% to 99.5% for classifying per- and polyfluoroalkyl substances (PFAS) across different contamination sources [27]. These pattern recognition capabilities make ML particularly valuable for discriminating between true signals and false positives based on subtle spectral and chromatographic features that may not be apparent through manual inspection.
Prediction-based prioritization represents another powerful approach for managing annotation uncertainties, using quantitative structure-property relationships (QSPR) and machine learning models to estimate risk parameters even when complete structural identification remains unresolved [45] [9]. Tools such as MS2Quant predict concentrations directly from MS/MS spectra, while MS2Tox estimates toxicity parameters (e.g., LC50) from fragmentation patterns [9]. These predictive approaches enable calculation of risk quotients (PEC/PNEC - Predicted Environmental Concentration vs. Predicted No Effect Concentration), providing a scientifically defensible basis for prioritizing features with potential high environmental or health impacts despite incomplete identification [9].
The integration of diverse computational strategies into a cohesive workflow significantly enhances the reliability of NTA annotations. The INTERPRET NTA platform developed by the US Environmental Protection Agency exemplifies this integrated approach, combining chemical metadata from the AMOS database, predicted spectra for approximately 1.2 million chemical substances from the DSSTox database, and hazard values from the Cheminformatics Hazard Module (CHM) to support defensible review and interpretation of NTA results [8]. Validation studies demonstrated that known chemicals showed higher values for metadata, MS2, and hazard scores in 99.0%, 80.5%, and 92.0% of cases, respectively, compared to false positives, providing multiple orthogonal metrics for distinguishing reliable annotations from ambiguous assignments [8].
Diagram 1: Integrated computational workflow for managing false positives and annotation ambiguities in NTA studies. The workflow progresses through three major phases, with iterative refinement mechanisms (dashed lines) to improve annotation confidence.
Background interference represents a significant source of false positives in mass spectrometry-based screening, particularly in high-throughput applications. Traditional approaches that limit background analysis to a narrow window of adjacent samples (e.g., "5 wells before and after" in plate-based assays) may miss distributed interference patterns, leading to false positive identification. A case study using a 250,000-compound library across 96 wells demonstrated that expanding the cross-hit analysis window to encompass the entire plate reduced confirmed hits by 33%, indicating that nearly one-third of initial "hits" were actually false positives resulting from recurring background noise [54].
Table 2: Experimental Protocol for Comprehensive Cross-Hit Analysis
| Step | Procedure | Parameters | Quality Control |
|---|---|---|---|
| 1. Sample Preparation | Distribute samples across plate wells with randomized controls | 96- or 384-well format, randomized control placement | Document plate layout and control positions |
| 2. Data Acquisition | Acquire MS data using standardized instrument methods | Consistent ionization settings across all wells | Include system suitability tests |
| 3. Cross-Hit Analysis | Perform experiment-wide background interference assessment | Expand search window to entire plate, not just adjacent wells | Automated analysis using tools like Virscidian's Analytical Studio |
| 4. Hit Confirmation | Apply consistent thresholding across all samples | Statistical significance above plate-wide background | Manual review of ambiguous signals |
| 5. Validation | Confirm true hits with orthogonal techniques | Retention time alignment, MS/MS fragmentation | Compare to reference standards when available |
This protocol emphasizes the critical importance of plate-wide background assessment rather than localized interference evaluation. Automated cross-hit analysis tools dramatically improve reliability by detecting recurring background peaks and applying consistent thresholds objectively across the entire dataset [54]. Implementation of this expanded search window approach conserves valuable resources by reducing false leads and increasing confidence in final hit lists.
Structurally similar compounds with nearly identical fragmentation patterns present particular challenges for accurate identification, as traditional spectral matching algorithms may struggle to distinguish between closely related analogs. This problem is especially pronounced in forensic and pharmaceutical applications where families of compounds share core structural elements. For example, in samples containing only MDMA, it is common for MDA to be falsely reported as present because both molecules fragment to produce similar core structures with nearly identical higher energy fragmentation profiles [55].
The experimental protocol for addressing this specific false positive mechanism involves targeted verification of molecular ions alongside conventional spectral matching. Research demonstrates that while MDA and MDMA produce nearly identical fragment profiles at higher cone voltages (e.g., 35V), they are clearly distinguished by their molecular ions in lower energy function (m/z 180 for MDA vs. m/z 194 for MDMA) [55]. By implementing a molecular ion target filter that gates identification based on the presence of the expected molecular ion, the false positive rate for structurally similar analogs can be significantly reduced.
Implementation Protocol:
This approach was experimentally validated using mixtures of MDA and MDMA across concentration ranges from 100% MDA to 100% MDMA. While standard non-target processing produced false positives for MDA in MDMA-only samples, implementation of the molecular ion target method eliminated these false identifications while correctly confirming the presence of MDA when the m/z 180 mass was detected [55].
A comprehensive, tiered validation strategy is essential for establishing confidence in NTA results, particularly when supporting regulatory decisions or health-related assessments. This approach integrates multiple orthogonal validation measures throughout the analytical workflow, progressing from basic analytical confirmation to environmental plausibility assessments [27].
Table 3: Tiered Validation Protocol for NTA Studies
| Validation Tier | Assessment Methods | Acceptance Criteria | Documentation |
|---|---|---|---|
| Analytical Confidence | Certified reference materials (CRMs), spectral library matches, retention time prediction | MSI level 1-3 identification based on available standards | Spectral similarity scores, retention time deviations |
| Model Performance | Cross-validation (e.g., 10-fold), external dataset testing, balanced accuracy metrics | Accuracy >80%, precision appropriate to application | Confusion matrices, performance metrics, overfitting assessment |
| Environmental Plausibility | Geospatial correlation, known source signatures, chemical fate principles | Consistent with known transport/transformation pathways | Correlation with contextual data, source-receptor relationships |
This tiered approach bridges analytical rigor with real-world relevance, ensuring results are both chemically accurate and environmentally meaningful [27]. For ML-based NTA applications, particular emphasis should be placed on model interpretability, with strategies such as feature importance analysis and rational attribution provided to overcome the "black-box" limitations of complex algorithms like deep neural networks [27].
Successful implementation of false positive reduction strategies requires access to specialized computational tools, databases, and analytical resources. The following table summarizes key resources mentioned in the literature that support various aspects of NTA workflow optimization and validation.
Table 4: Essential Research Reagents and Computational Resources for NTA
| Resource Category | Specific Tools/Databases | Primary Function | Application in False Positive Reduction |
|---|---|---|---|
| Spectral Databases | NORMAN Suspect List Exchange, US EPA AMOS, PubChemLite | Reference spectra for suspect screening | Provides validated benchmarks for annotation |
| Chemical Structure Databases | US EPA DSSTox (~1.2 million substances) | Chemical structures and properties | Supports structure-based prediction and prioritization |
| Hazard Assessment Tools | US EPA Cheminformatics Hazard Module (CHM) | Hazard value calculation | Enables risk-based prioritization of features |
| Data Processing Platforms | INTERPRET NTA, XCMS, Analytical Studio | Automated data processing and analysis | Reduces manual review errors and subjective bias |
| Prediction Tools | MS2Quant, MS2Tox | Concentration and toxicity prediction | Supports risk-based prioritization without full identification |
| Statistical Analysis Software | AnalyzerPro XD, various R/Python packages | Multivariate statistical analysis | Identifies patterns distinguishing true signals from artifacts |
These resources collectively enable the implementation of the integrated strategies discussed throughout this guide. Platforms such as INTERPRET NTA are particularly valuable as they combine multiple functionalities—accessing chemical metadata from AMOS, retrieving predicted spectra from DSSTox, and obtaining hazard values from CHM—within a unified interface that supports defensible review and reporting of NTA results [8].
The expanding chemical landscape facing environmental and pharmaceutical researchers necessitates robust, systematic approaches for managing the uncertainties inherent in non-targeted analysis. By implementing the integrated prioritization strategies, computational workflows, and experimental protocols outlined in this technical guide, researchers can significantly enhance the reliability of their NTA results while efficiently focusing resources on the most chemically and toxicologically significant findings. The continued development and standardization of these approaches, particularly through improved benchmarking of computational tools and validation of machine learning applications, will further strengthen the translation of NTA data from exploratory research to actionable environmental and health decisions [53] [27] [14]. As the field progresses, emphasis should remain on creating transparent, defensible workflows that explicitly address uncertainty quantification and provide stakeholders with clear understanding of result limitations and appropriate applications.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) represents a powerful, discovery-focused approach for characterizing unknown chemicals in complex samples without a priori knowledge of the sample's chemical content [3]. Unlike traditional targeted methods, NTA aims to capture a broader chemical space, making robust quality assurance and quality control (QA/QC) measures imperative to ensure data quality and consistency [56]. The fundamental challenge in NTA lies in minimizing the risk of losing potential substances of interest (false negatives) while maintaining confidence in identified compounds [56]. Given the relative novelty of comprehensive NTA workflows and the absence of universal benchmarks for "good" performance, researchers must implement specialized QA/QC approaches throughout the entire analytical process [44] [57].
The quality assurance framework for NTA encompasses all practices, benchmarks, and assessments that ensure the reliability of non-targeted analysis [44]. This includes defining the chemical space (the physicochemical property space spanned by detectable and identifiable chemicals), assessing accuracy (closeness of agreement between results and known true values), and determining precision (closeness of agreement between replicated results) [44] [3]. This technical guide details the critical QA/QC measures required throughout the NTA workflow, providing researchers and drug development professionals with a structured approach to quality management in non-targeted applications.
A systematic QA/QC framework for NTA evaluates performance across two primary domains: data acquisition and data processing/analysis [44]. Within these domains, four key aspects should be assessed: quality, boundary, accuracy, and precision. The following table defines these core concepts as adapted from IUPAC 2019 guidelines for NTA performance assessment [44].
Table 1: Core QA/QC Performance Aspects in Non-Targeted Analysis
| Performance Aspect | Definition in NTA Context | Key Assessment Questions |
|---|---|---|
| Quality | The QA/QC practices, benchmarks, and assessments for the non-targeted analysis, including adherence to protocols and QC benchmarks [44]. | Were stated QA/QC protocols followed? Were deviations documented and their implications discussed? [44] |
| Boundary | Describes the chemical and analytical space of the non-targeted analysis, including limitations of sample prep, instrumentation, and data processing [44]. | Did authors discuss how methods impacted the observable chemical space (e.g., extraction recoveries, ionization efficiency)? [44] |
| Accuracy | The closeness of agreement between NTA results (e.g., mass error, identification) and known true values [44]. | Did authors report method performance in correctly classifying samples or identifying known chemicals? [44] |
| Precision | The closeness of agreement between results when components of the experiment are replicated, including repeatability and reproducibility [44]. | Did authors communicate the repeatability/reproducibility of key measures across replicates, samples, or batches? [44] |
The foundation of effective QA/QC begins with careful study design that clearly defines objectives and scope with respect to targeted analysis, suspect screening analysis (SSA), and true NTA [3]. Suspect screening analysis identifies chemicals by comparison to a predefined list or library, thereby narrowing the study's scope, while true NTA attempts characterization without a predefined library [3]. The chemical space—the physicochemical property space spanned by detectable and identifiable chemicals—is fundamentally shaped by methodological choices made during study design [3] [19]. Key analytical considerations that influence the detectable chemical space include: (1) sample matrix type, (2) extraction solvent and pH, (3) extraction/cleanup media, (4) elution buffers, (5) instrument platform, (6) chromatography conditions, (7) ionization type, and (8) ionization mode [19].
Sample preparation requires careful optimization to balance selectivity and sensitivity, aiming to remove interfering components while preserving as many compounds as possible with adequate sensitivity [27]. The following protocols are essential for QA/QC during sample preparation:
Data acquisition QA/QC covers performance with respect to objectives & scope, sample information & preparation, chromatography, and mass spectrometry [44]. Assessments typically rely on data from QC spikes and samples, though certain aspects can also be evaluated with real samples containing unknown chemical constituents [44].
Table 2: Data Acquisition QA/QC Measures and Assessment Protocols
| Analytical Domain | QA/QC Measures | Recommended Assessment Protocols |
|---|---|---|
| Chromatography | Retention time stability, separation efficiency, polarity range of detected compounds [44]. | Monitor deviation of retention time (RT) from expected RT for QC spikes; assess separation of isomeric compounds of interest [44]. |
| Mass Spectrometry | Mass accuracy, observed matrix effects, ionization efficiency, mass error range [44]. | Evaluate observed mass error range for QC spikes; list monoisotopic masses of known chemicals and describe deviations of observed accurate masses from known values [44]. |
| System Performance | Signal intensity stability, carryover assessment, instrument detection limits [44] [56]. | Analyze QC reference materials at regular intervals; monitor signal drift over time; implement cleaning procedures to minimize carryover [44] [56]. |
| Overall Precision | Variability (repeatability/reproducibility) across replicates, samples, or batches [44]. | Calculate relative standard deviation (RSD), standard deviation (SD), or coefficient of variation (CV) for mass error, RT, and peak intensity across replicate analyses [44]. |
The data acquisition process for HRMS-based NTA typically uses information-dependent acquisition (IDA) or data-dependent acquisition (DDA), where the mass analyzer performs accurate mass scans of precursor ions and selects the most abundant ions for successive MS/MS analysis [58]. This cycle repeats throughout the chromatographic run, generating data files containing all precursor ion scans and dependent product ion scans for subsequent analysis [58].
Diagram 1: Comprehensive QA/QC Framework for NTA Workflow. This diagram illustrates the integrated quality assurance and quality control measures throughout the non-targeted analysis pipeline, from sample preparation to final validation.
Data processing and analysis QA/QC covers performance with respect to data processing, statistical & chemometric analysis, and annotation & identification [23]. These assessments rely on data from QC spikes and samples, and in some cases, can be evaluated with real samples containing unknown chemical constituents [44].
Data processing transforms raw data into meaningful information through steps that intentionally reduce data complexity [23]. Key QA/QC measures for data processing include:
Statistical and chemometric analyses aid summarization, evaluation, and interpretation of processed data [23]. QA/QC measures include:
Annotation attributes properties or molecular characteristics to MS1 features or MS/MS product ions, while identification provides enough evidence to attribute a specific compound to a detected feature [23]. QA/QC measures include:
Table 3: Data Processing and Analysis QA/QC Performance Assessment
| Processing Stage | QA/QC Assessment | Performance Metrics |
|---|---|---|
| Data Processing Quality | Results of QC checks throughout data processing workflow [44]. | Detection of QC features/compounds; alignment of features across technical replicates; filtering of blank compounds [44]. |
| Data Analysis Boundary | Description of capabilities of data processing and analysis methods [44]. | Chemical space of selected library/database; information available in libraries; estimated limits of detection/identification [44]. |
| Annotation Accuracy | Ability to correctly classify samples or identify known chemicals [44]. | Performance calculations from confusion matrix for samples with known classification or known compounds in QC spikes [44]. |
| Identification Precision | Repeatability/reproducibility of performance for QC samples analyzed multiple times [44]. | Consistency of correct identification across replicates, samples, or batches; performance measures from confusion matrix [44]. |
A robust, tiered validation strategy ensures the reliability of NTA outputs through multiple verification layers [27]:
The NTA Study Reporting Tool (SRT) provides an interdisciplinary framework for comprehensive methods and results reporting, organized by study chronology with scored sections based on reporting quality [57]. Key reporting elements include:
Diagram 2: Tiered Validation Strategy for NTA Results. This diagram outlines the multi-layered approach required to validate non-targeted analysis outputs, incorporating analytical, statistical, and environmental plausibility assessments.
Implementation of robust QA/QC measures requires specific research reagents and software tools. The following table details essential resources for NTA workflows.
Table 4: Essential Research Reagent Solutions for NTA QA/QC
| Reagent/Software Category | Specific Examples | Function in QA/QC |
|---|---|---|
| QC Spikes & Reference Materials | Certified Reference Materials (CRMs), Isotope-labeled internal standards, Performance evaluation standards [44] [27] | Verify analytical accuracy, monitor instrument performance, assess matrix effects, enable quantification [44] [27] |
| Sample Preparation Media | Multi-sorbent SPE cartridges (Oasis HLB, ISOLUTE ENV+, Strata WAX/WCX), QuEChERS kits [27] | Ensure comprehensive analyte recovery, minimize matrix interference, maintain broad chemical coverage [27] |
| Data Processing Software | Open-source: XCMS, MZmine, SIRIUS, MS-DIAL, PatRoon, InSpectra [59] | Perform feature detection, alignment, and annotation; enable transparent and customizable processing workflows [59] |
| Commercial Data Analysis Platforms | Thermo Compound Discoverer, Agilent MassHunter [19] [59] | Provide integrated workflows for data processing, statistical analysis, and database searching [19] [59] |
| Spectral Libraries & Databases | NIST Mass Spectral Library, mzCloud, MassBank, in-house MS/MS databases [19] [59] | Enable compound identification and confirmation through spectral matching [19] [59] |
Implementing critical QA/QC measures throughout the NTA workflow is essential for producing reliable, reproducible results in non-targeted analysis using high-resolution mass spectrometry. While universal benchmarks for NTA performance remain elusive, researchers should always conduct performance self-assessments and transparently report findings using shared terminology [44]. The integrated framework presented in this guide—spanning study design, sample preparation, data acquisition, data processing, and validation—provides a structured approach to quality management in NTA studies. As the field continues to evolve, widespread adoption of comprehensive QA/QC protocols and reporting standards will enhance the scientific rigor necessary for utilizing NTA study data in regulatory decision-making and risk assessment contexts [57]. Future advancements will likely focus on establishing more harmonized guidelines, improving QA/QC measures for quantitative NTA, and integrating artificial intelligence/machine learning tools into quality assessment workflows [59].
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for comprehensively characterizing unknown and unexpected chemicals in complex samples. Unlike targeted methods that focus on predefined analytes, NTA aims to detect and identify a broad range of chemical compounds without prior knowledge of sample composition [5]. This capability makes NTA particularly valuable for discovering emerging contaminants, characterizing complex mixtures, and identifying chemical signatures in environmental, biological, and product samples. However, the very nature of NTA—its openness to detecting the "unknown"—presents significant challenges for assessing and communicating method performance [5].
The establishment of robust performance assessment metrics is critical for advancing NTA from a research technique to a reliable analytical approach that can support regulatory decisions and risk assessments. Without standardized performance metrics, it remains difficult to compare results across different laboratories, instruments, and methods, or to determine whether NTA data are fit for specific purposes [33] [5]. Performance assessment in NTA must address both qualitative aspects (chemical identification and sample classification) and quantitative aspects (concentration estimation), each with distinct challenges and requirements for metrics [44] [5]. This technical guide provides a comprehensive overview of current frameworks, metrics, and experimental approaches for assessing both qualitative and quantitative NTA performance, contextualized within the broader field of HRMS data interpretation research.
Qualitative NTA performance primarily concerns the accuracy and reliability of chemical identifications. Unlike targeted analysis where identifications are confirmed using reference standards, NTA often relies on tiered confidence levels for reporting identifications when authentic standards are unavailable. The community has established a framework that classifies identifications into five confidence levels [27]:
The distribution of identifications across these confidence levels serves as a key qualitative performance metric, with higher proportions of Level 1-2 identifications indicating better performance.
For qualitative NTA studies focused on sample classification or chemical detection, performance can be assessed using a confusion matrix approach [44] [5]. This method evaluates a method's ability to correctly classify samples or identify known chemicals in quality control samples. The confusion matrix enables calculation of several key performance metrics:
Table 1: Performance Metrics Derived from Confusion Matrix Analysis
| Metric | Calculation | Interpretation |
|---|---|---|
| Accuracy | (TP + TN) / (TP + TN + FP + FN) | Overall correctness of classifications/identifications |
| Precision | TP / (TP + FP) | Reliability of positive findings |
| Recall (Sensitivity) | TP / (TP + FN) | Ability to detect true positives |
| Specificity | TN / (TN + FP) | Ability to exclude true negatives |
| F1-Score | 2 × (Precision × Recall) / (Precision + Recall) | Balanced measure of precision and recall |
TP = True Positive; TN = True Negative; FP = False Positive; FN = False Negative
These metrics are particularly valuable for assessing performance in studies involving sample classification (e.g., distinguishing contaminated vs. clean samples) or for evaluating detection capabilities using spiked quality control samples with known chemical composition [5].
Assessment of data acquisition quality provides fundamental metrics for qualitative NTA performance. These metrics evaluate the technical performance of the instrumental analysis and help identify potential issues that could affect chemical identifications [44]:
These metrics are typically evaluated using quality control samples analyzed throughout the analytical sequence and compared against pre-established benchmarks where available [44].
Quantitative non-targeted analysis (qNTA) aims to provide concentration estimates for chemicals detected in NTA without requiring compound-specific calibration [60]. This represents a significant advancement beyond purely qualitative applications, but introduces additional challenges for performance assessment. A recently proposed framework for qNTA performance evaluation focuses on three key aspects: accuracy, uncertainty, and reliability [60].
Table 2: Core Performance Metrics for Quantitative NTA
| Performance Aspect | Definition | Calculation Approach |
|---|---|---|
| Accuracy | Closeness of agreement between estimated and true concentration | Ratio of predicted to true concentration or relative error |
| Uncertainty | Range within which the true value is expected to lie | 95% inverse confidence intervals from bootstrap approaches |
| Reliability | Proportion of cases where confidence intervals contain true values | Percentage of predictions where true concentration falls within confidence bounds |
This framework recognizes that qNTA approaches inherently exhibit higher uncertainty compared to targeted methods, and provides standardized metrics for communicating this uncertainty to stakeholders [60].
Research comparing quantitative performance across methodological approaches reveals important patterns in qNTA capabilities. A recent study examining PFAS quantification found that the most generalizable qNTA approach (using "global" surrogates) showed decreased accuracy by a factor of ~4, increased uncertainty by a factor of ~1000, and decreased reliability by ~5% on average compared to a benchmark targeted approach using matched calibration curves and internal standard correction [60].
These performance differences highlight both the current limitations and potential utility of qNTA. While qNTA cannot match the performance of optimized targeted methods for specific compounds, it provides valuable semi-quantitative estimates that support preliminary risk assessments and prioritization decisions, particularly for chemicals lacking reference standards [60].
The accuracy of qNTA concentration estimates depends heavily on the selection of appropriate calibration surrogates. Different surrogate selection strategies yield different performance characteristics [60]:
The choice among these approaches involves trade-offs between accuracy, uncertainty, and applicability domains that must be considered when designing qNTA studies and interpreting their results [60].
A fundamental approach for assessing both qualitative and quantitative NTA performance involves analysis of quality control (QC) samples containing known chemicals at known concentrations [44] [5]. These samples are typically analyzed at regular intervals throughout the analytical sequence to monitor performance over time. The recommended protocol includes:
This approach provides direct assessment of method capabilities and limitations for specific chemical classes and concentration ranges [44].
Interlaboratory comparisons provide critical data on reproducibility and standardization of NTA methods [61]. While formal interlaboratory studies for NTA are still emerging, existing models from related fields provide guidance:
Such studies help identify major sources of variability and establish performance benchmarks for the community [61].
Performance assessment must extend to data processing and analysis steps, which can introduce significant variability in NTA results [44]. Recommended approaches include:
These measures help ensure that data processing steps do not inadvertently introduce errors or biases that affect study conclusions [44].
NTA Performance Assessment Workflow
This workflow illustrates the integrated approach to assessing both qualitative and quantitative NTA performance, highlighting the key metrics at each stage and their relationship to overall study objectives.
qNTA Methodology and Performance
This diagram illustrates the experimental approaches for quantitative NTA and typical performance outcomes relative to targeted analysis, highlighting the trade-offs between different surrogate selection strategies.
Table 3: Key Research Reagents and Materials for NTA Performance Assessment
| Reagent/Material | Function in Performance Assessment | Application Examples |
|---|---|---|
| QC Standard Mixtures | Benchmarking detection capabilities and quantitative performance | Community-defined mixtures for interlaboratory comparisons [33] |
| Stable Isotope-Labeled Standards | Internal standards for quantitative accuracy assessment | Isotope dilution methods for qNTA [60] |
| Matrix-Matched Calibrants | Evaluation of matrix effects on qualitative and quantitative performance | Studying quantitative bias in different environmental matrices [33] |
| Reference Materials | Ground truth for method validation and accuracy assessment | Certified Reference Materials (CRMs) for specific sample types [27] |
| Retention Time Index Standards | Chromatographic performance monitoring and alignment | Homologous series of compounds for retention time calibration |
| Ionization Efficiency Standards | Calibration of quantitative response factors | Compounds for predicting ionization efficiency in qNTA [60] |
These research reagents form the foundation for systematic assessment of NTA method performance, enabling laboratories to benchmark their capabilities and identify areas for improvement.
The establishment of robust performance assessment metrics for both qualitative and quantitative NTA represents a critical step toward broader adoption and application of these powerful techniques. While significant progress has been made in developing frameworks and metrics, particularly through community-driven initiatives such as the BP4NTA working group, challenges remain in standardizing approaches and establishing universal benchmarks [33] [44]. The metrics and experimental protocols outlined in this guide provide a foundation for researchers to systematically evaluate their NTA methods, communicate performance limitations transparently, and work toward generating comparable, reliable data across laboratories and applications. As the field continues to evolve, further refinement of these metrics—particularly for quantitative applications—will enhance the utility of NTA for environmental monitoring, exposure assessment, and regulatory decision-making.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for detecting and identifying unknown and unexpected compounds across diverse sample matrices, including environmental, biological, and food samples [14] [1]. Unlike targeted analytical methods with well-established performance criteria, NTA generates information-rich data with inherent uncertainties that complicate performance assessment and interpretation [14]. The absence of standardized validation procedures has significantly limited the adoption of NTA data by stakeholders and regulatory bodies [14] [62]. Systematic validation strategies are therefore essential to establish confidence in NTA findings and enable their effective utilization in chemical risk assessment, exposure science, and drug development [62].
The fundamental challenge in NTA validation stems from the core difference between targeted and non-targeted approaches. In targeted analysis, performance metrics for selectivity, sensitivity, accuracy, and precision are well-defined, with results considered unambiguously true within defined tolerances [14]. In contrast, NTA data are inherently less certain: reported compound identifications may be incorrect, absent compounds may be missed, and quantitative estimates may lack confidence intervals [14]. This technical guide outlines comprehensive validation strategies to address these challenges, providing researchers with structured approaches to demonstrate the reliability of their NTA findings for scientific and regulatory applications.
Validation approaches for NTA must be aligned with study objectives, as different goals require distinct performance assessments. Research indicates that most NTA projects fall into three primary categories [14]:
Table 1: NTA Study Objectives and Corresponding Validation Focus Areas
| Study Objective | Primary Output | Key Validation Metrics |
|---|---|---|
| Sample Classification | Pattern recognition and group differentiation | Confusion matrix statistics, model repeatability, transferability between instruments |
| Chemical Identification | Compound annotation and structural elucidation | Confidence levels, identification error rates, spectral matching scores |
| Chemical Quantitation | Concentration estimates | Accuracy, precision, linearity, detection limits |
For sample classification, the focus is on correctly categorizing samples based on their chemical profiles, often using multivariate statistical models [14]. Validation here must assess whether classification models are repeatable over time and transferable between instruments or laboratories [14]. For chemical identification, the goal is to accurately annotate and identify unknown compounds, with validation requiring confidence levels and error rate assessments [14]. Chemical quantitation in NTA aims to provide reliable concentration estimates, necessitating traditional analytical validation approaches adapted to NTA's unique challenges [14] [62].
A robust validation framework for NTA should incorporate both qualitative and quantitative performance metrics, adapted from targeted analysis but acknowledging the distinct characteristics of NTA [14]. For qualitative studies focusing on sample classification and chemical identification, performance can be assessed using a confusion matrix approach, despite some limitations and challenges [14]. Key metrics include:
For quantitative NTA studies, performance assessment should include estimation procedures developed for targeted methods, with consideration for additional sources of uncontrolled experimental error [14]. These include accuracy (closeness to true value), precision (measurement reproducibility), and sensitivity (limit of detection) [14]. The specific combination and application of these metrics depend on the NTA study objectives and the required confidence level for the intended application.
Implementing comprehensive QA/QC protocols throughout the NTA workflow is fundamental to generating reliable, validated results. While specific QA/QC approaches should be incorporated throughout any NTA workflow to evaluate specific method steps [14], several key measures are particularly critical for validation:
The specific QA/QC measures should be documented thoroughly, including acceptance criteria for each parameter. For publications, primary method steps should be noted in the main text, with detailed procedures and settings provided in supporting information [23].
Effective validation requires carefully designed sample sets that challenge the NTA method across its intended application range. Sample design should include:
The validation sample set should be of sufficient size and diversity to provide statistical confidence in performance estimates, with recommended minimums of 15-20 representative compounds across the concentration range of interest [63].
The transformation of raw HRMS data into meaningful chemical information requires multiple processing steps, each requiring validation to ensure data integrity [23]. The typical workflow consists of three main segments:
Diagram 1: NTA Data Processing Workflow
Data format conversion transforms raw data files to usable formats (e.g., .d, .raw to mzML, mzXML) without intentional data interpretation [23]. Data processing then reduces the raw data to meaningful information through steps including retention time alignment, peak detection, adduct and isotopologue grouping, and between-sample alignment [23]. Statistical and chemometric analysis identifies trends and relationships between samples and detections, while annotation and identification attributes molecular characteristics and specific compounds to detected features [23]. Each step requires specific quality control checkpoints to ensure proper validation.
Establishing confidence in compound identifications is a cornerstone of NTA validation. A standardized framework for communicating identification confidence should be implemented, typically consisting of multiple levels [23] [64]:
Table 2: Confidence Levels for Compound Identification in NTA
| Confidence Level | Required Evidence | Typical Applications |
|---|---|---|
| Level 1 | Confirmed by reference standard (match on retention time, accurate mass, and fragmentation spectrum) | Definitive identification for regulatory decisions |
| Level 2 | Probable structure based on library spectrum match (without RT confirmation) or characteristic fragmentation | Prioritization for further investigation |
| Level 3 | Tentative candidate based on diagnostic evidence (e.g., class-specific fragmentation) | Compound class identification |
| Level 4 | Molecular formula assignment based on accurate mass and isotope pattern | Elemental composition determination |
| Level 5 | Accurate mass of interest (exact mass match to database) | Suspect screening |
The confidence level should be explicitly reported for all compound identifications in NTA studies, along with the specific evidence supporting the assignment [23]. This framework enables appropriate interpretation of identification confidence by stakeholders and facilitates comparison across studies.
Parameter optimization is critical for reliable compound identification. An empirical approach to determining optimal positivity criteria has been demonstrated to achieve high identification efficiency [63]. Key parameters requiring optimization include:
One validated approach uses a combined scoring threshold of 70, with 70% weight given to library match and 10% weight each to mass error, retention time error, and isotope pattern difference, achieving identification efficiency of 99.2% [63]. This demonstrates the importance of library matching while incorporating orthogonal identification evidence.
While traditionally qualitative, NTA is increasingly being applied to generate quantitative estimates, necessitating robust validation approaches [62]. Several strategies have emerged for quantitative NTA (qNTA) validation:
Each approach requires specific validation experiments to establish performance characteristics, including linearity, accuracy, precision, and sensitivity over the concentration range of interest [62].
Comprehensive validation of quantitative NTA methods should establish performance across multiple parameters, adapted from targeted analysis but with consideration for NTA-specific challenges [14] [62]:
Table 3: Validation Parameters for Quantitative NTA
| Parameter | Assessment Method | Acceptance Criteria |
|---|---|---|
| Linearity | Analysis of calibration standards across concentration range | R² > 0.98, residual plots without pattern |
| Accuracy | Comparison to reference methods or spiked recovery | 70-120% recovery for most matrices |
| Precision | Repeated analysis of quality control samples | < 20% RSD for intermediate precision |
| Limit of Detection | Signal-to-noise approach or statistical methods | Sufficient for intended application |
| Matrix Effects | Comparison of solvent vs. matrix-matched standards | Consistent ionization suppression/enhancement |
| Carryover | Analysis of blanks after high concentration samples | < 20% of LOD |
Establishing these parameters provides stakeholders with confidence in quantitative estimates derived from NTA data, enabling their use in risk assessment and decision-making contexts [62].
Machine learning (ML) approaches offer promising avenues for enhancing NTA validation through improved chemical structure identification, advanced quantification methods, and enhanced toxicity prediction capabilities [1]. ML applications in NTA validation include:
While ML-assisted validation shows significant promise, challenges remain in refining these tools for complex mixtures and improving inter-laboratory validation [1].
Robust database infrastructure supports validation through curated reference data and standardized data exchange formats [65]. Key developments include:
Tools such as the Database Infrastructure for Mass Spectrometry (DIMSpec) provide open-source toolkits for creating portable databases with comprehensive metadata, supporting validation through contextual data preservation [65].
Comprehensive reporting of validation parameters is essential for interpreting NTA results and assessing their reliability. Minimum reporting standards should include:
Adherence to standardized reporting frameworks facilitates comparison across studies and builds confidence in NTA findings among stakeholders [14].
Implementation of systematic NTA validation requires specific resources and tools. The following table details key solutions and their functions in the validation process:
Table 4: Essential Research Resources for NTA Validation
| Resource Category | Specific Tools/Resources | Function in Validation |
|---|---|---|
| Reference Spectral Libraries | NIST Mass Spectral Library, MassBank, DIMSpec databases [65] | Provide reference spectra for identification confirmation and confidence assessment |
| Chemical Databases | EPA CompTox Dashboard, PubChem, CAS [62] | Supply structural information and metadata for identification and hazard context |
| Data Processing Tools | patRoon, RMassBank, XCMS [65] | Enable reproducible data processing with documented parameters |
| QC Materials | Custom synthetic opioid libraries [66], PFAS standards [65] | Provide reference materials for method validation and performance monitoring |
| Statistical Packages | R-based tools, Python libraries for chemometrics [1] | Support statistical validation and model performance assessment |
| FAIR Data Infrastructure | DIMSpec toolkit, mzML format [65] | Ensure data preservation, sharing, and reusability for validation |
Systematic validation of NTA findings is essential for building confidence in results and enabling application in regulatory and decision-making contexts. This guide has outlined comprehensive strategies spanning experimental design, data processing, compound identification, quantitative analysis, and reporting. By implementing these structured approaches, researchers can demonstrate the reliability of their NTA findings, address inherent uncertainties in non-targeted approaches, and facilitate broader adoption of NTA data by stakeholders. As the field continues to evolve, standardization of validation practices across laboratories will further enhance the comparability and interpretability of NTA results, ultimately strengthening their utility in chemical safety assessment and public health protection.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful discovery-based approach for identifying unknown or unsuspected chemicals in complex samples across various fields, including environmental monitoring, food safety, and exposomics [19]. Unlike targeted methods that focus on predefined analytes, NTA aims to characterize sample composition without a priori knowledge of chemical content, making it particularly valuable for detecting emerging contaminants and identifying previously unrecognized chemical exposures [67]. However, the complexity of NTA workflows and the challenge of assigning identities to detected chemical features have highlighted the critical need for standardized confidence frameworks to ensure transparent and reproducible reporting of chemical identifications.
Confidence level frameworks provide systematic approaches for communicating the certainty associated with chemical identifications in NTA studies. These frameworks establish standardized criteria based on the type and quality of analytical data supporting each identification, enabling researchers, reviewers, and end-users to appropriately interpret and trust reported findings. The implementation of such frameworks addresses fundamental challenges in NTA, including variable data quality across laboratories, subjective interpretation of analytical results, and insufficient reporting of identification evidence [68]. This technical guide examines the established confidence frameworks, detailed methodologies for achieving different confidence levels, and practical implementation strategies within the broader context of NTA research using high-resolution mass spectrometry.
The most widely adopted confidence framework in NTA is the hierarchical scale proposed by Schymanski et al., which categorizes chemical identifications into five distinct levels based on the strength of supporting evidence [67]. This system has become the community standard for reporting identification confidence, with specific criteria that must be met at each level:
Level 1: Confirmed Structure - Requires matching of both retention time and mass spectral data (including fragmentation spectrum) with an authentic standard analyzed under identical analytical conditions [67]. This level provides the highest confidence and is considered definitive confirmation.
Level 2: Probable Structure - Demonstrates concordance of experimental MS/MS spectra with literature or library spectral data, but lacks confirmation with an authentic standard [67]. While highly confident, this level acknowledges potential for isomeric compounds to produce similar fragmentation patterns.
Level 3: Tentative Candidate - Assigns a specific structure based on diagnostic evidence such as spectral similarity to related compounds, characteristic fragmentation patterns, or class-specific retention time behavior, but with insufficient evidence for definitive identification [67].
Level 4: Unequivocal Molecular Formula - Determines a specific molecular formula based on accurate mass measurement and isotope pattern matching, but cannot distinguish between isomeric structures [67].
Level 5: Exact Mass of Interest - Identifies a feature of interest based solely on accurate mass measurement without additional supporting evidence for structure or molecular formula [67].
Table 1: Schymanski Confidence Levels for Chemical Identification in NTA
| Confidence Level | Identification Type | Required Evidence | Typical Applications |
|---|---|---|---|
| Level 1 | Confirmed structure | Match to authentic standard for RT and MS/MS | Definitive identification for risk assessment |
| Level 2 | Probable structure | Library spectrum match without reference standard | Compound identification when standards unavailable |
| Level 3 | Tentative candidate | Diagnostic evidence (class-specific fragments, etc.) | Structural class identification |
| Level 4 | Unequivocal molecular formula | Accurate mass + isotope pattern | Formula assignment for unknown prioritization |
| Level 5 | Exact mass feature | Accurate mass only | Feature detection and prioritization |
Beyond the Schymanski scale, additional approaches provide complementary confidence metrics for specific aspects of NTA. Kilgour et al. developed confidence metrics for automatic peak assignment that combine mass accuracy, relative ion abundance, and rings-plus-double-bonds equivalence with novel metrics based on interconnectivity of mass difference networks and confidence of initial library matches [69]. These metrics help analysts determine appropriate degrees of trust in automated elemental formula assignments, particularly for complex natural organic materials where manual verification is impractical.
The NTA Study Reporting Tool (SRT) provides a comprehensive framework for assessing reporting quality across all aspects of NTA studies, including chemical identification protocols [68]. Developed by the Benchmarking and Publications for Non-Targeted Analysis (BP4NTA) working group, the SRT structures reporting requirements according to study chronology and emphasizes harmonization between methods and results sections, with specific sub-categories for annotation and identification methods and corresponding identification outputs [68].
Achieving Level 1 confirmation requires analysis of authentic chemical standards under identical analytical conditions as the sample. The detailed protocol includes:
Reference Standard Acquisition: Obtain certified reference materials or purified standards of candidate compounds from commercial suppliers or through synthesis.
Chromatographic Alignment: Analyze standards using identical chromatographic conditions (column, mobile phase composition, gradient, flow rate, temperature) as experimental samples.
Retention Time Matching: Compare sample and standard retention times using acceptable tolerance windows (typically ±0.1-0.2 minutes for LC, ±5 seconds for GC), accounting for proper retention time indexing if needed.
Mass Spectral Verification: Confirm match between experimental and standard MS and MS/MS spectra using similarity scoring (e.g., dot product ≥0.8 for MS/MS) with appropriate mass tolerance (≤5 ppm for HRMS).
Ion Ratio Consistency: Verify consistency in adduct formation, isotope patterns, and fragment ion ratios between sample and standard analyses.
This protocol was successfully implemented in a study of semi-volatile organic compounds in indoor dust, where 128 compounds were identified at confidence level 1 or 2 in standard reference material dust, with verification using authentic standards [70].
When authentic standards are unavailable, Level 2 identification can be achieved through comprehensive spectral matching:
High-Quality MS/MS Acquisition: Obtain fragmentation spectra using collision energies optimized for structural elucidation, typically employing data-dependent acquisition or inclusion lists.
Spectral Library Searching: Compare experimental MS/MS spectra against comprehensive reference databases such as mzCloud, MassBank, NIST, or GNPS.
Spectral Match Evaluation: Apply similarity scoring algorithms and manual spectral interpretation to confirm key fragment ions and neutral losses match proposed structure.
Orthogonal Evidence Integration: Support spectral matches with additional evidence including accurate mass measurement (≤5 ppm error), isotope pattern matching (≤10 mSigma fit), and retention time prediction when available.
In rapid response scenarios for unidentified chemicals, this approach has correctly assigned structures to more than half of features investigated, achieving Level 2 or 3 identifications sufficient for initial hazard assessment [67].
For unknown compounds without reference spectra or standards, systematic structure elucidation approaches include:
Molecular Formula Determination: Use accurate mass measurement (typically ≤3 ppm error) combined with isotope abundance pattern matching to assign molecular formulas.
In Silico Fragmentation: Employ computational tools such as CFM-ID, MetFrag, or SIRIUS to generate predicted fragments for candidate structures and compare with experimental MS/MS spectra.
Retention Time Prediction: Develop quantitative structure-retention relationship (QSRR) models using machine learning to predict retention behavior for candidate structures [71]. Recent applications have achieved prediction errors below 0.5 minutes, significantly improving confidence in tentative identifications [71].
Class-Specific Diagnostic Evidence: Identify characteristic fragments, neutral losses, or mass defects indicative of specific chemical classes (e.g., -CF2- units for PFAS, halogen patterns for brominated/chlorinated compounds).
The complete workflow for implementing confidence frameworks in NTA studies involves sequential steps from initial detection to final confirmation, with decision points at each stage determining the achievable confidence level.
Confidence Level Determination Workflow
The NTA Study Reporting Tool provides a structured approach for comprehensive documentation of confidence-related methodologies and results throughout the research process.
NTA Study Reporting Structure
Successful implementation of confidence frameworks requires specific reagents, reference materials, and computational tools throughout the NTA workflow.
Table 2: Essential Research Reagents and Materials for Confidence Framework Implementation
| Category | Specific Items | Function in Confidence Assessment |
|---|---|---|
| Reference Standards | Authentic chemical standards, Certified reference materials (e.g., NIST SRM 2585 dust) | Level 1 confirmation via retention time and spectral matching [70] |
| Chromatography | LC columns (C18, HILIC, etc.), GC columns (non-polar, mid-polar), Mobile phase additives | Compound separation and reproducible retention behavior for identification |
| Ionization Sources | ESI, APCI, APPI, EI, CI sources for LC- and GC-HRMS | Optimal ionization for different compound classes to maximize detection |
| Spectral Libraries | Commercial (mzCloud, NIST) and public (MassBank, GNPS) databases | Level 2-3 identification via spectral matching and similarity scoring [19] |
| Software Tools | Vendor (Compound Discoverer, MassHunter) and open-source (MS-DIAL, MZmine) platforms | Data processing, feature detection, and identification workflow implementation [19] |
| QA/QC Materials | Internal standards, QC spikes, procedural blanks, pool quality control samples | Monitoring analytical performance and identifying potential artifacts [68] |
Recent advances in machine learning are enhancing confidence framework implementation through improved prediction capabilities. Quantitative structure-retention relationship (QSRR) models using machine learning algorithms can now predict retention times with errors below 0.5 minutes, providing valuable orthogonal evidence for compound identification [71]. These models establish correlations between molecular descriptors and chromatographic behavior, serving as supplementary evidence for confidence assignment in suspect and non-targeted screening.
Machine learning approaches are also being applied to MS/MS spectrum prediction, false discovery rate estimation, and automated structure annotation, addressing key challenges in NTA confidence assessment [1]. The integration of these computational tools with established confidence frameworks shows promise for reducing manual intervention while improving identification accuracy and reproducibility across laboratories.
Hybrid approaches that integrate targeted and non-targeted analysis provide complementary advantages for comprehensive chemical characterization and confidence assessment. As demonstrated in a study of semi-volatile organic compounds in indoor dust, integrated NTA and TA approaches enable identification and prioritization of a wider range of chemicals while maintaining confidence in the results [70]. This strategy avoids the loss of critical trace compounds that might be missed by NTA alone while providing discovery capabilities beyond traditional targeted methods.
The integrated approach facilitates compound prioritization based on both exposure relevance and potential toxicity, supporting more informed risk-based assessment of identified chemicals [70]. This is particularly valuable in rapid response scenarios where timely and confident identification of unknown stressors is required, such as chemical threat incidents, illicit drug contamination, or accidental industrial spills [67].
The development and adoption of community-wide standards represent a critical direction for advancing confidence frameworks in NTA. The NTA Study Reporting Tool provides a living framework for assessing reporting quality, with integration into the BP4NTA website allowing continued evolution as community needs change [68]. Widespread implementation of such tools is expected to improve study design and standardize reporting practices, ultimately leading to broader use and acceptance of NTA data across scientific disciplines and regulatory applications.
Current efforts focus on addressing reporting areas that need immediate improvement, such as analytical sequence documentation and quality assurance/quality control information, which are essential for proper assessment of identification confidence [68]. As these community standards mature, they will enhance the transparency, reproducibility, and reliability of chemical identification confidence assignments in non-targeted analysis.
Non-targeted analysis (NTA) using high-resolution mass spectrometry has emerged as a powerful approach for characterizing the chemical composition of complex samples without prior knowledge of their content. This capability makes NTA invaluable across diverse fields including environmental science, food safety, exposomics, and drug development [68] [20]. Unlike traditional targeted methods that focus on predefined analytes, NTA workflows aim to detect and identify unknown chemicals, classify samples based on chemical profiles, and discover previously unrecognized compounds [68]. The exponential growth of NTA applications, however, has revealed significant challenges in research reproducibility and transparency due to the complexity of methodologies and lack of universal reporting standards [68] [72].
The fundamental challenge facing the NTA community lies in the tremendous diversity of analytical workflows, instrumentation, data processing techniques, and quality assurance practices employed across different laboratories and research domains [68]. This methodological heterogeneity, combined with insufficient reporting of critical experimental details, has hampered the ability to reproduce findings, compare results across studies, and build upon existing research [68] [57]. In response to these challenges, the Benchmarking and Publications for Non-Targeted Analysis (BP4NTA) working group developed the NTA Study Reporting Tool (SRT) as a comprehensive framework to standardize reporting practices and improve research transparency [73] [68] [72].
The SRT emerged from coordinated efforts by the BP4NTA working group, which formed in 2018 to address critical challenges in NTA research and reporting [20]. Comprising researchers from academic, government, and industrial sectors across North America and Europe, BP4NTA recognized that despite existing guidance on specific NTA elements—most notably for compound identification—no comprehensive reporting framework covered the complete NTA workflow [68] [20]. The development process involved rigorous validation where eleven NTA practitioners with diverse expertise evaluated eight published articles covering environmental, food, and health-based exposomic applications [68]. This evaluation demonstrated that the SRT provided a valid structure for guiding study design, manuscript preparation, and critical assessment of reporting quality [68] [57].
The SRT is strategically organized to follow the chronological progression of a typical NTA study, ensuring logical flow and comprehensive coverage of all critical workflow components [68] [74]. The tool divides the research process into two major sections (Methods and Results) containing five categories total, with each category further broken down into specific sub-categories for detailed evaluation [68].
Table: Structure of the NTA Study Reporting Tool
| Section | Category | Sub-categories |
|---|---|---|
| Methods | Study Design | Objectives & Scope, Sample Information & Preparation, QC Spikes & Samples |
| Data Acquisition | Analytical Sequence, Chromatography, Mass Spectrometry | |
| Data Processing & Analysis | Data Processing, Statistical & Chemometric Analysis, Annotation & Identification | |
| Results | Data Outputs | Statistical & Chemometric Outputs, Identification & Confidence Levels |
| QA/QC Metrics | Data Acquisition QA/QC, Data Processing & Analysis QA/QC |
This organized structure enables systematic evaluation of reporting completeness at each stage of the NTA workflow, from initial study design through final quality assurance documentation [68]. Each sub-category includes examples of the specific information that should be reported, making the SRT accessible to both novice and experienced NTA researchers [73] [68].
The SRT employs a hybrid scoring system that combines color-coding with numerical values to facilitate clear and consistent evaluation of reporting quality [68] [72]. The current version uses a 4-level scoring system where reviewers assign ratings based on the completeness of reporting in each sub-category, with space provided for rationales explaining each score assignment [73] [68].
Table: SRT Scoring System and Criteria
| Score | Color | Description |
|---|---|---|
| 3 | Blue | All relevant reporting elements are present |
| 2 | Yellow | Some relevant reporting elements are present, but important information is missing |
| 1 | Red | Most or all relevant reporting elements are missing |
| NA | Gray | Reporting on this topic is outside the study scope |
It is crucial to emphasize that the SRT focuses exclusively on assessing reporting quality, not the scientific merit or quality of the research data itself [68] [72] [74]. This distinction ensures that the tool remains an objective framework for evaluating the transparency and reproducibility of NTA studies, regardless of their specific research goals or methodological approaches [74].
The SRT is freely available to the scientific community through the BP4NTA website in two practical formats designed to accommodate different user preferences and use cases [73] [74]. The Excel version provides interactive functionality with dropdown menus for scoring and built-in plotting capabilities to visualize scores from multiple reviewers, while the PDF version offers a static format that can be easily annotated and shared [73]. Both formats maintain the same core structure and scoring system, ensuring consistency regardless of the chosen format [73].
The SRT is designed for implementation at multiple stages of the research process, providing maximum benefit to the NTA community [68] [74]. During study design and planning, researchers can use the SRT as a checklist to ensure all critical methodological elements are incorporated into their experimental protocols [68] [72]. When preparing manuscripts or research proposals, authors can perform self-evaluation using the SRT to identify reporting gaps before submission [68]. Evaluation results demonstrated that 72% of author self-assigned scores fell within the range of peer-assigned scores, indicating that SRT use for self-evaluation effectively strengthens reporting practices [57]. For peer review, journal referees and editors can apply the SRT to conduct standardized assessments of manuscript completeness, with the Excel version specifically including plotting functionality to facilitate comparison of multiple reviewer scores [73] [68] [74].
Proper citation practices ensure appropriate credit for the SRT developers and enable tracking of the tool's adoption and impact [73] [74]. The BP4NTA provides specific template language for acknowledging SRT use in publications, with slightly different formats depending on whether the tool was used during manuscript preparation or during peer review [73] [74].
Table: SRT Citation Guidelines for Different Use Cases
| Use Case | Location in Manuscript | Template Language |
|---|---|---|
| Manuscript Preparation | Methods Section | "The NTA Study Reporting Tool (SRT) was used in the preparation of this manuscript (Peter et al., 2021; 10.6084/m9.figshare.19763482 [PDF] or 10.6084/m9.figshare.19763503 [Excel])." |
| Peer Review | Acknowledgements Section | "The NTA Study Reporting Tool (SRT) was used during peer review to document and improve the reporting and transparency of this study (10.1021/acs.analchem.1c02621; 10.6084/m9.figshare.19763482 [PDF] or 10.6084/m9.figshare.19763503 [Excel])." |
The SRT encompasses the complete NTA workflow, from initial study conception through final reporting of results and quality metrics. The following diagram illustrates the logical relationships between different SRT components and their progression through the research lifecycle:
The SRT recognizes that comprehensive reporting of research reagents and materials is fundamental to experimental reproducibility. The following table details key components that should be documented throughout the NTA workflow:
Table: Essential Research Reagents and Materials for NTA Studies
| Category | Component | Function and Reporting Requirements |
|---|---|---|
| Sample Preparation | QC Spikes & Samples | Internal standards, recovery surrogates, procedural blanks; report compounds, concentrations, and addition points [68] |
| Purification Materials | SPE cartridges, filtration devices; report specific sorbents, formats, and processing conditions [27] | |
| Data Acquisition | Chromatography System | Separation mechanism; report column type, dimensions, mobile phases, and gradient program [68] |
| Mass Spectrometer | Mass analysis; report instrument type, resolution, mass accuracy, and acquisition mode [68] | |
| Analytical Sequence | Sample randomization; report injection order, QC frequency, and blank placement [68] | |
| Data Processing | Reference Spectral Libraries | Compound identification; report specific libraries and versions used [68] [27] |
| Software Platforms | Data processing; report software, algorithms, and parameter settings [68] [27] |
The initial validation of the SRT across eight published studies revealed significant disparities in reporting quality across different aspects of NTA workflows [68] [72]. While methods sections generally contained adequate descriptions of chromatography and mass spectrometry parameters, critical information about analytical sequences and quality assurance practices was frequently incomplete or entirely absent [68] [72]. Specifically, the evaluation identified that reporting scores were substantially lower and more variable for QA/QC metrics compared to other sub-categories, highlighting an urgent need for improved documentation of quality control practices across the NTA community [68] [57].
The SRT has already gained traction within the scientific community, with encouraging adoption by journals, researchers, and reviewers [74]. The Journal of Exposure Science and Environmental Epidemiology (JESEE) has incorporated the SRT into its author and reviewer guidelines, particularly for special issues focused on exposomics using non-targeted analysis [74]. This formal integration into the peer review process represents a significant step toward standardizing NTA reporting across the scientific literature [74]. As more researchers and journals adopt the SRT, the tool is expected to drive measurable improvements in the completeness, transparency, and reproducibility of NTA studies [57] [74].
The BP4NTA working group explicitly designed the SRT as a "living framework" that will evolve alongside advancements in NTA research [68] [72]. The digital distribution strategy, with version-controlled files hosted on the BP4NTA website, enables periodic updates based on community feedback, technological developments, and emerging best practices [73] [72]. Researchers are actively encouraged to submit comments, suggestions, and improvement ideas through a dedicated portal on the BP4NTA website, ensuring that the tool remains responsive to the changing needs of the NTA community [73] [72].
The NTA Study Reporting Tool represents a significant advancement in promoting reproducibility and transparency for non-targeted analysis studies using high-resolution mass spectrometry. By providing a comprehensive, standardized framework for evaluating reporting quality across the entire NTA workflow, the SRT addresses critical challenges that have hindered comparison, replication, and utilization of NTA data across research domains. The structured organization, practical scoring system, and flexible implementation options make the SRT accessible to researchers at all experience levels, from novices entering the field to experienced practitioners developing complex methodologies. As adoption increases, the SRT is poised to fundamentally transform reporting practices throughout the NTA community, ultimately enhancing the scientific rigor, reliability, and impact of non-targeted analysis in exposure science, environmental epidemiology, drug development, and related fields.
Non-targeted analysis (NTA) represents a paradigm shift in analytical chemistry, enabling the comprehensive characterization of chemical composition in complex samples without prior knowledge of their content. Powered primarily by high-resolution mass spectrometry (HRMS), NTA has become an indispensable tool for researchers across diverse fields, including environmental science, drug development, and exposomics [20]. Unlike traditional targeted methods that focus on predefined compounds, NTA aims to capture both known "unknowns" (compounds that exist but are not specifically monitored) and truly novel chemicals, providing a holistic view of the chemical universe within a sample [3]. This capability is particularly valuable for drug development professionals who must understand complex metabolite profiles, identify impurities, and characterize biotransformation products that may elude conventional targeted approaches.
The fundamental challenge in modern NTA lies not in data generation but in computational interpretation. Contemporary HRMS instruments generate immensely complex datasets that require sophisticated software tools for feature extraction, compound identification, and data reduction [27]. The analytical process transforms raw instrument data into chemically meaningful information through a multi-stage workflow encompassing data preprocessing, feature finding, compound annotation, and ultimately, identification and quantification. Each stage introduces specific computational challenges that different software tools address through varied algorithms and processing approaches. A critical study examining the consistency of data processing across different NTA software tools revealed startlingly low coherence, with only approximately 10% overlap of features between all four major programs tested (MZmine2, enviMass, Compound Discoverer, and XCMS online), while 40-55% of features were unique to each individual software platform [75]. This lack of consistency underscores the critical importance of understanding software capabilities, limitations, and algorithmic differences when designing NTA studies and interpreting results.
The divergence in results between NTA software platforms stems fundamentally from differences in their underlying algorithms and processing workflows. Each software package implements unique approaches to critical steps including chromatographic peak detection, chromatographic alignment, isotope and adduct grouping, and blank filtration. These algorithmic differences lead to substantial variation in the final feature lists, even when processing identical raw data files [75]. The implementation of replicate and blank filtering has been identified as a particularly significant source of observed divergences between software tools, suggesting that post-processing filters contribute substantially to the variability in reported results.
Performance benchmarking remains challenging due to the absence of standardized metrics for NTA software evaluation. Unlike targeted analysis where metrics like sensitivity and specificity can be calculated based on known compounds, NTA must contend with unknown features whose true presence or absence is difficult to verify. Some researchers have proposed using the number of true positives detected in spiked samples or the consistency of feature detection across replicates as potential metrics, but these approaches have limitations for evaluating true non-targeted performance [75]. The confidence level of identification represents another critical performance dimension, typically following a five-level hierarchy from confirmed structure (Level 1) to unequivocal molecular formula (Level 2) to tentative candidate (Level 3) to unambiguous molecular formula (Level 4) to exact mass (Level 5) [20].
Table 1: Comparative Analysis of Major NTA Software Platforms
| Software Tool | Algorithmic Approach | Strengths | Limitations | Optimal Use Cases |
|---|---|---|---|---|
| MZmine2 | Modular workflow with user-defined parameters | High flexibility, open-source, active development | Steeper learning curve, requires parameter optimization | Research requiring customized workflows and method development |
| XCMS Online | Cloud-based processing with standardized workflows | Accessibility, minimal setup, visualizations | Less customizable, dependency on internet connection | Initial exploratory analysis and collaborative projects |
| Compound Discoverer | Integrated workflow with automated processing | Streamlined workflow, commercial support, high throughput | Limited algorithmic transparency, cost | Routine screening in regulated environments |
| enviMass | R-based processing focused on environmental applications | Specialized for time-series data, trend analysis | Narrower focus, less broad applicability | Environmental monitoring and temporal trend analysis |
The transition from qualitative to quantitative NTA represents a critical frontier for the field, particularly for applications in risk assessment and regulatory decision-making where concentration estimates are essential. Significant efforts have been made in recent years to bridge this quantitative gap, with several approaches emerging for deriving quantitative estimates from NTA measurements [28]. These include the use of surrogate standards, machine learning-based prediction, and response factor modeling. However, quantitative NTA methods currently do not fully consider estimation uncertainty, and the effects of experimental recovery on this uncertainty remain largely unexplored in NTA studies [28].
The integration of quantitative NTA estimates with available hazard metrics may facilitate provisional safety evaluations, creating a pathway for NTA data to directly support chemical risk characterization. This is particularly relevant for drug development professionals who must assess the potential risk of identified impurities and metabolites. The conceptual framework for incorporating NTA data into contemporary risk assessment involves linking contaminant discovery with risk characterization through quantitative estimates, though significant methodological challenges remain in properly characterizing and communicating the associated uncertainties [28].
Rigorous evaluation of NTA software performance requires carefully designed benchmarking studies that assess both method capabilities and limitations. The Benchmarking and Publications for Non-Targeted Analysis Working Group (BP4NTA) has established a framework for such evaluations, emphasizing the need for harmonized terminology and clear guidance about best practices for analysis and reporting results [20]. Effective benchmarking studies should incorporate quality assurance and quality control (QA/QC) approaches specifically designed for NTA, including the use of quality control materials, standardized sample preparation protocols, and data quality assessment metrics.
The Non-Targeted Analysis Collaborative Trial (ENTACT) exemplifies a large-scale collaborative effort to evaluate and benchmark NTA performance across multiple laboratories and platforms [20]. Such proficiency testing exercises reveal substantial variability in results that can be attributed to differences in sample preparation techniques, instrumentation, software, and user settings rather than true sample differences. This highlights the critical importance of standardizing experimental protocols when conducting comparative software evaluations. Key components of an effective benchmarking protocol include: (1) use of standardized reference materials with known composition; (2) implementation of blank samples to identify and filter contamination; (3) incorporation of quality control samples to monitor instrument performance; (4) consistent data processing parameters across software platforms; and (5) standardized reporting formats for features and identifications.
The integration of machine learning (ML) with NTA represents a transformative advancement for extracting meaningful environmental information from the vast chemical datasets generated by HRMS [27]. ML algorithms are particularly effective at identifying latent patterns within high-dimensional data, making them well-suited for contamination source identification, sample classification, and biomarker discovery. A comprehensive workflow for ML-assisted NTA encompasses four key stages: (1) sample treatment and extraction; (2) data generation and acquisition; (3) ML-oriented data processing and analysis; and (4) result validation [27].
In the data processing stage, ML techniques address several critical challenges. Data preprocessing methods including noise filtering, missing value imputation (e.g., k-nearest neighbors), and normalization (e.g., TIC normalization) help mitigate batch effects and improve data quality [27]. Dimensionality reduction techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) simplify high-dimensional data, while clustering methods (hierarchical cluster analysis, k-means clustering) group samples by chemical similarity. Supervised ML models, including Random Forest (RF) and Support Vector Classifier (SVC), can then be trained on labeled datasets to classify contamination sources or predict sample properties. For example, one study successfully implemented ML classifiers to screen 222 targeted and suspect per- and polyfluoroalkyl substances (PFASs) across 92 samples, achieving classification balanced accuracy ranging from 85.5% to 99.5% across different sources [27].
Table 2: Essential Research Reagents and Materials for NTA Studies
| Reagent/Material | Function/Purpose | Application Notes |
|---|---|---|
| Solid Phase Extraction (SPE) Cartridges | Sample cleanup and analyte enrichment | Multi-sorbent strategies (e.g., Oasis HLB with ISOLUTE ENV+) provide broader chemical coverage |
| Quality Control Materials | Monitoring instrument performance and data quality | Includes solvent blanks, procedural blanks, and quality control samples |
| Internal Standards | Correcting for matrix effects and instrumental variance | Both labeled and non-labeled compounds across different retention times |
| Certified Reference Materials (CRMs) | Method validation and compound verification | Essential for confirming compound identities and validating quantitative approaches |
| Retention Time Markers | Chromatographic alignment and performance monitoring | Critical for retention time correction across different batches |
Diagram 1: NTA Software Evaluation Workflow. This workflow illustrates the systematic process for comparing multiple NTA software platforms, from raw data processing through performance evaluation.
The experimental workflow for comparative software analysis begins with standardized raw data acquisition using high-resolution mass spectrometry, typically with liquid or gas chromatography separation (LC/GC-HRMS). Following data acquisition, the same raw data files are processed in parallel through multiple software platforms, ensuring consistent parameter settings where possible [75]. The resulting feature lists from each software undergo blank filtering and replicate analysis to remove artifacts and ensure only reproducible features are considered. The critical analysis phase involves comparing the feature lists to identify overlaps and unique detections, with studies consistently showing approximately 10% overlap between all four major programs and 40-55% of features unique to each software [75].
For the overlapping features, identification confidence is assessed using the five-level hierarchy, examining supporting evidence including exact mass, isotopic patterns, fragmentation spectra, and when available, retention time matching with authentic standards [20]. Performance metrics are then calculated for each software, including sensitivity (number of true positives detected), selectivity (discrimination of true features from noise), and reproducibility (consistency across replicates). The final comparative report should transparently communicate both the strengths and limitations of each software platform for the specific application, noting that software performance may vary significantly based on sample matrix, instrumentation, and analytical objectives.
Implementing robust quality assurance and quality control (QA/QC) procedures is essential for generating reliable and reproducible NTA data. The BP4NTA working group emphasizes that study design should intentionally incorporate QA/QC approaches and yield the necessary data to enable performance assessments after data acquisition and analysis is complete [3]. Key QA/QC elements include the analysis of blank samples to identify contamination, quality control samples to monitor instrument stability, and replicate analyses to assess precision. The use of internal standards helps correct for matrix effects and instrumental variance, while standardized protocols for sample preparation and data processing minimize technical variability.
Confidence in compound identification represents a particular challenge in NTA, and should be communicated using a standardized confidence hierarchy. Level 1 identifications require confirmation with an authentic standard analyzed under identical analytical conditions, providing definitive confirmation [20]. Level 2 identifications are supported by library spectrum matches without a reference standard, while Level 3 represents tentative candidates based on diagnostic evidence. Lower confidence levels (4 and 5) provide progressively less certain information, from unambiguous molecular formula to exact mass only. Transparent reporting of identification confidence levels is essential for proper interpretation of NTA results, particularly in regulatory contexts or when informing risk assessment decisions.
The NTA field continues to evolve rapidly, with several community-wide initiatives working to address current challenges and advance methodological rigor. The Benchmarking and Publications for Non-Targeted Analysis Working Group (BP4NTA) represents a prominent effort to harmonize approaches and reporting practices across the NTA community [20]. With participation from academic, government, and industry sectors, BP4NTA has developed consensus definitions for NTA-relevant terms, reference content to support methodological reporting, and resources for both novice and experienced NTA researchers.
Future methodological developments are likely to focus on improving quantitative capabilities, with significant efforts underway to bridge the gap between contaminant discovery and risk characterization [28]. The integration of machine learning and artificial intelligence approaches will continue to advance, particularly for compound identification and source attribution [27]. Meanwhile, community-wide proficiency testing exercises and collaborative trials will help establish performance benchmarks and best practices. As these efforts mature, NTA data will see expanded application in regulatory decision-making and risk-based prioritization, moving beyond purely exploratory applications to directly support chemical management and safety assessment.
The comparative analysis of NTA software and computational tools reveals a complex landscape characterized by diverse algorithmic approaches and substantial variability in results. The finding that different software tools applied to identical datasets can yield dramatically different feature lists underscores the critical importance of software selection and transparent methodology reporting in NTA studies [75]. Researchers must carefully consider their analytical objectives, sample characteristics, and available resources when selecting software tools, recognizing that performance is highly context-dependent.
The ongoing harmonization efforts led by community initiatives like BP4NTA provide promising pathways toward improved consistency and reliability in NTA [20]. As the field continues to mature, the integration of quantitative approaches [28] and machine learning methodologies [27] will expand the applications and impact of NTA across diverse scientific domains. For drug development professionals and other researchers leveraging NTA approaches, maintaining awareness of evolving best practices, participating in community initiatives, and implementing rigorous QA/QC procedures will be essential for generating chemically meaningful and scientifically defensible results. Through continued methodological refinement and community-wide collaboration, NTA promises to remain at the forefront of analytical innovation, providing unprecedented insights into the chemical complexity of biological and environmental systems.
Non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS) has emerged as a powerful approach for detecting and identifying unknown and unexpected compounds in complex sample matrices [27] [20]. Unlike targeted methods that focus on predefined analytes, NTA aims to characterize the comprehensive chemical composition of samples without prior knowledge of their chemical content [3]. This capability makes NTA particularly valuable for discovering emerging environmental contaminants, identifying transformation products, and classifying samples based on chemical profiles [1] [22]. However, the inherent uncertainty in NTA results poses significant challenges for interpretation and acceptance by stakeholders, including regulatory bodies [5].
The complexity of NTA data, often comprising thousands of detected features in a single sample, necessitates robust validation strategies to ensure reliable results [22]. Without standardized validation approaches, the translation of NTA findings into actionable environmental insights remains problematic [27]. Tiered validation addresses this challenge by implementing multiple layers of verification that collectively enhance the confidence in NTA results [27]. This systematic framework bridges the gap between analytical capability and environmental decision-making by providing structured approaches to verify compound identifications, assess model generalizability, and contextualize findings within real-world scenarios [27].
The tiered validation framework for NTA consists of three complementary approaches: reference material verification, external dataset validation, and environmental plausibility assessment [27]. This multi-faceted strategy ensures that NTA results are both chemically accurate and environmentally meaningful by addressing different aspects of validation [27]. The framework is particularly crucial for supporting the interpretation of complex HRMS data and advancing the application of NTA in environmental research and decision-making [27].
Table 1: Overview of the Three-Tiered Validation Framework for NTA
| Validation Tier | Primary Objective | Key Techniques | Outcomes Assessed |
|---|---|---|---|
| Reference Material Verification | Confirm analytical confidence in compound identities | Certified reference materials (CRMs), spectral library matches, confidence-level assignments [27] | Chemical identity confirmation, analytical accuracy |
| External Dataset Testing | Evaluate model generalizability and robustness | Independent external datasets, cross-validation techniques (e.g., 10-fold) [27] | Model overfitting, transferability, performance stability |
| Environmental Plausibility Checks | Correlate model predictions with real-world context | Geospatial analysis, source-specific chemical markers, contextual data integration [27] | Environmental relevance, source-receptor relationships |
The integration of these three validation tiers addresses a critical gap in NTA research, where previous approaches often emphasized laboratory-based tests that might underperform in real-world conditions involving field-validated source-receptor relationships [27]. By implementing this comprehensive framework, researchers can more effectively contextualize machine learning outputs within actual contamination scenarios, thereby enhancing the practical utility of NTA data for environmental protection and public health decision-making [27].
Figure 1: The three-tiered validation framework for non-targeted analysis, illustrating the interconnected approach to verifying NTA results through reference materials, external datasets, and environmental plausibility assessments.
Reference material verification constitutes the foundational tier of NTA validation, focusing on confirming the analytical confidence in compound identities [27]. This process begins with the use of certified reference materials (CRMs) to verify the accuracy of compound identifications [27]. The experimental protocol involves analyzing CRMs alongside environmental samples using identical instrumentation and analytical conditions. This parallel analysis enables direct comparison of retention times, mass accuracy, and fragmentation spectra between suspected compounds in samples and authentic standards [27].
Spectral library matching represents another critical component of reference material verification [27]. The protocol requires acquiring MS/MS spectra for detected features in samples and comparing them against reference spectra in curated databases. Confidence-level assignments according to established frameworks (e.g., Level 1-5) provide systematic approaches for communicating identification certainty [27] [20]. Level 1 identification, the highest confidence level, requires matching retention time and fragmentation spectrum with an authentic standard analyzed under identical analytical conditions [27]. This tiered confidence system helps stakeholders appropriately interpret and utilize NTA results based on the strength of evidence supporting compound identifications.
Successful implementation of reference material verification requires careful consideration of several practical factors. The selection of appropriate CRMs should reflect the chemical diversity expected in study samples and the specific research objectives [27]. For emerging contaminants where CRMs may not be commercially available, alternative approaches include synthesizing reference compounds or obtaining well-characterized materials from research collections [27]. Quality control measures, such as batch-specific QC samples and internal standards, should be incorporated throughout the analytical process to monitor instrument performance and ensure data quality [27] [20].
The implementation of confidence-level assignments requires transparent reporting of the evidence supporting each identification [20]. Schymanski et al.'s confidence level framework is widely adopted in NTA studies, with Level 1 representing confirmed structure via reference standard, Level 2 indicating probable structure through library spectrum match, Level 3 suggesting tentative candidate, Level 4 providing unequivocal molecular formula, and Level 5 representing exact mass of interest [27]. Clear communication of these confidence levels in research findings enables appropriate interpretation of results by stakeholders with different informational needs [20].
External dataset testing serves as the second validation tier, focusing on assessing the generalizability and robustness of models developed through NTA [27]. This process involves validating classifiers and statistical models on independent external datasets that were not used during model development [27]. The experimental protocol requires partitioning data into training and completely separate testing sets, with the latter representing different geographical areas, temporal periods, or sample matrices than the original training data [27].
Cross-validation techniques represent a complementary approach to external dataset testing [27]. Methods such as k-fold cross-validation (e.g., 10-fold) involve partitioning the original dataset into k subsets, using k-1 subsets for training, and the remaining subset for testing, with the process repeated k times [27]. This approach provides robust assessment of model performance while maximizing data utility. For NTA applications involving sample classification, performance metrics including balanced accuracy, precision, recall, and F1-score should be calculated for both internal cross-validation and external testing to comprehensively evaluate model efficacy [27] [5].
A primary objective of external dataset testing is evaluating model overfitting and transferability [27]. Overfitting occurs when models perform well on training data but poorly on new, unseen data, indicating limited generalizability [27]. The comparison of performance metrics between internal validation (e.g., cross-validation) and external testing provides critical insights into overfitting risks. Significant performance degradation with external datasets suggests overfitting and limited practical utility of the model [27].
Transferability assessment extends beyond simple performance metrics to evaluate model applicability across different environmental contexts [27]. This process involves testing models on datasets representing varying conditions, such as different sample matrices (e.g., water, sediment, biological tissues), seasonal variations, or geographical diversity [27]. Successful transferability demonstrates model robustness and enhances confidence in its application for environmental decision-making across diverse scenarios. The Benchmarking and Publications for Non-Targeted Analysis Working Group (BP4NTA) has emphasized the importance of transferability assessment for advancing the acceptability of NTA methods in regulatory contexts [20].
Table 2: Performance Metrics for External Dataset Validation in NTA
| Metric | Calculation | Interpretation | Optimal Range |
|---|---|---|---|
| Balanced Accuracy | (Sensitivity + Specificity)/2 | Overall classification performance accounting for class imbalance | >80% |
| Precision | True Positives / (True Positives + False Positives) | Proportion of correct positive identifications | Context-dependent |
| Recall (Sensitivity) | True Positives / (True Positives + False Negatives) | Ability to identify all relevant cases | Context-dependent |
| F1-Score | 2 × (Precision × Recall) / (Precision + Recall) | Harmonic mean of precision and recall | >0.7 |
| Cross-Validation Consistency | Performance stability across validation folds | Model robustness | Low variance |
Environmental plausibility assessment constitutes the third validation tier, focusing on correlating model predictions with real-world contextual data [27]. This process involves integrating multiple lines of evidence to evaluate whether NTA findings align with known environmental patterns and processes [27]. Geospatial analysis represents a key methodological approach, examining the spatial distribution of detected compounds in relation to potential contamination sources, such as industrial facilities, agricultural areas, or wastewater treatment plants [27]. This analysis can reveal spatial gradients that support hypothesized source-receptor relationships.
The identification and evaluation of source-specific chemical markers provides another critical approach for environmental plausibility assessment [27]. This protocol involves identifying compounds with known associations to specific contamination sources (e.g., specific PFAS compounds associated with fire-fighting foams, pharmaceuticals indicative of wastewater influence) and evaluating whether their detection patterns align with expected source contributions [27]. The consistent co-detection of multiple markers from the same source type strengthens the plausibility of source attribution. Additionally, examining temporal patterns, such as seasonal variations in detection frequencies or concentrations, can provide further support for environmental plausibility when aligned with known use patterns or environmental processes [27].
Effective environmental plausibility assessment requires systematic integration of diverse contextual data types [27]. Land use information, industrial activity records, hydrological data, and known contamination histories should be compiled and analyzed in relation to NTA findings [27]. This integration enables assessment of whether detected chemical patterns logically correspond to potential influences in the study area. For example, detections of agricultural pesticides would be more plausible in areas with documented agricultural activity, while industrial compounds would be more expected near manufacturing facilities.
The interpretation of environmental plausibility involves both confirmatory and exploratory elements [27]. Confirmatory evaluation assesses whether NTA results align with pre-existing knowledge and expectations, while exploratory analysis identifies novel patterns that may reveal previously unrecognized contamination sources or pathways [27]. This dual approach ensures that environmental plausibility assessment neither simply confirms biases nor indiscriminately accepts all findings without critical evaluation. The BP4NTA working group emphasizes the importance of transparently reporting both supporting and contradictory evidence when presenting environmental plausibility assessments [20].
Figure 2: Environmental plausibility assessment framework for NTA, showing the integration of geospatial, chemical marker, temporal, and contextual data to evaluate the real-world relevance of NTA findings.
Successful implementation of tiered validation in NTA requires a systematic, integrated workflow that coordinates activities across all three tiers [27]. The process should begin during study design, with planning for appropriate reference materials, external validation datasets, and contextual data collection [3]. Sample analysis should incorporate quality control materials that enable reference material verification, while data analysis protocols should explicitly include procedures for external validation and environmental plausibility assessment [27] [20].
The workflow should emphasize iterative evaluation across validation tiers [27]. Initial reference material verification establishes foundation confidence in compound identities, which then supports meaningful external dataset testing [27]. Results from both laboratory-based tiers subsequently inform environmental plausibility assessment, which may identify needs for additional reference material verification or model refinement [27]. This iterative approach ensures continuous refinement of NTA results and enhances overall confidence in findings. Documentation of all validation activities, including materials used, methodologies applied, and results obtained, is essential for transparent reporting and stakeholder acceptance [20].
Table 3: Essential Research Reagents and Materials for Tiered Validation in NTA
| Reagent/Material | Application | Function in Validation | Implementation Considerations |
|---|---|---|---|
| Certified Reference Materials (CRMs) | Tier 1: Reference Material Verification | Confirm compound identities through retention time and fragmentation spectrum matching [27] | Select CRMs representative of expected contaminant classes; include isotopically labeled analogs when possible |
| Quality Control Samples | All Tiers | Monitor instrument performance and data quality throughout analytical process [27] [20] | Include procedural blanks, solvent blanks, and matrix spikes in each analytical batch |
| Spectral Libraries | Tier 1: Reference Material Verification | Support compound identification through mass spectral comparison [27] [22] | Use curated, domain-specific libraries (e.g., for PFAS, pharmaceuticals, pesticides) |
| Independent Validation Datasets | Tier 2: External Dataset Testing | Assess model generalizability and robustness [27] | Secure datasets representing different temporal periods, geographical areas, or sample matrices |
| Contextual Data Resources | Tier 3: Environmental Plausibility Assessment | Correlate chemical patterns with potential sources and environmental factors [27] | Compile land use records, industrial facility data, hydrological information, and known contamination history |
| Internal Standards | Tier 1: Reference Material Verification | Monitor analytical performance and correct for matrix effects [27] | Select compounds not expected in samples but with similar chemical properties to analytes of interest |
The implementation of a systematic tiered validation framework addressing reference material verification, external dataset testing, and environmental plausibility assessment represents a critical advancement for non-targeted analysis using high-resolution mass spectrometry [27]. This comprehensive approach directly addresses key uncertainties in NTA results, enhancing their utility for environmental decision-making [27] [5]. By integrating these complementary validation strategies, researchers can bridge the gap between analytical capability and practical application, supporting more effective contamination source identification, risk assessment, and environmental management [27].
As NTA methodologies continue to evolve, further refinement of tiered validation approaches remains essential [27]. Promising directions include the development of more comprehensive reference material collections, standardized protocols for external validation, and advanced computational methods for environmental plausibility assessment [27] [20]. Community-wide adoption of systematic validation frameworks, as promoted by initiatives such as the BP4NTA working group, will accelerate the transition of NTA from a research tool to a reliable approach for supporting environmental protection and public health decisions [20]. Through continued refinement and implementation of tiered validation strategies, the environmental research community can fully realize the potential of NTA for addressing complex contamination challenges.
In the rapidly evolving field of non-targeted analysis (NTA) using high-resolution mass spectrometry (HRMS), benchmarking studies and interlaboratory comparisons have emerged as indispensable tools for advancing methodological rigor and data reliability. These initiatives address a fundamental challenge in NTA: the inherent uncertainty in identifying and quantifying unknown chemicals across different laboratories, instruments, and data processing workflows [14]. Unlike traditional targeted methods that benefit from well-established performance metrics and standardized protocols, NTA generates information-rich data where results are often ambiguous—if an analyst reports a chemical present, it may actually be absent (e.g., an isomer or incorrect identification), and if reported absent, it may actually be present [14]. This uncertainty has prevented broader adoption of NTA data in regulatory decision-making, creating an urgent need for community-wide efforts to establish performance assessment frameworks [14].
Interlaboratory studies (ILS) represent a strategic response to these challenges, enabling the environmental chemistry community to evaluate reproducibility, identify sources of variability, and agree on harmonized quality control procedures [76]. The growing recognition of this need is evidenced by initiatives such as the NTA collaborative trial organized by the Norman Network, which has promoted clear reporting strategies for confidence levels in identifying chemicals of emerging concern (CECs) in complex environmental samples [76]. Similarly, the Benchmarking and Publications for Non-Targeted Analysis (BP4NTA) working group has formed specifically to address challenges in NTA studies through community-driven subcommittees focused on education, study planning, PFAS analysis, and gas chromatography applications [77]. These collaborative frameworks are essential for transitioning NTA from a research tool to a reliable approach for chemical monitoring and risk assessment.
One of the most comprehensive benchmarking efforts documented in literature involved 21 participants from 11 countries organized through the Norman Network [76]. This study was strategically designed to assess uncertainty in identified compounds caused by different NTA workflows and spectral databases. The experimental design employed a passive sampling strategy of river water at a drinking water intake and post-treatment drinking water, enabling identification of substances present in source water, present in finished drinking water, removed during treatment, and generated during treatment processes [76].
The study architecture required all participants to analyze identical "ready for injection" samples consisting of two surface water and two drinking water passive sampler extracts using two distinct approaches: (1) a pre-defined method with detailed instructions on LC separation and MS data acquisition, and (2) individually developed in-house methods that reflected each laboratory's established protocols [76]. This dual approach enabled systematic assessment of method transfer, chromatography, data acquisition, and data processing impacts on the detectable chemical space. Participants provided raw data, converted files, raw feature lists, and their top 50 identified features with confidence levels based on the Schymanski scale—a tiered system for reporting identification confidence [76].
The BP4NTA working group has established a structured subcommittee system to address specific challenges in NTA method development and standardization [77]:
Table: BP4NTA Subcommittees and Their Primary Functions
| Subcommittee | Primary Focus | Key Activities |
|---|---|---|
| Educational Subcommittee | Knowledge dissemination | Developing upper-level college courses on NTA; maintaining literature references; creating educational materials |
| Study Planning Tool (SPT) | Method standardization | Developing tools for designing high-quality NTA studies; creating SOPs; quality assurance plans |
| PFAS Subcommittee | PFAS-specific challenges | Facilitating cross-sharing of PFAS spectral libraries; drafting quality control recommendations |
| GC NTA Subcommittee | Gas chromatography applications | Sharing resources and methods; providing direction on tool development for GC-based NTA |
These subcommittees address critical gaps in NTA standardization, particularly through developing guidance for transitioning between targeted and non-targeted analysis in both industrial and academic research settings [77]. The monthly meetings, forums, and collaborative publications facilitate ongoing community engagement and knowledge sharing essential for methodological advancement.
A critical foundation for benchmarking studies is the clear categorization of NTA research objectives. According to performance assessment literature, most NTA projects fall into three distinct categories that determine appropriate evaluation metrics [14]:
For qualitative studies (sample classification and chemical identification), performance can be assessed using adaptations of the traditional confusion matrix, though with recognized challenges and limitations [14]. The Schymanski scale provides a standardized framework for reporting identification confidence levels, ranging from level 1 (confirmed structure) to level 5 (exact molecular formula unknown) [76]. For quantitative NTA studies, performance assessment can utilize estimation procedures developed for targeted methods, but must account for additional sources of uncontrolled experimental error [14].
The Norman Network ILS provided valuable quantitative data on reproducibility and confidence in chemical identification across multiple laboratories. While the study highlighted significant variability in identified features across different workflows, it also demonstrated the value of standardized reporting frameworks and raw data sharing for improving reproducibility [76].
Table: Performance Metrics for NTA Method Evaluation
| Performance Dimension | Assessment Approach | Challenges in NTA Context |
|---|---|---|
| Selectivity | Ability to differentiate chemical species from interferents | Difficult without reference standards; isomer distinction problematic |
| Sensitivity | Limit of detection (LOD) for chemical signals | Varies by compound; requires estimation without standards |
| Accuracy | Agreement between reported and true values | Challenging for identification without confirmed standards |
| Precision | Consistency across multiple measurements | Affected by instrumentation, data processing, and operator variability |
| Reproducibility | Consistency across different laboratories | Impacted by methodological differences; focus of ILS |
The integration of machine learning (ML) approaches shows particular promise for enhancing several aspects of NTA performance, including chemical structure identification, advanced quantification methods, and toxicity prediction capabilities [1]. However, challenges remain in refining ML tools for complex environmental mixtures and improving inter-laboratory validation [1].
The Norman Network study implemented a rigorous sample preparation protocol to ensure consistency across participating laboratories [76]. Horizon Atlantic HLB-L disks (47 mm diameter) were deployed for integrative sampling at both input and output of a drinking water treatment plant with exposure times of 2 and 4 days [76]. To increase sampling rates, passive samplers were placed in dynamic passive sampling devices (DPS) consisting of electrically driven large-volume water pumping systems coupled to exposure cells [76].
The sample processing methodology included:
This comprehensive approach generated samples with equivalent water volumes of 4.0-8.7 liters per vial, enabling trace-level detection of contaminants while minimizing matrix effects [76].
The interlaboratory study allowed assessment of both predefined methods and in-house protocols, revealing how analytical conditions influence the detectable chemical space [76]. The predefined method specified detailed conditions for liquid chromatography separation and mass spectrometric data acquisition, while participating laboratories used their own established in-house methods for comparison [76].
The instrumentation landscape across 21 laboratories included various HRMS systems, with the most common being:
Among separation techniques, reversed-phase liquid chromatography (RPLC) was predominantly employed, potentially introducing selection bias against polar, highly polar, and ionic compounds [40]. This bias represents a significant methodological consideration for comprehensive chemical space assessment.
NTA Benchmarking Workflow
Implementing robust NTA benchmarking studies requires specific instrumentation, analytical tools, and reference materials. The following table summarizes key components of the NTA research toolkit based on current methodologies and interlaboratory studies:
Table: Essential Research Toolkit for NTA Benchmarking Studies
| Tool Category | Specific Examples | Function in NTA Benchmarking |
|---|---|---|
| HRMS Instrumentation | Orbitrap, TOF, FT-ICR [11] | High-resolution mass measurement for accurate compound identification |
| Separation Techniques | RPLC, HILIC, GC [40] [19] | Compound separation prior to MS analysis; different techniques cover complementary chemical spaces |
| Passive Sampling Media | HLB disks, Silicone sheets [76] | Time-integrated sampling and preconcentration of contaminants from water |
| Internal Standards | Isotope-labeled compounds (Caffeine-13C3, Carbamazepine-D10) [76] | Retention time modeling and quality control |
| Performance Reference Compounds | 14 PRCs for silicone sheets [76] | Calibration of sampling rates and mass transfer coefficients |
| Data Processing Software | Vendor software (Compound Discoverer, MassHunter); Open-source platforms (MzMine, MS-DIAL) [19] | Feature detection, peak alignment, and compound identification |
| Spectral Libraries | MassBank EU, MoNA, NIST [76] [19] | Reference spectra for compound identification |
| Confidence Assessment | Schymanski scale [76] | Standardized framework for reporting identification confidence levels |
The selection of appropriate tools significantly influences the detectable chemical space, with studies showing that 51% of NTA investigations use only LC-HRMS, 32% use only GC-HRMS, and 16% employ both techniques to expand coverage of chemical properties [19]. This distribution highlights a critical methodological consideration, as exclusive use of RPLC-HRMS may systematically exclude certain classes of polar and ionic compounds [40].
The design and interpretation of NTA benchmarking studies must account for several analytical factors that significantly impact outcomes and reproducibility:
Extraction and Cleanup Bias: Sample preparation protocols, particularly solid-phase extraction (SPE) choices, strongly influence which chemical classes are recovered and detected. Different sorbents exhibit selective retention for compounds with specific physicochemical properties, potentially introducing systematic biases [40] [19].
Chromatographic Selectivity: The overrepresentation of reversed-phase liquid chromatography (RPLC) in NTA studies contributes to selective exclusion of polar, highly polar, and ionic compounds from analysis [40]. This limitation can be mitigated by incorporating alternative separation mechanisms such as hydrophilic interaction liquid chromatography (HILIC) or gas chromatography (GC).
Ionization Efficiency Variability: The choice of ionization techniques (e.g., ESI+, ESI-, APCI, EI) significantly impacts the detectable chemical space. Studies show 43% of LC-HRMS applications use both ESI+ and ESI-, while 18% use only ESI+, and 22% use only ESI- [19], creating substantial differences in coverage.
Data Processing Inconsistencies: The predominance of vendor-specific software (used in 57 studies) over open-source platforms (used in 7 studies) introduces challenges for method standardization and reproducibility [19]. Algorithmic differences in peak picking, alignment, and identification contribute to interlaboratory variability.
Despite considerable progress, significant methodological gaps remain in NTA benchmarking:
Spectral Library Limitations: Severe limitations exist for liquid chromatography spectral databases compared to GC-MS libraries, with insufficient resources for spectra matching [40] [19]. This bottleneck partially explains why many detected features remain unidentified in NTA studies.
Quantitative Uncertainty: While qualitative NTA has advanced significantly, quantitative NTA (qNTA) approaches still lack standardized approaches for addressing estimation uncertainty, particularly regarding experimental recovery effects [28].
Integration with Risk Assessment: Limited frameworks exist for incorporating NTA data into formal risk assessment paradigms, despite the potential for NTA to bridge contaminant discovery and risk characterization [28].
Effect-Directed Analysis Integration: Combining NTA with effect-directed analysis (EDA) shows promise for identifying toxicity drivers, with studies reporting NTA explains a median of 34% of observed toxicity compared to 8.86% for targeted analysis alone [40]. However, standardized approaches for this integration remain underdeveloped.
NTA Performance Assessment Framework
Benchmarking studies and interlaboratory comparisons represent fundamental pillars in the advancement of reliable NTA methods for environmental monitoring and exposure assessment. The growing body of research demonstrates that while significant variability exists across laboratories and methodologies, structured collaborative efforts can progressively improve reproducibility and confidence in results [76]. The future trajectory of NTA benchmarking will likely focus on several critical areas:
First, the integration of machine learning and artificial intelligence holds substantial promise for enhancing chemical structure identification, quantification accuracy, and toxicity prediction capabilities [1]. ML approaches may help address current bottlenecks in data processing and interpretation, particularly for complex environmental mixtures. Second, the development of standardized performance assessment protocols through initiatives like BP4NTA will be essential for translating NTA from a research tool to a reliable approach for regulatory decision-making [14] [77]. This includes establishing agreed-upon metrics for sensitivity, selectivity, accuracy, and precision in the NTA context.
Finally, the environmental chemistry community must address the critical need for expanded spectral libraries, particularly for LC-HRMS applications, and multimethod approaches that combine complementary analytical techniques to overcome the limitations of any single method [40] [19]. As these advancements mature, benchmarking studies will continue to provide the essential foundation for assessing progress, identifying persistent challenges, and directing future research investments toward the shared goal of comprehensive chemical exposure assessment.
Non-targeted analysis with high-resolution mass spectrometry represents a paradigm shift in analytical science, enabling comprehensive characterization of complex chemical mixtures beyond predefined targets. Successful implementation requires meticulous attention throughout the entire workflow—from experimental design and data acquisition to advanced interpretation and validation. The integration of machine learning, development of standardized reporting frameworks, and advancement of quantitative approaches are rapidly addressing historical limitations. As these methodologies mature, NTA is poised to transform exposure science, biomarker discovery, and environmental monitoring by providing unprecedented insights into previously uncharacterized chemical spaces. Future directions should focus on enhancing quantitative rigor, improving interoperability across platforms, establishing standardized performance criteria, and expanding applications in clinical and public health decision-making contexts.