Abstract
Nanopores are versatile single-molecule sensors that are being used to sense increasingly complex mixtures of structured molecules, with applications in molecular data storage and disease biomarker detection. However, increased molecular complexity presents additional challenges to the analysis of nanopore data including more translocation events being rejected for not matching an expected signal structure and a greater risk of selection bias entering this event curation process. To highlight these challenges, here we present the analysis of a model molecular system consisting of a nanostructured DNA molecule attached to a linear DNA carrier. We make use of recent advances in the event segmentation capabilities of Nanolyzer, a graphical analysis tool provided for nanopore event fitting, and describe approaches to event substructure analysis. In the process, we identify and discuss important sources of selection bias in that emerge in the analysis of this molecular system and consider the complicating effects of molecular conformation and variable experimental conditions (e.g. pore diameter). We then present additional refinements to existing analysis techniques, allowing for improved separation of multiplexed samples, fewer translocation events rejected as false negatives, and a wider range of experimental conditions for which accurate molecular information can be extracted. Increasing the coverage of analyzed events within nanopore data is not only important for characterizing complex molecular samples with high fidelity, but is also becoming essential to the generation of accurate, unbiased training data as machine learning approaches to data analysis and event identification continue to increase in prevalence.
Supplementary materials
Title
Supporting Information
Description
Section S1: Configuration Details of Nanolyzer Analysis
Section S2: List of Metadata Used in Nanolyzer Data Manager (“Data Dictionary”)
Section S3: Selection Filters Used – List of SQL Queries
Section S4: Removing Type 0 Events before Sublevel Clustering
Section S5: First Ten Events from Each Category of Table 1 / Figure 4
Section S6: Comparing ECD Distributions of Type 0 vs. Type 1/2 Events
Section S7: 1D Histograms of ECD/TrECD for Events of Figure 6
Section S8: Control – Two Carrier + Star Pairs Run Separately and Together on a Single Pore
Section S9: Sample Events of 2 kbp + 6/12-arm stars through Smaller (8-nm) Pore
Section S10: Filters A & B Applied to 2 kbp + 6/12-Arm Stars through Larger (12-nm) Pore
Section S11: Rigid Sorting Approach on Events from 2 kbp + 6/12-Arm Stars in Figure 7
Actions