22 & 23 May 2025: Join the mini-conference on Open and FAIR in Natural and Engineering Sciences. Register to attend.

Data underlying chapter 4 of PhD thesis: Building the genome of a minimal synthetic cell

DOI:10.4121/ad21c652-ad75-4a99-a09a-46c7d8f383d6.v1
The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/ad21c652-ad75-4a99-a09a-46c7d8f383d6

Datacite citation style

Cleij, Céline; Pascale Daran-Lapujade; Danelon, Christophe (2025): Data underlying chapter 4 of PhD thesis: Building the genome of a minimal synthetic cell. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/ad21c652-ad75-4a99-a09a-46c7d8f383d6.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

This dataset belongs to the PhD thesis of Céline Cleij titled "Building the genome of a minimal synthetic cell".

Specifically, the dataset belongs to Chapter 4 titled "De novo design and assembly of minimal genomes for the synthetic cell".


Authors: Céline Cleij, Pascale Daran-Lapujade, Christophe Danelon

Corresponding authors: Pascale Daran-Lapujade and Christophe Danelon

Contact information: [email protected] and [email protected]


This dataset contains data collected during experiments as part of Céline Cleij's PhD project. The data was collected from 2023-2025.


All data processing and analysis steps are described in detail in the Methods section of thesis chapter 4.

Designed SynMG sequences (GenBank) were prepared with the SnapGene software, using the plasmid maps of the sequenced template plasmids and the designed primer sequences.

Raw Nanopore sequencing reads (fastq) were obtained by Plasmidsaurus (Eugene, OR, USA) using Nanopore sequencing technology.

Consensus SynMG sequences (GenBank) were obtained by Plasmidsaurus after processing of the raw reads, and were manually annotated in SnapGene.

The overview of relevant mutations ("Relevant mutations in SynMG variants") was prepared in Excel, based on mutations in consensus sequences and raw reads obtained from sequencing.

LC-MS data was obtained after processing in the Mascot software.



The data is grouped into seven files:

i) Designed SynMG sequences. Files are named after the SynMG version (SynMG1 or SynMG2).

ii) S. cerevisiae - Raw Nanopore sequencing reads. Files are named after the yeast strain from which total DNA was extracted, and after the SynMG variant which was assembled in this strain.

iii) S. cerevisiae - Consensus SynMG sequences. Files are named after the yeast strain from which total DNA was extracted, and after the SynMG variant which was assembled in this strain.

iv) E. coli - Raw Nanopore sequencing reads. Files are named after the E. coli strain from which SynChr DNA was extracted, and after the SynMG variant which was amplified in this strain.

v) E. coli - Consensus SynMG sequences. Files are named after the E. coli strain from which SynChr DNA was extracted, and after the SynMG variant which was amplified in this strain.

vi) Relevant mutations in SynMG variants. This Excel file contains an overview of all relevant mutations SynMG1.1, SynMG1.2, SynMG1.3, SynMG2.1 and SynMG2.2 compared to the designed maps.

vii) LC-MS data. This Excel file contains LC-MS data used for making Figure 4A, B & D.


History

  • 2025-03-27 first online, published, posted

Publisher

4TU.ResearchData

Format

Raw Nanopore sequencing reads/fastq; Consensus SynMG sequences/GenBank; Designed SynMG sequences/GenBank; Relevant mutations in SynMG variants/Excel; LC-MS data/Excel

Funding

  • ANR grant (grant code ANR-22-CPJ2-0091-01) Agence Nationale de la Recherche
  • BaSyC – Building a Synthetic Cell Gravitation grant (grant code 024.003.019) NWO

Organizations

Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology;
Faculty of Applied Sciences, Department of Biotechnology, Delft University of Technology;
Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA

DATA

Files (8)