cff-version: 1.2.0 abstract: "

This dataset belongs to the PhD thesis of Céline Cleij titled "Building the genome of a minimal synthetic cell".

Specifically, the dataset belongs to Chapter 3 titled "Synthetic chromosome assembly in yeast for cell-free protein synthesis".


Authors: Céline Cleij, Pascale Daran-Lapujade, Christophe Danelon

Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology;

Department of Biotechnology, Delft University of Technology;

Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA


Corresponding authors: Pascale Daran-Lapujade and Christophe Danelon


Contact information: p.a.s.daran-lapujade@tudelft.nl and danelon@insa-toulouse.fr



*** General introduction ***

This dataset contains data collected during experiments as part of Céline Cleij's PhD project. The data was collected from 2021-2025.



*** Methodological information ***

Raw Nanopore sequencing reads (fastq) were obtained in-house using MinION technology (Oxford Nanopore Technologies, Oxford, UK).

Consensus SynChrs sequences (GenBank) were obtained after de novo assembly of the processed Nanopore sequencing reads using Flye or Canu. If necessary, a consensus SynChr sequence was assembled in SnapGene using information from the Flye and Canu assemblies and raw reads.

Designed SynChr sequences (GenBank) were prepared using SnapGene, using the plasmid maps of the sequenced template plasmids and the designed primer sequences.

Tables S3.1-S3.17 were prepared in Excel.


All data processing and analysis steps are described in detail in the Methods section of the publication.



*** Organization of the dataset ***

The data is grouped into four files:


i) Zip file "Raw Nanopore sequencing reads"

Files are named after the yeast strain from which total DNA was extracted. For IMF51 and IMF54, raw reads of the second sequencing run are deposited.


ii) Zip file "Consensus SynChr sequences"

Files are named after the yeast strain from which total DNA was extracted, and after the SynChr version (SynChr_control or SynChr_PURE) which was assembled in this strain.


iii) Zip file "Designed SynChr sequences"

Files are named after the SynChr version (SynChr_control, SynChr_PURE, SynChr_control_2mu, SynChr_PURE_2mu).


iv) Excel file "Tables S3.1-S3.17"

This Excel file contains the supplementary tables S3.1 to S3.17, which contain information about all used strains, synthetic chromosomes, plasmids, primers and SHRs used in this study.

" authors: - family-names: Cleij given-names: Céline orcid: "https://orcid.org/0000-0001-6580-1106" - family-names: Daran-Lapujade given-names: Pascale orcid: "https://orcid.org/0000-0002-4097-7831" - family-names: Danelon given-names: Christophe title: "Data underlying chapter 3 of PhD thesis: Building the genome of a minimal synthetic cell" keywords: version: 2 identifiers: - type: doi value: 10.4121/feb7423b-8194-4d99-89d8-593023e06473.v2 license: CC BY-NC 4.0 date-released: 2025-08-26