Conexiant
Login
  • The Analytical Scientist
  • The Cannabis Scientist
  • The Medicine Maker
  • The Ophthalmologist
  • The Pathologist
  • The Traditional Scientist
The Pathologist
  • Explore Pathology

    Explore

    • Latest
    • Insights
    • Case Studies
    • Opinion & Personal Narratives
    • Research & Innovations
    • Product Profiles

    Featured Topics

    • Molecular Pathology
    • Infectious Disease
    • Digital Pathology

    Issues

    • Latest Issue
    • Archive
  • Subspecialties
    • Oncology
    • Histology
    • Cytology
    • Hematology
    • Endocrinology
    • Neurology
    • Microbiology & Immunology
    • Forensics
    • Pathologists' Assistants
  • Training & Education

    Career Development

    • Professional Development
    • Career Pathways
    • Workforce Trends

    Educational Resources

    • Guidelines & Recommendations
    • App Notes
    • eBooks

    Events

    • Webinars
    • Live Events
  • Events
    • Live Events
    • Webinars
  • Profiles & Community

    People & Profiles

    • Power List
    • Voices in the Community
    • Authors & Contributors
  • Multimedia
    • Video
    • Pathology Captures
Subscribe
Subscribe

False

The Pathologist / Issues / 2025 / October / The Missing 6 Percent Why Arab DNA Is Changing the Future of Genomics
Genetics and epigenetics Precision medicine

The Missing 6 Percent: Why Arab DNA Is Changing the Future of Genomics

The discovery of millions of Arab-specific variants shows why inclusive genomics is essential for equity in diagnosis and precision medicine

By Jessica Allerton 10/09/2025 Discussion 4 min read

Share

Neil Ward

Although Arabs make up nearly 6 percent of the world’s population, they are largely absent from genomic research datasets. This underrepresentation limits understanding of disease patterns in Arab populations and hinders effective prevention, diagnosis, and treatment. A study published in Nature on the assembly of the first Arab human pangenome marks a major step toward more inclusive genomic research.

Here, we speak with Neil Ward, VP and General Manager EMEA at PacBio, about the implications of this research.

What inspired this study?

Arab populations have a unique genetic heritage shaped by longstanding traditions, including frequent first- and second-cousin marriages. This consanguinity has contributed to a high prevalence of rare diseases in the region. In response, many Arab countries have launched large-scale genome projects – such as the Emirati Genome Project, which has sequenced nearly 900,000 people using short-read sequencing. However, the lack of high-quality Arab reference genomes limits how fully these data can be interpreted.

Building any human genome is complex. Until 2022, the global reference genome (GRCh48) was only 92 percent complete, missing difficult-to-sequence “dark” regions. The Telomere-to-Telomere (T2T) Consortium filled in the missing 8 percent using advanced long-read sequencing, which better captures complex and repetitive DNA.

This breakthrough raised a new question: if long-read sequencing could complete the reference genome, why not use it to explore genetic diversity in populations outside Europe? That idea became the driving force behind efforts to build the first Arab human pangenome.

Why build pangenomes for historically underrepresented populations?

A reference genome is a baseline DNA sequence used to compare individual genomes, helping identify disease-linked variants, predict drug responses, and develop targeted therapies. The most widely used human reference is based on a single Northern European individual – limiting accuracy for non-European populations.

A pangenome is more inclusive, combining genomes from many individuals to capture both shared and population-specific variants. This broader view improves variant interpretation, supports equitable access to precision medicine, and helps ensure research reflects global diversity.

What were the main outcomes of the study?

The study built a high-quality Arab pangenome using samples from 53 individuals across eight countries. By combining multiple long-read sequencing technologies, the researchers uncovered over 111 million base pairs of DNA missing from standard human reference genomes.

Long-read sequencing was key because it captures long, complex, and repetitive regions of DNA that short-read methods often miss.

Key findings included:

  • 8.94 million small variants and 235,000 structural variants unique to Arab individuals.

  • 883 duplicated genes, including TAF11L5, present in all participants and possibly linked to recessive disease.

  • More than 1,400 new mitochondrial base pairs, improving maternal lineage tracking in Arab populations.

Why is the discovery of the structural variants important?

Structural variants – insertions, deletions, and duplications – are hard to detect with short-read sequencing but can significantly impact gene function. Identifying 235,000 Arab-specific variants fills major gaps in reference genomes, aiding in distinguishing harmless from harmful changes. This is crucial for understanding recessive diseases and improving diagnostics for Arab patients.

Can you elaborate on the 15 percent of duplicated genes linked with recessive diseases?

A duplicated gene is a gene that appears more than once in the genome, often due to structural variants. Think of it like a jigsaw puzzle where some pieces are repeated or misplaced – making it harder to see the full picture. In the genome, these duplications can confuse variant interpretation and sometimes disrupt gene function.

In this study, 15 percent of duplicated genes in the Arab pangenome were linked to autosomal recessive conditions. Identifying these duplications can improve carrier screening, reduce diagnostic uncertainty, and enable more accurate genetic testing in populations where such patterns are common.

How should labs handle new DNA sequences when interpreting variants?

Existing variant databases and reference genomes may not include these newly discovered regions. As Arab pangenome resources become publicly available, it will be essential for clinical and research labs to integrate them into their analysis pipelines. Doing so will improve variant detection, classification, and clinical interpretation for individuals of Arab ancestry.

Without these updates, population-specific variants may be overlooked or misclassified as uncertain – delaying diagnosis or producing inaccurate results. Incorporating these sequences into reference datasets and tools is a critical step toward equitable and reliable genomic analysis.

What is the possible impact of the new mitochondrial DNA sequences that were found?

The study identified 1,419 new base pairs of mitochondrial DNA (mtDNA) not present in the standard Cambridge reference sequence. While smaller in scale than nuclear genome discoveries, this finding is significant. mtDNA is maternally inherited, essential for energy production, and widely used in population genetics and ancestry studies. These new sequences improve maternal lineage tracing in underrepresented populations and enhance the diagnosis of mitochondrial disorders, helping clarify how these diseases appear across different ancestral backgrounds.

Could the UAE Pangenome Reference (UPR) be used for genetic research beyond the UAE?

Yes – the UPR offers an important resource for studying genetically related populations across the Middle East and North Africa. The study included 53 individuals from eight countries – the UAE, Saudi Arabia, Kuwait, Oman, Iraq, Jordan, Lebanon, and Egypt – capturing a broad range of Arab ancestries from both the Gulf and North Africa. While country-specific genomic projects are still needed, the UPR greatly enhances variant detection and interpretation in a region long underrepresented in global reference datasets.

What are the overall implications for human health of building pangenomes?

Pangenomes have the potential to transform medicine by ensuring genomic research and clinical tools reflect the full diversity of human populations. By capturing both shared and population-specific variants, pangenomes improve the accuracy of variant detection, reduce diagnostic uncertainty, and reveal risk factors often missed in standard reference genomes. This leads to more equitable healthcare across ancestries.

Beyond diagnostics, pangenomes support inclusive drug discovery, more precise genetic counseling, and the development of population-appropriate screening programs. Ultimately, they move us closer to a global standard of genomic medicine that is accurate, representative, and fair.

Newsletters

Receive the latest pathologist news, personalities, education, and career development – weekly to your inbox.

Newsletter Signup Image

About the Author(s)

Jessica Allerton

Deputy Editor, The Pathologist

More Articles by Jessica Allerton

Explore More in Pathology

Dive deeper into the world of pathology. Explore the latest articles, case studies, expert insights, and groundbreaking research.

False

Advertisement

Recommended

False

Related Content

Breathing New Life into Diagnostics
Genetics and epigenetics
Breathing New Life into Diagnostics

January 22, 2024

6 min read

Jonathan Edgeworth on how metagenomics could transform testing for respiratory infections

Molecular Spectacular
Genetics and epigenetics
Molecular Spectacular

January 8, 2024

1 min read

A look at last year’s most interesting molecular pathology stories

Redefining Diagnostic Reference Standards
Genetics and epigenetics
Redefining Diagnostic Reference Standards

January 3, 2022

1 min read

Find out what Horizon Discovery’s diagnostic reference standards can do for your workflow

Defining the Next Generation of NGS
Genetics and epigenetics
Defining the Next Generation of NGS

December 31, 2021

1 min read

Overcoming challenges of the typical NGS workflow with the Ion Torrent™ Genexus™ System

False

The Pathologist
Subscribe

About

  • About Us
  • Work at Conexiant Europe
  • Terms and Conditions
  • Privacy Policy
  • Advertise With Us
  • Contact Us

Copyright © 2025 Texere Publishing Limited (trading as Conexiant), with registered number 08113419 whose registered office is at Booths No. 1, Booths Park, Chelford Road, Knutsford, England, WA16 8GS.