An organism’s genome is a collection of DNA instructions required for its growth, function, and reproduction. The genome of a modern organism contains information about its journey along the evolutionary path that begins with “The first universal common ancestor“All life on Earth begins and ends with this organism.

Encoded within itself, an organism’s genome contains information that can reveal its relationship to its ancestors and relatives.

Other dimensions of the genome.

Our research explores the hypothesis that an organism’s genome may contain other types of information, Out of lineage or hierarchy. We asked: Could an organism’s genome contain information that would allow us to determine the type of environment in which the organism lives?

Unlikely as it may seem, our team of computer science and biology researchers at the University of Waterloo and Western University found that this is the case for organisms that survive and thrive in extreme conditions. These environmental conditions range from extreme heat (over 100°C). (below -12°C), intense radiation or acidity or extremes in stress.

DNA as a language

We viewed genomic DNA as text written in the “DNA language.” A DNA strand (or DNA sequence) consists of a sequence of The basic units are called nucleotides., a sugar attached to the phosphate backbone. There are four such different DNA units: Adenine, Cytosine, Guanine and Thiamine (A,C,G,T).

Abstractly speaking, a DNA sequence can be thought of as a line of text, written with “letters” from the “DNA alphabet”. For example, “CAT” would be a three-letter “DNA word” corresponding to the three-unit DNA sequence cytosine-adenine-thymine.

In the 1990s, it was discovered that Counting events Such DNA words can be identified in a short DNA sequence extracted from an organism’s genome. Species of organisms and its degree of relation to other organisms in evolution”.The tree of life

Extreme environments are coded in the genomes of the organisms that live there.

A schematic tree of life with the primary domains, archaea and bacteria, shown in purple and blue, respectively, and the secondary domain, eukaryotes, in green. Credit: Tara Mahendraraja, CC BY

This process of identifying or classifying an organism based on DNA word counts is similar to the process that allows us to distinguish an English book from a French book: by taking a page from each book, it The English text appears to have multiple occurrences. The three-syllable word “the” while the French text has many occurrences of the three-syllable word “les”.

Note that the word frequency profile of each book does not depend on the specific page we chose to read and on whether we considered multiple pages, a single page, or an entire chapter. Similarly, the frequency profile of DNA words in a genome does not depend on the location and length of the DNA sequence that was chosen to represent that genome.

That DNA word frequency profiles can serve as a “genomic signature” of an organism was a major discovery and, until now, it was thought that the DNA word frequency profile of a genome contained only species, genus. , contains evolutionary information about The family, order, class, phylum, kingdom or domain to which the organism belonged.

Our team set out to ask whether a genome’s DNA word frequency profile might reveal other kinds of information—for example, information about the extreme environment in which a microbial extremophile thrives.

Implications of the Environment in Extremophile DNA

We used a dataset of 700 microbial extremophiles living in extreme temperatures (either extreme heat or cold) or extreme pH conditions (strongly acidic or alkaline). We used both. Supervised Machine Learning And Unsupervised Machine Learning Computational approach to test our hypothesis.

In both types of environmental conditions, we discovered that we can clearly detect an environmental signal that indicates the type of extreme environment a particular organism inhabits.

In the case of unsupervised machine learning, a “blind” algorithm was given a data set of extremophile DNA sequences (and no other information about their taxonomy or their habitat). The algorithm was then asked to group these DNA sequences into clusters, based on whatever similarities could be found in their DNA word frequency profiles.

The expectation was that all clusters discovered in this way would be along taxonomic lines: bacteria group with bacteria, and archaea group with archaea. To our surprise, this was not always the case, and some archaea and bacteria were consistently grouped together, regardless of the algorithm we used.

The only apparent commonality that could explain their matching by multiple machine learning algorithms was that they were extreme thermophiles.

Shocking revelation

gave The tree of lifea Used in biology Represents genetic relationships. Among species, there are three major organs, called domains: Bacteria, Archaea and Eukarya.

Eukaryotes are organisms that have a membrane-bound nucleus, and this domain includes animals, plants, fungi, and unicellular microscopic protists. In contrast, bacteria and archaea are single-celled organisms that do not have a membrane-bound nucleus containing the genome. What distinguishes bacteria from archaea is the structure of their cell walls.

The three domains of life are dramatically different from each other, and genetically, a bacterium is as different from an archaea as a polar bear (Eukarya) is from E. coli (Bacteria).

The expectation was therefore that a bacterium and an archaeal genome would be as distinct as possible within any cluster by any measure of genomic similarity. Our findings of some bacteria and archaea cluster together, apparently simply because they are both adapted. meaning that the extreme temperature environment they live in causes widespread, genome-wide, systemic changes in their genome language.

This discovery is like discovering a whole new dimension. An ecological one exists in addition to its well-known taxonomic dimension.

Genomic effects of other environments

Aside from being unexpected, the finding could have implications for our understanding of the evolution of life on Earth, as well as guide our thinking about what it would take to survive in outer space.

Indeed, our ongoing research is exploring the existence of environmental signals in the genomic signature of radiation-resistant extremists, e.g. Deinococcus radioduranswhich can also protect against radiation exposure. cold, Dehydration, Vacuum conditions and acid, and was shown to survive in Outer space for three years.

Provided by

This article has been republished. Conversation Under Creative Commons License. read Original article.Conversation

Reference: Extreme environments are coded in the genomes of organisms that live there, research shows (2024, February 24) February 24, 2024 https://phys.org/news/2024-02-extreme-environments-coded -derived from genomes. html

This document is subject to copyright. No part may be reproduced without written permission, except for any fair dealing for the purpose of private study or research. The content is provided for informational purposes only.