Genomics

From the very beginning, we have been involved in genomic projects of different nature. These projects have to do with different areas of research, including medicine, molecular anthropology, and forensic genetics.

Medical genetics / genomics

Infectious diseases and vaccinomics

Today, this is the main area of research of the group. In particular, we are working on a variety of projects related to infectious diseases and vaccinomics. We have already analyzed genome data on Respiratory Syncytial Virus (RSV), rotavirus, Neisseria meningitides, etc. We have recently received national and European funding for research in infectious disease, and we are integrated in large consortia. We used a 'whole-omic' approach (genomics, transcriptomics, epigenomics, etc) using next-generation sequencing approaches (NGS; WGS; WES, RNAseq) aimed at understanding the role of the host in infectious diseases.

Complex and multifactorial diseases

In addition, we have carried out research in many other complex and multifactorial diseases using e.g. genome-wide association studies, including Alzheimer, Parkinson, schizophrenia, breast cancer, etc.

Molecular / genome anthropology

Our contribution to the field of molecular anthropology has been based on the analysis of uniparental markers (mtDNA and Y-chromosome) and whole-genome studies using genome-wide SNP genotyping or genome sequencing (e.g. WGS, WES). We have investigated the genome variation of human populations from many continental regions and countries.

Some of the projects have aimed at understanding variation at a continental scale, being Africa and America the main focus of these studies. We also dedicated important effort to understand the transatlantic slave trade, with several publications already in the earlier 2000.

We have also targeted specific populations located in many different continents, including: a) Europe (Iberia, Italy, etc), b) America: El Salvador, Guatemala, Caribbean (Cuba), Colombia, Venezuela, Brazil, Colombia, Bolivia, Chile, Argentina, etc. c) Asia (Vietnam, China, etc), etc.

Photo: Aconcagua mummy. Taken from our publication Gómez-Carballa et al. in Sci Rep.
Photo: Aconcagua mummy. Taken from our publication Gómez-Carballa et al. in Sci Rep.

We have also targeted specific ethnic groups, including Yao, Tongas, Shangaan, Chopi, Chwavo, Lomwe, Makonde, Makhwa, Ndau, Nguni, Nyungwe, Nyanja, Ronga, Shona, Sena, Tswa, diaguitas, koya, etc.

A few of these studies were more focussed on population groups presenting some cultural or social peculiarities. Among these studies, we would like to highlight those on Roma gypsies, Mennonites (from Argentine), 'Afro-Bolivian' Yungas (Bolivia) etc.

Finally, we were involved in studies of ancient DNA, both carried out at a continental scale (America and Europe), and a few targeting specific remains (Aconcagua mummy, Ötzi, etc).

Forensic genetics / genomics

A few of our initial projects were focused on technical issues related to forensic genetics (mini variant repeat variation [MVR], mitochondrial DNA, short tandem repeats [STRs], etc).

We were directly involved in a few interesting inter-laboratory and collaborative studies related to the GHEP-ISFG (The Spanish and Portugues-speaking working group of the International Society for Forensic Genetics) or other ad hoc collaborative groups of researchers interested in mtDNA or Y-chromosome variation.

We were very involved in the study of mtDNA variation in a forensic context. We paid special attention to applications of phylogeny to the detection of errors in databases.

More recently, we focused on the Y-chromosome variation, in both, SNPs and STRs, with particular attention to variation in a few European and American human populations.

Most of our activity in this field of research has been linked to the INCIFOR (Instituto de Ciencias Forenses - Universidade de Santiago de Compostela), and related to the most important forensic genetics institution, the (GHEP)-ISFG)

transcriptomica_4

Together with genomics, this is one of the most important activities in the group nowadays. We generate our own data, using RNAseq (NGS) tecniques, and also using n-counter nanostring. But we also carry out lot of work by exploring datasets that are available in public data repositories, such as GEO. Below are a few lines of research in transcriptomics.

Signatures - virus vs. bacterial infection

The treatment of febrile patients is one of the difficulties that health professionals have to face every day. The distinction between bacterial and viral infections in the clinical routine is unreliable and current methods (e.g. cultures) are not fast enough. Countless patients worldwide are hospitalized, undergoing invasive tests and receive antibiotics when, in fact, they suffer from a viral infection that would resolve by itself. According to the WHO, the abuse of antibiotic prescription is leading to the emergence of multi-resistant bacteria and is one of the main threats to global health. We are running different national (e.g. DIAVIR) and European projects (e.g. PERFORM) aiming at improving the diagnosis and treatment of febrile patients and reduce the use of antibiotics in clinical practice, using signatures of transcriptomic biomarkers.

Special effort is being made to develop a test at the point of attention, based on the use of the latest technology and non-invasive samples.

A few promising findings of the group:

  • Herberg et. 2016; JAMA
  • Barral-Arca et al. 2018, Sci Rep
  • Gómez-Carballa et al. 2019, Sci Rep

Multi-class signature in infectious diseases

Development and introduction of improved methods to accurately diagnose the cause of fever would be of great benefit to individuals and health services worldwide. A rapid diagnostic test to establish the cause of fever would reduce inappropriate treatment, and unnecessary investigation; reduce health service costs and inappropriate antibiotic use and thus contribute to the reduction in AMR and sustainable health care systems. The main focus of our European funded project DIAMONDS (Diagnosis and Management of Febrile Illness using RNA Personalised Molecular Signature Diagnosis; Grant agreement ID: 848196; funded: Overall budget: € 23 839 940,75; ongoing project: period 2020-2024) is to develop a multi-class approach aimed at identify the pathogen that is responsible for the infection by analyzing the host. Therefore, the approach focuses on using each individual patient’s RNA expression profile to identify the cause of illness.

transcriptomica_3

It has the potential to revolutionize health care delivery for febrile patients and transform the medical diagnostic process from one based on the sequential exclusion of different possible causes of fever reliant predominantly on pathogen culture to rapid assignment of diagnosis by individual personalized molecular signature. The project will use the well-established Europe-wide network of researchers, and their expertise in infectious diseases research, immunology and rheumatology, molecular and cellular biology, bioinformatics, computational modeling, and public health, linked to biotechnology and industrial expertise to provide the evidence base for a new personalize medicine approach to the diagnosis of febrile patients through a large-scale multi-country pilot demonstration of personalize molecular signature diagnosis (PMSD).

Rare diseases

We are using a transcriptomic approach to investigate patterns of gene expression in Progressive Osseous Heteroplasia (POH) by examining the genome-wide expression patterns at different tissues from patients.

Population transcriptomics

There is a growing body of evidence suggesting that gene expression varies within and between human populations. The ancestral background could be inferred very efficiently from RNA-seq data, even in data sets including samples with complex patterns of admixture.

nCounter - nanostring technology

We have the nCounter of nanostring platform (see here for more details on our services). The nCounter Analysis System utilizes a novel digital count through a color barcode technology for direct multiplexed measurement. The technology uses molecular "barcodes" and single-molecule imaging for the direct hybridization and detection of hundreds of unique transcripts in a single reaction.

n-counter - Nanostring
n-counter - Nanostring

Each color-coded barcode is attached to a single target-specific probe corresponding to a molecule of interest. You will obtain the real number of target molecules presents in a given sample. This technology uses a multiplex system (expression of up to 800 genes simultaneously) that makes direct digital molecular detection measurements, giving only the real amount of target molecules present in the sample. It is a unique technology of high precision, sensitivity, and reproducibility. The levels of precision and sensitivity are maintained even in the case of low expression levels. This is a key technology to analyze gene expression in compromised samples (low yields and/or highly degraded RNA samples). The technology is able to work successfully with difficult samples like FFPE samples and also can work with crude cell lysates. All this is achieved without the need to carry out any previous enzymatic reaction (it does not require RT-PCR or PCR), which minimizes the technical variation (r^2 > 0.99), results in bias and the occurrence of a false positive. Moreover, potential inhibitors present in the samples will not affect the assay performance. All of these features make this technology the best candidate to carry out validation gene expression studies.

Transcriptomics

Nanopore technology

Unlike traditional RNA-Seq techniques, long-read nanopore RNA sequencing allows accurate quantification and complete, full-length characterization of native RNA or cDNA without fragmentation or amplification – streamlining analysis and removing potential sources of bias. Direct RNA sequencing also enables the identification of base modifications alongside nucleotide sequences. Low input amounts combined with rapid, streamlined workflows enable highly sensitive gene expression analysis, even from single cells.

The main advantages of RNA sequencing using nanopore technology are:

  • Full-length transcripts — unambiguous identification of splice variants and gene fusions
  • Accurate transcript and isoform quantification
  • Eliminate PCR bias using direct cDNA or direct RNA sequencing
  • Detect base modifications alongside nucleotide sequence using direct RNA
  • Easy identification of anti-sense transcripts and lncRNA isoforms
A0952_vs_A0953_heatmap_genes_most_variance.png

Epigenetics refers to all heritable changes that cells inherit in addition to the genetic information, but that are not encoded in the nucleotide sequence of DNA. Epigenomics, instead, defines the study of the epigenome, that is the total epigenetic state of a cell.  The role of epigenetics is strictly related to gene expression as epigenetic processes are thought to influence gene expression at the level of transcription, even if other processes, like splicing and translation, may also be regulated epigenetically. The control of gene expression by epigenetic mechanism is stably distributed over multiple cell divisions, but it can be altered during cell differentiation as it is flexible enough to respond to chemical and environmental influences.

Aberrant epigenetic alterations have been associated with several diseases as they are involved in cellular processes, differentiation and in tumorigenesis. Whilst the risk to develop a disease is currently unmodifiable, it is likely that the epigenetic risk will be reversible and flexible. Consistent with this view, Epigenome-Wide Association Studies (EWAS) are becoming interestingly important to investigate the association of epigenetic changes and biological traits, as well as to evaluate the molecular basis for disease risks, with the last goal to find epigenetic biomarkers that can be used as target to obtain personalized treatments and therapies.

DNA methylation is one of the most studied epigenetic mechanisms that involves the transfer of a methyl group at the 5-carbon of the cytosine ring of the DNA. Generally, cytosines are methylated soon after DNA synthesis by a group of DNA methylatransferases (DNMTs) that are responsible for setting up DNA methylation patterns in early development and to maintain it.

Epigenetic_mechanisms

DNA methylation is frequently described as a “silencing” epigenetics marker as it is often associated to the loss of gene expression. It is suggested that DNA methylation can directly affect transcription restricting the access of the transcription binding factors to the promoter of the gene. However, it is not clear the role of the plethora of CpGs present in the genome, in other regions different from the promoter.

The search for the methylated CpGs is becoming nowadays an important tool that can be used to predict clinical outcomes and to explain different phenotype associated to similar genotype. To detect DNA methylation changes in the genome, several approaches have been developed, but currently, the most used techniques concern the high-throughput technology, such as NGS sequencing and microarray, whose the most widely used platform is the Infinium Illumina Methylation Beadarray through which it is possible to analyse the methylation status of the whole genome on a large scale and at a single-base resolution. The advantage of the Infinium platforms is that they are easy to use, time-efficient and cost-effective and show good agreement with DNA methylation measurements from other platforms, as the whole-genome bisulphite sequencing. In 2015 Illumina introduced the last version of the Infinium Methylation Beadarray, called MethylationEPIC (EPIC) BeadChip, that allow the analysis of the whole epigenome thanks to its coverage of more than 850.000 CpGs. This large number of new probes allows to decipher gene body, intergenic and non-CpG island regions, as well as unexplored territories, such as enhancers, contributing to cell homeostasis and human diseases.

Our laboratory focuses on the analysis of whole blood and tissue-specific DNA methylation applied to the study of infectious diseases with the aim to find novel epigenetic biomarkers that can explain the behaviour of the host during bacterial and viral infections. In fact, several strategies have been developed by pathogens to invade and establish long-term infections in their hosts. Evidences suggest that different pathogens can manipulate their hosts' processes in order to successfully colonize them. To better understand the interactions between infectious agents and their hosts, there is an increasing interest in unravelling modifications occurring in the host’s epigenome, particularly the changes induced in the hosts' DNA methylome, during and after the infection. The epigenetic signatures of the infected host cell may serve as epigenetic biomarker for more accurate diagnoses of the infection or predictor of disease outcomes.

IMG-20180205-WA0032

Proteomics is a high-throughput technology that refers to the large-scale study of the proteome, defined as the (full) set of proteins produced in an organism, tissue.... As in a transcriptome or an epigenome, the proteome is flexible and dynamic; it differs from cell to cell and changes over the time.

The term proteomics encompasses all the modifications produced in a native protein when organism is subjected to changes. The main aim of proteomics is to obtain a global and integrated view of proteins, as well as the way they interact between them to function, and how they respond to the large number of potential environmental stimuli.

Nowadays the study of proteomics is becoming more and more interesting in cell biology; it is in fact one of the most interesting methodologies to understand gene function. The focus in a single gene and protein can be very limited, while the genome, the proteome, etc provides a more comprehensive view of the cell, tissue, organism. Moreover, proteins, underwent post-translation modifications after intracellular and extracellular signals that cannot be predicted directly studying the genome.

Two main types of proteomics can be distinguished: (i) the structural proteomic responsible to determine the proteins present in a specific cellular organelle or the structure of protein complexes, and (ii) the functional proteomics that is involved in the characterization of protein-protein interactions to determine protein functions and to demonstrate how proteins assemble in larger complexes.

Several protein technologies have been developed to improve the proteomics analysis; the most important are:

  • 2-D Electrophoresis for proteins separation according to their physical-chemical properties (pH and molecular weight) to study the protein family content in different cell types and to allow the observation of post-translational modifications.
  • Wester Blotting for the determination of specific proteins in a complex sample using antibodies detection.
  • Microscopy for protein visualization and protein-protein colocalization
  • Liquid chromatography and mass spectrometry, for the qualitative and quantitative studies of proteome from different types of sample.

Advances in a high resolution liquid phase separation (LC-chromatography), mass spectrometry technologies and bioinformatics tools for the large scale data analysis are having and important impact on proteomic analysis for the study of complex samples, such as human blood serum, plasma, biological fluids and tissues.

One of our main aims is to identify and measure protein biomarkers for a better comprehension of some diseases.

In our laboratory we make use of these technologies to analyse possible biomarkers in patient samples and their correlation with different pathways or interaction with other proteins. At the same time, we use the standard proteomic techniques for the basic identification or visualization of protein addressing to gene editing in culture cells.

Recently, we have been carried out the analysis of the proteome in a particular case of Progressive Osseous Heteroplasia (POH), an ultra-rare phenotypic conditions that is being investigated in our group from a multidisciplinary point of view, that includes all kind of 'omic' approaches, but also cell cultures (primary culture cells, cell lines, etc), but also animal model (zebrafish).

Code

Bioinformatics is the art of making tools and methods to understand the biological data. Over the last 20 years, huge amounts of such data has been pouring into what used to be exclusively wet labs, and nobody can even imagine how to start analysing it manually.

In our lab, we have some different approaches to data analysis:

  • Full analysis. We collect the biological samples and process them in our facilities to get the raw data. This is one of the more fulfilling working pipelines, as we get to walk the entire path from the organic source to the conclusions. Chances usually arise to learn a lot in the journey, and overcoming unexpected hurdles gives us a boost in confidence.
  • Post-analysis. One simply cannot have every machine in the market, much less compete in efficiency with behemoths completely focused in one line of work. In those scenarios we send the biological samples we collect to external wet labs (e.g. NGS sequencing), and receive the raw data. Our team either run the best practices pipelines when available or create new customized pipelines.
  • Data mining and leeching, or more glamorously called "secondary data analysis". We don't have any data nor samples, but we found it in the wild. Maybe someone released a dataset required by a journal, or as part of an Open Source Consortium, data that was used with some other mean but we consider that it hides something more. After carefully check that we can ethically re-analyse the newly found treasure, we squeeze from it every bit of information that we possibly can.
  • Tailored software. This is probably our less exploited field. In some cases, you have the biological samples and the raw data obtained with large amounts of sweating and suffering, but you lack the bioinformatic muscle to reliably analyse it. Why release a poorly analysed dataset into a public repository? We can probably give you a hand there and add a layer of polish to your data.
We unload a lot of computing effort into the CESGA infrastructures. © CESGA
We unload a lot of computing effort into the CESGA infrastructures. © CESGA

Those strategies demand a multifaceted team that works quite tightly, so each of them can add the maximum to the heap of results: the biologist who is knowledgeable in informatics knows what to get from the samples and what can be realistically achieved. The computer scientist throws the data to the silico until it gets transformed into something that the statistician and mathematician turn into conclusions.

We have a team perfectly interwoven to achieve such tasks. Thanks to decades of combined expertise each piece of the gearbox knows how to maximise its efforts to get as close to the maximum as they can. The quote "the whole is more than the sum of its parts" is always hovering our minds.

In a more technical note, we are proficient (among other) at:

  • Statistic analysis.
  • Unix and supercomputing usage.
  • Big data tooling.
  • Programming, both scripting (R and Python) and performant code (C and Nim).
  • Database and dataset managing.
  • Cooperating with others, being either the small, equal or big fish.
cells_fluorescent_confocal

Cell cultures in biomedical research

Cell culture is the result of the process of removing cells from different tissues, subsequently maintaining their growth in a controlled artificial environment. The culture conditions depend on each type of cell, although the success of the culture will depend on regulating the physicochemical environment (pH, osmotic pressure and temperature) and maintaining the essential nutrients for proper growth. Such nutrients include growth substrate (culture medium) that supplies (amino acids, carbohydrates, vitamins and minerals), growth factors and hormones, as well as maintaining a controlled level of CO2, gas that the cell needs to subsist in culture. Most of the cells grow in adherent or monolayer culture because they needs to be attached to a solid or semi-solid substrate, while other cells can be floating in the culture media.

Probeta

There exist different types of culture cells. The PRIMARY CULTURE refers to the culture carried out after cell isolation from tissue and proliferation under specific and controlled conditions. The proliferation of normal cells, its number, and the time of division is limited. After this time, the cells enter in a biological process called programmed cell death (senescence) that avoids changes in their genotypes.

There exists an alternative to primary culture, the CELL LINES. These are immortal cells by using a proper chemical and viral treatment that drives the cells to the acquisition of the ability to divide indefinitely and maintain an uniform genotype.

For the storage of culture cells and to ensure a good subculture and viability in future analysis or experiments, the cells are treated with an appropriate protective agent (DMSO or glycerol). These reagents prevent the crystal formation inside the cells and allow the adequate preservation of the cells. The cells should be cryopreserved (at temperatures below -130 ºC) until they are needed again in the lab.

The culture cells are the best tools used in cellular and molecular biology; the provide an excellent experimental model to study physiology and biochemistry of cells in normal and disease context. Culture cells allow evaluating the effects of drugs, biological compounds like vaccines and therapeutic proteins.

In the “omics” topic, the culture cells are very useful for functional experiments that allow evaluating the biological impact of changes in genome, epigenome, transcriptome or proteome, and its association with the disease or severity of it.

Another area that benefits from culture cells is GENOME ENGINEERING (e.g. using CRISPR). We are able to edit changes in the genome or epigenome, and study how these changes affect the biology of the cells, or we can mimic diseases, e.g. rare diseases, and investigate on therapeutic options using this technology.

In our laboratory, the cellular models are being very useful to study infectious diseases, patterns of vaccination, immunological profiles and rare diseases.

Infection diseases

In our infectious disease projects, we used primary culture cells from patients to analyze their genetic, epigenetic, and transcriptomic profile, and search molecular biomarkers that allow specific diagnosis of the disease or to investigate on possible therapeutic targets.

Vaccination and immunology

In projects related to vaccines and immunology, we use the isolation of blood cells to study the impact of maternal vaccination on their newborns. We use the whole blood stimulation technique to evaluate the heterologous properties of vaccines against other antigens. On the other hand, we also isolate immunological cells from patients with different infectious diseases to measure proteins (e.g. cytokines) and their association with the severity of infections.

Rare diseases

We are working in the ultra-rare disease known as Progressive Osseous Heteroplasia (POH). We used primary cultures from surgery explants; this allows to study the molecular and cellular basis of the disease, testing therapeutic molecules and drugs, and determine the interaction of primary cells with immunological cells. We also use cell line from mesenchymal stem cells for gene editing using CRISPR technology to obtain a cellular model in POH.

Zebra_HOP_c_2020-01-22.jpeg

Among the different genetic engineering techniques used for gene editing, this novel technology is used to accurately edit the genome of cells and organisms. Through the modification of both DNA or RNA, CRISPR brings a wide range of tools to modify the genome. Thus, correcting mutations, eliminating pathogenic sequences of DNA, and activating or suppressing gene expression are some of the most used applications nowadays. 

Using this leading technology of gene editing, our group has integrated the functional genomics to different lines of research.  Some of them are:

a) CRISPR in culture cell lines in the context of Progressive Osseous Heteroplasia (POH) disease

Through the knocking out of specific genes in cultured cell lines, we perform functional analyzes of candidate transcriptomic markers that were found to be differentially expressed by the host in response to viral or bacterial infections. We are currently studying the molecular basis of this extremely rare disease POH. This innovative technology brings us the possibility to reproduce and analyze the mutations on the GNAS gene responsible for abnormal bone formation. 

zebra fish

In addition, we are also creating an in vivo model for the study of the disease, using CRISPR-edited zebrafish lines, to reproduce the phenotype in a complex organism.

b) CRISPR in infectious diseases

We are using CRISPR aiming at understanding the gene function of a few genes we have discovered to be related to infectious diseases, using both, cell cultures and zebrafish.

A0952_vs_A0953_dispersion_plot.png

Since the very beginning, we were involved in the study of several 'rare' Mendelian diseases, among which, we would like to highlight our studies on:

  • Mitochondriopathies
  • Insuline resistance
  • Wilson disease
  • Ichthyosis

In the last few years we are very involved in Progressive Osseous Heroplassia (POH) and heterotypic ossification.

Heterotopic ossification

Our research group, in its translational component, combines multidisciplinary clinical activity in the area of ​​pediatrics and infectious diseases with an intense activity in the research field.

For 8 years, our clinical group has attended the only reported case in the world of progressive osseous heteroplasia (POH) in two identical monozygotic twins. The genetic analysis has confirmed the presence of an inactivating mutation in the GNAS gene in both girls. The most relevant issue in this case is, however, that the two girls have a markedly differential clinical evolution: while one of the twins remains practically asymptomatic, her sister suffers an unusual rapid, progressive and disabling form of the disease. At present, there is no effective or preventive treatment for this disease and the only solution involves the surgical amputation of the affected limbs and well-defined lesions.

hsa03010.KEGG_native.png

POH is an ultra-rare genetic disease of progressive formation of extra-skeletal bone that affects about 60 people worldwide. The disease belongs to a group of rare disease, such as Albright's hereditary osteodystrophy (AHO), pseudohypoparathyroidism (PHP) and pseudopseudohypoparathyroidism (PPHP), which are characterized by the presence of heterotopic ossifications that have their origin in inactivating mutations of the GNAS complex. GNAS gene encodes for the alpha subunit of the Gs protein of the enzyme adenylate cyclase, that plays a fundamental role in the signaling pathways that regulate bone development (osteogenesis). The complexity of the GNAS gene results from a marked phenomenon of genomic imprinting, which underlies the allele-specific regulation of the expression of the transcripts it generates and that influences the spectrum of clinical phenotypes of the inactivating disorders associated to mutations in GNAS. In fact, due to differential methylation of its promoters, some GNAS transcripts exhibit only paternal or maternal expression, since its promoter region is methylated in the paternal or maternal allele. Consequently, the expression of the different phenotypes associated with mutations in GNAS depends on whether the mutation is inherited from the mother or the father.

Following the unique case worldwide with which our group has maintained a relationship since its inception, we have put all our scientific effort in the study of heterotopic ossification (HO) associated with mutations in GNAS. Advances in the study of this disease can also be applied to the other related pathologies and could be a fundamental key for their comprehension.

Despite being a rare entity, the effects of heterotopic ossification in a developing body have a great impact on the quality of life of a child or adult. Ectopic bone plate removal is only a palliative solution, with recurrent lesions after surgery. From these palliative surgeries, our group has managed to isolate different cell types and cultivate them for study. On the other hand, and thanks to samples collected in palliative surgeries, we have analysed the gene expression patterns of different signalling pathways in skin samples and ossification, looking for the understanding of the molecular mechanisms behind heterotopic ossification and the possible search of new potential therapeutic targets that could be used to curb the ossification process. The study of this unique case of monozygotic twins affected with extreme phenotypes is a great opportunity to reveal key aspects in the pathophysiology of heterotopic ossifications.