| Title: |
GENCODE 2025: reference gene annotation for human and mouse |
| Authors: |
Mudge, Jonathan M; Carbonell-Sala, Sílvia; Diekhans, Mark; Martinez, Jose Gonzalez; Hunt, Toby; Jungreis, Irwin; Loveland, Jane E; Arnan, Carme; Barnes, If; Bennett, Ruth; Berry, Andrew; Bignell, Alexandra; Cerdán-Vélez, Daniel; Cochran, Kelly; Cortés, Lucas T; Davidson, Claire; Donaldson, Sarah; Dursun, Cagatay; Fatima, Reham; Hardy, Matthew; Hebbar, Prajna; Hollis, Zoe; James, Benjamin T; Jiang, Yunzhe; Johnson, Rory; Kaur, Gazaldeep; Kay, Mike; Mangan, Riley J; Maquedano, Miguel; Gómez, Laura Martínez; Mathlouthi, Nourhen; Merritt, Ryan; Ni, Pengyu; Palumbo, Emilio; Perteghella, Tamara; Pozo, Fernando; Raj, Shriya; Sisu, Cristina; Steed, Emily; Sumathipala, Dulika; Suner, Marie-Marthe; Uszczynska-Ratajczak, Barbara; Wass, Elizabeth; Yang, Yucheng T; Zhang, Dingyao; Finn, Robert D; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim JP; Kellis, Manolis; Kundaje, Anshul; Paten, Benedict; Tress, Michael L; Birney, Ewan; Martin, Fergal J; Frankish, Adam |
| Source: |
Nucleic Acids Research, vol 53, iss D1 |
| Publisher Information: |
eScholarship, University of California |
| Publication Year: |
2025 |
| Collection: |
University of California: eScholarship |
| Subject Terms: |
31 Biological Sciences (for-2020); 3102 Bioinformatics and Computational Biology (for-2020); 3105 Genetics (for-2020); Genetics (rcdc); Biotechnology (rcdc); Human Genome (rcdc); 1.5 Resources and infrastructure (underpinning) (hrcs-rac); Generic health relevance (hrcs-hc); Mice (mesh); Animals (mesh); Molecular Sequence Annotation (mesh); Humans (mesh); Software (mesh); Databases; Genetic (mesh); Genomics (mesh); Transcriptome (mesh); Genome (mesh); RNA; Long Noncoding (mesh); 05 Environmental Sciences (for); 06 Biological Sciences (for); 08 Information and Computing Sciences (for); Developmental Biology (science-metrix); 34 Chemical sciences (for-2020) |
| Subject Geographic: |
d966 - d975 |
| Description: |
GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result. Meanwhile, we are incorporating data from state-of-the-art proteomics and Ribo-seq experiments to fine-tune our annotation of translated sequences, while further insights into function can be gained from multi-genome alignments that grow richer as more species' genomes are sequenced. Such methodologies are combined into a fully integrated annotation workflow. However, the increasing complexity of our resources can present usability challenges, and we are resolving these with the creation of filtered genesets such as MANE Select and GENCODE Primary. The next challenge is to propagate annotations throughout multiple human and mouse genomes, as we enter the pangenome era. Our resources are freely available at our web portal www.gencodegenes.org, and via the Ensembl and UCSC genome browsers. |
| Document Type: |
article in journal/newspaper |
| File Description: |
application/pdf |
| Language: |
unknown |
| Relation: |
qt32v2p12s; https://escholarship.org/uc/item/32v2p12s; https://escholarship.org/content/qt32v2p12s/qt32v2p12s.pdf |
| DOI: |
10.1093/nar/gkae1078 |
| Availability: |
https://escholarship.org/uc/item/32v2p12s; https://escholarship.org/content/qt32v2p12s/qt32v2p12s.pdf; https://doi.org/10.1093/nar/gkae1078 |
| Rights: |
CC-BY |
| Accession Number: |
edsbas.7A38D8F0 |
| Database: |
BASE |