| Title: |
Where is the News? Improving Toponym Identification and Differentiation in Online News |
| Authors: |
Shingleton, Joseph; Basiri, Ana |
| Publication Year: |
2024 |
| Collection: |
University of Glasgow: Enlighten - Publications |
| Subject Terms: |
QA75 Electronic computers. Computer science |
| Description: |
Understanding the geographical context of unstructured textual data is a key challenge in information extraction. In many applications, however, simple identification of toponyms is insufficient and can often lead to ambiguities in the extracted information. One such application is in the geolocation of online news - where a single article may mention multiple locations, with only one location referring to the article’s subject. In this paper, we present a transformer based model, trained to identify the subject toponyms of news articles. Further, our model identifies likely parents of the subject toponym, potentially helping to improve later geolocation tasks. Our model is able to identify the subject of anarticle with an F1-score of 0.760 when tested on a human-tagged test dataset. |
| Document Type: |
conference object |
| File Description: |
text |
| Language: |
English |
| Relation: |
https://eprints.gla.ac.uk/321323/3/321323.pdf; Shingleton, Joseph ORCID logoorcid:0000-0002-1628-3231 and Basiri, Ana ORCID logoorcid:0000-0002-2399-1797 (2024) Where is the News? Improving Toponym Identification and Differentiation in Online News. In: Second International Workshop on Geographic Information Retreival, Glasgow, 24 March 2024, pp. 34-42. |
| Availability: |
https://eprints.gla.ac.uk/321323/; https://eprints.gla.ac.uk/321323/3/321323.pdf; https://ceur-ws.org/Vol-3683/ |
| Rights: |
cc_by_4 |
| Accession Number: |
edsbas.22AD5010 |
| Database: |
BASE |