JurisTCU: a Brazilian Portuguese information retrieval dataset with query relevance judgments
Leandro Carísio Fernandes·Edans F. O. Sandes·Leonardo Augusto da Silva Pacheco·Marcos Vinícius Borela de Castro·Leandro dos Santos Ribeiro
This paper introduces JurisTCU, a Brazilian Portuguese dataset for legal information retrieval (LIR). The dataset is freely available ( https://huggingface.co/datasets/LeandroRibeiro/JurisTCU ) and consists of 16,045 jurisprudential documents from the Brazilian Federal Court of Accounts, along with 150 queries annotated with relevance judgments. It addresses the scarcity of Portuguese-language LIR datasets with query relevance annotations. The queries are organized into three groups: real user k
