表の読み取りや処理(Table Processing)#
[2402.05121] Large Language Model for Table Processing: A Survey
Task Name |
Table Type |
Description (related work) |
Example Dataset |
|---|---|---|---|
Table QA |
WT |
WikiTableQuestion [30] |
|
Tablefact verification |
WT |
TabFact [32] |
|
Table-to-text |
WT |
Produce a NL question given a table ( [11]) |
ToTTo [33] |
Data cleaning |
WT/SS/DB |
- |
|
Column/Row/Cellpopulation |
WT/SS/DB |
TURL [9] |
|
Entity linking |
WT |
TURL [9] |
|
Column typeannotation |
WT |
TURL [9] |
|
Spreadsheetmanipulation |
SS |
SpreadsheetBench [37] |
|
NL2SQL |
DB |
Spider [40] |
|
Data analysis |
SS/DB |
Table data analysis pipeline, consists offeature engineering, machine learning, etc. ( [41, 42]) |
DS-1000 [43] |
Table detection |
DOC |
Locate tables in documents ( [44]) |
TableBank [45] |
Table extraction |
DOC |
PubTabNet [47] |
有価証券報告書の読み取り#
UFO 2024(有報読み取りコンペ)#
背景: 有価証券報告書の表を対象としたコンペティションの提案
タスク:
Table Retrievalタスク:質問に該当する表を検索する
Table QAタスク:単一の表から、質問に該当するセルのID or 値を取り出すタスク