Main path analysis is a popular method for extracting the scientific backbone from the citation network of a research domain. Existing approaches ignored the semantic relationships between the citing and cited publications, resulting in several adverse issues, in terms of coherence of main paths and coverage of significant studies. This paper advocated the semantic main path analysis approach to alleviate these issues based on citation function analysis. A wide variety of SciBERT-based deep learning models were designed for identifying citation functions. Semantic citation networks were built by either including important citations, e.g., extension, motivation, usage and similarity, or excluding incidental citations like background and future work. Semantic main path network was built by merging the top-K main paths extracted from various time slices of semantic citation network. In addition, this study proposed a three-way framework for quantitative evaluation of main path analysis results. Both qualitative and quantitative analysis on three research areas of computational linguistics demonstrated that, compared to semantics-agnostic counterparts, different types of semantic main path networks provide complementary views scientific knowledge flows. Combining them together, we can obtain a more precise and comprehensive picture of domain evolution and uncover more coherent development pathways between scientific ideas.
|Number of pages
|Journal of the Association for Information Science and Technology
|Early online date
|21 Mar 2023
|Published - May 2023