標題: On the applicability of the longest-match rule in lexical analysis
作者: Yang, W
Tsay, CW
Chan, JT
資訊工程學系
Department of Computer Science
關鍵字: compiler;context-free grammar;finite-state automaton;lexical analyzer;Mealy automaton;Moore automaton;parser;regular expression;scanner
公開日期: 1-Oct-2002
摘要: The lexical analyzer of a compiler usually adopts the longest-match rule to resolve ambiguities when deciding the next token in the input stream. However, that rule may not be applicable in all situations. Because the longest-match rule is widely used, a language designer or a compiler implementor frequently overlooks the subtle implications of the rule. The consequence is either a flawed language design or a deficient implementation. We propose a method that automatically checks the applicability of the longest-match rule and identifies precisely the situations in which that rule is not applicable. The method is useful to both language designers and compiler implementors. In particular, the method is indispensable to automatic generators of language translation systems since, without the method, the generated lexical analyzers can only blindly apply the longest-match rule and this results in erroneous behaviors. The crux of the method consists of two algorithms: one is to compute the regular set of the sequences of tokens produced by a nondeterministic Mealy automaton when the automaton processes elements of an input regular set. The other is to determine whether a regular set and a context-free language have nontrivial intersection with a set of equations. (C) 2002 Elsevier Science Ltd. All rights reserved.
URI: http://dx.doi.org/10.1016/S0096-0551(02)00014-0
http://hdl.handle.net/11536/28507
ISSN: 1477-8424
DOI: 10.1016/S0096-0551(02)00014-0
期刊: COMPUTER LANGUAGES SYSTEMS & STRUCTURES
Volume: 28
Issue: 3
起始頁: 273
結束頁: 288
Appears in Collections:Articles


Files in This Item:

  1. 000181112400002.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.