Abstract:
Compiler errors can cause a variety of problems for software systems, including
unexpected program behavior, security flaws, and system failures. These defects can be
brought on by a number of things, including improper data type handling, poor code
creation, and wrong code optimization. Compiler defects can be difficult to spot due to
their complexity and, if ignored, can have severe effects. So, Identifying compiler
defects is a crucial and difficult undertaking because it is difficult to produce reliable
test programs.
One of the most used software testing techniques for finding bugs and vulnerabilities is
fuzzing. Fuzzing is the process of generating numerous inputs to a target application
while keeping an eye out for any anomalies. Among fuzzing techniques, the most recent
and promising methods for compiler validation are test program generation and
mutation. Both methods have proven to be effective in identifying numerous problems
in a variety of compilers, although they are still constrained by the techniques' use of
valid code creation and mutation methodologies.
Code mutation is a method that has grown in favor recently since it can find bugs that
standard testing can forget. This technique involves performing minor adjustments to a
program's source code to make versions of the original code, which are then compiled
and evaluated to see if they deliver the desired outcomes. It is a sign that there might be
a compiler issue if the output of the altered code differs from the output of the original
code.
Current mutation-based fuzzers randomly alter a program's input without
comprehending its underlying grammar or semantics. In this study, we proposed a novel
mutation technique that mutates the existing program while automatically
understanding the syntax and semantic rules. Any type of compiler can be verified using
the suggested method without regard to the semantics of the language. With that we can
use this approach to test various other compilers without depend on the syntax of that
language. We focus on evaluating the Ballerina compiler to the language syntax and
semantics because Ballerina is a relatively new programming language.
In this work initially we construct a test suite from the present testcases of that language
and developed a syntax tree generator which can identify the syntax of that language
and then developed semantic generator which can identify semantic of that language.
With that we are able to mutate the existing test cases using our generator. Furthermore,
we have analyzed the performance of our model with the number of test cases which
use to train our model and the number of tokens in the generated file.
Keywords: Compiler testing, random testing, random program generation, Automated
testing
Citation:
Abeygunawardana, C.S. (2023). Finding compiler bugs VIA code mutation : a case study [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/23386