Compilers - Semantic Analysis with Formal Grammar
Compilers - Semantic Analysis with Formal Grammar
At least the following three tables are involved in semantic analysis:
- Symbol table stores information about identifiers seen by semantic parser.
- Type table
- Constants table
Internal representation of indentifiers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#include <array>
#include <cstddef>
#include <cstdint>
#include <string>
using TypePointer = void *;
using Value = std::array<std::byte, 8>;
// NOLINTBEGIN
enum class IdentifierKind { type, constant, variable };
// @brief: AccessType determines whether actual value changes when the
// variable's value changes inside the function.
enum class AccessType { direct, indirect };
// Symbol Table
struct IdentifierBase {
TypePointer typeptr;
IdentifierKind kind;
};
struct Type : public IdentifierBase {};
struct Constant : public IdentifierBase {
Value value;
};
struct Variable : public IdentifierBase {
AccessType access;
std::size_t level;
std::size_t off;
};
enum class TypeKind { recordTy, arrayTy, pointerTy, unionTy };
// Type Table
struct TypeBase {
std::size_t size;
TypeKind kind;
};
struct Enum {
// ...
};
struct Record {
struct RecordField {
std::string fieldname;
TypeBase fieldtype;
std::size_t off;
RecordField *next;
};
RecordField *next;
};
struct Array {
std::int_least64_t low;
std::int_least64_t up;
TypeBase *elemtype;
};
struct Pointer {
TypeBase *basetype;
};
// ...
// NOLINTEND
Symbol table
When seeing
- definition of a symbol, parser registers it in the symbol table.
- usage of a symbol, parser finds it in the symbol table.
Finding symbols in table is easy peasy, but be careful about the scope.
Scope
No two identifiers in the same scope can have the same name. It’s ambiguous.
Embedding rule:
- A scoped identifier can’t be fetched outside its scope, but can be used in inner scope.
- If more than one identifiers can be seen in current scope, identifier in the deepest level is used.
See also
- c4 - a minimal c compiler in four functions (500 lines).
This post is licensed under
CC BY 4.0
by the author.