Syntax-Directed Translation
Overview
Having established the grammatical structure of a source program through parsing, we now proceed to the crucial phase of discerning its meaning. This process is systematically guided by the program's syntax, and the formal framework for achieving this is known as Syntax-Directed Translation (SDT). This powerful method involves associating semantic rules, or actions, with the productions of a context-free grammar. By executing these actions during parsing, we can translate a string from the source language into a target representation, which may be an abstract syntax tree, intermediate code, or even executable code itself.
The principles of syntax-directed translation form the theoretical and practical foundation for the semantic analysis phase of a compiler. Through this framework, we can perform a multitude of essential tasks, including the enforcement of type-compatibility rules, the collection of information for symbol tables, and the generation of intermediate representations such as three-address code. In essence, syntax-directed translation serves as the bridge between the structural analysis performed by the front end and the code synthesis conducted by the back end, ensuring that the compiler understands what the program is supposed to do, not just how it is structured.
For the Graduate Aptitude Test in Engineering (GATE), a thorough understanding of this topic is paramount. Questions frequently assess one's ability to work with attribute grammars, differentiate between S-attributed and L-attributed definitions, and trace the evaluation of attributes on a parse tree. Mastery of these concepts is indispensable for solving complex problems involving semantic checking and intermediate code generation, which are recurring and high-value themes in the Compiler Design syllabus.
---
Chapter Contents
| # | Topic | What You'll Learn |
|---|-------|-------------------|
| 1 | Semantic Analysis | Verifying program meaning and language rules. |
---
Learning Objectives
After completing this chapter, you will be able to:
- Define Syntax-Directed Definitions (SDDs) and differentiate between synthesized and inherited attributes.
- Distinguish between S-attributed and L-attributed definitions and their respective evaluation orders.
- Construct dependency graphs and annotated parse trees for given grammar productions and input strings.
- Develop translation schemes to perform type checking and generate intermediate code.
---
We now turn our attention to Semantic Analysis, which constitutes the first and most fundamental application of the syntax-directed framework.
## Part 1: Semantic Analysis
Introduction
Following the parsing phase, a compiler has successfully verified that a source program is syntactically correct according to the rules of a context-free grammar. However, syntactic correctness does not guarantee that the program is meaningful or logically sound. For instance, a program might attempt to add an integer to a string, or use a variable before it has been declared. The detection of such errors falls under the purview of semantic analysis. This phase serves as a critical bridge between the structural analysis of the parser and the generation of machine-level code, ensuring that the program adheres to the language's semantic rules.
The primary framework for performing semantic analysis is Syntax-Directed Translation (SDT). This powerful technique systematically associates semantic rules, or actions, with the productions of a grammar. As the parser constructs the parse tree, these semantic actions are executed to compute attribute values, perform checks, and generate intermediate representations. We can view this process as annotating the parse tree with semantic information that flows up, down, and across its nodes, ultimately determining the overall meaning of the source program. The successful completion of this phase confirms that the program is not only syntactically well-formed but also semantically coherent.
A Syntax-Directed Definition (SDD) is a context-free grammar where each grammar symbol is associated with a set of attributes, and each production is associated with a set of semantic rules for computing the values of the attributes.
---
Key Concepts
#
## 1. Attributes: Synthesized and Inherited
Attributes are values associated with grammar symbols that carry semantic information. They can be of any type, such as numbers, strings, types, memory locations, or even data structures. We classify attributes into two distinct categories based on how their values are computed.
a) Synthesized Attributes
A synthesized attribute for a non-terminal is computed from the attribute values of its children in the parse tree. This corresponds to a flow of information upwards from the leaves of the tree towards the root. For a production , a semantic rule computing a synthesized attribute for will take the form .
Consider the parse tree for a production . A synthesized attribute is computed using attributes from its children , , and .
b) Inherited Attributes
An inherited attribute for a non-terminal is computed from the attribute values of its parent and/or its siblings in the parse tree. This corresponds to a flow of information downwards or sideways across the tree. For a production , a semantic rule computing an inherited attribute for child might take the form .
Inherited attributes are particularly useful for propagating contextual information, such as the data type in a declaration, down to the individual identifiers.
---
#
## 2. Classes of Syntax-Directed Definitions
Based on the types of attributes and the structure of semantic rules, we can classify SDDs into two important categories, which have significant implications for parsing and implementation.
a) S-Attributed Definitions
An SDD that uses only synthesized attributes is called an S-attributed definition. These are the simplest form of SDDs and are highly compatible with bottom-up parsing methods, such as LR parsing.
An SDD is S-attributed if every attribute it uses is a synthesized attribute. Semantic rules for a production can only compute attributes for the head non-terminal .
The evaluation of attributes in an S-attributed definition naturally follows a post-order traversal of the parse tree. Since a node's attribute value depends only on its children, we can compute it as soon as the attributes for all its children have been computed. This aligns perfectly with the reduction step in a bottom-up parser.
Worked Example: S-Attributed Expression Evaluator
Problem: Consider the following S-attributed grammar for binary number evaluation. Compute the value of the input string `1011`.
| Production | Semantic Rule |
|-----------------------|---------------------------------|
| | |
| | |
| | |
| | |
Solution:
First, we construct the parse tree for the input string `1011`. The grammar is left-recursive, so the parse tree will be left-leaning.
Step 1: Parse the input and build the tree. The string `1011` is parsed as .
Step 2: Evaluate attributes in a post-order traversal (bottom-up).
- The first `1` is reduced by . Let's call this .
- Next, `10` is parsed. This corresponds to the production .
- Next, `101` is parsed. This corresponds to .
- Finally, `1011` is parsed. This corresponds to .
Answer: The computed value for the string `1011` is .
b) L-Attributed Definitions
L-attributed definitions are a more general class that allows both synthesized and inherited attributes, but with a key restriction on how inherited attributes are computed. This class is designed to permit attribute evaluation in a single left-to-right, depth-first traversal of the parse tree.
An SDD is L-attributed if for each production , each inherited attribute of (for ) depends only on:
- The inherited attributes of the parent, .
- The attributes (synthesized or inherited) of its left siblings, .
Furthermore, any rule for a synthesized attribute of can depend on any attributes of its children.
Crucially, an inherited attribute for a symbol cannot depend on any attributes from its right siblings (). This "left-to-right" information flow makes L-attributed definitions suitable for top-down parsing methods.
Every S-attributed definition is, by definition, also an L-attributed definition, as it satisfies the conditions vacuously (there are no inherited attributes to violate the rules). However, not all L-attributed definitions are S-attributed.
Example Analysis: S-Attributed vs. L-Attributed
Let us consider three different semantic rules.
This rule only defines a synthesized attribute () based on its children. Thus, it is S-attributed (and therefore also L-attributed).
Here, the attribute is an inherited attribute of . It depends on an attribute of its left sibling, `type`. This follows the L-attributed rule. Since it uses an inherited attribute, it is not S-attributed but is L-attributed.
The rule for requires the synthesized attribute from its right sibling . This violates the L-attributed condition. Therefore, this definition is neither S-attributed nor L-attributed.
---
#
## 3. Application: Type Checking
One of the most important roles of semantic analysis is type checking. SDDs provide a formal mechanism to implement this. We associate a `type` attribute with non-terminals representing expressions and use semantic rules to enforce type compatibility.
Consider a simple language with integer and boolean types.
| Production | Semantic Rule |
|-----------------------------|----------------------------------------------------------------------------------------------------------------|
| | |
| | |
| | |
| | |
| | |
Here, the `add_type` function would update the symbol table during a declaration, and `lookup_type` would retrieve the type from the symbol table when an identifier is used in an expression. The rules for expressions check if the operands have the correct types for the given operator. Notice the rule for , which correctly looks up the type, avoiding the flaw seen in a similar PYQ where the type was hardcoded.
---
Problem-Solving Strategies
When faced with SDT problems in GATE, a systematic approach is essential.
For Attribute Evaluation Problems:
- Construct the Parse Tree: First, correctly parse the input string according to the given grammar. Pay close attention to operator precedence and associativity, which are dictated by the grammar's structure (e.g., left-recursion implies left-associativity).
- Identify Attribute Dependencies: Look at the semantic rules. Are the attributes synthesized or inherited? This determines the evaluation order.
- Perform Traversal and Evaluate: For S-attributed grammars, perform a post-order traversal of the parse tree. Compute the attributes for a node only after all its children's attributes are known. For L-attributed grammars, a depth-first, left-to-right traversal is needed.
- Annotate the Tree: As you compute values, write them next to the nodes on your rough parse tree sketch. This helps keep track of intermediate values and prevents calculation errors.
For S/L-Attributed Classification Problems:
- Scan for Inherited Attributes: Quickly scan all semantic rules. If you find even one inherited attribute (an attribute of a symbol on the RHS of a production), the SDD is not S-attributed.
- Check L-Attributed Conditions: If it is not S-attributed, examine each rule with inherited attributes. For a production , verify that any inherited attribute for only depends on attributes of and/or . If any rule violates this by depending on a right sibling ( where ), the SDD is not L-attributed.
- Conclusion: If it passes the L-attributed check, it's L-attributed. If it has no inherited attributes, it's S-attributed (and also L-attributed).
---
Common Mistakes
- β Incorrect Parse Tree Construction: Misinterpreting operator precedence or associativity from the grammar leads to an incorrect parse tree, which makes any subsequent attribute calculation wrong.
- β Confusing S- and L-Attributed Definitions: Assuming any SDD with inherited attributes is not L-attributed.
- β Right-to-Left Dependency Violation: In an L-attributed check, seeing a rule like and failing to recognize it as a violation.
- β Ignoring Terminal Attributes: Forgetting that terminals can also have synthesized attributes (e.g., `id.lexeme`, `num.val`) which are typically provided by the lexical analyzer.
---
Practice Questions
:::question type="MCQ" question="Consider the following syntax-directed definition:
S \rightarrow L = R \quad \{ \text{check_type}(L.type, R.type); \}
L \rightarrow L_1 [E] \quad \{ L.type = \text{element_type}(L_1.type); \quad E.type = \text{INT}; \}
Which of the following statements is true?" options=["The SDD is S-attributed.","The SDD is L-attributed but not S-attributed.","The SDD is neither S-attributed nor L-attributed.","The SDD contains a syntax error."] answer="The SDD is S-attributed." hint="Analyze the dependencies for each attribute. Does any attribute for a symbol on the RHS depend on its parent or siblings?" solution="Let's analyze the attributes.
- In , the semantic action is a procedure call, not an attribute assignment. We consider the attributes used. and are synthesized from the children.
- In , is a synthesized attribute depending on its child .
- In , is a synthesized attribute.
- In , is synthesized from its child . The rule also checks that is INT, but it does not define any inherited attributes. All information flows up the parse tree.
:::
:::question type="NAT" question="Consider the following SDD, where `val` is a synthesized attribute.
-- Note the typo in the rule
There is a deliberate error in one semantic rule. Assuming the parser uses this SDD to evaluate the expression `3 (4 + 2)`. What is the value computed for ?" answer="18" hint="The rule F has a typo: depends on itself. This creates an infinite recursion for the attribute calculation. However, for the given expression, this production is used as and is not used at the top level for the multiplication. Let's re-read the question. Ah, the typo is . This is a circular definition. Let's assume the intended rule was . Let's solve with the intended rule. Wait, the question asks what is computed with the given SDD*. Let's trace it carefully.
The expression is `3 * (4 + 2)`.
The parse tree structure is .
derives `3`. derives `(4 + 2)`.
Let's compute for `(4 + 2)` first.
. We need for `4+2`.
. derives `4`, derives `2`.
. .
. .
So, .
This means for `(4+2)` is .
Now for the top-level production .
. .
.
The rule is . This is a circular definition. . This equation only holds if . However, compilers would typically get stuck in a loop or use an uninitialized value. Let's reconsider the question's intent. It's more likely a typo in the question itself and should be . Let's assume this was the intended rule, as this is a common GATE pattern.
With the corrected rule :
.
.
Let's assume the question is malicious. If , then , which implies . If the value is 0, the answer is 0. But this is semantically nonsensical. Competitive exam questions usually test understanding, not obscure language-lawyer issues with circular definitions. The most logical interpretation is that the `T.val` on the RHS was a typo for `F.val`.
Let's stick to the most reasonable interpretation.
" solution="
Step 1: Assume the semantic rule for multiplication has a typo and the intended rule is . A circular definition like is not evaluable and would likely be flagged as an error or result in undefined behavior, which is not typically tested via a NAT question seeking a specific value.
Step 2: Construct the parse tree for `3 (4 + 2)`. The top-level production for the expression will be , and for the term, it will be F.
- will correspond to the sub-expression `3`.
- will correspond to the sub-expression `(4 + 2)`.
Step 3: Evaluate the attribute for the sub-expression `3`.
.
Step 4: Evaluate the attribute for the sub-expression `(4 + 2)`.
This corresponds to the production . We must find the value of for `4 + 2`.
The production for this is .
- corresponds to `4`. . So, .
- corresponds to `2`. . So, .
Therefore, for the sub-expression `(4 + 2)`, we have:
Step 5: Compute the final value using the top-level production .
Result:
The final value computed for the expression is .
"
:::
:::question type="MSQ" question="Which of the following statements about Syntax-Directed Definitions (SDDs) and their evaluation are correct?" options=["All S-attributed definitions can be evaluated by a bottom-up parser during parsing.","An L-attributed definition can have inherited attributes that depend on attributes of right siblings.","A top-down parser, like a recursive-descent parser, can evaluate any L-attributed definition.","Type checking is an application of synthesized attributes only."] answer="A,C" hint="Evaluate each statement based on the core definitions of S-attributed and L-attributed grammars and their relationship with parsing methods." solution="
- A. All S-attributed definitions can be evaluated by a bottom-up parser during parsing. This is correct. S-attributed definitions use only synthesized attributes. In a bottom-up parser, a reduction occurs only after the entire right-hand side of a production (and its subtrees) have been processed. At this point, the attributes of all children are available on the stack, and the parent's synthesized attribute can be computed.
- B. An L-attributed definition can have inherited attributes that depend on attributes of right siblings. This is incorrect. This is the primary restriction of L-attributed definitions. An inherited attribute of a symbol can only depend on its parent and its left siblings ().
- C. A top-down parser, like a recursive-descent parser, can evaluate any L-attributed definition. This is correct. The left-to-right dependency flow in L-attributed definitions is designed to work with the depth-first, left-to-right traversal naturally performed by top-down parsers. Inherited attributes can be passed down as arguments to the parsing functions for non-terminals.
- D. Type checking is an application of synthesized attributes only. This is incorrect. While synthesized attributes are used to pass the type of an expression up the tree (e.g., ), inherited attributes are very useful for passing contextual type information down the tree, for example, in declarations like , the type is passed to via an inherited attribute.
:::
---
Summary
- Semantic Analysis bridges Syntax and Code Generation. Its primary role is to enforce context-sensitive rules, such as type checking and scope resolution, which cannot be handled by CFGs alone.
- Attributes Define Meaning. SDDs augment grammars with attributes (synthesized for upward information flow, inherited for downward/sideways flow) and semantic rules that define how these attributes are computed.
- S-Attributed vs. L-Attributed is a Core Distinction.
- Evaluation Requires a Systematic Approach. For any SDT evaluation problem, the first and most critical step is to correctly build the parse tree based on the grammar's implicit precedence and associativity.
- S-Attributed: Only synthesized attributes. Evaluated naturally during bottom-up parsing.
- L-Attributed: Synthesized attributes plus inherited attributes with a strict left-to-right dependency. Evaluated naturally during top-down parsing. Every S-attributed SDD is also L-attributed.
---
What's Next?
This topic is fundamental to understanding the compiler's front-end. It connects directly to:
- Intermediate Code Generation: The semantic actions in an SDT are often used to produce an intermediate representation like Three-Address Code. The attributes computed here become the operands and results in that code.
- Symbol Table Management: The `lookup` and `add_type` actions we've seen are interfaces to the symbol table. A deeper study of how symbol tables are implemented (e.g., hash tables, scope stacks) is essential.
Mastering how information flows through the parse tree via attributes is key to solving a wide range of compiler design problems in GATE.
---
Chapter Summary
In our study of Syntax-Directed Translation, we have established the formal mechanisms for bridging the gap between parsing and code generation. The semantic analysis phase, guided by the syntactic structure of the source program, appends meaning to the constructs recognized by the parser. For success in the GATE examination, a firm grasp of the following core concepts is essential.
- Syntax-Directed Definitions (SDD) vs. Translation Schemes (SDT): We must distinguish between these two concepts. An SDD is a high-level specification that associates semantic rules with grammar productions, independent of implementation. An SDT, in contrast, is an implementation-focused mechanism that embeds semantic action code within the productions themselves, explicitly defining the order of evaluation.
- Synthesized and Inherited Attributes: Attributes are the primary carriers of semantic information. A synthesized attribute at a node is computed from the attributes of its children, passing information up the parse tree. An inherited attribute is computed from the attributes of its parent and/or siblings, passing information down or across the tree.
- S-Attributed and L-Attributed Definitions: These classifications are critical as they dictate the evaluation strategy. An S-Attributed Definition uses only synthesized attributes and is naturally suited for evaluation during a bottom-up parse (e.g., by an LR parser). An L-Attributed Definition is a more general form where an inherited attribute of a symbol on the right-hand side of a production can only depend on attributes of symbols to its left. This class is well-suited for evaluation by a top-down parser. Every S-Attributed SDD is also L-Attributed, but not vice-versa.
- Dependency Graphs: For a given parse tree, the dependency graph is a directed graph that visualizes the flow of information among attribute instances. A directed edge from attribute to attribute signifies that 's value depends on 's value. A topological sort of this graph yields a valid evaluation order for the attributes.
- Attribute Evaluation Order: The choice of evaluation order is tied to the parser. S-Attributed definitions can be evaluated by a post-order traversal of the parse tree. L-Attributed definitions can be evaluated by a depth-first, left-to-right traversal. Understanding this link between attribute type and parser capability is frequently tested.
- Application in Type Checking: A principal application of SDT is static type checking. Semantic rules are used to determine the type of each expression from the types of its subexpressions. The type information, stored as attributes in the parse tree nodes, is checked for compatibility according to the language's type system.
---
Chapter Review Questions
:::question type="MCQ" question="Consider the following Syntax-Directed Definition (SDD), where `s` denotes a synthesized attribute and `i` denotes an inherited attribute.
Grammar:
Semantic Rules:
Which of the following statements is TRUE regarding this SDD?" options=["The SDD is both S-Attributed and L-Attributed.", "The SDD is L-Attributed but not S-Attributed.", "The SDD is S-Attributed but not L-Attributed.", "The SDD is neither S-Attributed nor L-Attributed."] answer="B" hint="Analyze the dependencies for each inherited attribute. Recall the specific constraint for an L-Attributed definition concerning dependencies on siblings." solution="
Let us analyze the given SDD to classify it.
1. Check for S-Attributed property:
An SDD is S-Attributed if it uses only synthesized attributes. The given SDD uses both synthesized attributes (`s`) and inherited attributes (`i`). Therefore, it is not S-Attributed. This immediately eliminates options A and C.
2. Check for L-Attributed property:
An SDD is L-Attributed if, for any production , the inherited attributes of (for ) depend only on:
- The inherited attributes of .
- The attributes (synthesized or inherited) of the symbols (i.e., symbols to the left of ).
Let's examine the rules for the production :
- Rule for : . The inherited attribute of depends on the inherited attribute of the parent . This is valid under the L-Attributed definition.
- Rule for : . The inherited attribute of depends on a synthesized attribute of its left sibling, . This is also valid under the L-Attributed definition.
Since all rules adhere to the constraints, the SDD is L-Attributed.
Conclusion:
The SDD is L-Attributed but not S-Attributed. Thus, option B is the correct answer.
"
:::
:::question type="NAT" question="Consider the S-Attributed Definition for evaluating binary numbers, given by the following productions and semantic rules, where `val` is a synthesized attribute.
{ }
{ }
{ }
{ }
{ }
Using this SDD, compute the value for the input string `1101`." answer="13" hint="Trace the evaluation of the `val` attribute up the parse tree in a bottom-up fashion, corresponding to an LR parse of the string." solution="
We need to evaluate the `val` attribute for the input string `1101`. The evaluation follows the structure of a bottom-up parse. Let's trace the reductions and the corresponding semantic rule applications.
The input string is `1101`.
- Reduction: . Rule: .
- Reduction: . Rule: .
- The value of the prefix `1` is 1.
- Reduction: . Rule: .
- We now have (from prefix `1`) and (from current `1`).
- Reduction: . Rule: .
- The value of the prefix `11` is 3.
- Reduction: . Rule: .
- We now have (from prefix `11`) and (from current `0`).
- Reduction: . Rule: .
- The value of the prefix `110` is 6.
- Reduction: . Rule: .
- We now have (from prefix `110`) and (from current `1`).
- Reduction: . Rule: .
- The value of the string `1101` is 13.
Finally, the start symbol reduction sets .
The final computed value is 13.
"
:::
:::question type="MCQ" question="A syntax-directed translation scheme is given for type checking expressions:
The `lookup` function consults a symbol table with the following entries: `p: int`, `q: float`, `r: int`. What is the synthesized type attribute of the root of the parse tree for the expression `p + r + q`?" options=["int", "float", "error", "The type cannot be determined"] answer="C" hint="Evaluate the type attributes from left to right, following the left-associativity of the `+` operator. Check if the type rules are satisfied at each step." solution="
Let us trace the evaluation of the `type` attribute for the expression `p + r + q`. The `+` operator is left-associative, so the expression is parsed as `(p + r) + q`.
The parse tree structure for this expression is:
```
E
/|\
E + q
/|\
p + r
```
We evaluate the `type` attributes in a bottom-up (post-order) fashion.
Step 1: Evaluate the types of the leaves.
- For `p`: The rule is . The action is . From the symbol table, the type of `p` is `int`. Let's call this node . So, .
- For `r`: The rule is . The action is . From the symbol table, the type of `r` is `int`. Let's call this node . So, .
- For `q`: The rule is . The action is . From the symbol table, the type of `q` is `float`. Let's call this node . So, .
Step 2: Evaluate the type of the subexpression `(p + r)`.
- This corresponds to the production , where is and is .
- We have and .
- The semantic rule is: `{ if } E_1.type = \text{int and } E_2.type = \text{int then } E.type = \text{int else } E.type = \text{error} }`.
- Since both conditions are met, the type of the resulting node (let's call it ) is `int`. So, .
Step 3: Evaluate the type of the final expression `(p + r) + q`.
- This corresponds to the production , where is the node for `(p + r)` (i.e., ) and is the node for `q` (i.e., ).
- We have and .
- We apply the same semantic rule: `{ if } E_1.type = \text{int and } E_2.type = \text{int then } E.type = \text{int else } E.type = \text{error} }`.
- The condition `E_2.type = int` is false (since it is `float`).
- Therefore, the `else` part of the rule is executed, and the type of the root node is set to `error`.
The final synthesized type is `error`.
"
:::
---
What's Next?
Having completed Syntax-Directed Translation, you have established a firm foundation for the subsequent, more pragmatic phases of compilation. The abstract concepts of attribute grammars and semantic rules now pave the way for concrete implementation tasks.
Key connections to your learning path:
- Relation to Previous Chapters: This chapter is a direct extension of Parsing. We have seen that the parse tree, the primary output of the parsing phase, is not merely a validation of syntax but the very scaffold upon which we build meaning. The distinctions between LL and LR parsing, which might have seemed purely algorithmic, now reveal their profound impact on attribute evaluation strategies. S-Attributed grammars align perfectly with bottom-up LR parsers, while L-Attributed grammars are the natural companion to top-down LL parsers.
- Foundation for Future Chapters: The concepts mastered here are indispensable for the next stages of your study in Compiler Design: