Workshop on Essential Abstractions in GCC

# More Details of Machine Descriptions

GCC Resource Center (www.cse.iitb.ac.in/grc)

Department of Computer Science and Engineering, Indian Institute of Technology, Bombay



2 July 2012

- 2

- Some details of MD constructs
  - On names of patterns in .md files
  - On the role of define\_expand
  - On the role of predicates and constraints
  - Mode and code iterators
  - Defining attributes
  - Other constructs
- Improving machine descriptions and instruction selection
  - New constructs to factor out redundancy
  - Cost based tree tiling for instruction selection



### Part 1

# More Features

◆□▶ ◆□▶ ◆三▶ ◆三▶ 三三 のへで



















#### Role of define\_expand





#### Using define\_expand for Generating RTL statements



#### **Use of Predicates**

```
(define_insn "<name>"
  [(set (match_operand:SI 0 "general_operand" "=r")
        (plus:SI (match_dup 0)
                (match_operand:SI 1 "general_operand" "r")))]
  ""
  "...")
```

Predicates are using for matching operands

- For constructing an insn during expansion <name> must be a standard pattern name
- For recognizing an instruction (in subsequent RTL passes including pattern matching)





Predicates are using for matching operands

- For constructing an insn during expansion <name> must be a standard pattern name
- For recognizing an instruction (in subsequent RTL passes including pattern matching)



### **Understanding Constraints**

```
(define_insn "<name>"
  [(set (match_operand:SI 0 "general_operand" "=r")
        (plus:SI (match_dup 0)
               (match_operand:SI 1 "general_operand" "r")))]
  ""
  "...")
```









• Reloading operands in the most suitable register class





- Reloading operands in the most suitable register class
- Fine tuning within the set of operands allowed by the predicate





- Reloading operands in the most suitable register class
- Fine tuning within the set of operands allowed by the predicate
- If omitted, operands will depend only on the predicates



### **Role of Constraints**

Consider the following two instruction patterns:

```
• (define_insn ""
    [(set (match_operand:SI 0 "general_operand" "=r")
        (plus:SI (match_dup 0)
              (match_operand:SI 1 "general_operand" "r")))]
    ""
    "...")
```

- During expansion, the destination and left operands must match the same predicate
- During recognition, the destination and left operands must be identical

### **Role of Constraints**

• Consider an insn for recognition

```
(insn n prev next
  (set (reg:SI 3)
        (plus:SI (reg:SI 6) (reg:SI 109)))
        ...)}
```

- Predicates of the first pattern do not match (because they require identical operands during recognition)
- Constraints do not match for operand 1 of the second pattern

### **Role of Constraints**

• Consider an insn for recognition

```
(insn n prev next
  (set (reg:SI 3)
        (plus:SI (reg:SI 6) (reg:SI 109)))
        ...)}
```

- Predicates of the first pattern do not match (because they require identical operands during recognition)
- Constraints do not match for operand 1 of the second pattern
- Reload pass generates additional insn to that the first pattern can be used

```
(insn n2 prev n
    (set (reg:SI 3) (reg:SI 6))
    ...)
(insn n n2 next
    (set (reg:SI 3)
                          (plus:SI (reg:SI 3)(reg:SI 109)))
    ...)
```

#### Part 2

# Factoring Out Common Information

◆□▶ ◆□▶ ◆臣▶ ◆臣▶ 臣 のへで

# Handling Mode Differences

```
(define_insn "subsi3"
    [(set (match_operand:SI 0 "register_operand" "=d")
           (minus:SI (match_operand:SI 1 "register_operand" "d")
                        (match_operand:SI 2 "register_operand" "d")))]
    44 99
    "subu\t %0,%1,%2"
    [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
(define_insn "subdi3"
    [(set (match_operand:DI 0 "register_operand" "=d")
           (minus:DI (match_operand:DI 1 "register_operand" "d")
                        (match_operand:DI 2 "register_operand" "d")))]
    44 99
    "dsubu\t %0,%1,%2"
    [(set_attr "type" "arith")
    (set_attr "mode" "DI")])
```

### Handling Mode Differences

```
(define_insn "subsi3"
    [(set (match_operand:SI 0 "register_operand" "=d")
           (minus:SI (match_operand:SI 1 "register_operand" "d")
                        (match_operand:SI 2 "register_operand" "d")))]
    44 99
    "subu\t %0,%1,%2"
    [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
(define_insn "subdi3"
    [(set (match_operand:DI 0 "register_operand" "=d")
           (minus:DI (match_operand:DI 1 "register_operand" "d")
                        (match_operand:DI 2 "register_operand" "d")))]
    44 99
    "dsubu\t %0,%1,%2"
    [(set_attr "type" "arith")
    (set_attr "mode" "DI")])
```



### Mode Iterators: Abstracting Out Mode Differences



# Handling Code Differences

```
(define_expand "bunordered"
    [(set (pc) (if_then_else (unordered:CC (cc0) (const_int 0))
                             (label_ref (match_operand 0 ""))
                             (pc)))]
    .. ..
    { mips expand conditional branch (operands, UNORDERED);
      DONE;
    })
 (define_expand "bordered"
     [(set (pc) (if_then_else (ordered:CC (cc0) (const_int 0))
                              (label_ref (match_operand 0 ""))
                              (pc)))]
     ** **
     { mips_expand_conditional_branch (operands, ORDERED);
       DONE:
     })
```

11/38

# Handling Code Differences

```
(define_expand "bunordered"
    [(set (pc) (if_then_else (unordered:CC (cc0) (const_int 0))
                             (label_ref (match_operand 0 ""))
                             (pc)))]
    .. ..
    { mips expand conditional branch (operands, UNORDERED);
      DONE;
    })
 (define_expand "bordered"
     [(set (pc) (if_then_else (ordered:CC (cc0) (const_int 0))
                              (label_ref (match_operand 0 ""))
                              (pc)))]
     ** **
     { mips_expand_conditional_branch (operands, ORDERED);
       DONE:
     })
```

11/38

## **Code Iterators: Abstracting Out Code Differences**



### Part 3

# Miscellaneous Features

◆□▶ ◆□▶ ◆三▶ ◆三▶ 三三 少へで

- Classifications are need based
- Useful to GCC phases e.g. pipelining



- Classifications are need based
- Useful to GCC phases e.g. pipelining

```
;; Instruction type.
(define_attr "type"
    "other,multi, alu,alu1,negnot, ... str ,cld, ..."
    (const_string "other") )
```



- Classifications are need based
- Useful to GCC phases e.g. pipelining

```
;; Instruction type.
(define_attr "type"
    "other,multi, alu,alu1,negnot, ... str,cld, ..."
    (const_string "other") )
Fields:
Attribute name,
```

- Classifications are need based
- Useful to GCC phases e.g. pipelining

```
;; Instruction type.
(define_attr "type"
    "other,multi, alu,alu1,negnot, ... str,cld, ..."
    (const_string "other"))
Fields:
Attribute name, all possible values,
```

- Classifications are need based
- Useful to GCC phases e.g. pipelining

Property: Pipelining Need: To classify target instructions Construct: define\_attr



13/38

- Classifications are need based
- Useful to GCC phases e.g. pipelining

Property: Pipelining Need: To classify target instructions Construct: define\_attr



13/38

# **Specifying Instruction Attributes**

- Optional field of a define\_insn
- For an i386, we choose to mark string instructions with the attribute value str

```
(define_insn "*strmovdi_rex_1"
  [(set (mem:DI (match_operand:DI 2 ...)]
  "TARGET_64BIT && (TARGET_SINGLE_ ...)"
  "movsq"
  [ (set_attr "type" "str")
   ...
   (set_attr "memory" "both")])
```

### NOTE

An instruction may have more than one attribute!



# **Using Attributes**

Pipeline specification requires the CPU type to be "pentium" and the instruction type to be "str"



### Some Other RTL Constructs

- define\_split: Split complex insn into simpler ones e.g. for better use of delay slots
- define\_insn\_and\_split: A combination of define\_insn and define\_split Used when the split pattern matches and insn exactly.
- **define\_peephole2**: Peephole optimization over insns that substitutes insns. Run after register allocation, and before scheduling.
- define\_constants: Use literal constants in rest of the MD.



#### Part 4

# Machine Descriptions in specRTL

◆□▶ ◆□▶ ◆臣▶ ◆臣▶ 臣 のへで

#### The Need for Improving Machine Descriptions

The Problems:

• The specification mechanism for Machine descriptions is quite adhoc

• Adhoc design decisions



# The Need for Improving Machine Descriptions

The Problems:

- The specification mechanism for Machine descriptions is quite adhoc
  - Only syntax borrowed from LISP, neither semantics not spirit!
  - Non-composable rules
  - Mode and code iterator mechanisms are insufficient
- Adhoc design decisions



## The Need for Improving Machine Descriptions

The Problems:

- The specification mechanism for Machine descriptions is quite adhoc
  - Only syntax borrowed from LISP, neither semantics not spirit!
  - Non-composable rules
  - Mode and code iterator mechanisms are insufficient
- Adhoc design decisions
  - Honouring operand constraints delayed to global register allocation During GIMPLE to RTL translation, a lot of C code is required
  - Choice of insertion of NOPs



### Handing Constraints

• define\_insns patterns have operand predicates and constraints



**Essential Abstractions in GCC** 

GCC Resource Center, IIT Bombay

# Handing Constraints

- define\_insns patterns have operand predicates and constraints
- While generating an RTL insn from GIMPLE, only the predicates are checked. The constraints are completely ignored



# Handing Constraints

- define\_insns patterns have operand predicates and constraints
- While generating an RTL insn from GIMPLE, only the predicates are checked. The constraints are completely ignored
- An insn which is generated in the expander is modified in the reload pass to satisfy the constraints



# Handing Constraints

- define\_insns patterns have operand predicates and constraints
- While generating an RTL insn from GIMPLE, only the predicates are checked. The constraints are completely ignored
- An insn which is generated in the expander is modified in the reload pass to satisfy the constraints
- It may be possible to generate this final form of RTL during expansion by honouring constraints
  - Honouring contraints earlier than the current place
     May get rid of some C code in define\_expand



# **Design Flaws in Machine Descriptions**

Multiple patterns with same structure

- Repetition of almost similar RTL expressions across multiple define\_insn an define\_expand patterns
  - Some Modes, Predicates, Constraints, Boolean Condition, or RTL Expression may differ everything else may be identical
  - One RTL expression may appears as a sub-expression of some other RTL expression
- Repetition of C code along with RTL expressions in these patterns.



[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (plus:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "p" "<u>c2</u>")))]

RTL Template



[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>cO</u>") (plus:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "p" "<u>c2</u>")))]





[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>cl</u>") (plus:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "p" "<u>c2</u>")))]



#### Details

| Pattern name                        | $\underline{m}$       | $\underline{p}$ | <u>c0</u> | <u>c1</u> | <u>c2</u> |
|-------------------------------------|-----------------------|-----------------|-----------|-----------|-----------|
| define_insn<br>add <mode>3</mode>   | ANYF register_operand |                 | =f        | f         | f         |
| define_expand<br>add <mode>3</mode> | GPR                   | arith_operand   |           |           |           |
| define_insn<br>*add <mode>3</mode>  | GPR                   | arith_operand   | =d,d      | d,d       | d,Q       |



[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))]





[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))]





[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))]



Details

| Pattern name                                 | <u>m</u> | <u>c0</u> | <u>c1</u> | <u>c2</u> |
|----------------------------------------------|----------|-----------|-----------|-----------|
| define_insn *mul <mode>3</mode>              | SCALARF  | =f        | f         | f         |
| define_insn *mul <mode>3_r4300</mode>        | SCALARF  | =f        | f         | f         |
| define_insn mulv2sf3                         | V2SF     | =f        | f         | f         |
| define_expand mul <mode>3</mode>             | GPR      |           |           |           |
| define_insn mul <mode>3_mul3_loongson</mode> | GPR      | =d        | d         | d         |
| define_insn mul <mode>3_mul3</mode>          | GPR      | d,1       | d,d       | d,d       |

2 July 2012

# Redundancy in MIPS Machine Descriptions: Example 3

[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (plus:<u>m</u> (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))] (match\_operand:<u>m</u> 3 "register\_operand" "<u>c3</u>")))]

RTL Template =



2 July 2012

#### Redundancy in MIPS Machine Descriptions: Example 3

[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (plus:<u>m</u> (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))] (match\_operand:<u>m</u> 3 "register\_operand" "<u>c3</u>")))]



[(set (match\_operand:<u>m</u> 0 "register\_operand" "<u>c0</u>") (plus:<u>m</u> (mult:<u>m</u> (match\_operand:<u>m</u> 1 "register\_operand" "<u>c1</u>") (match\_operand:<u>m</u> 2 "register\_operand" "<u>c2</u>")))] (match\_operand:<u>m</u> 3 "register\_operand" "<u>c3</u>")))]



| Pattern name         | $\underline{m}$ | <u>c0</u>     | <u>c1</u> | <u>c2</u> | <u>c3</u> |
|----------------------|-----------------|---------------|-----------|-----------|-----------|
| *mul_acc_si          | SI              | =l*?*?,d?     | d,d       | d,d       | 0,d       |
| *mul_acc_si_r3900    | SI              | =l*?*?,d*?,d? | d,d,d     | d,d,d     | 0,1,d     |
| *macc                | SI              | =1,d          | d,d       | d,d       | 0,1       |
| *madd4 <mode></mode> | ANYF            | =f            | f         | f         | f         |
| *madd3 <mode></mode> | ANYF            | =f            | f         | f         | 0         |

# **Insufficient Iterator Mechanism**

- Iterators cannot be used across define\_insn, define\_expand, define\_peephole2 and other patterns
- Defining iterator attribute for each varying parameter becomes tedious
- For same set of modes and rtx codes, change in other fields of pattern makes use of iterators impossible
- Mode and code attributes cannot be defined for operator or operand number, name of the pattern etc.
- Patterns with different RTL template share attribute value vector for which iterators can not be used



## Many Similar Patterns Cannot be Combined

```
(define_expand "iordi3"
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
      (ior:DI (match_operand:DI 1 "nonimmediate_operand" "")
           (match_operand:DI 2 "x86_64_general_operand" "")))
   (clobber (reg:CC FLAGS_REG))]
   "TARGET 64BIT"
   "ix86_expand_binary_operator (IOR, DImode, operands); DONE;")
(define_insn "*iordi_1_rex64"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,r")
      (ior:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0")
           (match_operand:DI 2 "x86_64_general_operand" "re,rme")))
   (clobber (reg:CC FLAGS_REG))]
   "TARGET 64BTT
   && ix86_binary_operator_ok (IOR, DImode, operands)"
   "or{q}\t{%2, %0|%0, %2}"
   [(set_attr "type" "alu")
   (set_attr "mode" "DI")])
```



# Measuring Redundancy in RTL Templates

| MD File | Total number<br>of patterns | Number of primitive trees | Number of times<br>primitive trees<br>are used to create<br>composite trees |
|---------|-----------------------------|---------------------------|-----------------------------------------------------------------------------|
| i386.md | 1303                        | 349                       | 4308                                                                        |
| arm.md  | 534                         | 232                       | 1369                                                                        |
| mips.md | 337                         | 147                       | 921                                                                         |



# specRTL: Key Observations

• Davidson Fraser insight

Register transfers are target specific but their form is target independent

- GCC's approach
  - Use Target independent RTL for machine specification
  - Generate expander and recognizer by reading machine descriptions

Main problems with GCC's Approach

Although the shapes of RTL statements are target independent, they have to be provided in RTL templates

• Our key idea:

Separate shapes of RTL statements from the target specific details



#### Specification Goals of specRTL

Support all of the following

- Separation of shapes from target specific details
- Creation of new shapes by composing shapes
- Associtiating concrete details with shapes
- Overriding concrete details



# Software Engineering Goals of specRTL

- Allow non-disruptive migration for existing machine descriptions
  - Incremental changes
  - No need to change GCC source until we are sure of the new specification

 $\mathsf{GCC}$  must remain usable after each small change made in the machine descriptions



# Meeting the Specification Goals: Key Idea

- Separation of shapes from target specific details:
  - Shape  $\equiv$  tree structure of RTL templates
  - Details = attributes of tree nodes (eg. modes, predicates, constraints etc.)



# Meeting the Specification Goals: Key Idea

- Separation of shapes from target specific details:
  - Shape  $\equiv$  tree structure of RTL templates
  - Details = attributes of tree nodes (eg. modes, predicates, constraints etc.)
- Abstract patterns and Concrete patterns
  - Abstract patterns are shapes with "holes" in them that represent missing information
  - Concrete patterns are shapes in which all holes are plugged in using target specific information



# Meeting the Specification Goals: Key Idea

- Separation of shapes from target specific details:
  - Shape  $\equiv$  tree structure of RTL templates
  - Details = attributes of tree nodes (eg. modes, predicates, constraints etc.)
- Abstract patterns and Concrete patterns
  - Abstract patterns are shapes with "holes" in them that represent missing information
  - Concrete patterns are shapes in which all holes are plugged in using target specific information
- Abstract patterns capture *shapes* which can be concretized by providing details



# Meeting the Specification Goals: Operations

• Creating new shapes by composing shapes: extends



# Meeting the Specification Goals: Operations

- Creating new shapes by composing shapes: extends
- Associtiating concrete details with shapes: instantiates



# Meeting the Specification Goals: Operations

- Creating new shapes by composing shapes: extends
- Associtiating concrete details with shapes: instantiates
- Overriding concrete details: overrides



# **Properties of Operations**

| Operation    | Base<br>pattern | Derived pattern | Nodes<br>influenced | Can change |
|--------------|-----------------|-----------------|---------------------|------------|
| extends      | Abstract        | Abstract        | Leaf nodes          | Structure  |
| instantiates | Abstract        | Concrete        | All nodes           | Attributes |
| overrides    | Abstract        | Abstract        | Internal nodes      | Attributes |
|              | Concrete        | Concrete        | All nodes           | Attributes |



#### **Creating Abstract Patterns**





**Essential Abstractions in GCC** 

#### **Creating Concrete Patterns**







# **Generating Conventional Machine Descriptions**



: /\* Conventional Machine Description Fragments \*/ :}



# **Generating Conventional Machine Descriptions**





# **Generating Conventional Machine Descriptions**

| abstract set_plus extends set | (=) root            |
|-------------------------------|---------------------|
| {                             | $\sim$              |
| <pre>root.2 = plus;</pre>     | root.1/ (+)root.2   |
| }                             | root.2.1/ \root.2.2 |

#### **Overriding Details**





#### **Overriding Details**



concrete \*add<mode>3.insn overrides add<mode>3.expand
{ allconstraints = ("=d,d", "d,d", "d,Q"); }



#### Some More Examples

Omitting conventional MD fragments



#### Some More Examples

Omitting conventional MD fragments

concrete \*mul<mode>3\_r4300.insn overrides \*mul<mode>3.insn
{}
concrete mulv2sf3 overrides \*mul<mode>3.insn
{ SCALARF -> V2SF; }



36/38

# Part 5

# Conclusions

◆□▶ ◆□▶ ◆三▶ ◆三▶ 三三 のへで

#### **Current Status and Plans for Future Work**

- specRTL compiler is ready
- Many of the i386 instructions and all spim instructions have been rewritten
- We invite more people to try out specRTL in writing other descriptions



#### Conclusions

- Separating shapes from concrete details is very helpful
- It may be possible to identify a large number of common shapes
- Machine descriptions may become much smaller Only the concrete details need to be specified
- Non-disruptive and incremental migration to new machine descriptions
- GCC source need not change until these machine descriptions have been found useful

