#### Workshop on Essential Abstractions in GCC

# Machine Descriptions and Retargetability

GCC Resource Center (www.cse.iitb.ac.in/grc)

Department of Computer Science and Engineering, Indian Institute of Technology, Bombay



July 2009

- Influences on GCC Machine Descriptions
- Organization of GCC Machine Descriptions
- Machine description constructs
- The essence of retargetability in GCC
- Systematic construction of machine descriptions

#### Part 1

## **Examples of Influences on the Machine Descriptions**



```
int main ()
{
   int a;
   char b;
   float c;
   while (a != 0)
     a = b + sqr(c);
   return 0;
```















## GCC Architecture Influence on MD

- Standard pattern names with predefined semantics are used in MD
- New Standard Pattern Names may be introduced (eg. cbranch didn't exist in earlier version)
- A new MD constructs may be introduced
   (e.g. define\_predicate didn't exist in earlier versions)
- Macros to be added, removed, or changed in future!

# **Target Influences on MD**

NOTE: Target System = Target ISA + Target System SoftwareIllustration of ISA Influence



# **Target Influences on MD**

NOTE: Target System = Target ISA + Target System SoftwareIllustration of ISA Influence

# Memory Move CPU<sub>2</sub> Move

# Target Influences on MD

NOTE: Target System = Target ISA + Target System Software Illustration of ISA Influence



### System Influences





### System Influences



#### Part 2

# Organization of GCC MD

# **GCC** Machine Descriptions

- Processor instructions useful to GCC
- Processor characteristics useful to GCC
- Target ASM syntax
- Target specific optimizations as IR-RTL → IR-RTL transformations (GCC code performs the transformation computations, MD supplies their target patterns)
  - Peephole optimizations
  - Transformations for enabling scheduling

#### Syntactic Entities in GCC MD

- Necessary Specifications
  - Processor instructions useful to GCC
    - ▶ One Gimple → One IR-RTL
    - ▶ One Gimple → More than one IR-RTL
  - Processor characteristics useful to GCC
  - Target ASM syntax
  - IR-RTL → IR-RTL transformations Target Specific Optimizations
- Programming Conveniences

(eg. define\_insn\_and\_split, define\_constants, define\_cond\_exec, define\_automaton )

define\_insn define\_expand define\_cpu\_unit part of define\_insn define\_split define\_peephole2

# The GCC MD comprises of

- <target>.h: A set of C macros that describe
  - ► HLL properties: e.g. INT\_TYPE\_SIZE to h/w bits
  - Activation record structure
  - Target Register (sub)sets, and characteristics (lists of read-only regs, dedicated regs, etc.)
- System Software details: formats of assembler, executable etc.
- <target>.md: Target instructions described using MD constructs.
- <target>.c: Optional, but usually required. C functions that implement target specific code (e.g. target specific activation layout).

#### File Organization of GCC MD

## The GCC MD comprises of

- <target>.h: A set of C macros that describe
  - ► HLL properties: e.g. INT\_TYPE\_SIZE to h/w bits
  - Activation record structure
  - Target Register (sub)sets, and characteristics (lists of read-only regs, dedicated regs, etc.)
  - System Software details: formats of assembler, executable etc.
- <target>.md: Target instructions described using MD constructs. (Our main interest!)
- <target>.c: Optional, but usually required. C functions that implement target specific code (e.g. target specific activation layout).

#### Part 3

Essential Constructs in Machine Descriptions

July 09



July 09



# The GCC Phase Sequence

# Observe that

- RTL is a target specific IR
- GIMPLE  $\rightarrow$  non strict RTL  $\rightarrow$  strict RTL.
- SPN: "(Semantic) Glue" between GIMPLE and RTL
  - operator match + coarse operand match, and
  - refine the operand match
- Finally: Strict RTL ⇔ Unique target ASM string

#### Consider generating RTL expressions of GIMPLE nodes

Two constructs available: define\_insn and define\_expand

# **Running Example**

# Consider a data move operation

- reads data from source location, and
- writes it to the destination location.
- GIMPLE node: GIMPLE\_MODIFY\_STMT
- SPN: "movsi"

#### Some possible combinations are:

- Reg ← Reg : Register move
  - Reg ← Mem : Load
  - Reg ← Const : Load immediate
  - Mem ← Reg : Store
  - Mem ← Mem : Illegal instruction

# **Specifying Target Instruction Semantics**

```
(define_insn
   "movsi"
   (set
      (match_operand 0 "register_operand" "r")
      (match_operand 1 "const_int_operand" "k")
          C boolean expression, if required */
   "li %0, %1"
```

# **Specifying Target Instruction Semantics**



# **Specifying Target Instruction Semantics**



—Target Dependent—

14/35

# Instruction Specification and Translation



July 09

RTL → ASM

14/35



• RTL: target dependent Need: associate the semantics ⇒GCC Solution: Standard Pattern Names

• GIMPLE: target independent

```
(define_insn "movsi"
   (set (match_operand 0 "register_operand" "r")
        (match_operand 1 "const_int_operand" "k"))
   "" /* C boolean expression, if required */
   "mov %0, %1"
```

 $\mathsf{Gimple} \to \mathsf{RTL}$ 

# Instruction Specification and Translation



# General Move Instruction

(define\_insn "maybe\_spn\_like\_movsi"

```
(match_operand 0 "general_operand" "")
      (match_operand 1 "general_operand" ""))
11 11
"mov %0, %1"
```

- This define\_insn can generate data movement patterns of all combinations
- Even Mem → Mem is possible.
- We need a mechanism to generate more restricted data movement RTX instances!

(define\_expand "movsi"

# The define\_expand Construct

```
[(set (match_operand:SI 0 "nonimmediate_operand" "")
        (match_operand:SI 1 "general_operand" "")
  )1
11 11
{
  if (GET_CODE (operands[0]) == MEM &&
      GET_CODE (operands[1]) != REG)
    if (can_create_pseudo_p())
        operands[1] = force_reg (SImode, operands[1]);
```

# Relationship Between <target>.md, <target>.c, and <target>.h Files

Machine Descriptions and Retargetability: Essential Constructs in Machine Descriptions

#### Example:

- Register class constraints are used in <target>.md file
- Register class is defined in <target>.h file
- Checks for register class are implemented in <target>.c file

# Register Class Constraints in <target>.md File

:: Here z is the constraint character defined in

```
;; REG_CLASS_FROM_LETTER_P
;; The register $zero is used here.
(define_insn "IITB_move_zero"
   [(set
      (match_operand:SI 0 "nonimmediate_operand" "=r,m")
      (match_operand:SI 1 "zero_register_operand" "z,z")
   )]
   11 11
   "@
  move \t%0,%1
   sw \t%1, %m0"
```

```
#define REG_CLASS_FROM_LETTER_P
   reg_class_from_letter
enum reg_class
        NO_REGS,
                              ZERO_REGS .
                              CALLEE_SAVED_REGS,
        CALLER_SAVED_REGS,
        BASE_REGS,
                              GENERAL_REGS,
        ALL_REGS,
                              LIM_REG_CLASSES
};
#define REG_CLASS_CONTENTS
{0x00000000, 0x00000001, 0x0200ff00, 0x00ff0000,
  Oxf2fffffc, \Oxffffffffe, Oxfffffffff}
```

The Register Classes

The Register Class Enumeration

Essential Abstrations in GCC

GCC Resource Center, IIT Bombay

GCC Resource Center, IIT Bombay

reg\_class\_from\_letter (char ch)

enum reg\_class

```
switch(ch)
case 'b':return BASE_REGS;
case 'x':return CALLEE_SAVED_REGS;
case 'y':return CALLER_SAVED_REGS;
case 'z':return ZERO_REGS;
return NO_REGS;
```

Get the enumeration from the Register class letter

### Part 5

Other Constructs in Machine Descriptions

- Classifications are need based
- Useful to GCC phases e.g. pipelining

Property: Pipelining

Need: To classify target instructions

Construct: define\_attr

21/35

### Defining Attributes

- Classifications are need based
- Useful to GCC phases e.g. pipelining

Property: Pipelining
Need: To classify target instructions

Construct: define attr

```
;; Instruction type.
```

```
(define_attr "type"
```

```
orino_doti type
```

"other, multi, alu, alu1, negnot, ...

```
(const_string "other") )
```

u1, negnot, ... str , cld, ..."

IIT Bombay

July 09

21/35

### **Defining Attributes**

- Classifications are need based
  - Useful to GCC phases e.g. pipelining

Property: Pipelining Need: To classify target instructions

(const\_string "other") )

Construct: define attr

Instruction type.

(define\_attr "type"

"other, multi, alu, alu1, negnot, ...

Fields:

Attribute name,

**Essential Abstrations in GCC** 

str ,cld, ..."

GCC Resource Center, IIT Bomba

str ,cld, ..."

### **Defining Attributes**

Classifications are need based

 Useful to GCC phases – e.g. pipelining Property: Pipelining

"other, multi, alu, alu1, negnot,

Need: To classify target instructions

Construct: define attr

Instruction type.

(define\_attr "type"

Fields:

(const\_string "other")

Attribute name, all possible values,

str ,cld, ..."

GCC Resource Center, IIT Bomba

## **Defining Attributes**

- - Useful to GCC phases e.g. pipelining

"other, multi, alu, alu1, negnot,

Need: To classify target instructions

Classifications are need based

Construct: define attr

(const\_string "other") )

Instruction type.

Property: Pipelining

```
(define_attr "type"
```

Attribute name, all possible values, one of the possible values,

21/35

Fields:

str ,cld, ..."

July 09

21/35

## **Defining Attributes**

- Classifications are need based
  - Useful to GCC phases e.g. pipelining
- Need: To classify target instructions
- Construct: define\_attr
- ;; Instruction type.
- (define\_attr "type"

Property: Pipelining

- - "other,multi, alu,alu1,negnot, ...
    (const\_string "other"))
  - Fields:

Attribute name, all possible values, one of the possible values, default.

### **Specifying Instruction Attributes**

Optional field of a define\_insn

 For an i386, we choose to mark string instructions with the attribute value str

```
(define_insn "*strmovdi_rex 1"
  [(set (mem:DI (match_operand:DI 2 ...)]
  "TARGET_64BIT && (TARGET_SINGLE_ ...)"
  "movsq"
  [ (set_attr "type" "str")
   (set_attr "memory" "both")])
```

### NOTE

An instruction may have more than one attribute!

# **Using Attributes**

```
(define_insn_reservation "pent_str" 12
  (and (eq_attr "cpu" "pentium")
       (eq_attr "type" "str") )
  "pentium-np*12")
```

Pipeline specification requires the CPU type to be "pentium" and the instruction type to be "str"

### Some Other RTL Constructs

- define\_split: Split complex insn into simpler ones
   e.g. for better use of delay slots
- define\_insn\_and\_split: A combination of define\_insn and define\_split
   Used when the split pattern matches and insn exactly.
- define\_peephole: (Old) Peephole optimization over insns that substitutes target ASM text.
- <a href="define\_peephole2">define\_peephole2</a>: (New) Peephole optimization over insns that substitutes insns. Run after register allocation, and before scheduling.
- define\_constants: Use literal constants in rest of the MD.

### Part 7

# The Essence of Retargetability

July 09

# Instruction Specification and Translation: A Recap



```
(define_insn "movsi"
   (set (match_operand 0 "register_operand" "r")
        (match_operand 1 "const_int_operand" "k"))
   "" /* C boolean expression, if required */
   "mov %0, %1"
```

RTL → ASM

—Target Dependent—

# Instruction Specification and Translation: A Recap

Parse → Gimplify → Tree SSA Optimize → Optimize RTL → Generate ASM

• GIMPLE: target independent

- RTL: target dependentNeed: associate the *semantics*
- ⇒GCC Solution: Standard Pattern Names

 $\mathsf{Gimple} \to \mathsf{RTL}$ 

# Instruction Specification and Translation: A Recap



# Translation Sequence in GCC

```
(define_insn
    "movsi"
    (set
          (match_operand 0 "register_operand" "r")
          (match_operand 1 "const_int_operand" "k")
        )
    "" /* C boolean expression, if required */
    "li %0, %1"
)
```

26/35

# Translation Sequence in GCC

```
(define_insn
    "movsi"
    (set
          (match_operand 0 "register_operand" "r")
          (match_operand 1 "const_int_operand" "k")
        )
    "" /* C boolean expression, if required */
    "li %0, %1"
)
```

D.1283 = 10; (set (reg:SI 58 [D.1283]) (const\_int 10: [0xa]) | 1i \$t0, 10

Machine Descriptions and Retargetability: The Essence of Retargetability

July 09

When are the machine descriptions read?



GCC Resource Center, IIT Bombay

July 09

When are the machine descriptions read?

• During the build process

# The Essence of Retargetability

When are the machine descriptions read?

- During the build process
- When a program is compiled by gcc the information gleaned from machine descriptions is consulted











### Part 8

Systematic Construction of Machine Descriptions



**Phases of Compilation** 









July 09 Machine Descriptions and Retargetability: Systematic Construction of Machine Descriptions 33/35



- Define different levels of source language
- Identify the minimal information required in the machine description to support each level
  - Successful compilation of any program, and
  - correct execution of the generated assembly program.
- Interesting observations
  - ▶ It is the increment in the source language which results in understandable increments in machine descriptions rather than the increment in the target architecture.
  - ▶ If the levels are identified properly, the increments in machine descriptions are monotonic.

### Part 10

# Summary

## Summary

- GCC achieves retargetability by reading the machine descriptions and generating a back end customised to the machine descriptions
- Machine descriptions are influenced by: The HLLs, GCC architecture, and properties of target, host and build systems
- Writing machine descriptions requires: specifying the C macros, target instructions and any required support functions
- define\_insn and define\_expand are used to convert a GIMPLE representation to RTL
- GCC machine descriptions can be constructed in a systematic manner

GCC Resource Center, IIT Bom