1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
|
Info file gcc.info, produced by Makeinfo, -*- Text -*- from input
file gcc.texinfo.
This file documents the use and the internals of the GNU compiler.
Copyright (C) 1988, 1989, 1990 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided also
that the sections entitled "GNU General Public License" and "Protect
Your Freedom--Fight `Look And Feel'" are included exactly as in the
original, and provided that the entire resulting derived work is
distributed under the terms of a permission notice identical to this
one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that the sections entitled "GNU General Public
License" and "Protect Your Freedom--Fight `Look And Feel'" and this
permission notice may be included in translations approved by the
Free Software Foundation instead of in the original English.
File: gcc.info, Node: Sharing, Prev: Calls, Up: RTL
Structure Sharing Assumptions
=============================
The compiler assumes that certain kinds of RTL expressions are
unique; there do not exist two distinct objects representing the same
value. In other cases, it makes an opposite assumption: that no RTL
expression object of a certain kind appears in more than one place in
the containing structure.
These assumptions refer to a single function; except for the RTL
objects that describe global variables and external functions, no RTL
objects are common to two functions.
* Each pseudo-register has only a single `reg' object to represent
it, and therefore only a single machine mode.
* For any symbolic label, there is only one `symbol_ref' object
referring to it.
* There is only one `const_int' expression with value zero, and
only one with value one.
* There is only one `pc' expression.
* There is only one `cc0' expression.
* There is only one `const_double' expression with mode `SFmode'
and value zero, and only one with mode `DFmode' and value zero.
* No `label_ref' appears in more than one place in the RTL
structure; in other words, it is safe to do a tree-walk of all
the insns in the function and assume that each time a
`label_ref' is seen it is distinct from all others that are seen.
* Only one `mem' object is normally created for each static
variable or stack slot, so these objects are frequently shared
in all the places they appear. However, separate but equal
objects for these variables are occasionally made.
* When a single `asm' statement has multiple output operands, a
distinct `asm_operands' RTX is made for each output operand.
However, these all share the vector which contains the sequence
of input operands. Because this sharing is used later on to
test whether two `asm_operands' RTX's come from the same
statement, the sharing must be guaranteed to be preserved.
* No RTL object appears in more than one place in the RTL
structure except as described above. Many passes of the
compiler rely on this by assuming that they can modify RTL
objects in place without unwanted side-effects on other insns.
* During initial RTL generation, shared structure is freely
introduced. After all the RTL for a function has been
generated, all shared structure is copied by `unshare_all_rtl'
in `emit-rtl.c', after which the above rules are guaranteed to
be followed.
* During the combiner pass, shared structure with an insn can
exist temporarily. However, the shared structure is copied
before the combiner is finished with the insn. This is done by
`copy_substitutions' in `combine.c'.
File: gcc.info, Node: Machine Desc, Next: Machine Macros, Prev: RTL, Up: Top
Machine Descriptions
********************
A machine description has two parts: a file of instruction
patterns (`.md' file) and a C header file of macro definitions.
The `.md' file for a target machine contains a pattern for each
instruction that the target machine supports (or at least each
instruction that is worth telling the compiler about). It may also
contain comments. A semicolon causes the rest of the line to be a
comment, unless the semicolon is inside a quoted string.
See the next chapter for information on the C header file.
* Menu:
* Patterns:: How to write instruction patterns.
* Example:: An explained example of a `define_insn' pattern.
* RTL Template:: The RTL template defines what insns match a pattern.
* Output Template:: The output template says how to make assembler code
from such an insn.
* Output Statement:: For more generality, write C code to output
the assembler code.
* Constraints:: When not all operands are general operands.
* Standard Names:: Names mark patterns to use for code generation.
* Pattern Ordering:: When the order of patterns makes a difference.
* Dependent Patterns:: Having one pattern may make you need another.
* Jump Patterns:: Special considerations for patterns for jump insns.
* Peephole Definitions::Defining machine-specific peephole optimizations.
* Expander Definitions::Generating a sequence of several RTL insns
for a standard operation.
File: gcc.info, Node: Patterns, Next: Example, Prev: Machine Desc, Up: Machine Desc
Everything about Instruction Patterns
=====================================
Each instruction pattern contains an incomplete RTL expression,
with pieces to be filled in later, operand constraints that restrict
how the pieces can be filled in, and an output pattern or C code to
generate the assembler output, all wrapped up in a `define_insn'
expression.
A `define_insn' is an RTL expression containing four or five
operands:
1. An optional name. The presence of a name indicate that this
instruction pattern can perform a certain standard job for the
RTL-generation pass of the compiler. This pass knows certain
names and will use the instruction patterns with those names, if
the names are defined in the machine description.
The absence of a name is indicated by writing an empty string
where the name should go. Nameless instruction patterns are
never used for generating RTL code, but they may permit several
simpler insns to be combined later on.
Names that are not thus known and used in RTL-generation have
no effect; they are equivalent to no name at all.
2. The "RTL template" (*note RTL Template::.) is a vector of
incomplete RTL expressions which show what the instruction
should look like. It is incomplete because it may contain
`match_operand' and `match_dup' expressions that stand for
operands of the instruction.
If the vector has only one element, that element is the
template for the instruction pattern. If the vector has
multiple elements, then the instruction pattern is a `parallel'
expression containing the elements described.
3. A condition. This is a string which contains a C expression
that is the final test to decide whether an insn body matches
this pattern.
For a named pattern, the condition (if present) may not
depend on the data in the insn being matched, but only the
target-machine-type flags. The compiler needs to test these
conditions during initialization in order to learn exactly which
named instructions are available in a particular run.
For nameless patterns, the condition is applied only when
matching an individual insn, and only after the insn has matched
the pattern's recognition template. The insn's operands may be
found in the vector `operands'.
4. The "output template": a string that says how to output matching
insns as assembler code. `%' in this string specifies where to
substitute the value of an operand. *Note Output Template::.
When simple substitution isn't general enough, you can
specify a piece of C code to compute the output. *Note Output
Statement::.
5. Optionally, some "machine-specific information". The meaning of
this information is defined only by an individual machine
description; typically it might say whether this insn alters the
condition codes, or how many bytes of output it generates.
This operand is written as a string containing a C
initializer (complete with braces) for the structure type
`INSN_MACHINE_INFO', whose definition is up to you (*note
Misc::.).
File: gcc.info, Node: Example, Next: RTL Template, Prev: Patterns, Up: Machine Desc
Example of `define_insn'
========================
Here is an actual example of an instruction pattern, for the
68000/68020.
(define_insn "tstsi"
[(set (cc0)
(match_operand:SI 0 "general_operand" "rm"))]
""
"*
{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
return \"tstl %0\";
return \"cmpl #0,%0\"; }")
This is an instruction that sets the condition codes based on the
value of a general operand. It has no condition, so any insn whose
RTL description has the form shown may be handled according to this
pattern. The name `tstsi' means "test a `SImode' value" and tells
the RTL generation pass that, when it is necessary to test such a
value, an insn to do so can be constructed using this pattern.
The output control string is a piece of C code which chooses which
output template to return based on the kind of operand and the
specific type of CPU for which code is being generated.
`"rm"' is an operand constraint. Its meaning is explained below.
File: gcc.info, Node: RTL Template, Next: Output Template, Prev: Example, Up: Machine Desc
RTL Template for Generating and Recognizing Insns
=================================================
The RTL template is used to define which insns match the
particular pattern and how to find their operands. For named
patterns, the RTL template also says how to construct an insn from
specified operands.
Construction involves substituting specified operands into a copy
of the template. Matching involves determining the values that serve
as the operands in the insn being matched. Both of these activities
are controlled by special expression types that direct matching and
substitution of the operands.
`(match_operand:M N PRED CONSTRAINT)'
This expression is a placeholder for operand number N of the
insn. When constructing an insn, operand number N will be
substituted at this point. When matching an insn, whatever
appears at this position in the insn will be taken as operand
number N; but it must satisfy PRED or this instruction pattern
will not match at all.
Operand numbers must be chosen consecutively counting from zero
in each instruction pattern. There may be only one
`match_operand' expression in the pattern for each operand
number. Usually operands are numbered in the order of
appearance in `match_operand' expressions.
PRED is a string that is the name of a C function that accepts
two arguments, an expression and a machine mode. During
matching, the function will be called with the putative operand
as the expression and M as the mode argument. If it returns
zero, this instruction pattern fails to match. PRED may be an
empty string; then it means no test is to be done on the
operand, so anything which occurs in this position is valid.
CONSTRAINT controls reloading and the choice of the best
register class to use for a value, as explained later (*note
Constraints::.).
People are often unclear on the difference between the
constraint and the predicate. The predicate helps decide
whether a given insn matches the pattern. The constraint plays
no role in this decision; instead, it controls various decisions
in the case of an insn which does match.
Most often, PRED is `"general_operand"'. This function checks
that the putative operand is either a constant, a register or a
memory reference, and that it is valid for mode M.
For an operand that must be a register, PRED should be
`"register_operand"'. It would be valid to use
`"general_operand"', since the reload pass would copy any
non-register operands through registers, but this would make GNU
CC do extra work, and it would prevent the register allocator
from doing the best possible job.
For an operand that must be a constant, either PRED should be
`"immediate_operand"', or the instruction pattern's extra
condition should check for constants, or both. You cannot
expect the constraints to do this work! If the constraints
allow only constants, but the predicate allows something else,
the compiler will crash when that case arises.
`(match_dup N)'
This expression is also a placeholder for operand number N. It
is used when the operand needs to appear more than once in the
insn.
In construction, `match_dup' behaves exactly like
`match_operand': the operand is substituted into the insn being
constructed. But in matching, `match_dup' behaves differently.
It assumes that operand number N has already been determined by
a `match_operand' appearing earlier in the recognition template,
and it matches only an identical-looking expression.
`(match_operator:M N "PREDICATE" [OPERANDS...])'
This pattern is a kind of placeholder for a variable RTL
expression code.
When constructing an insn, it stands for an RTL expression whose
expression code is taken from that of operand N, and whose
operands are constructed from the patterns OPERANDS.
When matching an expression, it matches an expression if the
function PREDICATE returns nonzero on that expression *and* the
patterns OPERANDS match the operands of the expression.
Suppose that the function `commutative_operator' is defined as
follows, to match any expression whose operator is one of the
six commutative arithmetic operators of RTL and whose mode is
MODE:
int
commutative_operator (x, mode)
rtx x;
enum machine_mode mode;
{
enum rtx_code code = GET_CODE (x);
if (GET_MODE (x) != mode)
return 0;
return (code == PLUS || code == MULT || code == UMULT
|| code == AND || code == IOR || code == XOR);
}
Then the following pattern will match any RTL expression
consisting of a commutative operator applied to two general
operands:
(match_operator:SI 2 "commutative_operator"
[(match_operand:SI 3 "general_operand" "g")
(match_operand:SI 4 "general_operand" "g")])
Here the vector `[OPERANDS...]' contains two patterns because
the expressions to be matched all contain two operands.
When this pattern does match, the two operands of the
commutative operator are recorded as operands 3 and 4 of the
insn. (This is done by the two instances of `match_operand'.)
Operand 2 of the insn will be the entire commutative expression:
use `GET_CODE (operands[2])' to see which commutative operator
was used.
The machine mode M of `match_operator' works like that of
`match_operand': it is passed as the second argument to the
predicate function, and that function is solely responsible for
deciding whether the expression to be matched "has" that mode.
When constructing an insn, argument 2 of the gen-function will
specify the operation (i.e. the expression code) for the
expression to be made. It should be an RTL expression, whose
expression code is copied into a new expression whose operands
are arguments 3 and 4 of the gen-function. The subexpressions
of argument 2 are not used; only its expression code matters.
There is no way to specify constraints in `match_operator'. The
operand of the insn which corresponds to the `match_operator'
never has any constraints because it is never reloaded as a whole.
However, if parts of its OPERANDS are matched by `match_operand'
patterns, those parts may have constraints of their own.
`(address (match_operand:M N "address_operand" ""))'
This complex of expressions is a placeholder for an operand
number N in a "load address" instruction: an operand which
specifies a memory location in the usual way, but for which the
actual operand value used is the address of the location, not
the contents of the location.
`address' expressions never appear in RTL code, only in machine
descriptions. And they are used only in machine descriptions
that do not use the operand constraint feature. When operand
constraints are in use, the letter `p' in the constraint serves
this purpose.
M is the machine mode of the *memory location being addressed*,
not the machine mode of the address itself. That mode is always
the same on a given target machine (it is `Pmode', which
normally is `SImode'), so there is no point in mentioning it;
thus, no machine mode is written in the `address' expression.
If some day support is added for machines in which addresses of
different kinds of objects appear differently or are used
differently (such as the PDP-10), different formats would
perhaps need different machine modes and these modes might be
written in the `address' expression.
File: gcc.info, Node: Output Template, Next: Output Statement, Prev: RTL Template, Up: Machine Desc
Output Templates and Operand Substitution
=========================================
The "output template" is a string which specifies how to output
the assembler code for an instruction pattern. Most of the template
is a fixed string which is output literally. The character `%' is
used to specify where to substitute an operand; it can also be used
to identify places where different variants of the assembler require
different syntax.
In the simplest case, a `%' followed by a digit N says to output
operand N at that point in the string.
`%' followed by a letter and a digit says to output an operand in
an alternate fashion. Four letters have standard, built-in meanings
described below. The machine description macro `PRINT_OPERAND' can
define additional letters with nonstandard meanings.
`%cDIGIT' can be used to substitute an operand that is a constant
value without the syntax that normally indicates an immediate operand.
`%nDIGIT' is like `%cDIGIT' except that the value of the constant
is negated before printing.
`%aDIGIT' can be used to substitute an operand as if it were a
memory reference, with the actual operand treated as the address.
This may be useful when outputting a "load address" instruction,
because often the assembler syntax for such an instruction requires
you to write the operand as if it were a memory reference.
`%lDIGIT' is used to substitute a `label_ref' into a jump
instruction.
`%' followed by a punctuation character specifies a substitution
that does not use an operand. Only one case is standard: `%%'
outputs a `%' into the assembler code. Other nonstandard cases can
be defined in the `PRINT_OPERAND' macro. You must also define which
punctuation characters are valid with the
`PRINT_OPERAND_PUNCT_VALID_P' macro.
The template may generate multiple assembler instructions. Write
the text for the instructions, with `\;' between them.
When the RTL contains two operands which are required by
constraint to match each other, the output template must refer only
to the lower-numbered operand. Matching operands are not always
identical, and the rest of the compiler arranges to put the proper
RTL expression for printing into the lower-numbered operand.
One use of nonstandard letters or punctuation following `%' is to
distinguish between different assembler languages for the same
machine; for example, Motorola syntax versus MIT syntax for the
68000. Motorola syntax requires periods in most opcode names, while
MIT syntax does not. For example, the opcode `movel' in MIT syntax
is `move.l' in Motorola syntax. The same file of patterns is used
for both kinds of output syntax, but the character sequence `%.' is
used in each place where Motorola syntax wants a period. The
`PRINT_OPERAND' macro for Motorola syntax defines the sequence to
output a period; the macro for MIT syntax defines it to do nothing.
File: gcc.info, Node: Output Statement, Next: Constraints, Prev: Output Template, Up: Machine Desc
C Statements for Generating Assembler Output
============================================
Often a single fixed template string cannot produce correct and
efficient assembler code for all the cases that are recognized by a
single instruction pattern. For example, the opcodes may depend on
the kinds of operands; or some unfortunate combinations of operands
may require extra machine instructions.
If the output control string starts with a `*', then it is not an
output template but rather a piece of C program that should compute a
template. It should execute a `return' statement to return the
template-string you want. Most such templates use C string literals,
which require doublequote characters to delimit them. To include
these doublequote characters in the string, prefix each one with `\'.
The operands may be found in the array `operands', whose C data
type is `rtx []'.
It is possible to output an assembler instruction and then go on
to output or compute more of them, using the subroutine
`output_asm_insn'. This receives two arguments: a template-string
and a vector of operands. The vector may be `operands', or it may be
another array of `rtx' that you declare locally and initialize
yourself.
When an insn pattern has multiple alternatives in its constraints,
often the appearance of the assembler code is determined mostly by
which alternative was matched. When this is so, the C code can test
the variable `which_alternative', which is the ordinal number of the
alternative that was actually satisfied (0 for the first, 1 for the
second alternative, etc.).
For example, suppose there are two opcodes for storing zero,
`clrreg' for registers and `clrmem' for memory locations. Here is
how a pattern could use `which_alternative' to choose between them:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "r,m")
(const_int 0))]
""
"*
return (which_alternative == 0
? \"clrreg %0\" : \"clrmem %0\");
")
File: gcc.info, Node: Constraints, Next: Standard Names, Prev: Output Statement, Up: Machine Desc
Operand Constraints
===================
Each `match_operand' in an instruction pattern can specify a
constraint for the type of operands allowed. Constraints can say
whether an operand may be in a register, and which kinds of register;
whether the operand can be a memory reference, and which kinds of
address; whether the operand may be an immediate constant, and which
possible values it may have. Constraints can also require two
operands to match.
* Menu:
* Simple Constraints:: Basic use of constraints.
* Multi-Alternative:: When an insn has two alternative constraint-patterns.
* Class Preferences:: Constraints guide which hard register to put things in.
* Modifiers:: More precise control over effects of constraints.
* No Constraints:: Describing a clean machine without constraints.
File: gcc.info, Node: Simple Constraints, Next: Multi-Alternative, Prev: Constraints, Up: Constraints
Simple Constraints
------------------
The simplest kind of constraint is a string full of letters, each
of which describes one kind of operand that is permitted. Here are
the letters that are allowed:
`m'
A memory operand is allowed, with any kind of address that the
machine supports in general.
`o'
A memory operand is allowed, but only if the address is
"offsettable". This means that adding a small integer
(actually, the width in bytes of the operand, as determined by
its machine mode) may be added to the address and the result is
also a valid memory address.
For example, an address which is constant is offsettable; so is
an address that is the sum of a register and a constant (as long
as a slightly larger constant is also within the range of
address-offsets supported by the machine); but an autoincrement
or autodecrement address is not offsettable. More complicated
indirect/indexed addresses may or may not be offsettable
depending on the other addressing modes that the machine supports.
Note that in an output operand which can be matched by another
operand, the constraint letter `o' is valid only when
accompanied by both `<' (if the target machine has predecrement
addressing) and `>' (if the target machine has preincrement
addressing).
When the constraint letter `o' is used, the reload pass may
generate instructions which copy a nonoffsettable address into
an index register. The idea is that the register can be used as
a replacement offsettable address. But this method requires
that there be patterns to copy any kind of address into a
register. Auto-increment and auto-decrement addresses are an
exception; there need not be an instruction that can copy such
an address into a register, because reload handles these cases
specially.
Most older machine designs have "load address" instructions
which do just what is needed here. Some RISC machines do not
advertise such instructions, but the possible addresses on these
machines are very limited, so it is easy to fake them.
`<'
A memory operand with autodecrement addressing (either
predecrement or postdecrement) is allowed.
`>'
A memory operand with autoincrement addressing (either
preincrement or postincrement) is allowed.
`r'
A register operand is allowed provided that it is in a general
register.
`d', `a', `f', ...
Other letters can be defined in machine-dependent fashion to
stand for particular classes of registers. `d', `a' and `f' are
defined on the 68000/68020 to stand for data, address and
floating point registers.
`i'
An immediate integer operand (one with constant value) is allowed.
This includes symbolic constants whose values will be known only
at assembly time.
`n'
An immediate integer operand with a known numeric value is
allowed. Many systems cannot support assembly-time constants
for operands less than a word wide. Constraints for these
operands should use `n' rather than `i'.
`I', `J', `K', ...
Other letters in the range `I' through `M' may be defined in a
machine-dependent fashion to permit immediate integer operands
with explicit integer values in specified ranges. For example,
on the 68000, `I' is defined to stand for the range of values 1
to 8. This is the range permitted as a shift count in the shift
instructions.
`F'
An immediate floating operand (expression code `const_double')
is allowed.
`G', `H'
`G' and `H' may be defined in a machine-dependent fashion to
permit immediate floating operands in particular ranges of values.
`s'
An immediate integer operand whose value is not an explicit
integer is allowed.
This might appear strange; if an insn allows a constant operand
with a value not known at compile time, it certainly must allow
any known value. So why use `s' instead of `i'? Sometimes it
allows better code to be generated.
For example, on the 68000 in a fullword instruction it is
possible to use an immediate operand; but if the immediate value
is between -128 and 127, better code results from loading the
value into a register and using the register. This is because
the load into the register can be done with a `moveq'
instruction. We arrange for this to happen by defining the
letter `K' to mean "any integer outside the range -128 to 127",
and then specifying `Ks' in the operand constraints.
`g'
Any register, memory or immediate integer operand is allowed,
except for registers that are not general registers.
`N' (a digit)
An operand that matches operand number N is allowed. If a digit
is used together with letters, the digit should come last.
This is called a "matching constraint" and what it really means
is that the assembler has only a single operand that fills two
roles considered separate in the RTL insn. For example, an add
insn has two input operands and one output operand in the RTL,
but on most machines an add instruction really has only two
operands, one of them an input-output operand.
Matching constraints work only in circumstances like that add
insn. More precisely, the matching constraint must appear in an
input-only operand and the operand that it matches must be an
output-only operand with a lower number. Thus, operand N must
have `=' in its constraint.
For operands to match in a particular case usually means that
they are identical-looking RTL expressions. But in a few
special cases specific kinds of dissimilarity are allowed. For
example, `*x' as an input operand will match `*x++' as an output
operand. For proper results in such cases, the output template
should always use the output-operand's number when printing the
operand.
`p'
An operand that is a valid memory address is allowed. This is
for "load address" and "push address" instructions.
`p' in the constraint must be accompanies by `address_operand'
as the predicate in the `match_operand'.
In order to have valid assembler code, each operand must satisfy
its constraint. But a failure to do so does not prevent the pattern
from applying to an insn. Instead, it directs the compiler to modify
the code so that the constraint will be satisfied. Usually this is
done by copying an operand into a register.
Contrast, therefore, the two instruction patterns that follow:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "r")
(plus:SI (match_dup 0)
(match_operand:SI 1 "general_operand" "r")))]
""
"...")
which has two operands, one of which must appear in two places, and
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "r")
(plus:SI (match_operand:SI 1 "general_operand" "0")
(match_operand:SI 2 "general_operand" "r")))]
""
"...")
which has three operands, two of which are required by a constraint
to be identical. If we are considering an insn of the form
(insn N PREV NEXT
(set (reg:SI 3)
(plus:SI (reg:SI 6) (reg:SI 109)))
...)
the first pattern would not apply at all, because this insn does not
contain two identical subexpressions in the right place. The pattern
would say, "That does not look like an add instruction; try other
patterns." The second pattern would say, "Yes, that's an add
instruction, but there is something wrong with it." It would direct
the reload pass of the compiler to generate additional insns to make
the constraint true. The results might look like this:
(insn N2 PREV N
(set (reg:SI 3) (reg:SI 6))
...)
(insn N N2 NEXT
(set (reg:SI 3)
(plus:SI (reg:SI 3) (reg:SI 109)))
...)
It is up to you to make sure that each operand, in each pattern,
has constraints that can handle any RTL expression that could be
present for that operand. (When multiple alternatives are in use,
each pattern must, for each possible combination of operand
expressions, have at least one alternative which can handle that
combination of operands.) The constraints don't need to *allow* any
possible operand--when this is the case, they do not constrain--but
they must at least point the way to reloading any possible operand so
that it will fit.
* If the constraint accepts whatever operands the predicate
permits, there is no problem: reloading is never necessary for
this operand.
For example, an operand whose constraints permit everything
except registers is safe provided its predicate rejects registers.
An operand whose predicate accepts only constant values is safe
provided its constraints include the letter `i'. If any
possible constant value is accepted, then nothing less than `i'
will do; if the predicate is more selective, then the
constraints may also be more selective.
* Any operand expression can be reloaded by copying it into a
register. So if an operand's constraints allow some kind of
register, it is certain to be safe. It need not permit all
classes of registers; the compiler knows how to copy a register
into another register of the proper class in order to make an
instruction valid.
* A nonoffsettable memory reference can be reloaded by copying the
address into a register. So if the constraint uses the letter
`o', all memory references are taken care of.
* A constant operand can be reloaded by allocating space in memory
to hold it as preinitialized data. Then the memory reference
can be used in place of the constant. So if the constraint uses
the letters `o' or `m', constant operands are not a problem.
If the operand's predicate can recognize registers, but the
constraint does not permit them, it can make the compiler crash.
When this operand happens to be a register, the reload pass will be
stymied, because it does not know how to copy a register temporarily
into memory.
File: gcc.info, Node: Multi-Alternative, Next: Class Preferences, Prev: Simple Constraints, Up: Constraints
Multiple Alternative Constraints
--------------------------------
Sometimes a single instruction has multiple alternative sets of
possible operands. For example, on the 68000, a logical-or
instruction can combine register or an immediate value into memory,
or it can combine any kind of operand into a register; but it cannot
combine one memory location into another.
These constraints are represented as multiple alternatives. An
alternative can be described by a series of letters for each operand.
The overall constraint for an operand is made from the letters for
this operand from the first alternative, a comma, the letters for
this operand from the second alternative, a comma, and so on until
the last alternative. Here is how it is done for fullword logical-or
on the 68000:
(define_insn "iorsi3"
[(set (match_operand:SI 0 "general_operand" "=m,d")
(ior:SI (match_operand:SI 1 "general_operand" "%0,0")
(match_operand:SI 2 "general_operand" "dKs,dmKs")))]
...)
The first alternative has `m' (memory) for operand 0, `0' for
operand 1 (meaning it must match operand 0), and `dKs' for operand 2.
The second alternative has `d' (data register) for operand 0, `0' for
operand 1, and `dmKs' for operand 2. The `=' and `%' in the
constraints apply to all the alternatives; their meaning is explained
in the next section.
If all the operands fit any one alternative, the instruction is
valid. Otherwise, for each alternative, the compiler counts how many
instructions must be added to copy the operands so that that
alternative applies. The alternative requiring the least copying is
chosen. If two alternatives need the same amount of copying, the one
that comes first is chosen. These choices can be altered with the
`?' and `!' characters:
`?'
Disparage slightly the alternative that the `?' appears in, as a
choice when no alternative applies exactly. The compiler
regards this alternative as one unit more costly for each `?'
that appears in it.
`!'
Disparage severely the alternative that the `!' appears in.
When operands must be copied into registers, the compiler will
never choose this alternative as the one to strive for.
When an insn pattern has multiple alternatives in its constraints,
often the appearance of the assembler code is determined mostly by
which alternative was matched. When this is so, the C code for
writing the assembler code can use the variable `which_alternative',
which is the ordinal number of the alternative that was actually
satisfied (0 for the first, 1 for the second alternative, etc.). For
example:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "r,m")
(const_int 0))]
""
"*
return (which_alternative == 0
? \"clrreg %0\" : \"clrmem %0\");
")
File: gcc.info, Node: Class Preferences, Next: Modifiers, Prev: Multi-Alternative, Up: Constraints
Register Class Preferences
--------------------------
The operand constraints have another function: they enable the
compiler to decide which kind of hardware register a pseudo register
is best allocated to. The compiler examines the constraints that
apply to the insns that use the pseudo register, looking for the
machine-dependent letters such as `d' and `a' that specify classes of
registers. The pseudo register is put in whichever class gets the
most "votes". The constraint letters `g' and `r' also vote: they
vote in favor of a general register. The machine description says
which registers are considered general.
Of course, on some machines all registers are equivalent, and no
register classes are defined. Then none of this complexity is
relevant.
File: gcc.info, Node: Modifiers, Next: No Constraints, Prev: Class Preferences, Up: Constraints
Constraint Modifier Characters
------------------------------
`='
Means that this operand is write-only for this instruction: the
previous value is discarded and replaced by output data.
`+'
Means that this operand is both read and written by the
instruction.
When the compiler fixes up the operands to satisfy the
constraints, it needs to know which operands are inputs to the
instruction and which are outputs from it. `=' identifies an
output; `+' identifies an operand that is both input and output;
all other operands are assumed to be input only.
`&'
Means (in a particular alternative) that this operand is written
before the instruction is finished using the input operands.
Therefore, this operand may not lie in a register that is used
as an input operand or as part of any memory address.
`&' applies only to the alternative in which it is written. In
constraints with multiple alternatives, sometimes one
alternative requires `&' while others do not. See, for example,
the `movdf' insn of the 68000.
`&' does not obviate the need to write `='.
`%'
Declares the instruction to be commutative for this operand and
the following operand. This means that the compiler may
interchange the two operands if that is the cheapest way to make
all operands fit the constraints. This is often used in
patterns for addition instructions that really have only two
operands: the result must go in one of the arguments. Here for
example, is how the 68000 halfword-add instruction is defined:
(define_insn "addhi3"
[(set (match_operand:HI 0 "general_operand" "=m,r")
(plus:HI (match_operand:HI 1 "general_operand" "%0,0")
(match_operand:HI 2 "general_operand" "di,g")))]
...)
Note that in previous versions of GNU CC the `%' constraint
modifier always applied to operands 1 and 2 regardless of which
operand it was written in. The usual custom was to write it in
operand 0. Now it must be in operand 1 if the operands to be
exchanged are 1 and 2.
`#'
Says that all following characters, up to the next comma, are to
be ignored as a constraint. They are significant only for
choosing register preferences.
`*'
Says that the following character should be ignored when
choosing register preferences. `*' has no effect on the meaning
of the constraint as a constraint.
Here is an example: the 68000 has an instruction to sign-extend
a halfword in a data register, and can also sign-extend a value
by copying it into an address register. While either kind of
register is acceptable, the constraints on an address-register
destination are less strict, so it is best if register
allocation makes an address register its goal. Therefore, `*'
is used so that the `d' constraint letter (for data register) is
ignored when computing register preferences.
(define_insn "extendhisi2"
[(set (match_operand:SI 0 "general_operand" "=*d,a")
(sign_extend:SI
(match_operand:HI 1 "general_operand" "0,g")))]
...)
File: gcc.info, Node: No Constraints, Prev: Modifiers, Up: Constraints
Not Using Constraints
---------------------
Some machines are so clean that operand constraints are not
required. For example, on the Vax, an operand valid in one context
is valid in any other context. On such a machine, every operand
constraint would be `g', excepting only operands of "load address"
instructions which are written as if they referred to a memory
location's contents but actual refer to its address. They would have
constraint `p'.
For such machines, instead of writing `g' and `p' for all the
constraints, you can choose to write a description with empty
constraints. Then you write `""' for the constraint in every
`match_operand'. Address operands are identified by writing an
`address' expression around the `match_operand', not by their
constraints.
When the machine description has just empty constraints, certain
parts of compilation are skipped, making the compiler faster.
However, few machines actually do not need constraints; all machine
descriptions now in existence use constraints.
|