diff options
| author | Andrew Lee <alee14498@protonmail.com> | 2021-08-15 00:34:05 -0400 |
|---|---|---|
| committer | Andrew Lee <alee14498@protonmail.com> | 2021-08-15 00:34:05 -0400 |
| commit | 60cc83bf91bfc9bb02f6304b5d6c8234ba6d210f (patch) | |
| tree | fdc0be85a1ca35e34c3ae2c805fe9b718e3c1091 /gcc-1.40/gcc.info-5 | |
| parent | dd8dfab51b832a654365ed00c06bf802ff628bfa (diff) | |
| download | linux-0.01-distro-master.tar.gz linux-0.01-distro-master.tar.bz2 linux-0.01-distro-master.zip | |
Diffstat (limited to 'gcc-1.40/gcc.info-5')
| -rw-r--r-- | gcc-1.40/gcc.info-5 | 1117 |
1 files changed, 1117 insertions, 0 deletions
diff --git a/gcc-1.40/gcc.info-5 b/gcc-1.40/gcc.info-5 new file mode 100644 index 0000000..cbf4147 --- /dev/null +++ b/gcc-1.40/gcc.info-5 @@ -0,0 +1,1117 @@ +Info file gcc.info, produced by Makeinfo, -*- Text -*- from input +file gcc.texinfo. + + This file documents the use and the internals of the GNU compiler. + + Copyright (C) 1988, 1989, 1990 Free Software Foundation, Inc. + + Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + + Permission is granted to copy and distribute modified versions of +this manual under the conditions for verbatim copying, provided also +that the sections entitled "GNU General Public License" and "Protect +Your Freedom--Fight `Look And Feel'" are included exactly as in the +original, and provided that the entire resulting derived work is +distributed under the terms of a permission notice identical to this +one. + + Permission is granted to copy and distribute translations of this +manual into another language, under the above conditions for modified +versions, except that the sections entitled "GNU General Public +License" and "Protect Your Freedom--Fight `Look And Feel'" and this +permission notice may be included in translations approved by the +Free Software Foundation instead of in the original English. + + +File: gcc.info, Node: Bug Reporting, Prev: Bug Criteria, Up: Bugs + +How to Report Bugs +================== + + Send bug reports for GNU C to one of these addresses: + + bug-gcc@prep.ai.mit.edu + {ucbvax|mit-eddie|uunet}!prep.ai.mit.edu!bug-gcc + + *Do not send bug reports to `help-gcc', or to the newsgroup +`gnu.gcc.help'.* Most users of GNU CC do not want to receive bug +reports. Those that do, have asked to be on `bug-gcc'. + + The mailing list `bug-gcc' has a newsgroup which serves as a +repeater. The mailing list and the newsgroup carry exactly the same +messages. Often people think of posting bug reports to the newsgroup +instead of mailing them. This appears to work, but it has one +problem which can be crucial: a newsgroup posting does not contain a +mail path back to the sender. Thus, if I need to ask for more +information, I may be unable to reach you. For this reason, it is +better to send bug reports to the mailing list. + + As a last resort, send bug reports on paper to: + + GNU Compiler Bugs + Free Software Foundation + 675 Mass Ave + Cambridge, MA 02139 + + The fundamental principle of reporting bugs usefully is this: +*report all the facts*. If you are not sure whether to state a fact +or leave it out, state it! + + Often people omit facts because they think they know what causes +the problem and they conclude that some details don't matter. Thus, +you might assume that the name of the variable you use in an example +does not matter. Well, probably it doesn't, but one cannot be sure. +Perhaps the bug is a stray memory reference which happens to fetch +from the location where that name is stored in memory; perhaps, if +the name were different, the contents of that location would fool the +compiler into doing the right thing despite the bug. Play it safe +and give a specific, complete example. That is the easiest thing for +you to do, and the most helpful. + + Keep in mind that the purpose of a bug report is to enable me to +fix the bug if it is not known. It isn't very important what happens +if the bug is already known. Therefore, always write your bug +reports on the assumption that the bug is not known. + + Sometimes people give a few sketchy facts and ask, "Does this ring +a bell?" Those bug reports are useless, and I urge everyone to +*refuse to respond to them* except to chide the sender to report bugs +properly. + + To enable me to fix the bug, you should include all these things: + + * The version of GNU CC. You can get this by running it with the + `-v' option. + + Without this, I won't know whether there is any point in looking + for the bug in the current version of GNU CC. + + * A complete input file that will reproduce the bug. If the bug + is in the C preprocessor, send me a source file and any header + files that it requires. If the bug is in the compiler proper + (`cc1'), run your source file through the C preprocessor by + doing `gcc -E SOURCEFILE > OUTFILE', then include the contents + of OUTFILE in the bug report. (Any `-I', `-D' or `-U' options + that you used in actual compilation should also be used when + doing this.) + + A single statement is not enough of an example. In order to + compile it, it must be embedded in a function definition; and + the bug might depend on the details of how this is done. + + Without a real example I can compile, all I can do about your + bug report is wish you luck. It would be futile to try to guess + how to provoke the bug. For example, bugs in register + allocation and reloading frequently depend on every little + detail of the function they happen in. + + * The command arguments you gave GNU CC to compile that example + and observe the bug. For example, did you use `-O'? To + guarantee you won't omit something important, list them all. + + If I were to try to guess the arguments, I would probably guess + wrong and then I would not encounter the bug. + + * The names of the files that you used for `tm.h' and `md' when + you installed the compiler. + + * The type of machine you are using, and the operating system name + and version number. + + * A description of what behavior you observe that you believe is + incorrect. For example, "It gets a fatal signal," or, "There is + an incorrect assembler instruction in the output." + + Of course, if the bug is that the compiler gets a fatal signal, + then I will certainly notice it. But if the bug is incorrect + output, I might not notice unless it is glaringly wrong. I + won't study all the assembler code from a 50-line C program just + on the off chance that it might be wrong. + + Even if the problem you experience is a fatal signal, you should + still say so explicitly. Suppose something strange is going on, + such as, your copy of the compiler is out of synch, or you have + encountered a bug in the C library on your system. (This has + happened!) Your copy might crash and mine would not. If you + told me to expect a crash, then when mine fails to crash, I + would know that the bug was not happening for me. If you had + not told me to expect a crash, then I would not be able to draw + any conclusion from my observations. + + Often the observed symptom is incorrect output when your program + is run. Sad to say, this is not enough information for me + unless the program is short and simple. If you send me a large + program, I don't have time to figure out how it would work if + compiled correctly, much less which line of it was compiled + wrong. So you will have to do that. Tell me which source line + it is, and what incorrect result happens when that line is + executed. A person who understands the test program can find + this as easily as a bug in the program itself. + + * If you send me examples of output from GNU CC, please use `-g' + when you make them. The debugging information includes source + line numbers which are essential for correlating the output with + the input. + + * If you wish to suggest changes to the GNU CC source, send me + context diffs. If you even discuss something in the GNU CC + source, refer to it by context, not by line number. + + The line numbers in my development sources don't match those in + your sources. Your line numbers would convey no useful + information to me. + + * Additional information from a debugger might enable me to find a + problem on a machine which I do not have available myself. + However, you need to think when you collect this information if + you want it to have any chance of being useful. + + For example, many people send just a backtrace, but that is + never useful by itself. A simple backtrace with arguments + conveys little about GNU CC because the compiler is largely + data-driven; the same functions are called over and over for + different RTL insns, doing different things depending on the + details of the insn. + + Most of the arguments listed in the backtrace are useless + because they are pointers to RTL list structure. The numeric + values of the pointers, which the debugger prints in the + backtrace, have no significance whatever; all that matters is + the contents of the objects they point to (and most of the + contents are other such pointers). + + In addition, most compiler passes consist of one or more loops + that scan the RTL insn sequence. The most vital piece of + information about such a loop--which insn it has reached--is + usually in a local variable, not in an argument. + + What you need to provide in addition to a backtrace are the + values of the local variables for several stack frames up. When + a local variable or an argument is an RTX, first print its value + and then use the GDB command `pr' to print the RTL expression + that it points to. (If GDB doesn't run on your machine, use + your debugger to call the function `debug_rtx' with the RTX as + an argument.) In general, whenever a variable is a pointer, its + value is no use without the data it points to. + + In addition, include a debugging dump from just before the pass + in which the crash happens. Most bugs involve a series of + insns, not just one. + + Here are some things that are not necessary: + + * A description of the envelope of the bug. + + Often people who encounter a bug spend a lot of time + investigating which changes to the input file will make the bug + go away and which changes will not affect it. + + This is often time consuming and not very useful, because the + way I will find the bug is by running a single example under the + debugger with breakpoints, not by pure deduction from a series + of examples. I recommend that you save your time for something + else. + + Of course, if you can find a simpler example to report *instead* + of the original one, that is a convenience for me. Errors in + the output will be easier to spot, running under the debugger + will take less time, etc. Most GNU CC bugs involve just one + function, so the most straightforward way to simplify an example + is to delete all the function definitions except the one where + the bug occurs. Those earlier in the file may be replaced by + external declarations if the crucial function depends on them. + (Exception: inline functions may affect compilation of functions + defined later in the file.) + + However, simplification is not vital; if you don't want to do + this, report the bug anyway and send me the entire test case you + used. + + * A patch for the bug. + + A patch for the bug does help me if it is a good one. But don't + omit the necessary information, such as the test case, on the + assumption that a patch is all I need. I might see problems + with your patch and decide to fix the problem another way, or I + might not understand it at all. + + Sometimes with a program as complicated as GNU CC it is very + hard to construct an example that will make the program follow a + certain path through the code. If you don't send me the + example, I won't be able to construct one, so I won't be able to + verify that the bug is fixed. + + And if I can't understand what bug you are trying to fix, or why + your patch should be an improvement, I won't install it. A test + case will help me to understand. + + * A guess about what the bug is or what it depends on. + + Such guesses are usually wrong. Even I can't guess right about + such things without first using the debugger to find the facts. + + +File: gcc.info, Node: Portability, Next: Interface, Prev: Bugs, Up: Top + +GNU CC and Portability +********************** + + The main goal of GNU CC was to make a good, fast compiler for +machines in the class that the GNU system aims to run on: 32-bit +machines that address 8-bit bytes and have several general registers. +Elegance, theoretical power and simplicity are only secondary. + + GNU CC gets most of the information about the target machine from +a machine description which gives an algebraic formula for each of +the machine's instructions. This is a very clean way to describe the +target. But when the compiler needs information that is difficult to +express in this fashion, I have not hesitated to define an ad-hoc +parameter to the machine description. The purpose of portability is +to reduce the total work needed on the compiler; it was not of +interest for its own sake. + + GNU CC does not contain machine dependent code, but it does +contain code that depends on machine parameters such as endianness +(whether the most significant byte has the highest or lowest address +of the bytes in a word) and the availability of autoincrement +addressing. In the RTL-generation pass, it is often necessary to +have multiple strategies for generating code for a particular kind of +syntax tree, strategies that are usable for different combinations of +parameters. Often I have not tried to address all possible cases, +but only the common ones or only the ones that I have encountered. +As a result, a new target may require additional strategies. You +will know if this happens because the compiler will call `abort'. +Fortunately, the new strategies can be added in a machine-independent +fashion, and will affect only the target machines that need them. + + +File: gcc.info, Node: Interface, Next: Passes, Prev: Portability, Up: Top + +Interfacing to GNU CC Output +**************************** + + GNU CC is normally configured to use the same function calling +convention normally in use on the target system. This is done with +the machine-description macros described (*note Machine Macros::.). + + However, returning of structure and union values is done +differently on some target machines. As a result, functions compiled +with PCC returning such types cannot be called from code compiled +with GNU CC, and vice versa. This does not cause trouble often +because few Unix library routines return structures or unions. + + GNU CC code returns structures and unions that are 1, 2, 4 or 8 +bytes long in the same registers used for `int' or `double' return +values. (GNU CC typically allocates variables of such types in +registers also.) Structures and unions of other sizes are returned +by storing them into an address passed by the caller (usually in a +register). The machine-description macros `STRUCT_VALUE' and +`STRUCT_INCOMING_VALUE' tell GNU CC where to pass this address. + + By contrast, PCC on most target machines returns structures and +unions of any size by copying the data into an area of static +storage, and then returning the address of that storage as if it were +a pointer value. The caller must copy the data from that memory area +to the place where the value is wanted. This is slower than the +method used by GNU CC, and fails to be reentrant. + + On some target machines, such as RISC machines and the 80386, the +standard system convention is to pass to the subroutine the address +of where to return the value. On these machines, GNU CC has been +configured to be compatible with the standard compiler, when this +method is used. It may not be compatible for structures of 1, 2, 4 +or 8 bytes. + + GNU CC uses the system's standard convention for passing +arguments. On some machines, the first few arguments are passed in +registers; in others, all are passed on the stack. It would be +possible to use registers for argument passing on any machine, and +this would probably result in a significant speedup. But the result +would be complete incompatibility with code that follows the standard +convention. So this change is practical only if you are switching to +GNU CC as the sole C compiler for the system. We may implement +register argument passing on certain machines once we have a complete +GNU system so that we can compile the libraries with GNU CC. + + If you use `longjmp', beware of automatic variables. ANSI C says +that automatic variables that are not declared `volatile' have +undefined values after a `longjmp'. And this is all GNU CC promises +to do, because it is very difficult to restore register variables +correctly, and one of GNU CC's features is that it can put variables +in registers without your asking it to. + + If you want a variable to be unaltered by `longjmp', and you don't +want to write `volatile' because old C compilers don't accept it, +just take the address of the variable. If a variable's address is +ever taken, even if just to compute it and ignore it, then the +variable cannot go in a register: + + { + int careful; + &careful; + ... + } + + Code compiled with GNU CC may call certain library routines. Most +of them handle arithmetic for which there are no instructions. This +includes multiply and divide on some machines, and floating point +operations on any machine for which floating point support is +disabled with `-msoft-float'. Some standard parts of the C library, +such as `bcopy' or `memcpy', are also called automatically. The +usual function call interface is used for calling the library routines. + + These library routines should be defined in the library `gnulib', +which GNU CC automatically searches whenever it links a program. On +machines that have multiply and divide instructions, if hardware +floating point is in use, normally `gnulib' is not needed, but it is +searched just in case. + + Each arithmetic function is defined in `gnulib.c' to use the +corresponding C arithmetic operator. As long as the file is compiled +with another C compiler, which supports all the C arithmetic +operators, this file will work portably. However, `gnulib.c' does +not work if compiled with GNU CC, because each arithmetic function +would compile into a call to itself! + + +File: gcc.info, Node: Passes, Next: RTL, Prev: Interface, Up: Top + +Passes and Files of the Compiler +******************************** + + The overall control structure of the compiler is in `toplev.c'. +This file is responsible for initialization, decoding arguments, +opening and closing files, and sequencing the passes. + + The parsing pass is invoked only once, to parse the entire input. +The RTL intermediate code for a function is generated as the function +is parsed, a statement at a time. Each statement is read in as a +syntax tree and then converted to RTL; then the storage for the tree +for the statement is reclaimed. Storage for types (and the +expressions for their sizes), declarations, and a representation of +the binding contours and how they nest, remains until the function is +finished being compiled; these are all needed to output the debugging +information. + + Each time the parsing pass reads a complete function definition or +top-level declaration, it calls the function `rest_of_compilation' or +`rest_of_decl_compilation' in `toplev.c', which are responsible for +all further processing necessary, ending with output of the assembler +language. All other compiler passes run, in sequence, within +`rest_of_compilation'. When that function returns from compiling a +function definition, the storage used for that function definition's +compilation is entirely freed, unless it is an inline function (*note +Inline::.). + + Here is a list of all the passes of the compiler and their source +files. Also included is a description of where debugging dumps can +be requested with `-d' options. + + * Parsing. This pass reads the entire text of a function + definition, constructing partial syntax trees. This and RTL + generation are no longer truly separate passes (formerly they + were), but it is easier to think of them as separate. + + The tree representation does not entirely follow C syntax, + because it is intended to support other languages as well. + + C data type analysis is also done in this pass, and every tree + node that represents an expression has a data type attached. + Variables are represented as declaration nodes. + + Constant folding and associative-law simplifications are also + done during this pass. + + The source files for parsing are `c-parse.y', `c-decl.c', + `c-typeck.c', `c-convert.c', `stor-layout.c', `fold-const.c', + and `tree.c'. The last three files are intended to be + language-independent. There are also header files `c-parse.h', + `c-tree.h', `tree.h' and `tree.def'. The last two define the + format of the tree representation. + + * RTL generation. This is the conversion of syntax tree into RTL + code. It is actually done statement-by-statement during + parsing, but for most purposes it can be thought of as a + separate pass. + + This is where the bulk of target-parameter-dependent code is + found, since often it is necessary for strategies to apply only + when certain standard kinds of instructions are available. The + purpose of named instruction patterns is to provide this + information to the RTL generation pass. + + Optimization is done in this pass for `if'-conditions that are + comparisons, boolean operations or conditional expressions. + Tail recursion is detected at this time also. Decisions are + made about how best to arrange loops and how to output `switch' + statements. + + The source files for RTL generation are `stmt.c', `expr.c', + `explow.c', `expmed.c', `optabs.c' and `emit-rtl.c'. Also, the + file `insn-emit.c', generated from the machine description by + the program `genemit', is used in this pass. The header files + `expr.h' is used for communication within this pass. + + The header files `insn-flags.h' and `insn-codes.h', generated + from the machine description by the programs `genflags' and + `gencodes', tell this pass which standard names are available + for use and which patterns correspond to them. + + Aside from debugging information output, none of the following + passes refers to the tree structure representation of the + function (only part of which is saved). + + The decision of whether the function can and should be expanded + inline in its subsequent callers is made at the end of rtl + generation. The function must meet certain criteria, currently + related to the size of the function and the types and number of + parameters it has. Note that this function may contain loops, + recursive calls to itself (tail-recursive functions can be + inlined!), gotos, in short, all constructs supported by GNU CC. + + The option `-dr' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.rtl' to + the input file name. + + * Jump optimization. This pass simplifies jumps to the following + instruction, jumps across jumps, and jumps to jumps. It deletes + unreferenced labels and unreachable code, except that + unreachable code that contains a loop is not recognized as + unreachable in this pass. (Such loops are deleted later in the + basic block analysis.) + + Jump optimization is performed two or three times. The first + time is immediately following RTL generation. The second time + is after CSE, but only if CSE says repeated jump optimization is + needed. The last time is right before the final pass. That + time, cross-jumping and deletion of no-op move instructions are + done together with the optimizations described above. + + The source file of this pass is `jump.c'. + + The option `-dj' causes a debugging dump of the RTL code after + this pass is run for the first time. This dump file's name is + made by appending `.jump' to the input file name. + + * Register scan. This pass finds the first and last use of each + register, as a guide for common subexpression elimination. Its + source is in `regclass.c'. + + * Common subexpression elimination. This pass also does constant + propagation. Its source file is `cse.c'. If constant + propagation causes conditional jumps to become unconditional or + to become no-ops, jump optimization is run again when CSE is + finished. + + The option `-ds' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.cse' to + the input file name. + + * Loop optimization. This pass moves constant expressions out of + loops, and optionally does strength-reduction as well. Its + source file is `loop.c'. + + The option `-dL' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.loop' + to the input file name. + + * Stupid register allocation is performed at this point in a + nonoptimizing compilation. It does a little data flow analysis + as well. When stupid register allocation is in use, the next + pass executed is the reloading pass; the others in between are + skipped. The source file is `stupid.c'. + + * Data flow analysis (`flow.c'). This pass divides the program + into basic blocks (and in the process deletes unreachable + loops); then it computes which pseudo-registers are live at each + point in the program, and makes the first instruction that uses + a value point at the instruction that computed the value. + + This pass also deletes computations whose results are never + used, and combines memory references with add or subtract + instructions to make autoincrement or autodecrement addressing. + + The option `-df' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.flow' + to the input file name. If stupid register allocation is in + use, this dump file reflects the full results of such allocation. + + * Instruction combination (`combine.c'). This pass attempts to + combine groups of two or three instructions that are related by + data flow into single instructions. It combines the RTL + expressions for the instructions by substitution, simplifies the + result using algebra, and then attempts to match the result + against the machine description. + + The option `-dc' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending + `.combine' to the input file name. + + * Register class preferencing. The RTL code is scanned to find + out which register class is best for each pseudo register. The + source file is `regclass.c'. + + * Local register allocation (`local-alloc.c'). This pass + allocates hard registers to pseudo registers that are used only + within one basic block. Because the basic block is linear, it + can use fast and powerful techniques to do a very good job. + + The option `-dl' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.lreg' + to the input file name. + + * Global register allocation (`global-alloc.c'). This pass + allocates hard registers for the remaining pseudo registers + (those whose life spans are not contained in one basic block). + + * Reloading. This pass renumbers pseudo registers with the + hardware registers numbers they were allocated. Pseudo + registers that did not get hard registers are replaced with + stack slots. Then it finds instructions that are invalid + because a value has failed to end up in a register, or has ended + up in a register of the wrong kind. It fixes up these + instructions by reloading the problematical values temporarily + into registers. Additional instructions are generated to do the + copying. + + Source files are `reload.c' and `reload1.c', plus the header + `reload.h' used for communication between them. + + The option `-dg' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.greg' + to the input file name. + + * Jump optimization is repeated, this time including cross-jumping + and deletion of no-op move instructions. + + The option `-dJ' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.jump2' + to the input file name. + + * Delayed branch scheduling may be done at this point. The source + file name is `dbranch.c'. + + The option `-dd' causes a debugging dump of the RTL code after + this pass. This dump file's name is made by appending `.dbr' to + the input file name. + + * Final. This pass outputs the assembler code for the function. + It is also responsible for identifying spurious test and compare + instructions. Machine-specific peephole optimizations are + performed at the same time. The function entry and exit + sequences are generated directly as assembler code in this pass; + they never exist as RTL. + + The source files are `final.c' plus `insn-output.c'; the latter + is generated automatically from the machine description by the + tool `genoutput'. The header file `conditions.h' is used for + communication between these files. + + * Debugging information output. This is run after final because + it must output the stack slot offsets for pseudo registers that + did not get hard registers. Source files are `dbxout.c' for DBX + symbol table format and `symout.c' for GDB's own symbol table + format. + + Some additional files are used by all or many passes: + + * Every pass uses `machmode.def', which defines the machine modes. + + * All the passes that work with RTL use the header files `rtl.h' + and `rtl.def', and subroutines in file `rtl.c'. The tools + `gen*' also use these files to read and work with the machine + description RTL. + + * Several passes refer to the header file `insn-config.h' which + contains a few parameters (C macro definitions) generated + automatically from the machine description RTL by the tool + `genconfig'. + + * Several passes use the instruction recognizer, which consists of + `recog.c' and `recog.h', plus the files `insn-recog.c' and + `insn-extract.c' that are generated automatically from the + machine description by the tools `genrecog' and `genextract'. + + * Several passes use the header files `regs.h' which defines the + information recorded about pseudo register usage, and + `basic-block.h' which defines the information recorded about + basic blocks. + + * `hard-reg-set.h' defines the type `HARD_REG_SET', a bit-vector + with a bit for each hard register, and some macros to manipulate + it. This type is just `int' if the machine has few enough hard + registers; otherwise it is an array of `int' and some of the + macros expand into loops. + + +File: gcc.info, Node: RTL, Next: Machine Desc, Prev: Passes, Up: Top + +RTL Representation +****************** + + Most of the work of the compiler is done on an intermediate +representation called register transfer language. In this language, +the instructions to be output are described, pretty much one by one, +in an algebraic form that describes what the instruction does. + + RTL is inspired by Lisp lists. It has both an internal form, made +up of structures that point at other structures, and a textual form +that is used in the machine description and in printed debugging +dumps. The textual form uses nested parentheses to indicate the +pointers in the internal form. + +* Menu: + +* RTL Objects:: Expressions vs vectors vs strings vs integers. +* Accessors:: Macros to access expression operands or vector elts. +* Flags:: Other flags in an RTL expression. +* Machine Modes:: Describing the size and format of a datum. +* Constants:: Expressions with constant values. +* Regs and Memory:: Expressions representing register contents or memory. +* Arithmetic:: Expressions representing arithmetic on other expressions. +* Comparisons:: Expressions representing comparison of expressions. +* Bit Fields:: Expressions representing bit-fields in memory or reg. +* Conversions:: Extending, truncating, floating or fixing. +* RTL Declarations:: Declaring volatility, constancy, etc. +* Side Effects:: Expressions for storing in registers, etc. +* Incdec:: Embedded side-effects for autoincrement addressing. +* Assembler:: Representing `asm' with operands. +* Insns:: Expression types for entire insns. +* Calls:: RTL representation of function call insns. +* Sharing:: Some expressions are unique; others *must* be copied. + + +File: gcc.info, Node: RTL Objects, Next: Accessors, Prev: RTL, Up: RTL + +RTL Object Types +================ + + RTL uses four kinds of objects: expressions, integers, strings and +vectors. Expressions are the most important ones. An RTL expression +("RTX", for short) is a C structure, but it is usually referred to +with a pointer; a type that is given the typedef name `rtx'. + + An integer is simply an `int', and a string is a `char *'. Within +RTL code, strings appear only inside `symbol_ref' expressions, but +they appear in other contexts in the RTL expressions that make up +machine descriptions. Their written form uses decimal digits. + + A string is a sequence of characters. In core it is represented +as a `char *' in usual C fashion, and it is written in C syntax as +well. However, strings in RTL may never be null. If you write an +empty string in a machine description, it is represented in core as a +null pointer rather than as a pointer to a null character. In +certain contexts, these null pointers instead of strings are valid. + + A vector contains an arbitrary, specified number of pointers to +expressions. The number of elements in the vector is explicitly +present in the vector. The written form of a vector consists of +square brackets (`[...]') surrounding the elements, in sequence and +with whitespace separating them. Vectors of length zero are not +created; null pointers are used instead. + + Expressions are classified by "expression codes" (also called RTX +codes). The expression code is a name defined in `rtl.def', which is +also (in upper case) a C enumeration constant. The possible +expression codes and their meanings are machine-independent. The +code of an RTX can be extracted with the macro `GET_CODE (X)' and +altered with `PUT_CODE (X, NEWCODE)'. + + The expression code determines how many operands the expression +contains, and what kinds of objects they are. In RTL, unlike Lisp, +you cannot tell by looking at an operand what kind of object it is. +Instead, you must know from its context--from the expression code of +the containing expression. For example, in an expression of code +`subreg', the first operand is to be regarded as an expression and +the second operand as an integer. In an expression of code `plus', +there are two operands, both of which are to be regarded as +expressions. In a `symbol_ref' expression, there is one operand, +which is to be regarded as a string. + + Expressions are written as parentheses containing the name of the +expression type, its flags and machine mode if any, and then the +operands of the expression (separated by spaces). + + Expression code names in the `md' file are written in lower case, +but when they appear in C code they are written in upper case. In +this manual, they are shown as follows: `const_int'. + + In a few contexts a null pointer is valid where an expression is +normally wanted. The written form of this is `(nil)'. + + +File: gcc.info, Node: Accessors, Next: Flags, Prev: RTL Objects, Up: RTL + +Access to Operands +================== + + For each expression type `rtl.def' specifies the number of +contained objects and their kinds, with four possibilities: `e' for +expression (actually a pointer to an expression), `i' for integer, +`s' for string, and `E' for vector of expressions. The sequence of +letters for an expression code is called its "format". Thus, the +format of `subreg' is `ei'. + + Two other format characters are used occasionally: `u' and `0'. +`u' is equivalent to `e' except that it is printed differently in +debugging dumps, and `0' means a slot whose contents do not fit any +normal category. `0' slots are not printed at all in dumps, and are +often used in special ways by small parts of the compiler. + + There are macros to get the number of operands and the format of +an expression code: + +`GET_RTX_LENGTH (CODE)' + Number of operands of an RTX of code CODE. + +`GET_RTX_FORMAT (CODE)' + The format of an RTX of code CODE, as a C string. + + Operands of expressions are accessed using the macros `XEXP', +`XINT' and `XSTR'. Each of these macros takes two arguments: an +expression-pointer (RTX) and an operand number (counting from zero). +Thus, + + XEXP (X, 2) + +accesses operand 2 of expression X, as an expression. + + XINT (X, 2) + +accesses the same operand as an integer. `XSTR', used in the same +fashion, would access it as a string. + + Any operand can be accessed as an integer, as an expression or as +a string. You must choose the correct method of access for the kind +of value actually stored in the operand. You would do this based on +the expression code of the containing expression. That is also how +you would know how many operands there are. + + For example, if X is a `subreg' expression, you know that it has +two operands which can be correctly accessed as `XEXP (X, 0)' and +`XINT (X, 1)'. If you did `XINT (X, 0)', you would get the address +of the expression operand but cast as an integer; that might +occasionally be useful, but it would be cleaner to write `(int) XEXP +(X, 0)'. `XEXP (X, 1)' would also compile without error, and would +return the second, integer operand cast as an expression pointer, +which would probably result in a crash when accessed. Nothing stops +you from writing `XEXP (X, 28)' either, but this will access memory +past the end of the expression with unpredictable results. + + Access to operands which are vectors is more complicated. You can +use the macro `XVEC' to get the vector-pointer itself, or the macros +`XVECEXP' and `XVECLEN' to access the elements and length of a vector. + +`XVEC (EXP, IDX)' + Access the vector-pointer which is operand number IDX in EXP. + +`XVECLEN (EXP, IDX)' + Access the length (number of elements) in the vector which is in + operand number IDX in EXP. This value is an `int'. + +`XVECEXP (EXP, IDX, ELTNUM)' + Access element number ELTNUM in the vector which is in operand + number IDX in EXP. This value is an RTX. + + It is up to you to make sure that ELTNUM is not negative and is + less than `XVECLEN (EXP, IDX)'. + + All the macros defined in this section expand into lvalues and +therefore can be used to assign the operands, lengths and vector +elements as well as to access them. + + +File: gcc.info, Node: Flags, Next: Machine Modes, Prev: Accessors, Up: RTL + +Flags in an RTL Expression +========================== + + RTL expressions contain several flags (one-bit bit-fields) that +are used in certain types of expression. Most often they are +accessed with the following macros: + +`EXTERNAL_SYMBOL_P (X)' + In a `symbol_ref' expression, nonzero if it corresponds to a + variable declared extern in the users code. Zero for all other + variables. Stored in the `volatil' field and printed as `/v'. + +`MEM_VOLATILE_P (X)' + In `mem' expressions, nonzero for volatile memory references. + Stored in the `volatil' field and printed as `/v'. + +`MEM_IN_STRUCT_P (X)' + In `mem' expressions, nonzero for reference to an entire + structure, union or array, or to a component of one. Zero for + references to a scalar variable or through a pointer to a scalar. + Stored in the `in_struct' field and printed as `/s'. + +`REG_USER_VAR_P (X)' + In a `reg', nonzero if it corresponds to a variable present in + the user's source code. Zero for temporaries generated + internally by the compiler. Stored in the `volatil' field and + printed as `/v'. + +`REG_FUNCTION_VALUE_P (X)' + Nonzero in a `reg' if it is the place in which this function's + value is going to be returned. (This happens only in a hard + register.) Stored in the `integrated' field and printed as `/i'. + + The same hard register may be used also for collecting the + values of functions called by this one, but + `REG_FUNCTION_VALUE_P' is zero in this kind of use. + +`RTX_UNCHANGING_P (X)' + Nonzero in a `reg' or `mem' if the value is not changed + explicitly by the current function. (If it is a memory + reference then it may be changed by other functions or by + aliasing.) Stored in the `unchanging' field and printed as `/u'. + +`RTX_INTEGRATED_P (INSN)' + Nonzero in an insn if it resulted from an in-line function call. + Stored in the `integrated' field and printed as `/i'. This may + be deleted; nothing currently depends on it. + +`INSN_DELETED_P (INSN)' + In an insn, nonzero if the insn has been deleted. Stored in the + `volatil' field and printed as `/v'. + +`CONSTANT_POOL_ADDRESS_P (X)' + Nonzero in a `symbol_ref' if it refers to part of the current + function's "constants pool". These are addresses close to the + beginning of the function, and GNU CC assumes they can be + addressed directly (perhaps with the help of base registers). + Stored in the `unchanging' field and printed as `/u'. + + These are the fields which the above macros refer to: + +`used' + This flag is used only momentarily, at the end of RTL generation + for a function, to count the number of times an expression + appears in insns. Expressions that appear more than once are + copied, according to the rules for shared structure (*note + Sharing::.). + +`volatil' + This flag is used in `mem',`symbol_ref' and `reg' expressions + and in insns. In RTL dump files, it is printed as `/v'. + + In a `mem' expression, it is 1 if the memory reference is + volatile. Volatile memory references may not be deleted, + reordered or combined. + + In a `reg' expression, it is 1 if the value is a user-level + variable. 0 indicates an internal compiler temporary. + + In a `symbol_ref' expression, it is 1 if the symbol is declared + `extern'. + + In an insn, 1 means the insn has been deleted. + +`in_struct' + This flag is used in `mem' expressions. It is 1 if the memory + datum referred to is all or part of a structure or array; 0 if + it is (or might be) a scalar variable. A reference through a C + pointer has 0 because the pointer might point to a scalar + variable. + + This information allows the compiler to determine something + about possible cases of aliasing. + + In an RTL dump, this flag is represented as `/s'. + +`unchanging' + This flag is used in `reg' and `mem' expressions. 1 means that + the value of the expression never changes (at least within the + current function). + + In an RTL dump, this flag is represented as `/u'. + +`integrated' + In some kinds of expressions, including insns, this flag means + the rtl was produced by procedure integration. + + In a `reg' expression, this flag indicates the register + containing the value to be returned by the current function. On + machines that pass parameters in registers, the same register + number may be used for parameters as well, but this flag is not + set on such uses. + + +File: gcc.info, Node: Machine Modes, Next: Constants, Prev: Flags, Up: RTL + +Machine Modes +============= + + A machine mode describes a size of data object and the +representation used for it. In the C code, machine modes are +represented by an enumeration type, `enum machine_mode', defined in +`machmode.def'. Each RTL expression has room for a machine mode and +so do certain kinds of tree expressions (declarations and types, to +be precise). + + In debugging dumps and machine descriptions, the machine mode of +an RTL expression is written after the expression code with a colon +to separate them. The letters `mode' which appear at the end of each +machine mode name are omitted. For example, `(reg:SI 38)' is a `reg' +expression with machine mode `SImode'. If the mode is `VOIDmode', it +is not written at all. + + Here is a table of machine modes. + +`QImode' + "Quarter-Integer" mode represents a single byte treated as an + integer. + +`HImode' + "Half-Integer" mode represents a two-byte integer. + +`PSImode' + "Partial Single Integer" mode represents an integer which + occupies four bytes but which doesn't really use all four. On + some machines, this is the right mode to use for pointers. + +`SImode' + "Single Integer" mode represents a four-byte integer. + +`PDImode' + "Partial Double Integer" mode represents an integer which + occupies eight bytes but which doesn't really use all eight. On + some machines, this is the right mode to use for certain pointers. + +`DImode' + "Double Integer" mode represents an eight-byte integer. + +`TImode' + "Tetra Integer" (?) mode represents a sixteen-byte integer. + +`SFmode' + "Single Floating" mode represents a single-precision (four byte) + floating point number. + +`DFmode' + "Double Floating" mode represents a double-precision (eight + byte) floating point number. + +`XFmode' + "Extended Floating" mode represents a triple-precision (twelve + byte) floating point number. This mode is used for IEEE + extended floating point. + +`TFmode' + "Tetra Floating" mode represents a quadruple-precision (sixteen + byte) floating point number. + +`BLKmode' + "Block" mode represents values that are aggregates to which none + of the other modes apply. In RTL, only memory references can + have this mode, and only if they appear in string-move or vector + instructions. On machines which have no such instructions, + `BLKmode' will not appear in RTL. + +`VOIDmode' + Void mode means the absence of a mode or an unspecified mode. + For example, RTL expressions of code `const_int' have mode + `VOIDmode' because they can be taken to have whatever mode the + context requires. In debugging dumps of RTL, `VOIDmode' is + expressed by the absence of any mode. + +`EPmode' + "Entry Pointer" mode is intended to be used for function + variables in Pascal and other block structured languages. Such + values contain both a function address and a static chain + pointer for access to automatic variables of outer levels. This + mode is only partially implemented since C does not use it. + +`CSImode, ...' + "Complex Single Integer" mode stands for a complex number + represented as a pair of `SImode' integers. Any of the integer + and floating modes may have `C' prefixed to its name to obtain a + complex number mode. For example, there are `CQImode', + `CSFmode', and `CDFmode'. Since C does not support complex + numbers, these machine modes are only partially implemented. + +`BImode' + This is the machine mode of a bit-field in a structure. It is + used only in the syntax tree, never in RTL, and in the syntax + tree it appears only in declaration nodes. In C, it appears + only in `FIELD_DECL' nodes for structure fields defined with a + bit size. + + The machine description defines `Pmode' as a C macro which expands +into the machine mode used for addresses. Normally this is `SImode'. + + The only modes which a machine description must support are +`QImode', `SImode', `SFmode' and `DFmode'. The compiler will attempt +to use `DImode' for two-word structures and unions, but this can be +prevented by overriding the definition of `MAX_FIXED_MODE_SIZE'. +Likewise, you can arrange for the C type `short int' to avoid using +`HImode'. In the long term it might be desirable to make the set of +available machine modes machine-dependent and eliminate all +assumptions about specific machine modes or their uses from the +machine-independent code of the compiler. + + To help begin this process, the machine modes are divided into +mode classes. These are represented by the enumeration type `enum +mode_class' defined in `rtl.h'. The possible mode classes are: + +`MODE_INT' + Integer modes. By default these are `QImode', `HImode', + `SImode', `DImode', `TImode', and also `BImode'. + +`MODE_FLOAT' + Floating-point modes. By default these are `QFmode', `HFmode', + `SFmode', `DFmode' and `TFmode', but the MC68881 also defines + `XFmode' to be an 80-bit extended-precision floating-point mode. + +`MODE_COMPLEX_INT' + Complex integer modes. By default these are `CQImode', + `CHImode', `CSImode', `CDImode' and `CTImode'. + +`MODE_COMPLEX_FLOAT' + Complex floating-point modes. By default these are `CQFmode', + `CHFmode', `CSFmode', `CDFmode' and `CTFmode', + +`MODE_FUNCTION' + Algol or Pascal function variables including a static chain. + (These are not currently implemented). + +`MODE_RANDOM' + This is a catchall mode class for modes which don't fit into the + above classes. Currently `VOIDmode', `BLKmode' and `EPmode' are + in `MODE_RANDOM'. + + Here are some C macros that relate to machine modes: + +`GET_MODE (X)' + Returns the machine mode of the RTX X. + +`PUT_MODE (X, NEWMODE)' + Alters the machine mode of the RTX X to be NEWMODE. + +`NUM_MACHINE_MODES' + Stands for the number of machine modes available on the target + machine. This is one greater than the largest numeric value of + any machine mode. + +`GET_MODE_NAME (M)' + Returns the name of mode M as a string. + +`GET_MODE_CLASS (M)' + Returns the mode class of mode M. + +`GET_MODE_SIZE (M)' + Returns the size in bytes of a datum of mode M. + +`GET_MODE_BITSIZE (M)' + Returns the size in bits of a datum of mode M. + +`GET_MODE_UNIT_SIZE (M)' + Returns the size in bits of the subunits of a datum of mode M. + This is the same as `GET_MODE_SIZE' except in the case of + complex modes and `EPmode'. For them, the unit size is the size + of the real or imaginary part, or the size of the function + pointer or the context pointer. + +
\ No newline at end of file |
