[guss-hackers] machine description format

guss-hackers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[guss-hackers] machine description format

From:	Johan Rydberg
Subject:	[guss-hackers] machine description format
Date:	Mon, 2 Jun 2003 18:09:52 +0200


Hi,

It is not easy to develop and maintain all the parts of a
simulator for a specific architecture.  For that reason, it
is better to describe the architecture in some form of
pseudo-language.  This is done in GCC with the .md-files,
and with the CPU descriptions in GCC.

GCC uses RTL to describe instructions and attributes of
the architecture.  The format of the machine description file
is quite simple and quite a few people are familiar with
the format so I suggest that we develop something for GUSS
that resembles the GCC machine description files.

The key differences between the needs of a GUSS and a GCC
machine description is that the GCC one specifies how insns
should be generated from a special sequence of RTL expressions.
GUSS must do the opposite; translate from a insn into a sequence
of RTL expressions.

So what should the machine description file contain? Insn
definitions of course, but also definitions of the insn fields.
It is also possible to describe the hardware pieces of the
architecture (such as the register file), but I'm not sure that
such descriptions is needed atm.

One problem with CISC-like ISAs, such as the IA-32 ISA, is that
insn fields can be positioned differently depending on the fn opcode.
This results in that it is quite a complex task to figuring out
which fields that should be extracted, and in what order.
Maybe it is better make the insn decoder dumb, and let hand
written functions handle the extraction order.

The GCC machine descriptor RTX define_insn has the following
syntax:

  (define_insn NAME
    MATCH-SEQUENCE
    CONSTRAIN
    INSNS
    [ATTRS]
  )

The MATCH-SEQUENCE is a small tree or sequence of RTL expressions,
which will be recognized and one of INSNS will be emitted.

I suggest the following format for the define_insn RTX;

  (define_insn NAME
    MATCH-SEQUENCE
    CONSTRAIN
    INSN-RTX
    [ATTRS]
  )

Where MATCH-SEQUENCE match the opcode of the insn.  There may be
more than one insn with the same match sequence, as long as the

CONSTRAING strings is different.To extract fields/operands the match_operand RTX is used. Its

first argument is the operand number (a insn has typically 3 operands).
The second argument is a function name that is used to extract
the operand from the insn stream.  The match_operand RTX will
expand into a function call to "extract_FN" with the mode of the
operand, the operand number and an optional argument as arguments.

The example below defines the "add Ev,Gv" insn from the IA-32 ISA.
The DATASIZE16 and DATASIZE32 constrains is used to separate the

data mode.(define_insn "add+ew+gw"

  (+ (f-opcode 0x01))
  "DATASIZE16"
  (set (match_operand:HI 0 "general_modrm")
       (plus:HI (match_dup:HI 0)
                (match_operand:HI 1 "general_register")))
)

(define_insn "add+el+gl"
  (+ (f-opcode 0x01))
  "DATASIZE32"
  (set (match_operand:SI 0 "general_modrm")
       (plus:SI (match_dup:SI 0)
                (match_operand:SI 1 "general_register")))
)

The example could expand into the following decoder:

  switch (f_opcode)
    {
      ...
      case 1:
        if (DATASIZE16)
          {
            itype = INSN_add_ew_gw;
            extract_general_modrm (info, vpc, 0, HImode, -1);
            extract_general_register (info, vpc, 1, HImode, -1);
          }
        else if (DATASIZE32)
          {
            itype = INSN_add_el_gl;
            extract_general_modrm (info, vpc, 0, HImode, -1);
            extract_general_register (info, vpc, 1, HImode, -1);
          }
        break;
      ...
    }

When executing the insn in the interpreter the match_operand RTX
plays another role.  The function name is used as template for
getter and setter functions for the operand.  The add+ew+gw insn
above could result in the following C code:

  INSN_add_ew_gw:
    {
      set_general_modrm_HI (current_pc, 0,

PLUSHI (get_general_modrm_HI (current_pc,0),get_general_register_HI(current_pc, 1)));

    }
  NEXT_INSN ();

The getters and setters do can be inline functions or macros, so
the overheader do not have to be that great.

Another thing that can be generated from the machine descriptor is
the code that generates micro instructions that is later translated
into host code.  Here the match_operand function plays a third role.
It acts as template for functions that generate micro insns that
either fetches or stores a operand.


  case INSN_add_ew_gw:
     {
       set_general_modrm (info, current_pc, 0, HImode, -1,
                          gen_bex (PLUS, HImode,

get_general_modrm (current_pc, 0,HImode, -1),get_general_register (current_pc,1, HImode, -1)));

     }

The code above would generate the following micro insns for
the "addl %ecx, %ebx" (which can be a "add Gv, Ev" insn of course):

    GET %ecx, tmp0
    GET %ebx, tmp1
    PLUS tmp0, tmp1, tmp2
    PUT tmp2, %ecx

The GET bex (bex stands for "backend expression") fetches a value
for the register file and stores it in a temporary register.  PUT stores
a value in the register file.

Of course this is just some of my ideas.  You are welcome to comment,
make suggestions and ask questions.  I would like this project to be
more than a one-man-show.

brgds,
johan rydberg

[Prev in Thread]

Current Thread

[Next in Thread]

[guss-hackers] machine description format, Johan Rydberg <=
- Re: [guss-hackers] machine description format, Johan Rydberg, 2003/06/02

Prev by Date: [guss-hackers] dynamic binary translation
Next by Date: Re: [guss-hackers] machine description format
Previous by thread: [guss-hackers] dynamic binary translation
Next by thread: Re: [guss-hackers] machine description format
Index(es):
- Date
- Thread