lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lightning] i386 sse support


From: Paulo César Pereira de Andrade
Subject: Re: [Lightning] i386 sse support
Date: Fri, 17 Sep 2010 15:31:46 -0300

Em 17 de setembro de 2010 04:28, Paolo Bonzini <address@hidden> escreveu:
>>   BTW, I was about to change
>> http://code.google.com/p/exl/source/browse/trunk/lib/elightning.c#690
>> because I thought it was caused by the bad jumps in the x87 and/or sse
>> code, but actually it still happens if used a sse instruction to
>> add/sub/etc
>> vectors (using gcc __attribute__((vector_size(16)))) and generated jit
>> code that uses x87. Maybe it is gcc that is at fault, by not calling EMMS?
>> Or, it is right to assume that when compiled with -msse it should not
>> care...
>> (or it is intel that is at fault for messing with x87 when using some sse
>> instructions :-)
>
> EMMS is for mmx only.

  What is kind of a subset of sse :-) I did not (yet) test by inserting an
"emms" after returning from the "vector" instructions, but I am almost
sure it would correct the problem. It gets somewhat crazy with fpe
exceptions, but my code just handles fpe without detailing, and just
treats it as a integer divide by zero, or domain error, that should not
happen on normal conditions, as it checks values before calling
some functions, and creates complex numbers, etc.

>>   Also, it should be talked at some moment :-) I hope this code to
>> be included in the upstream lightning, so that others can benefit
>> from the changes, but I already put a lot of work, and plan to still
>> do a lot of extra work on it, so, it should be fair to add my name
>> to the sources, and update FSF copyright to 2010.
>
> I will add you to the AUTHORS file too.

  Ok, but I am talking only about the code in my branch, as I almost
completely rewrote most of the i386/x86_64 logic, but did not do most
of it from scratch. Changes like better logic handling x87 registers
that did not cover some "shortcuts", better immediate handling like
using inc/dec, or xchg when applicable instead of push/pop to
swap register values, etc.

> Paolo

Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]