qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/8] Towards an Heterogeneous QEMU


From: mar.krzeminski
Subject: Re: [Qemu-devel] [RFC PATCH 0/8] Towards an Heterogeneous QEMU
Date: Mon, 26 Oct 2015 18:12:46 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0



W dniu 25.10.2015 o 22:38, Peter Crosthwaite pisze:
On Thu, Oct 22, 2015 at 2:21 AM, Christian Pinto
<address@hidden> wrote:
Hello Peter,


On 07/10/2015 17:48, Peter Crosthwaite wrote:
On Mon, Oct 5, 2015 at 8:50 AM, Christian Pinto
<address@hidden> wrote:
Hello Peter,

thanks for your comments

On 01/10/2015 18:26, Peter Crosthwaite wrote:
On Tue, Sep 29, 2015 at 6:57 AM, Christian Pinto
<address@hidden>  wrote:
Hi all,

This RFC patch-series introduces the set of changes enabling the
architectural elements to model the architecture presented in a
previous
RFC
letter: "[Qemu-devel][RFC] Towards an Heterogeneous QEMU".
Sorry for late response, unfortunately my M3+A9 SoC can not be published (yet).
But I am working on it.
and the OS binary image needs
to be placed in memory at model startup.

I don't see what this limitation is exactly. Can you explain more? I
do see a need to work on the ARM bootloader for AMP flows, it is a
pure SMP bootloader than assumes total control.
the problem here was to me that when we launch QEMU a binary needs to be
provided and put in memory
in order to be executed. In this patch series the slave doesn't have a
proper memory allocated when first launched.
But it could though couldn't it? Can't the slave guest just have full
access to it's own address space (probably very similar to the masters
address space) from machine init time? This seems more realistic than
setting up the hardware based on guest level information.

Actually the address space for a slave is built at init time, the thing that
is not
completely configured is the memory region modeling the RAM. Such region is
configured
in terms of size, but there is no pointer to the actual memory. The pointer
is mmap-ed later
before the slave boots.

based on what information? Is the master guest controlling this? If so
what is the real-hardware analogue for this concept where the address
map of the slave can change (i.e. be configured) at runtime?
I am not sure if it is the case since I haven't emulated this yet (and it has very low priority), but I might have a real case in my M3+A9 - M3 has 256MiB window that can be moved over the 1GiB system memory at runtime.


The information about memory (fd + offset for mmap) is sent only later
when
the boot is triggered. This is also
safe since the slave will be waiting in the incoming state, and thus no
corruption or errors can happen before the
boot is triggered.
I was thinking more about your comment about slave-to-slave
interrupts. This would just trivially be a local software-generated
interrupts of some form within the slave cluster.

Sorry, I did not catch your comment at first time. You are right, if cores
are in the same cluster
a software generated interrupt is going to be enough. Of course the eventfd
based interrupts
make sense for a remote QEMU.

Is eventfd a better implementation of remote-port GPIOs as in the Xilinx work?

Re the terminology, I don't like the idea of thinking of inter-qemu
"interrupts" as whatever system we decide on should be able to support
arbitrary signals going from one QEMU to another. I think the Xilinx
work already has reset signals going between the QEMU peers.

The multi client-socket is used for the master to trigger
         the boot of a slave, and also for each master-slave couple to
exchancge the
         eventd file descriptors. The IDM device can be instantiated
either
as a
         PCI or sysbus device.

So if everything is is one QEMU, IPIs can be implemented with just a
of registers makes the master in
"control" each of the slaves. The IDM device is already seen as a regular
device by each of the QEMU instances
involved.

I'm starting to think this series is two things that should be
decoupled. One is the abstract device(s) to facilitate your AMP, the
other is the inter-qemu communication. For the abstract device, I
guess this would be a new virtio-idm device. We should try and involve
virtio people perhaps. I can see the value in it quite separate from
modelling the real sysctrl hardware.

Interesting, which other value/usage do you see in it? For me the IDM was
meant to
It has value in prototyping with your abstract toolkit even with
homogeneous hardware. E.g. I should be able to just use single-QEMU
ARM virt machine -smp 2 and create one of these virtio-AMP setups.
Homogeneous hardware with heterogenous software using your new pieces
of abstract hardware.

It is also more practical for getting a merge of your work as you are
targetting two different audiences with the work. People intersted in
virtio can handle the new devices you create, while the core
maintainers can handle your multi-QEMU work. It is two rather big new
features.

work as an abstract system controller to centralize the management
of the slaves (boot_regs and interrupts).


But I think the implementation
should be free of any inter-QEMU awareness. E.g. from P4 of this
series:

+static void send_shmem_fd(IDMState *s, MSClient *c)
+{
+    int fd, len;
+    uint32_t *message;
+    HostMemoryBackend *backend = MEMORY_BACKEND(s->hostmem);
+
+    len = strlen(SEND_MEM_FD_CMD)/4 + 3;
+    message = malloc(len * sizeof(uint32_t));
+    strcpy((char *) message, SEND_MEM_FD_CMD);
+    message[len - 2] = s->pboot_size;
+    message[len - 1] = s->pboot_offset;
+
+    fd = memory_region_get_fd(&backend->mr);
+
+    multi_socket_send_fds_to(c, &fd, 1, (char *) message, len *
sizeof(uint32_t));

The device itself is aware of shared-memory and multi-sockets. Using
the device for single-QEMU AMP would require neither - can the IDM
device be used in a homogeneous AMP flow in one of our existing SMP
machine models (eg on a dual core A9 with one core being master and
the other slave)?

Can this be architected in two phases for greater utility, with the
AMP devices as just normal devices, and the inter-qemu communication
as a separate feature?

I see your point, and it is an interesting proposal.

What I can think here to remove the awareness of how the IDM communicates
with
the slaves, is to define a kind of AMP Slave interface. So there will be an
instance of the interface for each of the slaves, encapsulating the
communication part (being either local or based on sockets).
The AMP Slave interfaces would be what you called the AMP devices, with one
device per slave.

Do we need this hard definition of master and slave in the hardware?
Can the virtio-device be more peer-peer and the master-slave
relationship is purely implemented by the guest?

Regards,
Peter

At master side, besides the IDM, one would instantiate
as many interface devices as slaves. During the initialization the IDM will
link
with all those interfaces, and only call functions like: send_interrupt() or
boot_slave() to interact with the slaves. The interface will be the same for
both local or remote slaves, while the implementation of the methods will
differ and reside in the specific AMP Slave Interface device.
On the slave side, if the slave is remote, another instance of the
interface is instantiated so to connect to socket/eventfd.

So as an example the send_shmem_fd function you pointed could be hidden in
the
slave interface, and invoked only when the IDM will invoke the slave_boot()
function of a remote slave interface.

This would higher the level of abstraction and open the door to potentially
any
communication mechanism between master and slave, without the need to adapt
the
IDM device to the specific case. Or, eventually, to mix between local and
remote instances.


Thanks,

Christian

Regards,
Peter

Regards,
Marcin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]