[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Acceptance Test BootLinuxAarch64.test_virt_tcg execution times
From: |
Philippe Mathieu-Daudé |
Subject: |
Re: Acceptance Test BootLinuxAarch64.test_virt_tcg execution times |
Date: |
Wed, 12 Aug 2020 14:19:44 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 |
On 8/6/20 9:35 PM, Cleber Rosa wrote:
> TL;DR: This is a followup from an IRC chat about the
> tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg test
> taking many orders of magnitute longer than other acceptance (and even
> similar boot) tests. I could not find an easy way for this specific
> test (aarch64+tcg) to have a significant execution time improvement.
> The best solution may be to filter out tests that are known to be
> slow, and create a specific "test job" that includes them.
>
> Fisrt, if it's not clear, this specific test runs QEMU with TCG and
> boots a Fedora 31 "cloud image" and waits until the cloud-init agent
> notifies the test that the boot is over.
>
> Out of the four archtiectures tested with the same approach under
> "tests/acceptance/boot_linux.py", aarch64 was special, in the sense
> that many Linux "cloud images" got stuck very late in the boot
> process. What seemed to be a disk activity within the guest that
> seemed to make the kernel drain its random number sources if my memory
> serves me right. Giving the machine a RNG device fixed it. This can
> still be verified Today if you comment the virtio-rng lines in the
> aarch64 test.
>
> So, even with the RNG device and the boot process not getting stuck, a
> lot of the test time is spent with QEMU actively using CPU time
> produced by the guest boot process. This may or may not be the cause
> for the slowness.
I believe this is related to the issue Richard recently addressed:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg729216.html
>
> One approach to have a shorter test time, would be to reduce the
> things that happen during the guest boot process. Choosing a minimal
> guest, such as CirrOS, would be an example of such a solution, but:
>
> * With less things happening during the guest boot, less things
> get tested within QEMU;
>
> * CirrOS can not make use of the same boot cloud-init configuration
> and boot verification the test currently uses;
>
> So that leaves other non-minimal Linux "cloud images" as options. But
> still, the following things are required or nice to have:
>
> * Support for cloud-init;
>
> * Support for as many as possible architectures;
>
> * Wide user base;
>
> * Be thoroughly tested with this "boot_linux.py" test
>
> So in the end, I picked Fedora 31, which was available and behaved
> well for four different architectures with and without KVM. Today, I
> verified if switching distros would provide an "easy fix", but the
> results were negative. Any ideas on how to improve the test execution
> times are appreciated.
>
> For the record, one of the ways we're trying to improve the overall
> test experience is to allow tests to run in parallel (expected to be
> fully supported on the upcoming version 81.0).
>
> For those interested, these are the numbers I got, and how I tested
> with other distros. I'm using QEMU e1d322c405 with a vanilla
> configure under a x86_64 Fedora 32 host.
>
> Fedora 31 (baseline):
> ====================
>
> $ make check-venv
> $ ./tests/venv/bin/avocado run -t arch:aarch64,accel:tcg --keep-tmp on --
> tests/acceptance/boot_linux.py{,,,,}
> JOB ID : 14802f9d5016a44d2937ed7b1fec63b2eaa06e89
> JOB LOG :
> /home/cleber/avocado/job-results/job-2020-08-06T13.19-14802f9/job.log
> (1/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (153.12 s)
> (2/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (149.57 s)
> (3/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (154.45 s)
> (4/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (148.97 s)
> (5/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (150.70 s)
> RESULTS : PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 |
> CANCEL 0
> JOB TIME : 757.50 s
>
> Fedora 32:
> ==========
>
> 1. Tweak version and image hash:
>
> ---
> diff --git a/tests/acceptance/boot_linux.py b/tests/acceptance/boot_linux.py
> index 0055dc7cee..44c62bd4a2 100644
> --- a/tests/acceptance/boot_linux.py
> +++ b/tests/acceptance/boot_linux.py
> @@ -48,7 +48,7 @@ class BootLinuxBase(Test):
> image_arch = 'ppc64le'
> try:
> boot = vmimage.get(
> - 'fedora', arch=image_arch, version='31',
> + 'fedora', arch=image_arch, version='32',
> checksum=self.chksum,
> algorithm='sha256',
> cache_dir=self.cache_dirs[0],
> @@ -160,7 +160,7 @@ class BootLinuxAarch64(BootLinux):
> :avocado: tags=machine:gic-version=2
> """
>
> - chksum =
> '1e18d9c0cf734940c4b5d5ec592facaed2af0ad0329383d5639c997fdf16fe49'
> + chksum =
> 'b367755c664a2d7a26955bbfff985855adfa2ca15e908baf15b4b176d68d3967'
>
> def add_common_args(self):
> self.vm.add_args('-bios',
> ---
>
> 2. Download the image before the test:
>
> $ ./tests/venv/bin/avocado vmimage get --distro=fedora --arch aarch64
> --distro-version=32
> The image was downloaded:
> Provider Version Architecture File
> fedora 32 aarch64
> /tmp/data/cache/by_location/7049001631a4b2eabf5766cc110e66d486e09821/Fedora-Cloud-Base-32-1.6.aarch64.qcow2
>
> 3. Run the tests:
>
> $ ./tests/venv/bin/avocado run -t arch:aarch64,accel:tcg --keep-tmp on --
> tests/acceptance/boot_linux.py{,,,,}
> JOB ID : 09e740a41dc400f9fcbb9253f613734597fe0efc
> JOB LOG :
> /home/cleber/avocado/job-results/job-2020-08-06T13.53-09e740a/job.log
> (1/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (162.06 s)
> (2/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (167.78 s)
> (3/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (166.98 s)
> (4/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (171.13 s)
> (5/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (167.43 s)
> RESULTS : PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 |
> CANCEL 0
> JOB TIME : 836.05 s
>
> Ubuntu 20.04:
> =============
>
> 1. Tweak version and image hash:
>
> ---
> diff --git a/tests/acceptance/boot_linux.py b/tests/acceptance/boot_linux.py
> index 0055dc7cee..03c0e1bee9 100644
> --- a/tests/acceptance/boot_linux.py
> +++ b/tests/acceptance/boot_linux.py
> @@ -48,7 +48,7 @@ class BootLinuxBase(Test):
> image_arch = 'ppc64le'
> try:
> boot = vmimage.get(
> - 'fedora', arch=image_arch, version='31',
> + 'ubuntu', arch=image_arch, version='20.04',
> checksum=self.chksum,
> algorithm='sha256',
> cache_dir=self.cache_dirs[0],
> @@ -160,7 +160,7 @@ class BootLinuxAarch64(BootLinux):
> :avocado: tags=machine:gic-version=2
> """
>
> - chksum =
> '1e18d9c0cf734940c4b5d5ec592facaed2af0ad0329383d5639c997fdf16fe49'
> + chksum =
> '1d9e50f3381145835b11911adf611f455d674a570814086b7d6581ecc0718770'
>
> def add_common_args(self):
> self.vm.add_args('-bios',
> ---
>
> 2. Download the image before the test:
>
> $ ./tests/venv/bin/avocado vmimage get --distro=ubuntu --arch aarch64
> --distro-version=20.04
> The image was downloaded:
> Provider Version Architecture File
> ubuntu 20.04 arm64
> /tmp/data/cache/by_location/19db8c6d910a3f2660c4109ffb85d73d43e5cdf2/ubuntu-20.04-server-cloudimg-arm64.img
>
> 3. Run the tests:
>
> $ ./tests/venv/bin/avocado run -t arch:aarch64,accel:tcg --keep-tmp on --
> tests/acceptance/boot_linux.py{,,,,}
> JOB ID : 92a1bdbb5e933e6dff8b882808a191f1de3c2600
> JOB LOG :
> /home/cleber/avocado/job-results/job-2020-08-06T12.13-92a1bdb/job.log
> (1/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (341.40 s)
> (2/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (345.82 s)
> (3/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (335.91 s)
> (4/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (320.32 s)
> (5/5) tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg: PASS
> (319.79 s)
> RESULTS : PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 |
> CANCEL 0
> JOB TIME : 1663.92 s
>