discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Docker


From: Nicholas McCarthy
Subject: Re: [Discuss-gnuradio] Docker
Date: Fri, 15 Apr 2016 06:27:01 +0000


Copying the source tree into the Docker image -- including the Dockerfile itself, and all the repo's Git history & remotes! -- is a great way of preserving continuity in image distribution. The source, the build instructions, the source history, and the source's source (the remote) are all preserved inside the container. Can't get more self-referential than that. But there's a size penalty for doing so. I don't believe that 'git clone' should generally be part of a Dockerfile recipe. It separates the Dockerfile from the repository itself, and facilitates moving the dependency record further from the source. You clone the source, then you use the source to build a Docker image. Just like 'make', but for a container.

This is a good point regarding the unintended stuff you suck into an image via git clone... but it's not too different from leaving the source in the image with or without repo history... and when the source isn't required, we want to get rid of it (and any repo information).

In the other case, you're actually developing the code... it helps to have the repo present... though, no, this is not a canonical use of Docker stuffs.  It's more of a workaround to the problem that the intended user chooses to run Windows or OSX software or something. 

I'm also partial to keeping the Dockerfile tied to the source in some sense, but I don't think it's a very important motivator... I mean, whether you wget the code in there or COPY it shouldn't matter very much... lots of Dockerfiles get made this way (I'm assuming it happens when the Dockerfile author is not the author of the code, mostly).

I think this is really an open-ended question with no best answer, because we're using Docker in a way it's not really intended to be used: as a virtual image to replace or supplant a USB drive, rather than as a container to provide a repeatable environment for, and to chroot, a single service.

But we're also talking about the intended use... say running a modem as a single service built out of gnuradio in a container... Maybe we should focus on what the right answer is for this case and let the Windows crowd deal with the consequences?

On the other hand, it may just be an open-ended question, regardless.



I guess the point is that it's necessary somewhere. You're just willing to do the work to provide recipes, instead of relying on the OOT author to do so instead.
 

No, I'm actually relying on the OOT author to do the work of providing recipes...  In fact, I'm assuming she's already done the work. I just don't want to have to rely on her to translate the same work into a Dockerfile in order for my Dockerfile to work.





And then if it makes sense to use pybombs for some OOTs, it makes sense to encourage everyone to do it... pybombs is happy if you use it as your one-stop shop for gnuradio installing, but trying to mix and match hand-installed stuff with a pybombs installation is just insane... and if your Docker image is trying to perpetuate that insanity to everyone using it, it's a great sin.

I don't agree with this. Especially if, as would make sense in a Docker container, Gnuradio and any OOTs are installed globally. An OOT shouldn't care at all how Gnuradio was installed on the base layer. If users want to use pybombs, great, but I don't see any reason it should preclude someone from using a simple Dockerfile.

So I don't think pybombs is smart enough to tell that you've installed something by hand (from source)... even if everything is installed globally...  I don't know the ins and outs of pybombs2 as well as I did the ins and outs of pybombs1, but I'm assuming this is still a problem.  If the pybombs-hating OOT author took the time to mark his packages "installed" in pybombs (or if his package isn't part of pybombs to begin with), then yeah... that works great.  Otherwise, if someone manually installed boost (for an outdated example), and then you try to pybombs install boost in the same image, you're (most likely) screwed.  It's not that I'm against freedom (Merica!), but I don't think you can have your pybombs and ignore it, too.


Definitely. We can do testing to see the merit of manually compressing things, but I suspect that the simple zlib (or whatever) that Docker Registry is currently using (that bug report is old, v2 registries compress on push and pull) is likely to provide most of the benefit already.


I'm running something in the 1.10's now, but my old version was old... 1.6 or earlier.  Can't remember.

Cheers,
NIck M.



 
--n
 

Cheers,
Nick M.  


On Thu, Apr 14, 2016 at 9:54 PM Tim O'Shea <address@hidden> wrote:


Such a friendly fellow



On Thu, Apr 14, 2016, 7:43 PM Nick Foster <address@hidden> wrote:
On Thu, Apr 14, 2016 at 5:40 PM Nicholas McCarthy <address@hidden> wrote:
Yeah, this is one of the things I like most about Dockerfiles... the self documenting aspect.  However... have you noticed how many times you end up dealing with Docker images apart from their Dockerfiles?  For a lot of useful images, it's difficult (maybe impossible?) to track the Dockerfile down... I've heard it's pretty easy to reverse engineer from the intermediate layers, but still.  With pybombs, the catalog is part of the image at the command line.  Also, for any OOT, there might just be one guy who's an expert at building the project (the guy who wrote the recipe, presumably).  Anyone can use the resulting recipe to install, so if you want to support a Docker image with dependencies in the pybombs recipe world, you don't have to learn how to install them... just use the work already in pybombs.

Distribute a Dockerfile in the root folder of each OOT. Then you don't have to have any expert build, you just do "docker build ." and pow, it's recreated anytime, anywhere. Whether you use pybombs or you just put the dependencies and build process in the Dockerfile shouldn't matter, the entire recipe should be in there for anyone to recreate.
 

Yeah, I should run a real test pitting compression versus non-compression.... I may have gotten confused with pre-existing image layers and the source-stripping feature (which definitely helps a lot).  If I can avoid the whole tar and untar, that'd make everything much easier.  The manual decompression takes a lot longer than the Docker decompression, though, so maybe bzip2 is just doing a better job?  This will have to wait until I have a really good connection, tho.

Okay, that's true... it does very much depend on what you're rebuilding.... So it makes sense to isolate code that changes a lot from stable code in layers.  To me, this means installing all the dependencies for uhd and gnuradio in a layer, then uhd, then gnuradio... Does it make sense to layer any more? (I mean, obviously, it could matter theoretically, but in practice?  Am I missing a volatile dependency?)

It means that if you're doing daily Gnuradio builds you'll have to pull the whole layer down each day. If you only want daily builds of your OOT module, then no, there's no reason you'd want to layer any more finely grained than that. The incremental cost of updating an OOT should be just a few MB.

--n
 

Cheers,
Nick M.

On Thu, Apr 14, 2016 at 7:48 PM Nick Foster <address@hidden> wrote:
On Thu, Apr 14, 2016 at 3:54 PM Nicholas McCarthy <address@hidden> wrote:
Three points to make about this OOT Docker example.
1) Everything you're doing manually here can be accomplished using pybombs2... and pybombs2 makes itself quite amenable to existing within the Docker image both as an interface to what's already installed on the image and a handy way to install new OOTs and dependencies on top of the image.  I would favor using pybombs2 as a basis for gnuradio Docker images... at least that's what I'm doing for myself.

The only argument I have against this is that not doing so (i.e., cataloging the dependencies manually in the Dockerfile) sort of obviates the need for pybombs, in this particular case. The dependencies and build procedures are either encoded in the individual Dockerfiles, which are simply part of the OOT project itself, or they're centralized into pybombs recipes. Their usefulness overlaps somewhat.

Where I see a clear win for pybombs in a Docker recipe is when you want to compile and install multiple OOTs in a single Docker image -- it's painful to "stack" targets to create a catchall image (i.e., uhd -> gnuradio -> gr-air-modes -> gr-ais...), and annoying to cut and paste Dockerfile recipes to concatenate them. It's not an issue from a production standpoint since half the reason to use Docker in production is to isolate services from each other, one image per service (if all your services depend on the same base layer, there's little/no image size overhead). But if you're putting together a single catchall image for folks to muck around with, a pybombs base layer seems like a great way to do that and still keep a sane-ish Dockerfile.
 
2) Supporting a pybombs2 Docker image is a great way to ensure that pybombs2 builds from a very, very minimal initial Linux installation.  Since pybombs2 is newish and experiencing lots of activity, the project might benefit from anchoring itself to nightly Docker builds. 
3) How to handle OOTs could be a bit more complicated if you try, as I have, to reduce image size using the pybombs2 "deploy" mechanism.  Deploy lets you do two nice things... it optionally strips all the src out of your build automatically (a significant weight reduction step, as you note, but installed .h files remain... sort of like a dev package).  It also compresses the rest of your build.  As long as you install and then compress in the same Docker RUN call, your UFS layer is small, too.  The result saves GBs over the wire (I think?).  When you're using Docker to deploy code, image sizes over the wire can be a big deal, and I'd love for it to be convenient enough to start each day off by docker pulling the latest build.  The "pybombs" image I posted is a vanilla pybombs2 install with uhd forcebuilt and the source stripped (so it's pretty much a complete gnuradio build, gui and all)... it's ~1.5G to download (Again, I think?  Sometimes you have hidden image layers that make things appear smaller than they actually are.)  That's big in Docker world from what I've seen, but it's not much of an inconvenience.  ~2.5G for the version retaining the source code is a little more glaringly bad.  Fortunately, unless you're working on code in the uhd or gnuradio repos, the 1.5G version is totally fine.

For me, these are the "natural" GR versions to support via Docker... I'd be interested in a version more like yours that strips the gui and other modules, too.  Maybe it's worth cutting uhd in an image, but 200MB doesn't sound worthwhile at that cost.

I'm interested in experiences/opinions regarding the use of pybombs deploy to compress between intermediate UFS layers.  Am I somehow overestimating the savings in image size?  Does it seem too complicated, and/or am I overlooking an easier way?  Maybe no one cares about image size?  (Actually, I know some people who definitely do care, or at least they claim to care.  I'm pretty sure I will, too.)

I'd be really interested to see what the OTW savings from deploy are! That seems pretty useful, generally speaking. But remember that Docker Registry already implements image compression over the wire for each of the UFS layers it's sending. So I'd be a little concerned that the additional compression step isn't helping actual transfer times, although optionally removing the src will certainly help.
 

I don't think building the distribution in explicit layers is very helpful.  It's something that seems useful when you hear a description of the UFS, but it seems to wind up being a waste of time.  One or several GR images makes sense as a basis for images building OOTs, but what the hell are you going to do with an image that's run pybombs install boost except run pybombs install gnuradio?

Only the incremental layers which have changed need to be sent over the wire. This can be pretty painful if you're building Gnuradio monolithically (i.e., in a single RUN command), and means building a distribution in explicit layers has possibly the largest effect on over-the-wire requirements for users who need daily Docker builds.

--n
 

Cheers,
Nick M.


On Thu, Apr 14, 2016 at 2:16 PM Nick Foster <address@hidden> wrote:
I think it would be really helpful for the GNU Radio project to support a standard, basic gnuradio docker install with uhd and grc enabled as well as an example or two to demonstrate sane ways to run OOT modules on top of that image.  As Ben mentioned, Docker seems like a pretty energy-efficient way to approach support for systems like Windows and OSX going forward.  Not having used boot2docker personally, I won't say that it's necessarily time to retire the live usb image, but I think Docker may evolve quickly into a pretty obvious replacement, if it hasn't already.  I also appreciate GNU Radio looking for ways to support users and potential users attempting to build and deploy applications that reach beyond the immediate environment of GNU Radio and its core devs.  

As far as OOT modules, that's easy. For instance, a Dockerfile for gr-air-modes could look like this (this is untested, don't get any ideas):

FROM bistromath/gnuradio:3.7.8
MAINTAINER address@hidden version: 0.1
RUN apt-get install -y python-zmq
WORKDIR /opt/gr-air-modes
COPY . /opt/gr-air-modes
RUN    mkdir build \
    && cd build \
    && cmake ../ \
    && make -j4 \
    && make install

...that's more or less the whole thing, although this particular example is broken for a couple of reasons (no Qt in my base layer, other missing prerequisites). It might be nice to include a Dockerfile template in the OOT example. The nice part about doing OOT modules in this manner is that Gnuradio users could potentially never have to compile Gnuradio -- just write their OOT and base its Dockerfile upon a precompiled Gnuradio base layer. Another benefit is bitrot is all but eliminated, as you're basing your module on top of a versioned base layer rather than master.
 

One problem we have to face, though, is image size.  I'm trying to tackle that problem by compressing the install for transactions over the wire and then uncompressing locally for applications (using pybombs2, of course).  This is all a little awkward for docker distribution, but lots of things in docker are a little awkward.  Developers could build on top by untarring the prefix, pybombs installing extra recipes (possibly custom recipes) and then using the deploy command again, all within the same Docker "RUN" section.  Locally, if you docker build applications beginning with the same commands to untar the image, then all applications can take advantage of that layer (you'll have to untar the base image only one time regardless of how many applications use the base image).  Alternatively, you can docker run with cmd or entry set to untar the image (and then, presumably, you'll want to commit the running container locally so you don't have to untar again).  
 
Does anyone have a better idea for bringing image size down without making it impossible to build and deploy OOTs?  Those Bistromath images are pretty tiny... I haven't really looked into the Alpine base image, either. 

The Docker image I put up on Docker Hub is small-ish because it only includes these components (and their prerequisites):

--   * python-support
--   * testing-support
--   * volk
--   * gnuradio-runtime
--   * gr-blocks
--   * gnuradio-companion
--   * gr-fft
--   * gr-filter
--   * gr-analog
--   * gr-digital
--   * gr-channels
--   * gr-uhd
--   * gr-utils
--   * gr-wxgui

It could be a lot smaller if I removed the GR build files (292MB), GR source files (88MB), and UHD source/build (200MB). That would cut it down to somewhat more than half its current size. I like having them there because if I'm working inside the environment I can compile changes incrementally. For a pure deployment system, though, they're unnecessary.

It's possible, albeit messy, to build a Gnuradio distribution in layers and tag the individual layers separately. Because each command in a Dockerfile produces an incremental UFS layer, if you can break the compilation of Gnuradio into separate commands for each component in the Dockerfile, then you can tag the various incremental layers to build different composite Gnuradio distributions. It's probably simpler just to provide a "bells-'n-whistles" version and a "bare bones" version.

If you like, I can see just how small I can reasonably get things. I'd argue, though, that a one-time, couple-of-GB download is a reasonable compromise for the convenience of versioned distribution in all but niche applications (embedded or offline come to mind). In other words, the benefit of getting the wire size down probably doesn't outweigh the effort for most people.

--n
 



_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio


_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

reply via email to

[Prev in Thread] Current Thread [Next in Thread]