Re: Building with many cores without OOM
Hi!
On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
> I am one of those who builds a lot of different packages with different
> requirements and found that picking a good parallel=... value in
> DEB_BUILD_OPTIONS is hard. Go too low and your build takes very long. Go
> too high and you swap until the OOM killer terminates your build. (Usage
> of choom recommended in any case.)
> I think this demonstrates that we probably have something between 10 and
> 50 packages in unstable that would benefit from a generic parallelism
> limit based on available RAM. Do others agree that this is a problem
> worth solving in a more general way?
I think the general idea make sense, yes.
> For one thing, I propose extending debhelper to provide
> --min-ram-per-parallel-core as that seems to be the most common way to
> do it. I've proposed
> http://salsa.debian.org/debian/debhelper/-/merge_requests/128
> to this end.
To me this looks too high in the stack (and too Linux-specific :).
> Unfortunately, a the affeted packages tend to not just be big, but also
> so special that they cannot use dh_auto_*. As a result, I also looked at
> another layer to support this and found /usr/share/dpkg/buildopts.mk,
> which sets DEB_BUILD_OPTION_PARALLEL by parsing DEB_BUILD_OPTIONS. How
> about extending this file with a mechanism to reduce parallelity? I am
> attaching a possible extension to it to this mail to see what you think.
> Guillem, is that something you consider including in dpkg?
I'm not a huge fan of the make fragment files, as make programming is
rather brittle, and it easily causes lots of processes to spawn if you
look at it the wrong way (ideally I'd really like to be able to get
rid of them once we can rely on something else!). I think we could
consider adding it there, but as a last resort option, if there's no
other better place.
> Are there other layers that could reasonably be used to implement a more
> general form of parallelism limiting based on system RAM? Ideally, we'd
> consolidate these implementations into fewer places.
I think adding this in dpkg-buildpackage itself would make most sense
to me, where it is already deciding what amount of parallelism to use
when specifying «auto» for example.
Given that this would be and outside-in interface, I think this would
imply declaring these parameters say as debian/control fields for example,
or some other file to be parsed from the source tree.
My main concerns would be:
* Portability.
* Whether this is a local property of the package (so that the
maintainer has the needed information to decide on a value, or
whether this depends on the builder's setup, or perhaps both).
* We might need a way to percolate these parameters to children of
the build/test system (as Paul has mentioned), where some times
you cannot specify this directly in the parent. Setting some
standardize environment variables would seem sufficient I think,
but while all this seems kind of optional, this goes a bit into
reliance on dpkg-buildpackage being the only supported build
entry point. :)
> As I am operating build daemons (outside Debian), I note that I have to
> limit their cores below what is actually is available to avoid OOM
> kills and even that is insufficient in some cases. In adopting such a
> mechanism, we could generally raise the core count per buildd and
> consider OOM a problem of the package to be fixed by applying a sensible
> parallelism limit.
See above, on whether this is really package or setup dependent.
Thanks,
Guillem
Reply to: