Improving Memory Behaviour to Make Self-Hosted PyPy Translations Practical
Briefly

"In our previous blog post, we talked about how fast PyPy can translate itself compared to CPython. However, the price to pay for the 2x speedup was an huge amount of memory: actually, it was so huge that a standard compilation could not be completed on 32-bit because it required more than the 4 GB of RAM that are addressable on that platform. On 64-bit, it consumed 8.3 GB of RAM instead of the 2.3 GB needed by CPython."
"In the past two weeks Anto and Armin attacked the issue in the branch, which has been recently merged to trunk. The branch solves several issues. The main idea of the branch is that if a loop has not been executed for a certain amount of time (controlled by the new loop_longevity JIT parameter) we consider it "old" and no longer needed, thus we deallocate it."
PyPy translation memory consumption soared because the JIT kept generated assembler and large execution data structures alive indefinitely, causing a 64-bit translation to use 8.3 GB versus CPython's 2.3 GB and preventing standard 32-bit compilation by exceeding the 4 GB addressable limit. A merged branch introduces loop deallocation governed by a new loop_longevity JIT parameter that frees loops not executed for a configurable time. An oversight in generator freeing was also fixed. Measurements show active loops dropped from about 37,000 with infinite longevity to under 10,000 on trunk. Translation timing uses CPU Time Stamp Counter ticks.
Read at Antocuni
Unable to calculate read time
[
|
]