Details about Tebako patching processes
Introduction
Creating single-file executable packages for Ruby applications presents significant technical challenges, particularly when it comes to static linking.
The industry’s preference for dynamic linking conflicts with our goal of producing self-contained executables that don’t rely on external dependencies at runtime.
In our post about the technology used by Tebako packager, we introduced the core components of our solution.
This article explores the specific challenges we encountered with dependency management and the creative solutions we implemented to overcome them.
Understanding key technologies
DwarFS filesystem in Tebako
DwarFS is a highly compressed, read-only filesystem that serves as the foundation for Tebako’s packaging strategy.
Amongst other filesystem options, including SquashFS, we selected DwarFS for its unparalleled performance and ability to efficiently store and retrieve Ruby applications and their dependencies within a single filesystem.
DwarFS offers several advantages for our use case:
-
High compression ratios for efficient storage
-
Fast read access for optimal performance
-
Read-only nature that ensures package integrity
These benefits make DwarFS an ideal choice for Tebako, but implementing it requires navigating a complex toolchain with numerous dependencies.
By default, DwarFS is not designed for the command line, it is not designed to be used as a software library.
Furthermore, since Tebako aims to be a statically-linked executable, we needed to create a library version of DwarFS that is also statically linked.
As a result, we needed to create a library version of DwarFS,
libdwarfs
, that provides a static
interface for our packaging process and allows Tebako to read/write DwarFS
filesystems within the library.
Static libraries and industry trends
When compiling software, there are two main approaches to handling library dependencies: static linking and dynamic linking.
- Static linking
-
the library code is directly embedded into the final executable at compile time. This creates a larger executable but ensures it can run independently without relying on system libraries.
- Dynamic linking
-
includes only references to libraries in the executable. The actual library code is loaded at runtime from the system, resulting in smaller executables but requiring the correct library versions to be present on the target system.
Generally, the software industry has increasingly favored dynamic linking for several reasons:
-
Smaller individual binaries
-
Separation of concerns and dependency management
-
Easier security updates
This trend means that many libraries prioritize dynamic linking support, making static linking more challenging to implement and maintain.
However, Tebako needs static linking:
-
It ensures the packaged Ruby application can run on any compatible system without additional dependencies.
-
The packaged Ruby runtime is compiled against a specific set of libraries to ensure consistent behavior.
-
The packaged executable needs to run independently of the host system’s platform and architecture. This means that a platform-specific compilation process is needed.
-
It cannot pass on the "dependency hell" issue to its end users.
This means that Tebako must address the complexities of static linking in a dynamic-linking-focused ecosystem.
Dependency chain analysis
The complexity of Tebako’s dependency management becomes apparent when examining the complete dependency chain:
┌─────────┐ ┌──────────┐ ┌────────┐ ┌─────────────┐ ┌────────────┐
│ Tebako │ --> │ libdwarfs│ --> │ dwarfs │ --> │ FB Libraries│ --> │ Google Libs│
└─────────┘ └──────────┘ └────────┘ └─────────────┘ └────────────┘
(memfs layer) (memfs) (folly/fbthrift) (glog/gflags)
┌─────────┐
│ Tebako │
└────┬────┘
│
v
┌──────────┐ ┌───────────────┐
│ libdwarfs│<----│ Static Libs │
└────┬─────┘ │ - gflags │
│ │ - glog │
v │ - libarchive │
┌────────┐ │ - jemalloc │
│ dwarfs │<------│ - libfolly │
└────────┘ └───────────────┘
The challenge of dependencies is evident in the number of libraries required to build a single Tebako executable.
On Ubuntu 20, the full dependency list comprises 271 packages. Alpine requires even more packages, while macOS needs fewer.
The sheer number of dependencies isn’t inherently problematic, but the requirement for static libraries introduces significant complexity.
Platform-specific challenges
Each platform presents unique challenges for static linking. Below, we detail a selection of issues we encountered and how we resolved them.
Missing static libraries in Homebrew
Problem: The Homebrew formula for gflags
doesn’t include static library builds.
Solution: We implemented a verification step that checks for the presence of
gflags.a
in the system. When missing, our build process detects this and takes
appropriate action.
Implementation:
# Check if static gflags library exists
if [ ! -f "${HOMEBREW_PREFIX}/lib/libgflags.a" ]; then
echo "Static gflags library not found, building from source"
# Build process follows
fi
This straightforward check allows us to handle the missing static library gracefully.
Build option issues on macOS
Problem: The Homebrew formula for glog
uses incorrect options for building
static libraries, likely omitting Position Independent Code (PIC) flags.
Solution: We build our own copy of glog.a
on macOS when the system version
is unsuitable.
Implementation: Our build script detects macOS and compiles glog
with the
correct flags:
if [[ "$OSTYPE" == "darwin"* ]]; then
# Build glog with proper PIC flags
./configure CFLAGS="-fPIC" CXXFLAGS="-fPIC" --enable-static --disable-shared
make
fi
This ensures we have a properly built static library regardless of the system version.
Complex static builds across platforms
Problem: Building static libarchive
is particularly challenging on all
platforms due to its numerous dependencies and configuration requirements.
Solution: We developed a comprehensive build process that works consistently across supported platforms.
Implementation: Our approach is documented in a detailed pull request
(DwarFS PR #55 from Ribose)
that addresses the specific challenges of static libarchive
builds.
The solution involves careful configuration of build flags, dependency management, and platform-specific adjustments to ensure consistent results.
System-level patches on Alpine
Problem: Alpine 3.16 didn’t support static jemalloc
libraries (the situation
may have changed in 3.18).
Solution: We implemented an unconventional approach that patches system includes to enable static jemalloc
linking.
Implementation: Our solution is implemented in a dedicated script (located in
the Tebako tools repository) that
modifies system headers to properly support static jemalloc
.
This approach demonstrates the sometimes creative solutions required when working with static linking in modern environments.
Version compatibility challenges
Problem: Newer versions of libfolly
heavily depend on weak references (a
feature of dynamic linking) to other libraries, effectively blocking static
linking capabilities.
Solution: We currently use an older version of libfolly
that supports static
linking, but this creates ongoing compatibility challenges with newer versions
of other libraries.
Implementation: We maintain a compatibility matrix that tracks which versions
of libfolly
work with our static linking requirements and which versions of
other libraries are compatible with our chosen libfolly
version.
This represents one of our most significant ongoing challenges, as it requires careful version management across multiple interdependent libraries.
Maintenance and testing strategy
To manage these complex dependencies effectively, we have implemented a multilayer CI/CD approach in the Tebako project.
This strategy allows us to:
-
Identify new issues at the specific layer where they originate
-
Isolate and address problems without disrupting the entire build chain
-
Maintain comprehensive test coverage across all supported platforms
-
Quickly adapt to upstream changes in dependencies
Our CI/CD pipeline tests each layer independently before integrating them, ensuring that issues are caught early and resolved efficiently.
Looking ahead
The challenges described in this article represent the current state of static linking in a dynamic-linking-focused ecosystem. As package managers and libraries continue to evolve, we expect some issues to resolve while new ones emerge.
Our approach focuses on:
-
Maintaining compatibility with current library versions where possible
-
Implementing targeted patches where necessary
-
Contributing upstream when appropriate
-
Documenting workarounds for the benefit of the community
This balanced strategy allows us to deliver a reliable packaging solution while managing the inherent complexity of static linking.
Summary
Building Ruby applications with minimal external dependencies presents significant technical challenges that require creative solutions:
-
The industry trend toward dynamic linking conflicts with the goal of creating self-contained executables
-
Each platform presents unique challenges for static library availability and compatibility
-
Complex dependency chains require careful management and version control
-
Unconventional approaches, including system-level patches, are sometimes necessary
Through our multilayer CI/CD approach and targeted solutions for specific dependencies, we’ve created a maintainable system for packaging Ruby applications as single executables. While the challenges are significant, the resulting capability to distribute Ruby applications as standalone executables provides substantial value to the Ruby ecosystem.
The Tebako project continues to evolve as we address these challenges and improve our packaging capabilities across all supported platforms.