Benchmarking of tebako package against original Ruby applications
Introduction
Performance is a critical consideration when packaging applications for distribution. Tebako, as an executable packager, introduces several architectural elements that could potentially impact performance. This article presents a comprehensive benchmarking analysis comparing Tebako-packaged applications against their original Ruby implementations.
Tebako packages introduce four specific features that might affect performance:
-
An additional filesystem access layer that routes calls either to the original implementation or to the DwarFS in-memory packaged filesystem
-
Resource consumption for decompression when accessing files from the DwarFS filesystem
-
Extraction of native extension shared libraries to temporary folders before loading
-
Inclusion of the entire Ruby standard library and all files used during installation, potentially increasing startup time
To quantify these potential impacts, we conducted systematic benchmarking across various application types and usage patterns.
Understanding key components
Tebako packaging
Tebako is an executable packager that combines a Ruby application and its dependencies into a single executable binary. It allows users to run the packaged software as if it were using a mounted filesystem, eliminating the need for separate installation steps.
As described in our earlier technology post, Tebako packages a patched, statically linked Ruby interpreter with an embedded DwarFS filesystem containing the application code and dependencies. This approach provides excellent portability but introduces architectural elements that could potentially impact performance.
The performance implications of this packaging approach vary based on the application’s characteristics, particularly its dependency on filesystem operations and native extensions.
DwarFS filesystem
DwarFS is a highly compressed, read-only filesystem that serves as the foundation for Tebako’s packaging strategy. It enables efficient storage of Ruby applications and their dependencies within a single executable.
DwarFS offers several advantages for application packaging:
-
High compression ratios for efficient storage
-
Fast read access for optimal performance
-
Read-only nature that ensures package integrity
From a performance perspective, DwarFS introduces a trade-off: while decompression requires computational resources, reading from an in-memory filesystem can be faster than accessing files from disk, particularly for mechanical hard drives.
Benchmark targets
To comprehensively evaluate performance across different usage patterns, we selected four distinct benchmark targets that represent a range of complexity and resource usage patterns.
"Hello World" script
The "Hello World" script serves as our baseline test, focusing on core Ruby interpreter performance with minimal external dependencies.
- Purpose
-
This simple script helps isolate the performance of Ruby’s core functionality, particularly the parsing and execution of basic Ruby code.
- Implementation
-
The script outputs a configurable number of "Hello, world" messages and displays the gem path, with execution count controlled via command-line arguments.
- Value
-
By minimizing external factors, this benchmark provides clear insights into the baseline performance differences between native Ruby and Tebako-packaged applications.
if (argv = ARGV).empty?
puts "No arguments given"
exit(1)
end
if argv[0].to_i < 1
puts "Argument must be a positive integer"
exit(1)
end
argv[0].to_i.times do |i|
puts "Hello, world number #{i}!"
puts "Gem path: #{Gem.path}"
end
Coradoc
Coradoc is a document markup processor designed for structured content handling.
- Purpose
-
This benchmark tests file processing performance, particularly how Tebako handles external file access and processing through pure Ruby code.
- Features tested
-
The benchmark exercises Coradoc’s document parsing, transformation, and OSCAL (Open Security Controls Assessment Language) conversion capabilities.
- Real-world relevance
-
Document processing represents a common use case for Ruby applications, making this benchmark relevant for assessing performance in content management and documentation systems.
if (argv = ARGV).empty?
puts "No arguments given"
exit(1)
end
if argv[0].to_i < 1
puts "Argument must be a positive integer"
exit(1)
end
argv[0].to_i.times do
require "coradoc"
sample_file = File.join(__dir__, "fixtures", "sample.adoc")
require "coradoc/legacy_parser"
Coradoc::LegacyParser.parse(sample_file)[:document]
require "coradoc/oscal"
sample_file = File.join(__dir__, "fixtures", "sample-oscal.adoc")
document = Coradoc::Document.from_adoc(sample_file)
Coradoc::Oscal.to_oscal(document)
syntax_tree = Coradoc::Parser.parse(sample_file)
Coradoc::Transformer.transform(syntax_tree)
end
Vectory
Vectory is a vector graphics processing library that includes native extensions for performance-critical operations.
- Purpose
-
This benchmark evaluates how Tebako handles applications with native extensions, particularly focusing on the performance impact of extracting and loading these extensions from temporary locations.
- Features tested
-
The benchmark exercises Vectory’s EMF to SVG conversion capabilities, which rely on native extensions for efficient processing.
- Real-world relevance
-
Many production Ruby applications leverage native extensions for performance-critical operations, making this benchmark essential for understanding Tebako’s impact on such applications.
require "tempfile"
if (argv = ARGV).empty?
puts "No arguments given"
exit(1)
end
if argv[0].to_i < 1
puts "Argument must be a positive integer"
exit(1)
end
argv[0].to_i.times do
require "emf2svg"
svg = Emf2svg.from_file(File.join(__dir__, "fixtures", "img.emf"))
Tempfile.create(["output", ".svg"]) do |tempfile|
tempfile.write(svg)
puts "SVG written to #{tempfile.path}"
end
end
Metanorma
Metanorma is a comprehensive standards document authoring and publishing suite that combines numerous Ruby gems, native extensions, and Java components.
- Purpose
-
This benchmark tests complex application performance, representing the most comprehensive real-world scenario in our test suite.
- Features tested
-
The benchmark exercises various Metanorma commands, including help, version, and site generation for different standards formats (IETF, IEEE, IEC, ISO, IHO).
- Real-world relevance
-
As an enterprise-level document processing system, Metanorma represents the kind of complex application that benefits most from packaging solutions like Tebako, making it an ideal test case for real-world performance assessment.
The Metanorma benchmark included execution of utility commands (metanorma
help
, metanorma version
) and generation of sample sites for various standards
formats using:
$ metanorma site generate samples -c samples/metanorma.yml -o site-<site name> --agree-to-terms
Testing methodology
Our benchmarking approach focused on comparing execution times between native Ruby applications and their Tebako-packaged equivalents across various workloads.
Environment specifications
All tests were conducted on the following hardware and software configuration:
Model Name: Mac mini Model Identifier: Macmini9,1 Chip: Apple M1 Total Number of Cores: 8 (4 performance and 4 efficiency) Memory: 16 GB Ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [arm64-darwin21] Tebako executable packager 0.5.5
Test procedures
For the Hello World, Coradoc, and Vectory benchmarks, we executed multiple runs with varying repetition counts to assess how performance differences scale with workload. This approach helps distinguish between fixed overhead (such as initialization time) and proportional overhead (such as ongoing filesystem access).
For the Metanorma benchmark, we executed various commands that generate different load profiles, providing insights into performance across diverse usage patterns.
Performance analysis
Our benchmarking results revealed several interesting patterns across the different test cases.
Parsing performance
Surprisingly, Tebako-packaged applications parsed Ruby code faster than the native interpreter in many cases. The combination of the filesystem access layer routing and DwarFS decompression proved more efficient than reading from disk, even when using an SSD.
This performance advantage was particularly evident in the Hello World benchmark, where the Tebako package consistently outperformed the native Ruby interpreter for code parsing and execution.

File processing performance
External file processing showed minimal performance differences between Tebako packages and native applications, despite the additional filesystem access layer introduced by Tebako.
The Coradoc benchmark, which focuses on document processing through pure Ruby code, demonstrated that Tebako’s filesystem routing has negligible impact on file processing performance.

Native extension handling
Applications with native extensions showed the most significant performance differences, primarily during initialization. The Vectory benchmark highlighted the additional time required to extract native extensions to temporary locations before loading.
However, once initialized, the ongoing performance impact was minimal, suggesting that the primary overhead is concentrated in the startup phase rather than during normal operation.

Complex application performance
The Metanorma benchmark provided the most comprehensive view of real-world performance implications. As expected, Tebako packages showed longer initialization times due to the combined effects of loading the entire Ruby standard library and extracting multiple native extensions.
However, the relative performance impact decreased as the workload increased, indicating that the initialization overhead becomes less significant for longer-running tasks.

Key findings
Our benchmarking analysis revealed several key insights about Tebako’s performance characteristics:
-
Faster code parsing: Tebako packages parse Ruby code faster than native interpreters. The combination of the filesystem access layer and DwarFS decompression outperforms direct disk access, even with SSDs.
-
Minimal file processing impact: External file processing shows negligible performance differences despite Tebako’s additional filesystem access layer.
-
Initialization overhead: Tebako packages incur a "penalty" during initialization, primarily due to:
-
Loading the Ruby standard library (which may not be fully utilized by the application)
-
Mapping the package to the application’s memory address space
-
Extracting and loading native extensions from temporary locations
-
-
Workload-dependent impact: The performance impact decreases as workload increases, making Tebako particularly suitable for longer-running applications where initialization overhead becomes less significant.
Conclusions and recommendations
Tebako packaging introduces a fixed initialization overhead that varies based on application size and the presence of native extensions. In our tests, this additional time ranged from 0.03 seconds for the "Hello, World" script to approximately 3 seconds for the complex Metanorma application.
Importantly, this overhead:
-
Is primarily concentrated during initialization
-
Does not scale with data processing complexity
-
Becomes less significant for longer-running tasks
Based on these findings, we can provide several recommendations for optimal performance with Tebako-packaged applications:
-
Ideal use cases: Tebako is particularly well-suited for applications where:
-
Distribution simplicity outweighs minor initialization delays
-
Tasks run long enough to amortize the initialization overhead
-
Simplified deployment provides significant operational benefits
-
-
Performance optimization: When packaging performance-critical applications:
-
Consider the initialization overhead in your application’s startup sequence
-
For applications with frequent short runs, evaluate whether the packaging benefits outweigh the cumulative initialization overhead
-
-
User experience: For interactive applications:
-
Consider implementing splash screens or progress indicators during initialization
-
Defer non-essential module loading to minimize initial startup time
-
Overall, our benchmarking demonstrates that Tebako provides an excellent balance between distribution simplicity and performance, with minimal impact on runtime performance for most real-world applications.