Blame | Last modification | View Log | Download | RSS feed
=head1 NAME
DataTransport - Efficient serialization format for nested data structures
=head1 SYNOPSIS
use DataTransport;
# Create instance with default pipe delimiter
my $dt = DataTransport->new();
# Or specify custom delimiter
my $dt = DataTransport->new(delimiter => '::');
# Encode a data structure
my $data = {
hostname => 'server.example.com',
config => {
network => {
interface => 'eth0',
address => '192.168.1.1'
}
},
packages => ['perl', 'python', 'gcc']
};
my $encoded = $dt->encode($data);
# Result:
# config|network|address=192.168.1.1
# config|network|interface=eth0
# hostname=server.example.com
# packages="perl" "python" "gcc"
# Decode back to hash
my $decoded = $dt->decode($encoded);
# Write to file
$dt->writeFile($data, 'output.dat');
# Read from file
my $data = $dt->readFile('output.dat');
=head1 DESCRIPTION
DataTransport provides a simple, efficient format for serializing nested
Perl data structures. It uses a flat key=value format where nested hash
keys are joined with a delimiter (default: pipe symbol '|').
The incentive for this is to have a human-readable, easy-to-edit format that
does not rely on any external modules. It is particularly useful for configuration
files or simple data exchange where simplicity and readability are desired.
Key features:
- Completely self contained; does not need any additional modules
- Flat text format, easy to read and edit
- Supports nested hashes, arrays, and scalar values
- Configurable delimiter for key paths
- File I/O with validation headers
- Keys can contain periods and special characters except for the delimiter and
equals sign
- Preserves empty strings and numeric zero values
=head1 FILE FORMAT
DataTransport files use the following format:
# DataTransport v1.0
key=value
nested|key=value
array|values="item1" "item2" "item3"
Lines starting with # are comments. Nested hash keys are joined with
the delimiter. Array values are tab-separated and quoted.
=head1 PACKAGE VARIABLES
=head2 $VERSION
Module version number.
=head2 $delimiter
Global default delimiter for key paths. Default is '|' (pipe symbol).
Can be overridden globally or per-instance.
# Override globally
$DataTransport::delimiter = '::';
my $dt = DataTransport->new(); # Uses '::'
# Or per-instance (recommended)
my $dt = DataTransport->new(delimiter => '::');
=head1 METHODS
=head2 new
my $dt = DataTransport->new();
my $dt = DataTransport->new(delimiter => '::');
Constructor. Creates a new DataTransport instance.
B<Parameters:>
=over 4
=item delimiter (optional)
Custom delimiter to use for joining nested keys. If not specified,
uses the package-level $delimiter variable (default: '|').
=back
B<Returns:> DataTransport object
=cut
package DataTransport;
use strict;
use warnings;
our $VERSION = '1.0';
our $delimiter = '|'; # Default delimiter for key paths
sub new {
my ($class, %options) = @_;
my $self = bless {
delimiter => $options{delimiter} || $delimiter
}, $class;
return $self;
}
=head2 encode
my $encoded = $dt->encode($hashref);
my @lines = $dt->encode($hashref);
Encodes a nested hash structure into DataTransport format.
B<Parameters:>
=over 4
=item $hashref
Reference to a hash to encode. Can contain nested hashes, arrays,
and scalar values.
=item $prefix (internal)
Internal parameter used during recursion. Do not pass this parameter.
=back
B<Returns:>
In scalar context, returns a string with newline-separated key=value pairs.
In array context, returns an array of key=value lines.
B<Format Rules:>
=over 4
=item * Nested hash keys are joined with the delimiter
=item * Arrays are encoded as tab-separated quoted values
=item * Scalar values are output as-is
=item * Keys are sorted alphabetically at each level
=back
B<Example:>
my $data = { host => 'server', config => { port => 80 } };
my $encoded = $dt->encode($data);
# config|port=80
# host=server
=cut
sub encode {
my ($self, $hashref, $prefix) = @_;
$prefix //= '';
my @lines;
my $delim = $self->{delimiter};
foreach my $key (sort keys %$hashref) {
my $full_key = $prefix ? "$prefix$delim$key" : $key;
my $value = $hashref->{$key};
if (ref($value) eq 'HASH') {
push @lines, $self->encode($value, $full_key);
} elsif (ref($value) eq 'ARRAY') {
my $encoded_array = join("\t", map { qq{"$_"} } @$value);
push @lines, "$full_key=$encoded_array";
} else {
push @lines, "$full_key=$value";
}
}
return wantarray ? @lines : join("\n", @lines);
}
=head2 decode
my $hashref = $dt->decode($encoded_string);
my $hashref = $dt->decode(\@lines);
Decodes DataTransport format back into a nested hash structure.
B<Parameters:>
=over 4
=item $data
Either a string containing newline-separated key=value pairs, or
an array reference of individual lines.
=back
B<Returns:>
Hash reference containing the decoded data structure.
B<Format Recognition:>
=over 4
=item * Lines with delimiters are split into nested hash keys
=item * Values starting and ending with quotes become arrays
=item * Tab-separated quoted values become array elements
=item * Other values are stored as scalars
=back
B<Example:>
my $encoded = "config|port=80\nhost=server";
my $data = $dt->decode($encoded);
# $data = { host => 'server', config => { port => 80 } }
=cut
sub decode {
my ($self, $data) = @_;
my $hashref = {};
my @lines = ref($data) eq 'ARRAY' ? @$data : split(/\n/, $data);
my $delim = quotemeta($self->{delimiter});
foreach my $line (@lines) {
next unless $line =~ /^(.+?)=(.*)$/;
my ($key_path, $value) = ($1, $2);
my @keys = split(/$delim/, $key_path);
# Check if value is an array (contains quotes at start and end, possibly with tabs)
# Single element: "value" or multiple: "val1"\t"val2"\t"val3"
if ($value =~ /^".*"$/) {
if ($value =~ /\t/) {
# Multiple elements
my @elements = split(/\t/, $value);
@elements = map { s/^"//; s/"$//; s/^\s+//; s/\s+$//; $_ } @elements;
$value = \@elements;
} else {
# Single element array
$value =~ s/^"//;
$value =~ s/"$//;
$value = [$value];
}
}
# Build nested hash structure
my $current = $hashref;
for (my $i = 0; $i < @keys - 1; $i++) {
$current->{$keys[$i]} //= {};
$current = $current->{$keys[$i]};
}
$current->{$keys[-1]} = $value;
}
return $hashref;
}
=head2 writeFile
$dt->writeFile($hashref, $filename);
Encodes a hash structure and writes it to a file with a validation header.
B<Parameters:>
=over 4
=item $hashref
Reference to hash to encode and write.
=item $filename
Path to output file. Will be created or overwritten.
=back
B<Returns:>
1 on success, undef on failure.
B<File Format:>
The file includes a comment header for validation:
# DataTransport v1.0
key1=value1
key2=value2
B<Example:>
my $data = { hostname => 'server', port => 80 };
$dt->writeFile($data, '/tmp/config.dat') or die "Write failed";
=cut
#######################################################
#
# writeFile( $hashref, $filename )
#
# Write encoded data structure to a file with DataTransport header
#
# Parameters:
# $hashref - reference to hash to encode and write
# $filename - path to output file
#
# Returns:
# 1 on success, undef on failure
#
# File format includes a comment header identifying it as
# a DataTransport file for validation during readFile
#
#######################################################
sub writeFile {
my ($self, $hashref, $filename) = @_;
return unless defined $hashref && defined $filename;
# Open file for writing
open(my $fh, '>', $filename) or return;
# Write header comment
print $fh "# DataTransport v$VERSION\n";
# Encode and write data
my $encoded = $self->encode($hashref);
print $fh $encoded;
print $fh "\n" unless $encoded =~ /\n$/;
close($fh);
return 1;
}
=head2 readFile
my $hashref = $dt->readFile($filename);
Reads and decodes a DataTransport file.
B<Parameters:>
=over 4
=item $filename
Path to DataTransport file to read.
=back
B<Returns:>
Hash reference on success, undef on failure.
B<Validation:>
Checks for the DataTransport header comment. If not found,
prints a warning and returns undef.
B<Comment Handling:>
Lines starting with # are treated as comments and ignored during
parsing (except for the header validation).
B<Example:>
my $data = $dt->readFile('/tmp/config.dat');
die "Failed to read file" unless defined $data;
print "Hostname: $data->{hostname}\n";
=cut
#######################################################
#
# readFile( $filename )
#
# Read and decode a DataTransport file
#
# Parameters:
# $filename - path to input file
#
# Returns:
# hashref on success, undef on failure
#
# Validates that the file contains the DataTransport header
# before attempting to decode the contents
#
#######################################################
sub readFile {
my ($self, $filename) = @_;
return unless defined $filename && -f $filename;
# Open file for reading
open(my $fh, '<', $filename) or return;
# Read all lines
my @lines = <$fh>;
close($fh);
return unless @lines;
# Check for DataTransport header
if ($lines[0] !~ /^# DataTransport v/) {
warn "File $filename does not appear to be a DataTransport file\n";
return;
}
# Remove header line
shift @lines;
# Filter out any other comment lines and empty lines
@lines = grep { !/^#/ && /\S/ } @lines;
# Decode the data
return $self->decode(\@lines);
}
1;
=head1 EXAMPLES
=head2 Basic Usage
use DataTransport;
my $dt = DataTransport->new();
my $data = {
name => 'John Doe',
email => 'john@example.com'
};
my $encoded = $dt->encode($data);
my $decoded = $dt->decode($encoded);
=head2 Nested Structures
my $config = {
server => {
hostname => 'web.example.com',
port => 443,
ssl => {
enabled => 1,
cert => '/etc/ssl/cert.pem'
}
}
};
my $encoded = $dt->encode($config);
# server|hostname=web.example.com
# server|port=443
# server|ssl|cert=/etc/ssl/cert.pem
# server|ssl|enabled=1
=head2 Arrays
my $data = {
packages => ['perl', 'python', 'ruby'],
versions => ['5.32', '3.9', '2.7']
};
my $encoded = $dt->encode($data);
# packages="perl" "python" "ruby"
# versions="5.32" "3.9" "2.7"
=head2 Custom Delimiter
# Keys with periods using pipe delimiter
my $dt = DataTransport->new();
my $data = {
'host.example.com' => 'server1',
'net.interface' => 'eth0'
};
my $encoded = $dt->encode($data);
# host.example.com=server1
# net.interface=eth0
# Using :: delimiter for compatibility
my $dt2 = DataTransport->new(delimiter => '::');
my $config = {
database => {
host => 'db.example.com'
}
};
my $encoded2 = $dt2->encode($config);
# database::host=db.example.com
=head2 File Operations
my $dt = DataTransport->new();
# Write data
my $system_info = {
hostname => 'server01',
kernel => '5.10.0',
memory => '16GB'
};
$dt->writeFile($system_info, '/tmp/sysinfo.dat');
# Read data
my $loaded = $dt->readFile('/tmp/sysinfo.dat');
print "Hostname: $loaded->{hostname}\n";
=head1 NOTES
=over 4
=item * Keys are always sorted alphabetically during encoding
=item * The delimiter should not appear in key names (use quotes in keys if needed)
=item * Empty strings and zeros are preserved correctly
=item * Tab characters in values may be lost during array decoding
=item * The module uses simple regex parsing - complex nested quotes may not work
=back
=head1 SEE ALSO
L<YAML::Tiny>, L<JSON>, L<Data::Dumper>
=head1 AUTHOR
R. W. Rodolico
=head1 COPYRIGHT AND LICENSE
Copyright (c) 2025 R. W. Rodolico
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
=cut
1;