Overview

Joins two files in similar style as GNU "join", but with extra convenience of allowing unsorted inputs and auto-detection of delimiters and headers.

Download

Source code

Git

Installation

make 
make test
make memtest
sudo make install PREFIX=/usr/local

Examples

# Print key and line numbers where keys "ID" and "COL_ID" match.
# Auto detects column delimiters.
ezjoin -k ID,COL_ID -o key,1.line,2.line in1 in2 out

# Specify output comma delimiter and output columns with field 
# names "NAME" and "TIME".
ezjoin -k ID,ID -o 1.NAME,2.TIME -O "," in1 in2 out

# When input files don't have headers, use column numbers for key.
# Here, we use the first column in file #1, and third in file #2.
# Also print "missing" columns when key in first is missing in second.
ezjoin -k 1,3 -o key,2.1 -m missing in1 in2 out

# Specify input file delimiters: first uses space, second uses comma.
ezjoin -k 1,1 -o key,1.line,2.line -d " ," in1 in2 out

# Ouput with header, re-using field names from each file.
# "key" will use the first file's field name for -k column.
ezjoin -k 1,1 -o key,1.NAME,2.ADDRESS in1 in2 out

# Suppress output header.
ezjoin -k 1,1 -o key,1.NAME,2.2 -n in1 in2 out

# Output custom header, regardless if inputs have headers.
# Input headers will be auto suppressed.
ezjoin -k 1,1 -o key,1.2,2.2 -h "ID NAME ADDRESS" in1 in2 out

Usage

ezjoin -- Join two files with column keys.

USAGE: ezjoin [OPTIONS] in1 in2 out

OPTIONS:

-D, --debug                 Print detailed messages for debugging.

-d, --delimiters ARG        Force column delimiters for both files, such as -d
                            ',;' will specify comma for the first file, and
                            semi-colon for the second file. By default,
                            delimiters are auto-detected in the first lines in
                            this order '|;  , '.

-H, --hasheader             If input has a header. This is autodetected if any
                            output column for the first file uses a field name
                            (instead of a digit). If only column digits are
                            given, and you want a custom header with -h, then
                            this can suppress the original header.

-h, --header ARG            Custom header to output as first line.

-k, --keys ARG1[,ARGn]      Column keys for both files, such as -k 1,NAME.
                            Defaults to first columns for both files. Column
                            numbers start with 1.

-m, --missing ARG1[,ARGn]   What to print for missing key-values in second
                            file. Default is to suppress missing lookups.

-n, --noheader              Suppress header output.

-O, --outdelim ARG          Output delimiter. Default is space.

-o, --output ARG1[,ARGn]    Columns to output, such as
                            -o 1.1,2.NAME,key,2.line.
                            A file is denoted by 1 or 2, followed by a period
                            and a column name or number. Column numbers start
                            with 1. 'key' will print the key value. X.line
                            will print the 1-based line number for file number
                            X.

-v, --version               Display version info.

--help ARG                  Display usage instructions.
                            There is a choice of three different layouts for
                            description alignment. Your choice can be any one
                            of the following to suit your style:

                            0 - align (default)
                            1 - interleave
                            2 - stagger

ezjoin 0.1.1 Copyright (C) 2011,2012 Remik Ziemlinski
This program is free and without warranty.

Distribution

make html
make clean
make dist VER=0.1.1

Publishing

ssh -t rsz,ezjoin@shell.sourceforge.net create
scp html/* rsz,ezjoin@shell.sourceforge.net:/home/project-web/ezjoin/htdocs
scp ../ezjoin-0.1.1.tar.gz rsz,ezjoin@shell.sourceforge.net:/home/frs/project/e/ez/ezjoin

Changelog

v0.1.1 20121130

v0.1.0 20120716

License

Copyright 2011,2012 Remik Ziemlinski (see LICENSE.MIT)