JDiff: Differentiation of Comma Separated Values...
Intro | How-To | Download | History
JDiff is a small Java program that takes two sets of comma separated values and returns several results, including:
- All data items that are in both data sets.
- All data items that are mutually exclusive (mutex) - where the values are
either in one data set or the other, but not both.
- All data items that are mutually inclusive (mutin) - where the values are
in both data sets.
- All data items that are only in the first data set.
- All data items that are only in the second data set.
- And duplicates detection (but not removal).
When returned, the data is sorted for easy comparison.
Note: JDiff is not like other diff programs that you may be familiar with,
such as fc under DOS. JDiff was developped as a personal tool
to solve a task that I had, and as such is not very customizable (unless you
change the code).
How-To
Preparing the data
JDiff requires two data sets of comma separated values. You must create a text
file (the extention is of no consequence) with the following format:
- Each line must contain two values separated by a comma (,). The first value
may of of any form (number, letter, etc.), so long as it is a string. The
second value, however must be a number, but it need not have any specify decimal
positions.
- There can only be one data pair per line.
- There may not be any trailing spaces at the end of a line.
- There may not be any space between the values and the comma.
- Enter a -1 on a line by itself at the end of both the first and second data
set.
Here is an example of an acceptable file:
abc,1.0
def,2.0
ghi,3.0
-1
abc,1.0
ghi,3.0
jkl,4.0
def,3.0
-1
Running the program
JDiff is run through the command line with the command:
java JDiff
JDiff will then ask you for your the input file, desired output file and desired
options. The output will then be stored in the specified output file for easy
access. (If the output file already exists, a message will be displayed on the
screen and the file will then be overriden.) You can also use the JAR file if
you only want to keep track of one file:
java -jar JDiff.jar
Similarly, you can use the provided JDiff.bat batch file to run JDiff. The
batch file requires the use of the JAR file. Both files must be in the same
directory. You can run it like so:
JDiff
Download
JDiff can be downloaded here (28.7 KB)
The archives contain the following directories:
- bytecode: Contains the class files needed to run JDiff
- JDiff.class
- DataItem.class
- JDiff.jar
- JDiff.bat (Assumes java command
is in your path, may require editing. Requires accompaning JAR file.)
- documentation: Contains this page, as well as a javadoc subdirectory
- src: Contains the source java files
History
Version 1.1 (May 19th 2002):
- Added interactive support. You can now specify the options you want.
- Removed use of redirection, you can now specify the source file interactively.
- Removed support for manually entering your data (it was tedious and useless).
- Added support for saving results to specified file. (If the specified file
exists, it will be overriden.)
- Added a batch file to the executables. It provides for easier execution
of JDiff with the JAR file.
- Added duplicates detection. If a data set contains duplicates, JDiff will
not remove them for its results, so you can use this option to determine if
any exist.
Version 1.0 (May 18th 2002): First version.