Relational Junction CSV File JDBC Driver Documentation



Add the file csvdriver.jar to your classpath or extract the jar file in the directory of the application.

Driver Classes

Driver class (JDBC API v1.0)
Data Source class (JDBC API v2.0)

URL Syntax

The connection URL is jdbc:relationaljunction:csv:csvdir, where csvdir may be the following:

  • absolute or relative path to the local directory containing text files, e.g.:
  • path to the directory entry within the zip (jar) file, e.g.:
  • path to the resource directory within the CLASSPATH, e.g.:
  • FTP URL to the FTP-server directory (syntax: ftp://user:password@hostname[:port]/[dirpath/]), e.g.:
  • FTP URL to a ZIP archive containing CSV/text files (syntax: ftp://user:password@hostname[:port]/[dirpath/]zipfile), e.g.:
  • SFTP URL to the SFTP-server directory (syntax: sftp://user:password@hostname[:port]/[dirpath/], also required third-party libraries Commons VFS, Commons Logging and JSch for this protocol), e.g.:
  • HTTP URL to the web-server directory, e.g.:
  • HTTP URL to the dynamic server page (JSP, ASP, PHP, CGI, etc) that returns a CSV/text file as output. For more information please see useWebParam driver property.
  • SMB/CIFS URL to a directory located on a SMB/CIFS server (e.g.: MS Windows share or Samba server, syntax: smb://[user:password@]hostname/share/[dirpath/]):
  • an RMI server can be used as an alternative to FTP (SFTP, HTTP, etc) protocols for accessing remote files. For more information see the topic using the driver as a client/server (JDBC type 3) via RMI

Note: a path to some particular CSV/text file can be specified directly in a SQL query, e.g.:

  • absolute file path to a file, e.g.:
           SELECT * FROM "c:/csvfiles/test.csv"
  • relative path to a file:
           SELECT * FROM "subfolder/test.csv"
  • other URL path:

SELECT * FROM "zip://c:/"

Driver Properties

The driver supports a number of parameters that change default behavior of the driver.

These properties are:

charset is used to specify a different than the default encoding for a CSV/text file. Possible values are encodings supported in Java (Default is the JVM default charset)

commentLine is used to specify a string denoting comment line. Multiple values can be separated by “|” character, e.g.: // | #. The default value is not-defined.

dateFormat is used to specify a format for date/time values. A value of this property is a sequence of date/time formats separated by the ‘|’ character, e.g: “dd.MM.yy | dd.MM | dd”. (Default is “yyyy-MM-dd HH:mm:ss.SSS | yyyy-MM-dd HH:mm:ss | yyyy-MM-dd | HH:mm:ss.SSS | HH:mm:ss”). The output date/time format will be the first one in this sequence. The property may be set for each column separately. For more details about date/time format used in the driver please see the chapter “time format syntax” in the java.util.SimpleDateFormat class documentation.

defaultColumnType is used to specify the default data type for columns for a CSV/text file (Default is “Varchar”).

decimalFormatInput is used to specify input formats for numbers in a CSV/text file. For instance, you can use this parameter to specify currency values, e.g.: “###,###.##$”. Also, multiple formats can be separated by “|” character, e.g.: “#.#|###,###.#|###,###.#$”. The property may be set for each column separately. For more information about patterns used in this property please see the documentation for java.text.DecimalFormat class.

decimalFormatOutput is used to specify the output format for numbers in a CSV/text file. For example: “###,###.##$”. The property may be set for each column separately. For more information about patterns used in this property please see the documentation for java.text.DecimalFormat class.

dbInMemory, dbPath are used to set a driver mode. For more details see driver modes.

fileExtension is used to set a default file extension, if it is not specified explicitly in a file name (see a note below). For comma-separated values (CSV) files you should set the property value to “.csv”. For example: jdbc:relationaljunction:csv:c:/csvfiles?fileExtension=.csv The default value is “.txt” for compatibility with old versions.
Note: In SQL queries you can also use full file names within double quotes, e.g.: “myTable.txt”, “myTable.csv”, “c:/csvfiles/myTable.csv”. It is recommended to stick to one of these table naming types, i.e. either “myTable.txt” or myTable (without a file extension).

ignoreCase The property sets the case sensitivity for string values. By default the driver are not case sensitive (ignoreCase = true), i.e. SQL queries SELECT * FROM test WHERE string_column = ‘some_value’ and SELECT * FROM test WHERE string_column = ‘SOME_VALUE’ are identical.

rowDelimiter is used to specify a custom row delimiter for a CSV/text file. (Default is a line separator specified by JVM environment variable “line.separator”, e.g. for Windows – “rn”, for Linux – “r”)

schema is used to specify a path to the schema file. It can be absolute or relative to a CSV directory path, e.g.: “c:/schemas/schema1.xml” or “schemas/schema2.xml”. (Default is “schema.xml”)

separator is used to specify a different column separator in text files. For comma-separated values (CSV) files you should set the property value to “,”. For example: jdbc:relationaljunction:csv:c:/csvfiles?separator=, The default is value is tab (“t”) for compatibility with old versions.

suppressHeaders is used to specify if the first line contains a header row with column names (Default is false; the column header is on the first line).

watchFileModifications tells the driver to start an additional thread that checks regularly (every second once by default) if the file was modified by some external process. If the file was modified, the driver refreshes it in the cache. (Default value is false). This property is recommended, if your files are modified by an external program / process and you want the changes to be immediately available to the driver. Also, use the property “checkPeriod” to set regularity of the file check. By default it is 1000 (1000ms). Example: jdbc:relationaljunction:csv:c:/csvfiles?watchFileModifications=true&checkPeriod=3600000

Advanced Properties

escapeEOLInQuotes is used to escape row delimiters within double quotes. (Default is false).

escapeSeparatorInQuotes is used to escape column separators within double quotes. (Default is true).

ignoreRows is used to specify row numbers (including rows with the header and comments) to be ignored while parsing, e.g.: ignoreRows=1,3, 5-10, 100+

lockFiles is used to set locking for CSV files. If lockFiles = true the driver will use an exclusive file lock during write SQL operations and a shared lock during read SQL operations. Default is false.

logPath is used to set a file path to the log file.

emptyStringAsNull If ’emptyStringAsNull’ is set to ‘true’, empty strings (i.e. “”) are treated as NULL values. The property has no effect for CHAR values. (By default is true).

nullString is a string value that will be treated as the NULL value. (By default it is a regular expression “(?i)null”, i.e. case-insensitive “null”).

nullStringToWrite is a string value that will be written into a CSV/Text file for NULL values. (By default is “NULL”).

paddingChar padding char for fixed-length files. (By default is space).

parameter_xxx is used to pass parameter values to the schema. For more information see the schema chapter.

preSQL is used to specify a path to a SQL script that will be executed after creating a driver connection, e.g. c:/sql_script.txt. It may be useful for creating table indexes and other preparations.

propertiesFile is used to specify driver properties in a separate external properties file.

singletonConnection If “singletonConnection” is set to “true”, the driver creates only one java.sql.Connection instance for each unique JDBC URL and reuses that connection for all threads. It allows the driver to increase up performance and reduce memory usage in multithreading environments like web/application servers Tomcat, GlassFish, WebSphere, etc. For example JDBC URL with this property may look like the following: jdbc:relationaljunction:csv:c:/mydir/csvfiles?property1=value1&property2=value2&singletonConnection=true. Also, it is recommended to execute the SQL command “SHUTDOWN” while closing (undeploying) a web application. Default value is false.

trimBlanks If ‘trimBlanks’ is set to ‘true’, the driver trims leading and trailing spaces for string values when reading a text file. The property has no effect for CHAR values. (By default is true)

useWebParam is used to specify the name of the web parameter that will be used to pass a CSV/text file name to the dynamic server page. For instance, if you specify the following value: ‘tablename’, the driver will access server page using the following HTTP URL: (where ‘sometable’ is the table specified in the SQL query or the schema file)
Note: If you specify driver properties directly in the driver URL and server page has its own parameters as well, you should separate them with ‘??’:

quoteString is used to enable/disable writing of double quotes for string values while inserting and updating records in a CSV/text file. The property has no effect for CHAR values. (Default is true)

To set the driver properties, you can use one of the following ways:

1) using Properties class:

java.util.Properties props = new java.util.Properties();
props.put("separator", "|"); // separator is a bar
props.put("suppressHeaders", "false"); // column headers are on the first line
props.put("fileExtension", ".csv"); // default file extension is .csv
props.put("charset", "ISO-8859-2"); // file encoding is "ISO-8859-2"
props.put("commentLine", "--"); // string denoting comment line is "--"
// date/time format
props.put("dateFormat", "yyyy-MM-dd HH:mm | dd/MM/yyyy");
Connection conn = DriverManager.getConnection("jdbc:relationaljunction:csv:c:/csvfiles", props);

2) using a data source class com.relationaljunction.jdbc.csv.CsvDataSource2:

CsvDataSource2 csvDS = new CsvDataSource2();
csvDS.setPath("c:/csvfiles");        // path to the CSV directory
csvDS.setSeparator("|");             // separator is a bar
csvDS.setSuppressHeaders(false);     // column headers are on the first line
csvDS.setFileExtension(".csv");      // default file extension is .csv
Connection conn = csvDS.getConnection();

3) appending the properties to the URL (multiple properties should be separated by ‘&’ or ‘!’ character):

Connection conn = 

4) using an external properties file:

Connection conn =

5) using local properties specified in the schema for some particular file(s) and column(s).

Database Schema

The database schema file is called “schema.xml”. It is intended to specify data types and local properties for your CSV/text files. By default it should be located in the same directory where CSV/text files are contained. To specify another path use the driver property “schema”.

<table name = "my_table1.csv">
<!-- "my_table1.csv" is a relative path to the file which is located in a CSV directory specified in the driver URL. In other words if the driver URL is set to "jdbc:relationaljunction:csv:c:/csvfiles", this table tag describes "c:/csvfiles/my_table1.csv". -->
<column name = "my_column1" type = "Integer"/>
<!-- this tag assigns the SQL data type to the column by its name in the column header-->
<column pos = "2" type = "Integer"/>
<!-- this tag assigns the SQL data type to the column by its position in a text file-->
<!-- you can also use file masks, if your files have the same format -->
<table name = "*.csv">
<!-- all files with the template "*.csv" -->
<!-- the wildcard '*' denotes any string of zero or more characters -->
<column name = "col1" type = "Varchar"/>
<column name = "col2" type = "Integer"/>
<table name = "file????.*">
<!-- all files with the mask "file????.*" -->
<!-- the wildcard '?' denotes any single character -->
<column name = "id" type = "Integer"/>
<column name = "descr" type = "Varchar" size = "50"/>
<column name = "num" type = "Decimal" size = "15" decimalCount = "2"/>
<!-- the 'size' attribute specifies the maximum number of characters for the VARCHAR type or total number of digits that can be stored for the DECIMAL type -->
<!-- the 'decimalCount' attribute specifies the maximum number of digits that can be stored to the right of the decimal separator -->
<!-- regular expressions can be used as well. Add the "regex:" prefix for that. -->
<table name = "regex:products_201[0-2].csv">
<!-- this pattern describes the following files: "products_2010.csv", "products_2011.csv", "products_2012.csv" -->
<column name = "col1" type = "Varchar"/>
<column name = "col2" type = "Integer"/>
<!-- you can set an absolute path (URL) to a CSV/text file. To make an SQL query to one of such files you should specify an absolute path directly in the query, e.g.: SELECT * FROM "c:/csvfiles/mytable.csv" -->
<table name = "c:/csvfiles/*.csv">
<!-- absolute path to CSV/text files with the file mask '*' " -->
<column name = "id" type = "Integer"/>
<column name = "name" type = "Varchar" size = "50"/>
<!-- absolute path to a CSV/text file within a ZIP archive" -->
<table name = "zip://c:/" constraint = "PRIMARY KEY(product_id, region_id), UNIQUE (min_price), CHECK min_price > 100">
<!-- the 'constraint' attribute specifies a table constraint like a primary key, unique value, etc -->
<column name = "product_id" type = "Integer"/>
<column name = "region_id" type = "Integer"/>
<column name = "min_price" type = "Integer"/>
<column name = "max_price" type = "Integer"/>
<!-- you can set local properties for a specific table by using attributes of the same name. Local properties override global properties specified while creating a driver connection -->
<table name = "my_table2.csv" charset = "ISO-8859-2" suppressHeaders = "true" commentLine = "--" separator = "," dateFormat = "dd/MM/yyyy">
<column name = "column_name1" type = "Integer"/>
<column name = "column_name2" type = "Integer"/>
<column name = "column_name3" type = "Date"/>
<!-- Also, the properties dateFormat, decimalFormatInput, decimalFormatOutput can be set for each column separately. Column properties override table and global properties. -->
<table name = "localFormats*.csv">
<column name="int_col" type="Integer" decimalFormatInput="###,###$"/>
<column name="double_col" type="Double" decimalFormatInput="###,###.##$|###,###.##"/>
<column name="decimal_col" type="Decimal" decimalFormatInput="###,###.##$|###,###.##"  />
<column name="long_col" type="Long" decimalFormatInput="###,###$"/>
<column name="date_col" type="Date" dateFormat="dd/MM/yyyy"/>
<column name="time_col" type="Time" dateFormat="HH:mm"/>
<column name="datetime_col" type="Datetime"/>
<!-- To rename a column name use the 'alias' attribute along with 'name' or/and 'pos' attributes-->
<table name = "some_table.csv">
<column name = "initial_column_name" alias="renamed_column_name" type = "Integer"/>
<column name = "initial_column_name2" pos = "2" alias="renamed_column_name2" type = "Integer"/>
<column pos = "3" alias="renamed_column_name3" type = "Date"/>
<!-- if a text file contains the column header ("suppressHeaders" is "false"), you can use both the 'pos' and the 'name' attributes to override column names existing in the header by column names in a table description --> 
<table name = "file_with_header.csv" suppressHeaders = "false">
<column name = "new_column_name1" pos = "1" type = "Integer"/>
<column name = "new_column_name2" pos = "2" type = "Integer"/>
<column name = "new_column_name3" pos = "3" type = "Date"/>
<!-- The driver also supports the fixed-length files. For this, you should set the "fixed" value for the "separator" parameter. -->
<table name ="fixed_length_file.csv" separator = "fixed">
<column name = "id" begin = "1" end = "5" type = "Integer"/>
<!-- The "begin" attribute sets the initial position of a column in a text file and the "end" attribute sets its final position. -->
<column name = "name" begin = "6" end = "20" type = "Varchar"/>
<column name = "birthdate" begin = "21" end = "28" type = "Date"/>
<!-- you can use named parameters {@parameter_name} in table descriptions. Then a parameter value should be passed via the driver property "parameter_xxx", where "xxx" is a parameter name. For example: jdbc:relationaljunction:csv:c:/csvfiles/?parameter_fileName=my_file&parameter_columnName1=id&parameter_columnName2=title -->
<table name = "{@fileName}.csv">
<column name = "{@columnName1}" type = "Integer"/>
<column name = "{@columnName2}" type = "Varchar" size = "50"/>

Data Types

The driver supports the following data types:

Data TypeJDBC Returned type (java.sql.Types.*)Java class used
AUTOINCREMENT, IDENTITYjava.sql.Types.BIGINTjava.lang.Long
Int, Integer, Tinyint, Smallint, SHORTjava.sql.Types.INTEGERjava.lang.Integer
Long, Bigintjava.sql.Types.BIGINTjava.lang.Long
Float, Realjava.sql.Types.FLOATjava.lang.Float
String, Varchar, VARCHAR2java.sql.Types.VARCHARjava.lang.String
CHAR (the driver preserves all spaces and quotes in a value as is)java.sql.Types.VARCHARjava.lang.String
Datetime, Timestampjava.sql.Types.TIMESTAMPjava.util.Date
Date, YEARjava.sql.Types.DATEjava.sql.Date
BOOLEAN, LOGICAL, BITjava.sql.Types.BOOLEANjava.lang.Boolean


  • DOUBLE and FLOAT values should not be used to represent currency values, because of rounding problems.
  • The DECIMAL (NUMERIC) type is slower and requires more storage than the FLOAT and DOUBLE types.
  • Parsing and formatting takes longer for the DATETIME (TIMESTAMP) type than the numeric types.
  • Text types are slower to read, write, and compare than numeric types and generally require more storage.

Supported SQL Syntax

Relational Junction DBF version 5.x uses H2 database as an SQL engine and supports the most part of ANSI/ISO SQL grammar like SELECT, INSERT, UPDATE, DELETE and CREATE statements.

An SQL query must meet the following conditions:

  • It must corespond to the SQL syntax supported by the driver. For detailed information about supported SQL syntax please see the specification here.
  • А column using an SQL reserved word as a name or containing spaces and other delimiters must be quoted in double quotes in a query. For example, SELECT “Date”, “My integer-column” FROM “test.dbf”
  • To use single quotes (‘) within a string constant you should duplicate them, e.g.: SELECT ‘a”bcd”efgh’

Query examples:

 // ---- SELECT queries ---
SELECT SUM(a) AS col1, MAX(b) / MAX(c) AS col2 FROM "test.csv" GROUP BY a HAVING AVG(a) > 30;
SELECT name FROM "salesreps.csv" WHERE (rep_office IN ( 22, 11, 12 ))  OR (manager IS NULL AND hire_date >= PARSEDATETIME('01-05-2002','dd-MM-yyyy') OR (sales > quota AND NOT sales > 600000.0);
SELECT city, target, sales FROM "offices.csv" WHERE region = 'Eastern' AND sales > target ORDER BY city;
// ---- SELECT queries with join ----
SELECT * FROM "prices.csv" ps JOIN regions regs ON ps.regionid = JOIN "products.csv" prod ON prod.prodid = ps.prodid;
SELECT * FROM "prices.csv" ps, "products.csv" prod WHERE prod.prodid = ps.prodid;
// ---- INSERT, UPDATE and DELETE commands ----
INSERT INTO "salesreps.csv" (name, age, empl_num, sales, title) VALUES ('Henry Smith', 35, 111, NULL, 'Sales Mgr');
UPDATE "customers.csv" SET credit_limit = 50000.00 WHERE company = 'Acme Mfg.';
DELETE FROM "salesreps.csv" WHERE NAME LIKE 'Henry%';
// ---- CREATE TABLE command ----
CREATE TABLE "new_table.csv" (int_col INT, long_col LONG, float_col REAL, double_col DOUBLE, str_col VARCHAR(20), date_col DATETIME, bool_col BOOLEAN, num_col DECIMAL(15,2));
// ---- CREATE INDEX command ----
CREATE INDEX i_1 ON "new_table.csv" (int_col); 

See also:

Connection Example

This example code shows how the driver is used.

import java.sql.*;
public class DriverTest
public static void main(String[] args)
// load the driver into memory
// create a connection. CSV files are assumed to be in the current directory
Connection conn = DriverManager.getConnection("jdbc:relationaljunction:csv:.?separator=,");
// create a Statement object to execute the query with
Statement stmt = conn.createStatement();
// execute a query
ResultSet rs = stmt.executeQuery("SELECT * FROM "test.csv"");
// read the data and put it to the console
for (int j = 1; j <= rs.getMetaData().getColumnCount(); j++) {
System.out.print(rs.getMetaData().getColumnName(j) + "t");
while (
for(int j=1; j <= rs.getMetaData().getColumnCount(); j++){
System.out.print(rs.getObject(j)+ "t");
// close the objects
catch(Exception e)

Driver Modes

The driver works in the following way: it loads data from files to an intermediate database (further referred to as “synchrobase”) that is used for running SQL queries and synchronizing changes between this database and external CSV/text files.

The driver can work in the following three modes:

1) With a temporary synchrobase in RAM. The synchrobase is created in RAM and is removed from it after the connection is closed. It is a default mode.

2) With a temporary synchrobase on the hard drive. The synchrobase is created on the HDD every time a connection is opened and deleted after it’s closed. To use this mode, set the driver property dbInMemory to false.

 Connection conn = DriverManager.getConnection("jdbc:relationaljunction:csv:c:/csvfiles?dbInMemory=false&tempPath=c:/tempfiles");

3) With a persistent synchrobase on the hard drive. The synchrobase is created just once and is re-used afterwards. To use this mode set the property dbPath to the file path where is synchrobase will be stored.

 Connection conn = DriverManager.getConnection("jdbc:relationaljunction:csv:c:/csvfiles?dbPath=c:/synchrobases/syncro_db_name&tempPath=c:/tempfiles ");

In the first mode, all synchrobase data are stored in the system RAM, which ensures maximum performance. Keep in mind that Java Virtual Machine must have enough free memory allotted for storing large tables (use -Xms and -Xmx JVM options). You can also use the DROP TABLE

FROM CACHE command to force tables to be removed from the cache.

In the second case, a temporary synchrobase is created on the hard drive. This mode is used for processing large files, since it uses a minimum of RAM. The synchrobase is deleted after the connection is closed.

In the third mode, the synchrobase is created once and is reused afterwards. This mode requires additional files to store a synchrobase on the hard drive.

Note: if your files are modified by an external program / process and you want the changes to be immediately available to the driver, consider using watchFileModifications property.

There are also some properties for configuring these modes:

tempPath – directory where temporary files will be created (by default it is a OS temporary directory, specified by JVM environment variable “”). It is recommended to set your own value. For example it can be a directory on a fast SSD disk.


Properties props = new java.util.Properties();
props.setProperty("dbInMemory", "false");   // switch to the second mode (a temporary synchrobase on the hard drive)
props.setProperty("tempPath", "c:/temp");     
Connection conn = DriverManager.getConnection("jdbc:relationaljunction:csv:c:/csvfiles", props);

User-defined SQL functions

You can use your own SQL functions in the driver. To use this feature, you should do the following:

1) Create a static method that will act as an SQL function
Mind that:

  • The method must have the public modifier
  • The number of arguments in the method is arbitrary, but they must belong to one of the java classes specified in the data types table
  • It is advisable to process the null values for arguments in the method code
  • The method can return the null value as well

For example:

package my_pack;
public class MyFuncs{
// user-defined SQL function that formats the given argument into a date/time string with specified format
public static String format_date(java.util.Date d, String format) {
// process the null values
if (d == null || format == null)
return null;
java.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat(format);
// return a function result with java.lang.String type
return sdf.format(d);

2) Register the function by executing CREATE ALIAS … FOR command. Use an unique name for each function.
For example:

CREATE ALIAS IF NOT EXISTS format_date FOR "my_pack.MyFuncs.format_date(java.util.Date, java.lang.String)"

Also, you can use the driver property function:.
For example:

Properties props = new java.util.Properties();
Connection conn = DriverManager.getConnection("jdbc:relationaljunction:dbf:c:/dbffiles", props);
// or append this property to the URL
Connection conn2 = DriverManager.getConnection("jdbc:relationaljunction:dbf:c:/dbffiles
+ "?function:formate_date=my_pack.MyFuncs.format_date");

3) Call the function in an SQL query
For example:

Statement st = connection.createStatement();
st.execute( "SELECT format_date(date_column , 'yyyy-MM-dd') FROM "test.dbf"" );

Performance and other hints

  • Wherever possible use the java.sql.PreparedStatement class for SELECT, INSERT, UPDATE and DELETE operations
  • For bulk operations use batching, i.e.:
    // bulk insertion
    PreparedStatement pst = conn.prepareStatement("INSERT INTO test(id, str) VALUES(?,?)");
    for (int i = 0; i < 10000; i++) {
    pst.setInt(1, i);
    pst.setString(2, "string " + i);
  • Describe your tables in “schema.xml” and specify data types and sizes for columns.
  • Create indexes using CREATE INDEX … operation
  • To check what indexes are used in a query use EXPLAIN command:
    Statement st = connection.createStatement();
    ResultSet rs = st.executeQuery("EXPLAIN SELECT * FROM test WHERE id = 5");
    // print a result set to the output to check the execution plan for a query
  • Use the same connection (java.sql.Connection instance) for multiple threads.
  • Set the property singletonConnection to “true” for web/application servers like Tomcat, GlassFish, WebSphere, etc.
  • If your tables are big (~> 100 MB), use modes with a syncrobase on the hard drive. For more information see Driver modes.
  • Use org.hibernate.dialect.H2Dialect in Hibernate.
  • Do not forget to close java.sql.ResultSet, java.sql.Statement and java.sql.Connection instances.
  • Use a modern JVM. Using JDK 1.8 can increase up the driver performance up to 50% as compared with JDK 1.5.

Advanced topics

Using the driver as a client/server (JDBC type 3) via RMI