Groovy VFS DSL

Introduction

How Groovy VFS differs from Apache VFS

Groovy VFS is built on top of ttp://commons.apache.org/proper/commons-vfs/index.html[Apache VFS2]. Besides the obvious simpler and boilerplate-less code it introduces a bit of behavioural sanity. There are a couple of behaviours in Apache VFS that catches out the unwary and newbies.

  • When copying a file onto a directory with the same name Apache VFS will simply delete the whole directory and its contents and replace it with the new file. Groovy VFS will not allow this. However, should you require this behaviour all you need to do is pass smash : true to the cp operation.

  • Groovy VFS does not try not to replicate every operation that is available in Apache VFS, instead it concentrates on the most common use cases.

  • Unless a logger object is passed to the VFS during construction, Groovy VFS will not log anything.

  • Apache VFS does not create non-existing intermediate subdirectories in the target path when moving files or folders. In Groovy VFS this is default behaviour. To turn this off pass intermediates : false to mv.

  • Groovy VFS extends the URL format to include protocol options in the query string i.e. vfs.ftp.passiveMode=1&vfs.ftp.userDirIsRoot=0

Default construction

import org.ysb33r.groovy.dsl.vfs.VFS
def vfs = new VFS()

Passing properties

A VFS object can be constructed with options specific to the underlying Apache VFS FileSystemManager.

def vfs = new VFS(
  temporaryFileStore : '/tmp/my_cache'
)

In addition, any default file system options can be passed, in the format vfs.PROTOCOL.OPTION.

def vfs = new VFS(
  temporaryFileStore : '/tmp/my_cache',
  'vfs.ftp.passiveMode' : true,
  'vfs.http.maxTotalConnection' : 4
)

Valid system properties are:

  • cacheStrategy - Sets the cache strategy to use when dealing with file object data

  • filesCache - Sets the file cache implementation

  • logger - Sets the logger to use. Either a Apache Commons Logging or a SL4J instance is acceptable.

  • replicator - Sets the replicator

  • temporaryFileStore - Sets the temporary file store (File,String or VFS TemporaryFileStore object)

From v0.6 onwards better control over loading of providers are possible:

  • ignoreDefaultProviders - Don’t load any providers (overrides scanForVfsProviderXml, legacyPluginLoader)

  • scanForVfsProviderXml - Look for META-INF/vfs-provider.xml files

  • legacyPluginLoader - Load using providers.xml from Apache VFS jar.

The v0.6 the defaultProvider property has been removed. People relying on this, should supply this via the extend DSL.

Copying Files

Options

There are a number of options that affect how a copy behaves. There options are

vfs {
  cp url_src, url_dest,
    overwrite : false,
    recursive : false,
    smash : false,
}

From FileType

To FileType

Overwrite?

Smash?

Recursive?

Action

FILE

IMAGINARY

N

N

N

Copy, creating file

FILE

FILE

N

x

x

Don’t copy

FILE

FILE

Y

x

x

Overwrite file

FILE

FOLDER

N

N

x

If a same-named file does not exist in the folder, copy it, otherwise don’t copy

FILE

FOLDER

Y

N

x

Create same-named file in the folder, even it exists. If same-named directory exists, fail

FILE

FOLDER

x

Y

x

Replace same-named folder with a filename

FOLDER

IMAGINARY

N

N

N

Don’t copy

FOLDER

IMAGINARY

N

N

Y

Copy directory and descendants

FOLDER

FILE

N

N

N

Don’t copy

FOLDER

FILE

Y

N

N

Don’t copy

FOLDER

FILE

x

Y

x

Replace file with folder

FOLDER

FOLDER

N

Y

N

Don’t copy

FOLDER

FOLDER

N

N

Y

Copy as a subfolder. Existing files will be skipped

FOLDER

FOLDER

Y

N

Y

Copy as subfolder, replacing any same-named files along the way.

FOLDER

FOLDER

x

Y

x

When the source is a folder a filter can also be used to control which files are copied. Internally filters are converted to FileSelector objects. If no filter is supplied, then the behaviour is the same as if Selectors.SELECT_ALL has been supplied. Filters return two results

  • Whether the file should be included

  • If the source file is a actually a folder, whether its descendants should be traversed

Filters can be one of the following:

  • Regex pattern - This can be a Pattern class or a String. This is matched against the basename of a source file. Traversal will always occur.

  • FileSelector

  • Ant-style pattern

Moving Files

Options

There are a number of options that affect how a move behaves. There options, with their defaults, are

vfs {
  mv url_src, url_dest,
    overwrite : false,
    smash : false,
    intermediates : true
}

From FileType

To FileType

Overwrite?

Smash?

Action

FILE

IMAGINARY

N

N

Create new file, delete old file

FILE

FILE

N

N

Don’t move

FILE

FILE

Y

N

Overwrite existing file with source, delete old file

FILE

FOLDER

N

N

Move file into folder except if same-name target file exists

FILE

FOLDER

Y

N

Move file into folder, replacing any existing same-name target file<

FILE

FOLDER

x

Y

Replace same-named folder with the source fie

FOLDER

IMAGINARY

N

N

Create new folder with content. Delete old folder

FOLDER

FILE

N

N

Don’t move

FOLDER

FILE

Y

N

Don’t move

FOLDER

FILE

x

Y

Replace file with folder

FOLDER

FOLDER

N

N

Move folder as a sub-folder of destination even if the target folder has the same name as the source folder. Fails if same-name target exists within the target folder.

FOLDER

FOLDER

Y

N

Move folder as a sub-folder of destination. Fails of same-name target exists and not empty.

FOLDER

FOLDER

x

When intermediates ia set to true (the default behaviour), non-existing intermediate subdirectories in the target path will be created. If set to false, a FileActionException will be raised if the target intermediate subdirectories do not exist.

The overwrite property can also be a closure of the following signature:

def overwrite = { FileObject from, FileObject to ->
  __groovy_truth__
}

The closure is passed the source object and the target object. If the closure returns <code>true</code> then target file will be replaced by the source file.

vfs {
  mv url_src, url_dest,
    overwrite : { f,t ->
      f.name.baseName.startsWith('IMG')
    }
}

Move operations in groovy-vfs has been available since v0.3

Protocol Options

TO BE COMPLETED…​

These are the options in VFS2 that are known at this point in time. They can be added to any url in groovy-vfs in the form vfs.PROTOCOL.OPTION=VALUE. When used in this form all values must be URL-encoded when appropriate.

These options can also be used in an options block i.e.

options {
  sftp {
    compression 'zlib'
  }
}

Protocol

Option

Supported

Detail

ftp

controlEncoding

Y

See FTP.setControlEncoding(java.lang.String)

ftp

dataTimeout

Y

Set data timeout

ftp

defaultDateFormat

Y

Set default date format used by server

ftp

entryParser

Y

FQCN of FileEntryParser used to parse directory listing, if the one from commons-net FTPFileEntryParserFactory is not used.

ftp

entryParserFactory

N

Set parser factory

ftp

passiveMode

Y

Sets passive mode true or active mode false

ftp

recentDateFormat

Y

See FTPClientConfig (TODO: Need link)

ftp

serverLanguageCPde

Y

Language code used by server

ftp

serverTimeZoneId

Y

See FTPClientConfig (TODO: need link)

ftp

shortMonthNames

N

See FTPClientConfig (TODO: need link) - might be supported in a future release

ftp

soTimeout

Y

Socket timeout. Use 0 or null (config DSL only) to remove any socket timeout

ftp

userDirIsRoot

Y

Sets the user directory as the root, instead of the filesystem

Protocol

Option

Supported

Detail

sftp

compression

Y

'zlib' or 'none'

sftp

identities

N

List of files containing identity files

sftp

knownHosts

Y

File containing the known_hosts file

sftp

preferredAuthentication

Y

String containing the authentication order

sftp

proxyHost

Y

Proxy host for sftp connection

sftp

proxyHost

Y

Proxy port for sftp connection

sftp

strictHostKeyChecking

Y

Host key checking to use. Accept 'yes', 'no' & 'ask'

sftp

timeout

Y

Timeout value for a session

sftp

userDirIsRoot

Y

Sets the user directory as the root, instead of the filesystem

sftp

userInfo

N

Userinfo instance

Providers

Adding Providers

The Extension DSL

As from v0.6 onwards it is possible to easily add providers once they have been written. It is no longer necessary to embed your own META-INF/vfs-providers.xml file in a jar.

To accomplish this the DSL has been extended with the extend keyword.

Scheme Providers

Providers are the most common use-case. In the simplest use case we add a provider with a scheme.

vfs {
    extend {
        provider className : 'org.apache.commons.vfs2.provider.gzip.GzipFileProvider',schemes : ['gz']
    }
}

There are two properties in this case:

+ className - The name of the provider class. The class must be on the classpath, otherwise the provider will not be loaded. + schemes - One of more schemes that can be use in a URI.

Sometimes we would like to only load a provider if some supporting classes are available. For this we use the dependsOnClasses keyword, which is a list of one or more classes. If any of these classes are not found the provider will not be loaded.

vfs {
    extend {
        provider className : 'org.apache.commons.vfs2.provider.tar.TarFileProvider',
          schemes : ['tar'],
          dependsOnClasses : ['org.apache.commons.vfs2.provider.tar.TarInputStream']
    }
}

We can also decide to only load a provider, if another provider is already present. For this we use the dependsOnSchemes keyword.

vfs {
    extend {
        provider className : 'org.apache.commons.vfs2.provider.tar.TarFileProvider',
          schemes : ['tgz'],
          dependsOnSchemes : ['tar','gz']
    }
}

It is important to note that Groovy VFS does not throw exceptions when providers cannot be loaded, but simply sends a debug log message. This is a design decision that matched that of Apache VFS.

Overriding the Default Provider

As from v0.6 the defaultProvider can no longer be used to set a defaultProvider during construction of a VFS. There correct way to do it is via the defaultProvider keyword.

vfs {
  extend {
    defaultProvider className : 'org.apache.commons.vfs2.provider.url.UrlFileProvider'
  }
}

Operation Providers

Adding an operation provider is similar to the basic use case of adding providers, except that the operationProvider keyword is used.

vfs {
    extend {
      operationProvider className : 'acl.AclOperationsProvider', schemes : ["s3","aws"]
    }
}

Adding MIME Type Maps and File Extension Maps

Maps can be added for MIME types with keyword mimeType or for file extensions with keyword ext. In both cases the first parameter will refer to the MIME Type or file extension and the second parameter to the scheme that it is applicable to.

vfs {
    extend {

      mimeType 'application/zip', 'zip'

      ext 'gzip', 'gz'

    }
}

Gradle Plugin

Bootstrap Gradle Plugin

From initiation a VFS object has been available as an extension to the project class. The interface is very experimental and may change without much warning in future releases of this plugin.

buildscript {
    repositories {
        jcenter()
        mavenCentral()
      }
      dependencies {
        classpath 'org.ysb33r.gradle:vfs-gradle-plugin:0.5.1'
        classpath 'commons-net:commons-net:3.+'  // If you want to use ftp
        classpath 'commons-httpclient:commons-httpclient:3.1' // If you want http/https
        classpath 'com.jcraft:jsch:0.1.48'  // If you want sftp
      }
}
apply plugin : 'org.ysb33r.vfs'

// Create a VFS task
task copyReadme << {
  vfs {
    cp 'https://raw.github.com/ysb33r/groovy-vfs/master/README.md', new File("${buildDir}/tmp/README.md")
  }
}

// it is also possible to update global options for vfs
vfs {
  options {
    http {
      maxTotalConnections 4
    }
  }
}

If you want to see what VFS is going run gradle with --debug

VfsCopy

In 1.0 a copy task has been added.

import org.ysb33r.gradle.vfs.tasks.VfsCopy

task download ( type : VfsCopy ) {

    from 'http://somewhere.example/file' (1)

    from 'http://somewhere.example/another_folder', {
      include '**/*.jpg' (2)
    }

    from 'ftp://somewhere..else.example/folder', {
      options filter ~/\.jpg$/  (3)
    }

    into new File(buildDir,'downloadedFolder') (4)

    options { (5)
      ftp {
        passiveMode true
      }
    }
}
1 Copy this files
2 Ant-style patterns are also supported
3 Standard Groovy-VFS filter behaviour is achived via options
4 Set the destination root URI. Files will be copied into that folder
5 Set options for all of the copy operations. These override global options, but are only applicable for source and destination URLs within this task

Up to date checks

Performing an up to date check for a variety of remote filesystems can be tricky. Therefore the following logic is followed in order to determine whether the task is out of date

  • The destination root does not exist

  • Any file from source does not exist in the destination hierarchy

  • The source file is newer than the destination (folder timestamps are not checked)

In beta-3, beta-4 input-output caching is not working as yet. #49. This was rectified in beta-5. IN was updated again in beta-7 #64.

Copy optimisations

As long as both the source and the destination filesystems (schemes) support the ability to check modification time, only files that are newer will be copied. If the modification time cannot be checked the copy will only occur if the destination does not exist.

Adding extra plugins

From v1.0 onwards additional plugins can be loaded via a new extend block. For more details see this gist: https://gist.github.com/ysb33r/9916940