This user guide explains how to start using DataSift in your software projects.
DataSift provides a set of simple predefined validations and transformations which
can be used in a direct manner, without configuration at all. This is the objective of
the DataSiftValidationUtils
and DataSiftTransformationUtils
classes. For example:
boolean isInteger = DataSiftValidationUtils.stringIsInteger("-1232"); Integer myInteger = DataSiftTransformationUtils.stringAsInteger("-1232");
Note, however, that this way of using DataSift makes very little use of its power, and could even introduce a certain overhead in your application if your needs are reduced to this kind of occasional and simplistic data validation.
Normal use of DataSift would mean instantiating an
org.auelproject.datasift.DataSift
object, adding specs, resolvers,
validators or transfomers to it if necessary (either via configuration files or
API calls) and using the configured instance to validate or transform the data we want
(the target objects).
... DataSift dataSift = new DataSift(); ValidationResult validationResult = dataSift.validate(myObject,"ValidationSpec"); if (validationResult.isSuccessful()) { ... } else { ... } ...
With this ValidationResult
object, we can:
isSuccessful()
.
getSuccessfulValidationNames()
and
getFailedValidationNames()
methods.
java.util.Map
object which
will have the names of the spec validator items as keys and a
Boolean
object as value for each entry.
... DataSift dataSift = new DataSift(); TransformationResult transformationResult = dataSift.transform(myObject,"TransformationSpec"); if (transformationResult.isSuccessful()) { Map resultsByItem = transformationResult.getResults(); ... } else { ... } ...
And with this TransformationResult
object, we can:
isSuccessful()
.
getSuccessfulTransformationNames()
and
getFailedTransformationNames()
methods.
java.util.Map
object which
will have the names of the spec transformation items as keys and the
results of the transformations as values. If a transformation item has failed,
the value in the map entry will be the generated
TransformationNotPossibleException
.
DataSift defines specs to determine the way in which an object will be processed during validation or transformation operations. These specs are structures with the following attributes:
BeanResolver
to access
the fields of a JavaBean). This resolver can optionally take a number of configuration
parameters.name
"
passed to the bean resolver will result in looking for a getName()
method
on the target object).
Both validation or transformation specs can be
programmatically built by instantiating an
org.auelproject.datasift.config.SpecConfig
object, or declaratively in XML like this:
<validation-spec resolver="BeanResolver"> <validation name="loginLength" validator="StringLength"> <config name="acceptNull" value="false"/> <config name="minLength" value="4"/> <config name="maxLength" value="15"/> <data name="data" selector="login"/> </validation> <validation name="nameNotNull" validator="StringNotEmpty"> <data name="data" selector="name"/> </validation> <validation name="ageNumber" validator="StringIsInteger"> <data name="data" selector="age"/> </validation> <validation name="childrenAgesIter" validator="ArrayIterator"> <config name="successCondition" value="AND"/> <data name="data" selector="childrenAges"/> </validation> <validation name="childreAgeIsInteger" validator="StringIsInteger" iterator="childrenAgesIter"> <data name="data" selector="value"/> </validation> </validation-spec>
Specs can be declared:
datasift-specs.xml
.
If, on initialization, DataSift finds a file called
datasift-specs.xml
in its classpath, the specs it may contain will be
loaded and made available for any DataSift
instance created
from then on.
The format of the datasift-specs.xml
file is like this (see also the
examples bundled with the software):
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE datasift-specs PUBLIC "-//AuelProject Datasift//DTD Datasift Specs Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-specs_1_0.dtd"> <datasift-specs> <validation-spec name="[anyName]" resolver="[anyResolver]"> <resolver-config> <config name="[anyResolverConfigParam]" value="[anyValue]"/> ... </resolver-config> <validation name="[anyName]" validator="[anyValidator]"> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> ... <data name="[anyName]" selector="[anyField]"/> ... </validation> <validation name="[anyName]" validator="[anyValidator]" iterator="[anyIteratorName]"> <data name="[anyName]" selector="[anyField]"/> </validation> ... ... </validation-spec> ... ... <transformation-spec name="[anyName]" resolver="[anyResolver]"> <resolver-config> <config name="[anyResolverConfigParam]" value="[anyValue]"/> ... </resolver-config> <transformation name="[anyName]" transformer="[anyTransformer]"> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> ... <data name="data" selector="[anyField]"/> ... </transformation> <transformation name="[anyName]" transformer="[anyTransformer]" iterator="[anyIteratorName]"> <data name="[anyName]" selector="[anyField]"/> </transformation> </transformation-spec> ... ... </datasift-specs>
At any moment, the user may ask a DataSift
instance to load and parse
(from the application classpath) a specs file created by him/her with the same
structure as the default specs file. This will be done with:
DataSift dataSift = new DataSift(); dataSift.configureSpecsFromFile("mySpecsFile.xml");
...which will result in the specs defined in that file being registered and made accessible for any further use of this DataSift instance.
Besides registering specs via the datasift-specs.xml
or any other files
defined by the user, when DataSift is asked to use a spec that it has not previously
registered, it will try to load it from a single-spec file in the classpath with
the same name of the requested spec. For example, if we executed the following code:
DataSift dataSift = new DataSift(); ValidationResult result = dataSift.validate(myObject,"ANewSpec");
...and the spec called "ANewSpec" had not been previously registered in that
DataSift instance (in the default file, in this case), the framework would try to load a file
called ANewSpec.dsv.xml
from the classpath, parse it, and
register the validation spec it contains under the name "AnewSpec
"
for further use.
The file extension for automatically loaded files is .dsv.xml
for
validation spec files, and .dst.xml
for the transformation spec ones.
These single spec files have a very similar format to the previously seen, except for the lack of a spec name (the file name, without the extension will be used) and the fact that they can contain only one validation or transformation spec per file. This is the format for a single validation spec file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE validation-spec PUBLIC "-//AuelProject Datasift//DTD Datasift Validation Spec Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-validation-spec_1_0.dtd"> <validation-spec resolver="[anyResolver]"> <resolver-config> <config name="[anyResolverConfigParam]" value="[anyValue]"/> ... </resolver-config> <validation name="[anyName]" validator="[anyValidator]"> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> ... <data name="[anyName]" selector="[anyField]"/> </validation> <validation name="[anyName]" validator="[anyValidator]" iterator="[anyIteratorName]"> <data name="[anyName]" selector="[anyField]"/> </validation> </validation-spec>
...and for the single transformation spec files:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE transformation-spec PUBLIC "-//AuelProject Datasift//DTD Datasift Transformation Spec Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-transformation-spec_1_0.dtd"> <transformation-spec resolver="[anyResolver]"> <resolver-config> <config name="[anyResolverConfigParam]" value="[anyValue]"/> ... </resolver-config> <transformation name="[anyName]" transformer="[anyTransformer]"> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> <config name="[anyTransformerConfigParam]" value="[anyValue]"/> ... <data name="data" selector="[anyField]"/> </transformation> <transformation name="[anyName]" transformer="[anyTransformer]" iterator="[anyIteratorName]"> <data name="[anyName]" selector="[anyField]"/> </transformation> </transformation-spec>
But automatically loading requested specs is not where this feature ends, as DataSift is able to load specs defined on a by-class manner, this is, specs defined to validate or transform objects of a specific class.
What this means is that, when called without specifying a spec name, DataSift will take the target object, examine its class, and look for a spec defined for that class in an XML file. For example, having the following code:
TransformationResult result = dataSift.transform(myObject);
...and being myObject
an instance of class a.b.c.MyClass
,
DataSift will automatically look in its classpath for a file called
a/b/c/MyClass.dst.xml
and, if it exists, will parse it, register its spec
and use it to transform myObject
. The same will be appliable to validation
operations.
Users can also add validation or transformation specs to a DataSift instance
without configuration files, by creating a SpecConfig
object and
adding it to the DataSift:
SpecConfig specConfig = new SpecConfig(); specConfig.setName("mySpec"); specConfig.setResolver("BeanResolver"); SpecItemConfig specItemConfig = new SpecItemConfig(); specItemConfig.setName("uniqueItem"); specItemConfig.setEntityName("StringIsInteger"); specItemConfig.addDataParamConfig( EntityUtils.SINGLE_DATA_PARAMETER_NAME, "age"); specConfig.addSpecItemConfig(specItemConfig); dataSift.addValidationSpecConfig(specConfig);
DataSift provides a set of predefined resolvers that the developer can use without needing to
define them in any configuration file. For example, BeanResolver
(to
get data from javabeans) or StringResolver
(to use String objects as targets).
All the predefined resolvers and their configuration can be found in the reference documentation in the Predefined resolvers page.
At startup, DataSift will look in its classpath for a file named
datasift-resolvers.xml
, which may contain resolver declarations, and load
these declarations, making them available to all DataSift instances created from then
on.
This default configuration file would look like this:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE datasift-entities PUBLIC "-//AuelProject Datasift//DTD Datasift Entities Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-entities_1_0.dtd"> <datasift-entities> <resolvers> <resolver name="[anyName]" className="[anyClassName]"> <config name="[anyParameter]" value="[anyDefaultValue]"> <config name="[anyParameter]" value="[anyDefaultValue]"> ... </resolver> <resolver name="[anyName]" className="[anyClassName]"/> <resolver name="[anyName]" className="[anyClassName]"/> </resolvers> </datasift-entities>
Note that resolvers can receive configuration parameters both here in the resolvers declaration files and later in the specs declaration files. This allows some of the resolver parameters to be defined in a global manner in the resolver declaration files and some others be defined in a per-spec manner in the spec declaration files.
In a similar way to what happened with specs, the user may ask a DataSift instance in any moment to load a configuration file with the same structure as the default one but other name, being the resolvers declared there registered only in that instance of DataSift.
DataSift dataSift = new DataSift(); dataSift.configureResolversFromFile("myResolversFile.xml");
Resolvers can also be programmatically registered in a DataSift instance by creating an
object of class org.auelproject.datasift.config.ResolverConfig
, in a way
similar to:
ResolverConfig resolverConfig = new ResolverConfig(); resolverConfig.setName("MyResolver"); resolverConfig.setClassName("mypackage.MyResolverClass"); resolverConfig.addConfigParam("paramOne","aValue"); resolverConfig.addConfigParam("paramTwo","anotherValue"); dataSift.addResolverConfig(resolverConfig);
As happens to resolvers, the framework provides a set of predefined validators and
transformers that can be used out-of-the-box, without configuration at all. For
example, validators like StringIsInteger
(check if a String contains
an integer number), StringLength
(check if the length of a String
is between a minimum and a maximum) or StringNotEmpty
. Also transformers
like StringAsInteger
(transforms a String object into an Integer).
All the predefined validators and transformers and their configurations can be found in the reference documentation in the Predefined validators or Predefined transformers page.
Validators and transformers can also be declared in XML files, and there are default configuration files that are read (and their entities registered) at startup.
These default files are named datasift-validators.xml
for validators and
datasift-resolvers.xml
for resolvers. The validators default configuration
file may would look like this:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE datasift-entities PUBLIC "-//AuelProject Datasift//DTD Datasift Entities Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-entities_1_0.dtd"> <datasift-entities> <validators> <validator name="[anyName]" className="[anyClassName]"> <config name="[anyParameter]" value="[anyDefaultValue]"> <config name="[anyParameter]" value="[anyDefaultValue]"> ... </validator> <validator name="[anyName]" className="[anyClassName]"/> <validator name="[anyName]" className="[anyClassName]"/> </validators> </datasift-entities>
...and the default transformers file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE datasift-entities PUBLIC "-//AuelProject Datasift//DTD Datasift Entities Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-entities_1_0.dtd"> <datasift-entities> <transformers> <transformer name="[anyName]" className="[anyClassName]"> <config name="[anyParameter]" value="[anyDefaultValue]"> <config name="[anyParameter]" value="[anyDefaultValue]"> ... </transformer> <transformer name="[anyName]" className="[anyClassName]"/> <transformer name="[anyName]" className="[anyClassName]"/> </transformers> </datasift-entities>
At any moment the user can ask DataSift to load a file declaring new validators or transformers with:
DataSift dataSift = new DataSift(); dataSift.configureValidatorsFromFile("myValidatorsFile.xml"); dataSift.configureTransformersFromFile("myTransformersFile.xml");
But these files that register validators and tranformers (in fact, also resolvers) can also be mixed, so that the same file can be used by the developer to declare new resolvers, validators and transformers.
For example, we could create a file called myEntities.xml
with the following
contents:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE datasift-entities PUBLIC "-//AuelProject Datasift//DTD Datasift Entities Configuration 1.0//EN" "http://www.datasift.org/dtds/datasift-entities_1_0.dtd"> <datasift-entities> <resolvers> <resolver name="[anyName]" className="[anyClassName]"/> <resolver name="[anyName]" className="[anyClassName]"/> </resolvers> <validators> <validator name="[anyName]" className="[anyClassName]"/> <validator name="[anyName]" className="[anyClassName]"/> </validators> <transformers> <transformer name="[anyName]" className="[anyClassName]"/> <transformer name="[anyName]" className="[anyClassName]"/> </transformer> </datasift-entities>
...and then load its entities with:
DataSift dataSift = new DataSift(); dataSift.configureResolversFromFile("myEntities.xml"); dataSift.configureValidatorsFromFile("myEntities.xml"); dataSift.configureTransformersFromFile("myEntities.xml");
Finally, validators and transformers can be added to a DataSift instance the same
way as resolvers do. For this,
org.auelproject.datasift.config.ValidatorConfig
or
org.auelproject.datasift.config.TransformerConfig
objects should be
created and initialized, and later passed to the DataSift.
ValidatorConfig validatorConfig = new ValidatorConfig(); validatorConfig.setName("MyValidator"); validatorConfig.setClassName("mypackage.MyValidatorClass"); validatorConfig.addConfigParam("paramOne","aValue"); validatorConfig.addConfigParam("paramTwo","anotherValue"); dataSift.addValidatorConfig(validatorConfig);
TransformerConfig transformerConfig = new TransformerConfig(); transformerConfig.setName("MyTransformer"); transformerConfig.setClassName("mypackage.MyTransformerClass"); transformerConfig.addConfigParam("paramOne","aValue"); transformerConfig.addConfigParam("paramTwo","anotherValue"); dataSift.addTransformerConfig(transformerConfig);