Table of content
Toolbox Description
Motivation
MatLab does not naturally support sets and regular vectors are usually used to store sets. Even though standard MatLab set routins return proper sets (vectors are sorted and each value does not appear more than once), they do not assume that their inputs fulfil this conditions since users can, in principle, manipulate sets between successive calls to set routines.
This safe approach is very unfortunate for those of us who work with large sets.
However, users performing operations with large sets might never alter these sets manualy or easily ensure that they remain sorted and unique at all times. If this is the case, set operations could be optimized to take advantage of these properties. That's what FastSet is about.
Organization
The FastSet toolbox consists of five mex (compiled C++) files which cover all basic set operations. These functions almost perfectly compatible with the standard MatLab's routines, so that your lerning curve will be negligable while you'll benefit from noticable, up to 10-fold performance increase.
The five functions are:
-
fast_intersect_sorted
-
fast_union_sorted
-
fast_setdiff_sorted
-
fast_setxor_sorted
-
fast_ismember_sorted
FastSet toolbox have been implemented in C++ and compiled as mex-files . One disadvantage of mex-files is that they do not contain help. Therefore, each is accompanied with a corresponding m-file containig a function prototype and detailed help. MatLab help system knows to extract help from these m-files (when "help fast_union_sorted" is called from command prompt), while MatLab engine knows to call the mex-file when it is called for execution.
Since mex-files are libraries called by MatLab engine, they have to fit the MatLab version you have. The ones I provide will work with any 7.x version on Windows. I support both 32 and 64-bits versions by providing .mexw32 and .mexw64 mex-files for each method.
Install
Install
Installation of the toolbox couldn't be easier. Download the library, unpack the archive into a directory on your computer and add that directory to MatLab's path (File->Set Path... in MatLab menu). Don't forget to save the new path (otherwize it will revet to the original in the next MatLab session).
Notations
Ma
Limitations
-
The major FastSet limitation arises from the assumption which happens to provide the greates efficiency gain. All FastSet routines assume that
their inputs are sorted and unique vectors. They will fail if any of the inputs does not satisfy this assumption. FastSet was designed to be as efficient as possible, hence the inputs are not even verified. If you are not sure whether
your data is sorted and unique, pass them through the MatLab's standard unique
function.
The good news is that it is usually not easy to violate this requirement. All set functions (whether they are standard MatLab's set routines of FastSet's ones) return valid sets. So, porting existing code is quite simple. - FastSet routines support any numeric format - 8,16, 32 and 64 -bit integers as well as 64 and 80-bit floating point numbers. However, at this point they do support string arrays (cell arrays of strings). Let me know if you really need them and I'll consider implementing this feature as well.
- Another incompatibility to the standard MatLab routins is that FastSet functions do not accept the optional (and rarely used) 'row' parameter.
Download
I have compiled and tested FastSet on both 32 and 64-bit MS Windows versions of MatLabThe toolbox comes with a source code, so if any of you, guys cares to build the toolbox for other OS, please, send me the binaries and I'll put them here as well. Porting to other systems should be straightforward and I'm ready to provide whatever assistence you may require.
Download FastSet V1.0 Windows 32 & 64 bits
