| java.lang.Object weka.filters.Filter weka.filters.supervised.attribute.Discretize
Discretize | public class Discretize extends Filter implements SupervisedFilter,OptionHandler,WeightedInstancesHandler,TechnicalInformationHandler(Code) | |
An instance filter that discretizes a range of numeric attributes in the dataset into nominal attributes. Discretization is by Fayyad & Irani's MDL method (the default).
For more information, see:
Usama M. Fayyad, Keki B. Irani: Multi-interval discretization of continuousvalued attributes for classification learning. In: Thirteenth International Joint Conference on Articial Intelligence, 1022-1027, 1993.
Igor Kononenko: On Biases in Estimating Multi-Valued Attributes. In: 14th International Joint Conference on Articial Intelligence, 1034-1040, 1995.
BibTeX:
@inproceedings{Fayyad1993,
author = {Usama M. Fayyad and Keki B. Irani},
booktitle = {Thirteenth International Joint Conference on Articial Intelligence},
pages = {1022-1027},
publisher = {Morgan Kaufmann Publishers},
title = {Multi-interval discretization of continuousvalued attributes for classification learning},
volume = {2},
year = {1993}
}
@inproceedings{Kononenko1995,
author = {Igor Kononenko},
booktitle = {14th International Joint Conference on Articial Intelligence},
pages = {1034-1040},
title = {On Biases in Estimating Multi-Valued Attributes},
year = {1995},
PS = {http://ai.fri.uni-lj.si/papers/kononenko95-ijcai.ps.gz}
}
Valid options are:
-R <col1,col2-col4,...>
Specifies list of columns to Discretize. First and last are valid indexes.
(default none)
-V
Invert matching sense of column indexes.
-D
Output binary attributes for discretized attributes.
-E
Use better encoding of split point for MDL.
-K
Use Kononenko's MDL criterion.
author: Len Trigg (trigg@cs.waikato.ac.nz) author: Eibe Frank (eibe@cs.waikato.ac.nz) version: $Revision: 1.7 $ |
m_CutPoints | protected double[][] m_CutPoints(Code) | | Store the current cutpoints
|
m_DiscretizeCols | protected Range m_DiscretizeCols(Code) | | Stores which columns to Discretize
|
m_MakeBinary | protected boolean m_MakeBinary(Code) | | Output binary attributes for discretized attributes.
|
m_UseBetterEncoding | protected boolean m_UseBetterEncoding(Code) | | Use better encoding of split point for MDL.
|
m_UseKononenko | protected boolean m_UseKononenko(Code) | | Use Kononenko's MDL criterion instead of Fayyad et al.'s
|
serialVersionUID | final static long serialVersionUID(Code) | | for serialization
|
Discretize | public Discretize()(Code) | | Constructor - initialises the filter
|
attributeIndicesTipText | public String attributeIndicesTipText()(Code) | | Returns the tip text for this property
tip text for this property suitable fordisplaying in the explorer/experimenter gui |
batchFinished | public boolean batchFinished()(Code) | | Signifies that this batch of input to the filter is finished. If the
filter requires all instances prior to filtering, output() may now
be called to retrieve the filtered instances.
true if there are instances pending output throws: IllegalStateException - if no input structure has been defined |
calculateCutPoints | protected void calculateCutPoints()(Code) | | Generate the cutpoints for each attribute
|
calculateCutPointsByMDL | protected void calculateCutPointsByMDL(int index, Instances data)(Code) | | Set cutpoints for a single attribute using MDL.
Parameters: index - the index of the attribute to set cutpoints for Parameters: data - the data to work with |
convertInstance | protected void convertInstance(Instance instance)(Code) | | Convert a single instance over. The converted instance is added to
the end of the output queue.
Parameters: instance - the instance to convert |
getAttributeIndices | public String getAttributeIndices()(Code) | | Gets the current range selection
a string containing a comma separated list of ranges |
getCapabilities | public Capabilities getCapabilities()(Code) | | Returns the Capabilities of this filter.
the capabilities of this object See Also: Capabilities |
getCutPoints | public double[] getCutPoints(int attributeIndex)(Code) | | Gets the cut points for an attribute
Parameters: attributeIndex - the index (from 0) of the attribute to get the cut points of an array containing the cutpoints (or null if theattribute requested isn't being Discretized |
getInvertSelection | public boolean getInvertSelection()(Code) | | Gets whether the supplied columns are to be removed or kept
true if the supplied columns will be kept |
getMakeBinary | public boolean getMakeBinary()(Code) | | Gets whether binary attributes should be made for discretized ones.
true if attributes will be binarized |
getOptions | public String[] getOptions()(Code) | | Gets the current settings of the filter.
an array of strings suitable for passing to setOptions |
getTechnicalInformation | public TechnicalInformation getTechnicalInformation()(Code) | | Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
the technical information about this class |
getUseBetterEncoding | public boolean getUseBetterEncoding()(Code) | | Gets whether better encoding is to be used for MDL.
true if the better MDL encoding will be used |
getUseKononenko | public boolean getUseKononenko()(Code) | | Gets whether Kononenko's MDL criterion is to be used.
true if Kononenko's criterion will be used. |
globalInfo | public String globalInfo()(Code) | | Returns a string describing this filter
a description of the filter suitable fordisplaying in the explorer/experimenter gui |
input | public boolean input(Instance instance)(Code) | | Input an instance for filtering. Ordinarily the instance is processed
and made available for output immediately. Some filters require all
instances be read before producing output.
Parameters: instance - the input instance true if the filtered instance may now becollected with output(). throws: IllegalStateException - if no input format has been defined. |
invertSelectionTipText | public String invertSelectionTipText()(Code) | | Returns the tip text for this property
tip text for this property suitable fordisplaying in the explorer/experimenter gui |
listOptions | public Enumeration listOptions()(Code) | | Gets an enumeration describing the available options.
an enumeration of all the available options. |
main | public static void main(String[] argv)(Code) | | Main method for testing this class.
Parameters: argv - should contain arguments to the filter: use -h for help |
makeBinaryTipText | public String makeBinaryTipText()(Code) | | Returns the tip text for this property
tip text for this property suitable fordisplaying in the explorer/experimenter gui |
setAttributeIndices | public void setAttributeIndices(String rangeList)(Code) | | Sets which attributes are to be Discretized (only numeric
attributes among the selection will be Discretized).
Parameters: rangeList - a string representing the list of attributes. Sincethe string will typically come from a user, attributes are indexed from1. eg: first-3,5,6-last throws: IllegalArgumentException - if an invalid range list is supplied |
setAttributeIndicesArray | public void setAttributeIndicesArray(int[] attributes)(Code) | | Sets which attributes are to be Discretized (only numeric
attributes among the selection will be Discretized).
Parameters: attributes - an array containing indexes of attributes to Discretize.Since the array will typically come from a program, attributes are indexedfrom 0. throws: IllegalArgumentException - if an invalid set of rangesis supplied |
setInputFormat | public boolean setInputFormat(Instances instanceInfo) throws Exception(Code) | | Sets the format of the input instances.
Parameters: instanceInfo - an Instances object containing the input instancestructure (any instances contained in the object are ignored - only thestructure is required). true if the outputFormat may be collected immediately throws: Exception - if the input format can't be set successfully |
setInvertSelection | public void setInvertSelection(boolean invert)(Code) | | Sets whether selected columns should be removed or kept. If true the
selected columns are kept and unselected columns are deleted. If false
selected columns are deleted and unselected columns are kept.
Parameters: invert - the new invert setting |
setMakeBinary | public void setMakeBinary(boolean makeBinary)(Code) | | Sets whether binary attributes should be made for discretized ones.
Parameters: makeBinary - if binary attributes are to be made |
setOptions | public void setOptions(String[] options) throws Exception(Code) | | Parses a given list of options.
Valid options are:
-R <col1,col2-col4,...>
Specifies list of columns to Discretize. First and last are valid indexes.
(default none)
-V
Invert matching sense of column indexes.
-D
Output binary attributes for discretized attributes.
-E
Use better encoding of split point for MDL.
-K
Use Kononenko's MDL criterion.
Parameters: options - the list of options as an array of strings throws: Exception - if an option is not supported |
setOutputFormat | protected void setOutputFormat()(Code) | | Set the output format. Takes the currently defined cutpoints and
m_InputFormat and calls setOutputFormat(Instances) appropriately.
|
setUseBetterEncoding | public void setUseBetterEncoding(boolean useBetterEncoding)(Code) | | Sets whether better encoding is to be used for MDL.
Parameters: useBetterEncoding - true if better encoding to be used. |
setUseKononenko | public void setUseKononenko(boolean useKon)(Code) | | Sets whether Kononenko's MDL criterion is to be used.
Parameters: useKon - true if Kononenko's one is to be used |
useBetterEncodingTipText | public String useBetterEncodingTipText()(Code) | | Returns the tip text for this property
tip text for this property suitable fordisplaying in the explorer/experimenter gui |
useKononenkoTipText | public String useKononenkoTipText()(Code) | | Returns the tip text for this property
tip text for this property suitable fordisplaying in the explorer/experimenter gui |
Methods inherited from weka.filters.Filter | public static void batchFilterFile(Filter filter, String[] options) throws Exception(Code)(Java Doc) public boolean batchFinished() throws Exception(Code)(Java Doc) protected void bufferInput(Instance instance)(Code)(Java Doc) protected void copyValues(Instance instance, boolean isInput)(Code)(Java Doc) protected void copyValues(Instance instance, boolean instSrcCompat, Instances srcDataset, Instances destDataset)(Code)(Java Doc) public static void filterFile(Filter filter, String[] options) throws Exception(Code)(Java Doc) protected void flushInput()(Code)(Java Doc) public Capabilities getCapabilities()(Code)(Java Doc) public Capabilities getCapabilities(Instances data)(Code)(Java Doc) protected Instances getInputFormat()(Code)(Java Doc) public Instances getOutputFormat()(Code)(Java Doc) protected void initInputLocators(Instances data, int[] indices)(Code)(Java Doc) protected void initOutputLocators(Instances data, int[] indices)(Code)(Java Doc) public boolean input(Instance instance) throws Exception(Code)(Java Doc) protected Instances inputFormatPeek()(Code)(Java Doc) public boolean isFirstBatchDone()(Code)(Java Doc) public boolean isNewBatch()(Code)(Java Doc) public boolean isOutputFormatDefined()(Code)(Java Doc) public static void main(String[] args)(Code)(Java Doc) public static Filter[] makeCopies(Filter model, int num) throws Exception(Code)(Java Doc) public static Filter makeCopy(Filter model) throws Exception(Code)(Java Doc) public int numPendingOutput()(Code)(Java Doc) public Instance output()(Code)(Java Doc) protected Instances outputFormatPeek()(Code)(Java Doc) public Instance outputPeek()(Code)(Java Doc) protected void push(Instance instance)(Code)(Java Doc) protected void resetQueue()(Code)(Java Doc) protected static void runFilter(Filter filter, String[] options)(Code)(Java Doc) public boolean setInputFormat(Instances instanceInfo) throws Exception(Code)(Java Doc) protected void setOutputFormat(Instances outputFormat)(Code)(Java Doc) protected void testInputFormat(Instances instanceInfo) throws Exception(Code)(Java Doc) public static Instances useFilter(Instances data, Filter filter) throws Exception(Code)(Java Doc)
|
|
|