Source Code Cross Referenced for SVMAttributeEval.java in  » Science » weka » weka » attributeSelection » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Science » weka » weka.attributeSelection 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


001:        /*
002:         *    This program is free software; you can redistribute it and/or modify
003:         *    it under the terms of the GNU General Public License as published by
004:         *    the Free Software Foundation; either version 2 of the License, or
005:         *    (at your option) any later version.
006:         *
007:         *    This program is distributed in the hope that it will be useful,
008:         *    but WITHOUT ANY WARRANTY; without even the implied warranty of
009:         *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
010:         *    GNU General Public License for more details.
011:         *
012:         *    You should have received a copy of the GNU General Public License
013:         *    along with this program; if not, write to the Free Software
014:         *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
015:         */
016:
017:        /*
018:         *    SVMAttributeEval.java
019:         *    Copyright (C) 2002 University of Waikato, Hamilton, New Zealand
020:         *
021:         */
022:
023:        package weka.attributeSelection;
024:
025:        import weka.classifiers.functions.SMO;
026:        import weka.core.Capabilities;
027:        import weka.core.Instances;
028:        import weka.core.Option;
029:        import weka.core.OptionHandler;
030:        import weka.core.SelectedTag;
031:        import weka.core.TechnicalInformation;
032:        import weka.core.TechnicalInformationHandler;
033:        import weka.core.Utils;
034:        import weka.core.Capabilities.Capability;
035:        import weka.core.TechnicalInformation.Field;
036:        import weka.core.TechnicalInformation.Type;
037:        import weka.filters.Filter;
038:        import weka.filters.unsupervised.attribute.MakeIndicator;
039:
040:        import java.util.ArrayList;
041:        import java.util.Enumeration;
042:        import java.util.Iterator;
043:        import java.util.Vector;
044:
045:        /** 
046:         <!-- globalinfo-start -->
047:         * SVMAttributeEval :<br/>
048:         * <br/>
049:         * Evaluates the worth of an attribute by using an SVM classifier. Attributes are ranked by the square of the weight assigned by the SVM. Attribute selection for multiclass problems is handled by ranking attributes for each class seperately using a one-vs-all method and then "dealing" from the top of each pile to give a final ranking.<br/>
050:         * <br/>
051:         * For more information see:<br/>
052:         * <br/>
053:         * I. Guyon, J. Weston, S. Barnhill, V. Vapnik (2002). Gene selection for cancer classification using support vector machines. Machine Learning. 46:389-422.
054:         * <p/>
055:         <!-- globalinfo-end -->
056:         *
057:         <!-- technical-bibtex-start -->
058:         * BibTeX:
059:         * <pre>
060:         * &#64;article{Guyon2002,
061:         *    author = {I. Guyon and J. Weston and S. Barnhill and V. Vapnik},
062:         *    journal = {Machine Learning},
063:         *    pages = {389-422},
064:         *    title = {Gene selection for cancer classification using support vector machines},
065:         *    volume = {46},
066:         *    year = {2002}
067:         * }
068:         * </pre>
069:         * <p/>
070:         <!-- technical-bibtex-end -->
071:         *
072:         <!-- options-start -->
073:         * Valid options are: <p/>
074:         * 
075:         * <pre> -X &lt;constant rate of elimination&gt;
076:         *  Specify the constant rate of attribute
077:         *  elimination per invocation of
078:         *  the support vector machine.
079:         *  Default = 1.</pre>
080:         * 
081:         * <pre> -Y &lt;percent rate of elimination&gt;
082:         *  Specify the percentage rate of attributes to
083:         *  elimination per invocation of
084:         *  the support vector machine.
085:         *  Trumps constant rate (above threshold).
086:         *  Default = 0.</pre>
087:         * 
088:         * <pre> -Z &lt;threshold for percent elimination&gt;
089:         *  Specify the threshold below which 
090:         *  percentage attribute elimination
091:         *  reverts to the constant method.</pre>
092:         * 
093:         * <pre> -P &lt;epsilon&gt;
094:         *  Specify the value of P (epsilon
095:         *  parameter) to pass on to the
096:         *  support vector machine.
097:         *  Default = 1.0e-25</pre>
098:         * 
099:         * <pre> -T &lt;tolerance&gt;
100:         *  Specify the value of T (tolerance
101:         *  parameter) to pass on to the
102:         *  support vector machine.
103:         *  Default = 1.0e-10</pre>
104:         * 
105:         * <pre> -C &lt;complexity&gt;
106:         *  Specify the value of C (complexity
107:         *  parameter) to pass on to the
108:         *  support vector machine.
109:         *  Default = 1.0</pre>
110:         * 
111:         * <pre> -N
112:         *  Whether the SVM should 0=normalize/1=standardize/2=neither.
113:         *  (default 0=normalize)</pre>
114:         * 
115:         <!-- options-end -->
116:         *
117:         * @author Eibe Frank (eibe@cs.waikato.ac.nz)
118:         * @author Mark Hall (mhall@cs.waikato.ac.nz)
119:         * @author Kieran Holland
120:         * @version $Revision: 1.26 $
121:         */
122:        public class SVMAttributeEval extends AttributeEvaluator implements 
123:                OptionHandler, TechnicalInformationHandler {
124:
125:            /** for serialization */
126:            static final long serialVersionUID = -6489975709033967447L;
127:
128:            /** The attribute scores */
129:            private double[] m_attScores;
130:
131:            /** Constant rate of attribute elimination per iteration */
132:            private int m_numToEliminate = 1;
133:
134:            /** Percentage rate of attribute elimination, trumps constant
135:                rate (above threshold), ignored if = 0  */
136:            private int m_percentToEliminate = 0;
137:
138:            /** Threshold below which percent elimination switches to
139:                constant elimination */
140:            private int m_percentThreshold = 0;
141:
142:            /** Complexity parameter to pass on to SMO */
143:            private double m_smoCParameter = 1.0;
144:
145:            /** Tolerance parameter to pass on to SMO */
146:            private double m_smoTParameter = 1.0e-10;
147:
148:            /** Epsilon parameter to pass on to SMO */
149:            private double m_smoPParameter = 1.0e-25;
150:
151:            /** Filter parameter to pass on to SMO */
152:            private int m_smoFilterType = 0;
153:
154:            /**
155:             * Returns a string describing this attribute evaluator
156:             * @return a description of the evaluator suitable for
157:             * displaying in the explorer/experimenter gui
158:             */
159:            public String globalInfo() {
160:                return "SVMAttributeEval :\n\nEvaluates the worth of an attribute by "
161:                        + "using an SVM classifier. Attributes are ranked by the square of the "
162:                        + "weight assigned by the SVM. Attribute selection for multiclass "
163:                        + "problems is handled by ranking attributes for each class seperately "
164:                        + "using a one-vs-all method and then \"dealing\" from the top of "
165:                        + "each pile to give a final ranking.\n\n"
166:                        + "For more information see:\n\n"
167:                        + getTechnicalInformation().toString();
168:            }
169:
170:            /**
171:             * Returns an instance of a TechnicalInformation object, containing 
172:             * detailed information about the technical background of this class,
173:             * e.g., paper reference or book this class is based on.
174:             * 
175:             * @return the technical information about this class
176:             */
177:            public TechnicalInformation getTechnicalInformation() {
178:                TechnicalInformation result;
179:
180:                result = new TechnicalInformation(Type.ARTICLE);
181:                result.setValue(Field.AUTHOR,
182:                        "I. Guyon and J. Weston and S. Barnhill and V. Vapnik");
183:                result.setValue(Field.YEAR, "2002");
184:                result
185:                        .setValue(Field.TITLE,
186:                                "Gene selection for cancer classification using support vector machines");
187:                result.setValue(Field.JOURNAL, "Machine Learning");
188:                result.setValue(Field.VOLUME, "46");
189:                result.setValue(Field.PAGES, "389-422");
190:
191:                return result;
192:            }
193:
194:            /**
195:             * Constructor
196:             */
197:            public SVMAttributeEval() {
198:                resetOptions();
199:            }
200:
201:            /**
202:             * Returns an enumeration describing all the available options
203:             *
204:             * @return an enumeration of options
205:             */
206:            public Enumeration listOptions() {
207:                Vector newVector = new Vector(4);
208:
209:                newVector.addElement(new Option(
210:                        "\tSpecify the constant rate of attribute\n"
211:                                + "\telimination per invocation of\n"
212:                                + "\tthe support vector machine.\n"
213:                                + "\tDefault = 1.", "X", 1,
214:                        "-X <constant rate of elimination>"));
215:
216:                newVector.addElement(new Option(
217:                        "\tSpecify the percentage rate of attributes to\n"
218:                                + "\telimination per invocation of\n"
219:                                + "\tthe support vector machine.\n"
220:                                + "\tTrumps constant rate (above threshold).\n"
221:                                + "\tDefault = 0.", "Y", 1,
222:                        "-Y <percent rate of elimination>"));
223:
224:                newVector.addElement(new Option(
225:                        "\tSpecify the threshold below which \n"
226:                                + "\tpercentage attribute elimination\n"
227:                                + "\treverts to the constant method.", "Z", 1,
228:                        "-Z <threshold for percent elimination>"));
229:
230:                newVector
231:                        .addElement(new Option(
232:                                "\tSpecify the value of P (epsilon\n"
233:                                        + "\tparameter) to pass on to the\n"
234:                                        + "\tsupport vector machine.\n"
235:                                        + "\tDefault = 1.0e-25", "P", 1,
236:                                "-P <epsilon>"));
237:
238:                newVector.addElement(new Option(
239:                        "\tSpecify the value of T (tolerance\n"
240:                                + "\tparameter) to pass on to the\n"
241:                                + "\tsupport vector machine.\n"
242:                                + "\tDefault = 1.0e-10", "T", 1,
243:                        "-T <tolerance>"));
244:
245:                newVector
246:                        .addElement(new Option(
247:                                "\tSpecify the value of C (complexity\n"
248:                                        + "\tparameter) to pass on to the\n"
249:                                        + "\tsupport vector machine.\n"
250:                                        + "\tDefault = 1.0", "C", 1,
251:                                "-C <complexity>"));
252:
253:                newVector.addElement(new Option("\tWhether the SVM should "
254:                        + "0=normalize/1=standardize/2=neither.\n"
255:                        + "\t(default 0=normalize)", "N", 1, "-N"));
256:
257:                return newVector.elements();
258:            }
259:
260:            /**
261:             * Parses a given list of options. <p/>
262:             *
263:             <!-- options-start -->
264:             * Valid options are: <p/>
265:             * 
266:             * <pre> -X &lt;constant rate of elimination&gt;
267:             *  Specify the constant rate of attribute
268:             *  elimination per invocation of
269:             *  the support vector machine.
270:             *  Default = 1.</pre>
271:             * 
272:             * <pre> -Y &lt;percent rate of elimination&gt;
273:             *  Specify the percentage rate of attributes to
274:             *  elimination per invocation of
275:             *  the support vector machine.
276:             *  Trumps constant rate (above threshold).
277:             *  Default = 0.</pre>
278:             * 
279:             * <pre> -Z &lt;threshold for percent elimination&gt;
280:             *  Specify the threshold below which 
281:             *  percentage attribute elimination
282:             *  reverts to the constant method.</pre>
283:             * 
284:             * <pre> -P &lt;epsilon&gt;
285:             *  Specify the value of P (epsilon
286:             *  parameter) to pass on to the
287:             *  support vector machine.
288:             *  Default = 1.0e-25</pre>
289:             * 
290:             * <pre> -T &lt;tolerance&gt;
291:             *  Specify the value of T (tolerance
292:             *  parameter) to pass on to the
293:             *  support vector machine.
294:             *  Default = 1.0e-10</pre>
295:             * 
296:             * <pre> -C &lt;complexity&gt;
297:             *  Specify the value of C (complexity
298:             *  parameter) to pass on to the
299:             *  support vector machine.
300:             *  Default = 1.0</pre>
301:             * 
302:             * <pre> -N
303:             *  Whether the SVM should 0=normalize/1=standardize/2=neither.
304:             *  (default 0=normalize)</pre>
305:             * 
306:             <!-- options-end -->
307:             *
308:             * @param options the list of options as an array of strings
309:             * @throws Exception if an error occurs
310:             */
311:            public void setOptions(String[] options) throws Exception {
312:                String optionString;
313:
314:                optionString = Utils.getOption('X', options);
315:                if (optionString.length() != 0) {
316:                    setAttsToEliminatePerIteration(Integer
317:                            .parseInt(optionString));
318:                }
319:
320:                optionString = Utils.getOption('Y', options);
321:                if (optionString.length() != 0) {
322:                    setPercentToEliminatePerIteration(Integer
323:                            .parseInt(optionString));
324:                }
325:
326:                optionString = Utils.getOption('Z', options);
327:                if (optionString.length() != 0) {
328:                    setPercentThreshold(Integer.parseInt(optionString));
329:                }
330:
331:                optionString = Utils.getOption('P', options);
332:                if (optionString.length() != 0) {
333:                    setEpsilonParameter((new Double(optionString))
334:                            .doubleValue());
335:                }
336:
337:                optionString = Utils.getOption('T', options);
338:                if (optionString.length() != 0) {
339:                    setToleranceParameter((new Double(optionString))
340:                            .doubleValue());
341:                }
342:
343:                optionString = Utils.getOption('C', options);
344:                if (optionString.length() != 0) {
345:                    setComplexityParameter((new Double(optionString))
346:                            .doubleValue());
347:                }
348:
349:                optionString = Utils.getOption('N', options);
350:                if (optionString.length() != 0) {
351:                    setFilterType(new SelectedTag(Integer
352:                            .parseInt(optionString), SMO.TAGS_FILTER));
353:                } else {
354:                    setFilterType(new SelectedTag(SMO.FILTER_NORMALIZE,
355:                            SMO.TAGS_FILTER));
356:                }
357:
358:                Utils.checkForRemainingOptions(options);
359:            }
360:
361:            /**
362:             * Gets the current settings of SVMAttributeEval
363:             *
364:             * @return an array of strings suitable for passing to setOptions() 
365:             */
366:            public String[] getOptions() {
367:                String[] options = new String[14];
368:                int current = 0;
369:
370:                options[current++] = "-X";
371:                options[current++] = "" + getAttsToEliminatePerIteration();
372:
373:                options[current++] = "-Y";
374:                options[current++] = "" + getPercentToEliminatePerIteration();
375:
376:                options[current++] = "-Z";
377:                options[current++] = "" + getPercentThreshold();
378:
379:                options[current++] = "-P";
380:                options[current++] = "" + getEpsilonParameter();
381:
382:                options[current++] = "-T";
383:                options[current++] = "" + getToleranceParameter();
384:
385:                options[current++] = "-C";
386:                options[current++] = "" + getComplexityParameter();
387:
388:                options[current++] = "-N";
389:                options[current++] = "" + m_smoFilterType;
390:
391:                while (current < options.length) {
392:                    options[current++] = "";
393:                }
394:
395:                return options;
396:            }
397:
398:            //________________________________________________________________________
399:
400:            /**
401:             * Returns a tip text for this property suitable for display in the
402:             * GUI
403:             *
404:             * @return tip text string describing this property
405:             */
406:            public String attsToEliminatePerIterationTipText() {
407:                return "Constant rate of attribute elimination.";
408:            }
409:
410:            /**
411:             * Returns a tip text for this property suitable for display in the
412:             * GUI
413:             *
414:             * @return tip text string describing this property
415:             */
416:            public String percentToEliminatePerIterationTipText() {
417:                return "Percent rate of attribute elimination.";
418:            }
419:
420:            /**
421:             * Returns a tip text for this property suitable for display in the
422:             * GUI
423:             *
424:             * @return tip text string describing this property
425:             */
426:            public String percentThresholdTipText() {
427:                return "Threshold below which percent elimination reverts to constant elimination.";
428:            }
429:
430:            /**
431:             * Returns a tip text for this property suitable for display in the
432:             * GUI
433:             *
434:             * @return tip text string describing this property
435:             */
436:            public String epsilonParameterTipText() {
437:                return "P epsilon parameter to pass to the SVM";
438:            }
439:
440:            /**
441:             * Returns a tip text for this property suitable for display in the
442:             * GUI
443:             *
444:             * @return tip text string describing this property
445:             */
446:            public String toleranceParameterTipText() {
447:                return "T tolerance parameter to pass to the SVM";
448:            }
449:
450:            /**
451:             * Returns a tip text for this property suitable for display in the
452:             * GUI
453:             *
454:             * @return tip text string describing this property
455:             */
456:            public String complexityParameterTipText() {
457:                return "C complexity parameter to pass to the SVM";
458:            }
459:
460:            /**
461:             * Returns a tip text for this property suitable for display in the
462:             * GUI
463:             *
464:             * @return tip text string describing this property
465:             */
466:            public String filterTypeTipText() {
467:                return "filtering used by the SVM";
468:            }
469:
470:            //________________________________________________________________________
471:
472:            /**
473:             * Set the constant rate of attribute elimination per iteration
474:             *
475:             * @param cRate the constant rate of attribute elimination per iteration
476:             */
477:            public void setAttsToEliminatePerIteration(int cRate) {
478:                m_numToEliminate = cRate;
479:            }
480:
481:            /**
482:             * Get the constant rate of attribute elimination per iteration
483:             *
484:             * @return the constant rate of attribute elimination per iteration
485:             */
486:            public int getAttsToEliminatePerIteration() {
487:                return m_numToEliminate;
488:            }
489:
490:            /**
491:             * Set the percentage of attributes to eliminate per iteration
492:             *
493:             * @param pRate percent of attributes to eliminate per iteration
494:             */
495:            public void setPercentToEliminatePerIteration(int pRate) {
496:                m_percentToEliminate = pRate;
497:            }
498:
499:            /**
500:             * Get the percentage rate of attribute elimination per iteration
501:             *
502:             * @return the percentage rate of attribute elimination per iteration
503:             */
504:            public int getPercentToEliminatePerIteration() {
505:                return m_percentToEliminate;
506:            }
507:
508:            /**
509:             * Set the threshold below which percentage elimination reverts to
510:             * constant elimination.
511:             *
512:             * @param pThresh percent of attributes to eliminate per iteration
513:             */
514:            public void setPercentThreshold(int pThresh) {
515:                m_percentThreshold = pThresh;
516:            }
517:
518:            /**
519:             * Get the threshold below which percentage elimination reverts to 
520:             * constant elimination.
521:             *
522:             * @return the threshold below which percentage elimination stops
523:             */
524:            public int getPercentThreshold() {
525:                return m_percentThreshold;
526:            }
527:
528:            /**
529:             * Set the value of P for SMO
530:             *
531:             * @param svmP the value of P
532:             */
533:            public void setEpsilonParameter(double svmP) {
534:                m_smoPParameter = svmP;
535:            }
536:
537:            /**
538:             * Get the value of P used with SMO
539:             *
540:             * @return the value of P
541:             */
542:            public double getEpsilonParameter() {
543:                return m_smoPParameter;
544:            }
545:
546:            /**
547:             * Set the value of T for SMO
548:             *
549:             * @param svmT the value of T
550:             */
551:            public void setToleranceParameter(double svmT) {
552:                m_smoTParameter = svmT;
553:            }
554:
555:            /**
556:             * Get the value of T used with SMO
557:             *
558:             * @return the value of T
559:             */
560:            public double getToleranceParameter() {
561:                return m_smoTParameter;
562:            }
563:
564:            /**
565:             * Set the value of C for SMO
566:             *
567:             * @param svmC the value of C
568:             */
569:            public void setComplexityParameter(double svmC) {
570:                m_smoCParameter = svmC;
571:            }
572:
573:            /**
574:             * Get the value of C used with SMO
575:             *
576:             * @return the value of C
577:             */
578:            public double getComplexityParameter() {
579:                return m_smoCParameter;
580:            }
581:
582:            /**
583:             * The filtering mode to pass to SMO
584:             *
585:             * @param newType the new filtering mode
586:             */
587:            public void setFilterType(SelectedTag newType) {
588:
589:                if (newType.getTags() == SMO.TAGS_FILTER) {
590:                    m_smoFilterType = newType.getSelectedTag().getID();
591:                }
592:            }
593:
594:            /**
595:             * Get the filtering mode passed to SMO
596:             *
597:             * @return the filtering mode
598:             */
599:            public SelectedTag getFilterType() {
600:
601:                return new SelectedTag(m_smoFilterType, SMO.TAGS_FILTER);
602:            }
603:
604:            //________________________________________________________________________
605:
606:            /**
607:             * Returns the capabilities of this evaluator.
608:             *
609:             * @return            the capabilities of this evaluator
610:             * @see               Capabilities
611:             */
612:            public Capabilities getCapabilities() {
613:                Capabilities result;
614:
615:                result = new SMO().getCapabilities();
616:
617:                result.setOwner(this );
618:
619:                // only binary attributes are allowed, otherwise the NominalToBinary
620:                // filter inside SMO will increase the number of attributes which in turn
621:                // will lead to ArrayIndexOutOfBounds-Exceptions.
622:                result.disable(Capability.NOMINAL_ATTRIBUTES);
623:                result.enable(Capability.BINARY_ATTRIBUTES);
624:                result.disableAllAttributeDependencies();
625:
626:                return result;
627:            }
628:
629:            /**
630:             * Initializes the evaluator.
631:             *
632:             * @param data set of instances serving as training data 
633:             * @throws Exception if the evaluator has not been 
634:             * generated successfully
635:             */
636:            public void buildEvaluator(Instances data) throws Exception {
637:                // can evaluator handle data?
638:                getCapabilities().testWithFail(data);
639:
640:                //System.out.println("Class attribute: " + data.attribute(data.classIndex()).name());
641:                // Check settings           
642:                m_numToEliminate = (m_numToEliminate > 1) ? m_numToEliminate
643:                        : 1;
644:                m_percentToEliminate = (m_percentToEliminate < 100) ? m_percentToEliminate
645:                        : 100;
646:                m_percentToEliminate = (m_percentToEliminate > 0) ? m_percentToEliminate
647:                        : 0;
648:                m_percentThreshold = (m_percentThreshold < data.numAttributes()) ? m_percentThreshold
649:                        : data.numAttributes() - 1;
650:                m_percentThreshold = (m_percentThreshold > 0) ? m_percentThreshold
651:                        : 0;
652:
653:                // Get ranked attributes for each class seperately, one-vs-all
654:                int[][] attScoresByClass;
655:                int numAttr = data.numAttributes() - 1;
656:                if (data.numClasses() > 2) {
657:                    attScoresByClass = new int[data.numClasses()][numAttr];
658:                    for (int i = 0; i < data.numClasses(); i++) {
659:                        attScoresByClass[i] = rankBySVM(i, data);
660:                    }
661:                } else {
662:                    attScoresByClass = new int[1][numAttr];
663:                    attScoresByClass[0] = rankBySVM(0, data);
664:                }
665:
666:                // Cycle through class-specific ranked lists, poping top one off for each class
667:                // and adding it to the overall ranked attribute list if it's not there already
668:                ArrayList ordered = new ArrayList(numAttr);
669:                for (int i = 0; i < numAttr; i++) {
670:                    for (int j = 0; j < (data.numClasses() > 2 ? data
671:                            .numClasses() : 1); j++) {
672:                        Integer rank = new Integer(attScoresByClass[j][i]);
673:                        if (!ordered.contains(rank))
674:                            ordered.add(rank);
675:                    }
676:                }
677:                m_attScores = new double[data.numAttributes()];
678:                Iterator listIt = ordered.iterator();
679:                for (double i = (double) numAttr; listIt.hasNext(); i = i - 1.0) {
680:                    m_attScores[((Integer) listIt.next()).intValue()] = i;
681:                }
682:            }
683:
684:            /**
685:             * Get SVM-ranked attribute indexes (best to worst) selected for
686:             * the class attribute indexed by classInd (one-vs-all).
687:             */
688:            private int[] rankBySVM(int classInd, Instances data) {
689:                // Holds a mapping into the original array of attribute indices
690:                int[] origIndices = new int[data.numAttributes()];
691:                for (int i = 0; i < origIndices.length; i++)
692:                    origIndices[i] = i;
693:
694:                // Count down of number of attributes remaining
695:                int numAttrLeft = data.numAttributes() - 1;
696:                // Ranked attribute indices for this class, one vs.all (highest->lowest)
697:                int[] attRanks = new int[numAttrLeft];
698:
699:                try {
700:                    MakeIndicator filter = new MakeIndicator();
701:                    filter.setAttributeIndex("" + (data.classIndex() + 1));
702:                    filter.setNumeric(false);
703:                    filter.setValueIndex(classInd);
704:                    filter.setInputFormat(data);
705:                    Instances trainCopy = Filter.useFilter(data, filter);
706:                    double pctToElim = ((double) m_percentToEliminate) / 100.0;
707:                    while (numAttrLeft > 0) {
708:                        int numToElim;
709:                        if (pctToElim > 0) {
710:                            numToElim = (int) (trainCopy.numAttributes() * pctToElim);
711:                            numToElim = (numToElim > 1) ? numToElim : 1;
712:                            if (numAttrLeft - numToElim <= m_percentThreshold) {
713:                                pctToElim = 0;
714:                                numToElim = numAttrLeft - m_percentThreshold;
715:                            }
716:                        } else {
717:                            numToElim = (numAttrLeft >= m_numToEliminate) ? m_numToEliminate
718:                                    : numAttrLeft;
719:                        }
720:
721:                        // Build the linear SVM with default parameters
722:                        SMO smo = new SMO();
723:
724:                        // SMO seems to get stuck if data not normalised when few attributes remain
725:                        // smo.setNormalizeData(numAttrLeft < 40);
726:                        smo.setFilterType(new SelectedTag(m_smoFilterType,
727:                                SMO.TAGS_FILTER));
728:                        smo.setEpsilon(m_smoPParameter);
729:                        smo.setToleranceParameter(m_smoTParameter);
730:                        smo.setC(m_smoCParameter);
731:                        smo.buildClassifier(trainCopy);
732:
733:                        // Find the attribute with maximum weight^2
734:                        double[] weightsSparse = smo.sparseWeights()[0][1];
735:                        int[] indicesSparse = smo.sparseIndices()[0][1];
736:                        double[] weights = new double[trainCopy.numAttributes()];
737:                        for (int j = 0; j < weightsSparse.length; j++) {
738:                            weights[indicesSparse[j]] = weightsSparse[j]
739:                                    * weightsSparse[j];
740:                        }
741:                        weights[trainCopy.classIndex()] = Double.MAX_VALUE;
742:                        int minWeightIndex;
743:                        int[] featArray = new int[numToElim];
744:                        boolean[] eliminated = new boolean[origIndices.length];
745:                        for (int j = 0; j < numToElim; j++) {
746:                            minWeightIndex = Utils.minIndex(weights);
747:                            attRanks[--numAttrLeft] = origIndices[minWeightIndex];
748:                            featArray[j] = minWeightIndex;
749:                            eliminated[minWeightIndex] = true;
750:                            weights[minWeightIndex] = Double.MAX_VALUE;
751:                        }
752:
753:                        // Delete the worst attributes. 
754:                        weka.filters.unsupervised.attribute.Remove delTransform = new weka.filters.unsupervised.attribute.Remove();
755:                        delTransform.setInvertSelection(false);
756:                        delTransform.setAttributeIndicesArray(featArray);
757:                        delTransform.setInputFormat(trainCopy);
758:                        trainCopy = Filter.useFilter(trainCopy, delTransform);
759:
760:                        // Update the array of remaining attribute indices
761:                        int[] temp = new int[origIndices.length - numToElim];
762:                        int k = 0;
763:                        for (int j = 0; j < origIndices.length; j++) {
764:                            if (!eliminated[j]) {
765:                                temp[k++] = origIndices[j];
766:                            }
767:                        }
768:                        origIndices = temp;
769:                    }
770:                    // Carefully handle all exceptions
771:                } catch (Exception e) {
772:                    e.printStackTrace();
773:                }
774:                return attRanks;
775:            }
776:
777:            /**
778:             * Resets options to defaults.
779:             */
780:            protected void resetOptions() {
781:                m_attScores = null;
782:            }
783:
784:            /**
785:             * Evaluates an attribute by returning the rank of the square of its coefficient in a
786:             * linear support vector machine.
787:             *
788:             * @param attribute the index of the attribute to be evaluated
789:             * @throws Exception if the attribute could not be evaluated
790:             */
791:            public double evaluateAttribute(int attribute) throws Exception {
792:                return m_attScores[attribute];
793:            }
794:
795:            /**
796:             * Return a description of the evaluator
797:             * @return description as a string
798:             */
799:            public String toString() {
800:
801:                StringBuffer text = new StringBuffer();
802:                if (m_attScores == null) {
803:                    text
804:                            .append("\tSVM feature evaluator has not been built yet");
805:                } else {
806:                    text.append("\tSVM feature evaluator");
807:                }
808:
809:                text.append("\n");
810:                return text.toString();
811:            }
812:
813:            /**
814:             * Main method for testing this class.
815:             *
816:             * @param args the options
817:             */
818:            public static void main(String[] args) {
819:                runEvaluator(new SVMAttributeEval(), args);
820:            }
821:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.