Source Code Cross Referenced for AdditiveRegression.java in  » Science » weka » weka » classifiers » meta » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Science » weka » weka.classifiers.meta 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


001:        /*
002:         *    This program is free software; you can redistribute it and/or modify
003:         *    it under the terms of the GNU General Public License as published by
004:         *    the Free Software Foundation; either version 2 of the License, or
005:         *    (at your option) any later version.
006:         *
007:         *    This program is distributed in the hope that it will be useful,
008:         *    but WITHOUT ANY WARRANTY; without even the implied warranty of
009:         *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
010:         *    GNU General Public License for more details.
011:         *
012:         *    You should have received a copy of the GNU General Public License
013:         *    along with this program; if not, write to the Free Software
014:         *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
015:         */
016:
017:        /*
018:         *    AdditiveRegression.java
019:         *    Copyright (C) 2000 University of Waikato, Hamilton, New Zealand
020:         *
021:         */
022:
023:        package weka.classifiers.meta;
024:
025:        import weka.classifiers.Classifier;
026:        import weka.classifiers.IteratedSingleClassifierEnhancer;
027:        import weka.classifiers.rules.ZeroR;
028:        import weka.core.AdditionalMeasureProducer;
029:        import weka.core.Capabilities;
030:        import weka.core.Instance;
031:        import weka.core.Instances;
032:        import weka.core.Option;
033:        import weka.core.OptionHandler;
034:        import weka.core.TechnicalInformation;
035:        import weka.core.TechnicalInformationHandler;
036:        import weka.core.Utils;
037:        import weka.core.WeightedInstancesHandler;
038:        import weka.core.Capabilities.Capability;
039:        import weka.core.TechnicalInformation.Field;
040:        import weka.core.TechnicalInformation.Type;
041:
042:        import java.util.Enumeration;
043:        import java.util.Vector;
044:
045:        /**
046:         <!-- globalinfo-start -->
047:         * Meta classifier that enhances the performance of a regression base classifier. Each iteration fits a model to the residuals left by the classifier on the previous iteration. Prediction is accomplished by adding the predictions of each classifier. Reducing the shrinkage (learning rate) parameter helps prevent overfitting and has a smoothing effect but increases the learning time.<br/>
048:         * <br/>
049:         * For more information see:<br/>
050:         * <br/>
051:         * J.H. Friedman (1999). Stochastic Gradient Boosting.
052:         * <p/>
053:         <!-- globalinfo-end -->
054:         *
055:         <!-- technical-bibtex-start -->
056:         * BibTeX:
057:         * <pre>
058:         * &#64;techreport{Friedman1999,
059:         *    author = {J.H. Friedman},
060:         *    institution = {Stanford University},
061:         *    title = {Stochastic Gradient Boosting},
062:         *    year = {1999},
063:         *    PS = {http://www-stat.stanford.edu/~jhf/ftp/stobst.ps}
064:         * }
065:         * </pre>
066:         * <p/>
067:         <!-- technical-bibtex-end -->
068:         *
069:         <!-- options-start -->
070:         * Valid options are: <p/>
071:         * 
072:         * <pre> -S
073:         *  Specify shrinkage rate. (default = 1.0, ie. no shrinkage)
074:         * </pre>
075:         * 
076:         * <pre> -I &lt;num&gt;
077:         *  Number of iterations.
078:         *  (default 10)</pre>
079:         * 
080:         * <pre> -D
081:         *  If set, classifier is run in debug mode and
082:         *  may output additional info to the console</pre>
083:         * 
084:         * <pre> -W
085:         *  Full name of base classifier.
086:         *  (default: weka.classifiers.trees.DecisionStump)</pre>
087:         * 
088:         * <pre> 
089:         * Options specific to classifier weka.classifiers.trees.DecisionStump:
090:         * </pre>
091:         * 
092:         * <pre> -D
093:         *  If set, classifier is run in debug mode and
094:         *  may output additional info to the console</pre>
095:         * 
096:         <!-- options-end -->
097:         *
098:         * @author Mark Hall (mhall@cs.waikato.ac.nz)
099:         * @version $Revision: 1.23 $
100:         */
101:        public class AdditiveRegression extends
102:                IteratedSingleClassifierEnhancer implements  OptionHandler,
103:                AdditionalMeasureProducer, WeightedInstancesHandler,
104:                TechnicalInformationHandler {
105:
106:            /** for serialization */
107:            static final long serialVersionUID = -2368937577670527151L;
108:
109:            /**
110:             * Shrinkage (Learning rate). Default = no shrinkage.
111:             */
112:            protected double m_shrinkage = 1.0;
113:
114:            /** The number of successfully generated base classifiers. */
115:            protected int m_NumIterationsPerformed;
116:
117:            /** The model for the mean */
118:            protected ZeroR m_zeroR;
119:
120:            /** whether we have suitable data or nor (if not, ZeroR model is used) */
121:            protected boolean m_SuitableData = true;
122:
123:            /**
124:             * Returns a string describing this attribute evaluator
125:             * @return a description of the evaluator suitable for
126:             * displaying in the explorer/experimenter gui
127:             */
128:            public String globalInfo() {
129:                return " Meta classifier that enhances the performance of a regression "
130:                        + "base classifier. Each iteration fits a model to the residuals left "
131:                        + "by the classifier on the previous iteration. Prediction is "
132:                        + "accomplished by adding the predictions of each classifier. "
133:                        + "Reducing the shrinkage (learning rate) parameter helps prevent "
134:                        + "overfitting and has a smoothing effect but increases the learning "
135:                        + "time.\n\n"
136:                        + "For more information see:\n\n"
137:                        + getTechnicalInformation().toString();
138:            }
139:
140:            /**
141:             * Returns an instance of a TechnicalInformation object, containing 
142:             * detailed information about the technical background of this class,
143:             * e.g., paper reference or book this class is based on.
144:             * 
145:             * @return the technical information about this class
146:             */
147:            public TechnicalInformation getTechnicalInformation() {
148:                TechnicalInformation result;
149:
150:                result = new TechnicalInformation(Type.TECHREPORT);
151:                result.setValue(Field.AUTHOR, "J.H. Friedman");
152:                result.setValue(Field.YEAR, "1999");
153:                result.setValue(Field.TITLE, "Stochastic Gradient Boosting");
154:                result.setValue(Field.INSTITUTION, "Stanford University");
155:                result.setValue(Field.PS,
156:                        "http://www-stat.stanford.edu/~jhf/ftp/stobst.ps");
157:
158:                return result;
159:            }
160:
161:            /**
162:             * Default constructor specifying DecisionStump as the classifier
163:             */
164:            public AdditiveRegression() {
165:
166:                this (new weka.classifiers.trees.DecisionStump());
167:            }
168:
169:            /**
170:             * Constructor which takes base classifier as argument.
171:             *
172:             * @param classifier the base classifier to use
173:             */
174:            public AdditiveRegression(Classifier classifier) {
175:
176:                m_Classifier = classifier;
177:            }
178:
179:            /**
180:             * String describing default classifier.
181:             * 
182:             * @return the default classifier classname
183:             */
184:            protected String defaultClassifierString() {
185:
186:                return "weka.classifiers.trees.DecisionStump";
187:            }
188:
189:            /**
190:             * Returns an enumeration describing the available options.
191:             *
192:             * @return an enumeration of all the available options.
193:             */
194:            public Enumeration listOptions() {
195:
196:                Vector newVector = new Vector(4);
197:
198:                newVector.addElement(new Option("\tSpecify shrinkage rate. "
199:                        + "(default = 1.0, ie. no shrinkage)\n", "S", 1, "-S"));
200:
201:                Enumeration enu = super .listOptions();
202:                while (enu.hasMoreElements()) {
203:                    newVector.addElement(enu.nextElement());
204:                }
205:                return newVector.elements();
206:            }
207:
208:            /**
209:             * Parses a given list of options. <p/>
210:             *
211:             <!-- options-start -->
212:             * Valid options are: <p/>
213:             * 
214:             * <pre> -S
215:             *  Specify shrinkage rate. (default = 1.0, ie. no shrinkage)
216:             * </pre>
217:             * 
218:             * <pre> -I &lt;num&gt;
219:             *  Number of iterations.
220:             *  (default 10)</pre>
221:             * 
222:             * <pre> -D
223:             *  If set, classifier is run in debug mode and
224:             *  may output additional info to the console</pre>
225:             * 
226:             * <pre> -W
227:             *  Full name of base classifier.
228:             *  (default: weka.classifiers.trees.DecisionStump)</pre>
229:             * 
230:             * <pre> 
231:             * Options specific to classifier weka.classifiers.trees.DecisionStump:
232:             * </pre>
233:             * 
234:             * <pre> -D
235:             *  If set, classifier is run in debug mode and
236:             *  may output additional info to the console</pre>
237:             * 
238:             <!-- options-end -->
239:             *
240:             * @param options the list of options as an array of strings
241:             * @throws Exception if an option is not supported
242:             */
243:            public void setOptions(String[] options) throws Exception {
244:
245:                String optionString = Utils.getOption('S', options);
246:                if (optionString.length() != 0) {
247:                    Double temp = Double.valueOf(optionString);
248:                    setShrinkage(temp.doubleValue());
249:                }
250:
251:                super .setOptions(options);
252:            }
253:
254:            /**
255:             * Gets the current settings of the Classifier.
256:             *
257:             * @return an array of strings suitable for passing to setOptions
258:             */
259:            public String[] getOptions() {
260:
261:                String[] super Options = super .getOptions();
262:                String[] options = new String[super Options.length + 2];
263:                int current = 0;
264:
265:                options[current++] = "-S";
266:                options[current++] = "" + getShrinkage();
267:
268:                System.arraycopy(super Options, 0, options, current,
269:                        super Options.length);
270:
271:                current += super Options.length;
272:                while (current < options.length) {
273:                    options[current++] = "";
274:                }
275:                return options;
276:            }
277:
278:            /**
279:             * Returns the tip text for this property
280:             * @return tip text for this property suitable for
281:             * displaying in the explorer/experimenter gui
282:             */
283:            public String shrinkageTipText() {
284:                return "Shrinkage rate. Smaller values help prevent overfitting and "
285:                        + "have a smoothing effect (but increase learning time). "
286:                        + "Default = 1.0, ie. no shrinkage.";
287:            }
288:
289:            /**
290:             * Set the shrinkage parameter
291:             *
292:             * @param l the shrinkage rate.
293:             */
294:            public void setShrinkage(double l) {
295:                m_shrinkage = l;
296:            }
297:
298:            /**
299:             * Get the shrinkage rate.
300:             *
301:             * @return the value of the learning rate
302:             */
303:            public double getShrinkage() {
304:                return m_shrinkage;
305:            }
306:
307:            /**
308:             * Returns default capabilities of the classifier.
309:             *
310:             * @return      the capabilities of this classifier
311:             */
312:            public Capabilities getCapabilities() {
313:                Capabilities result = super .getCapabilities();
314:
315:                // class
316:                result.disableAllClasses();
317:                result.disableAllClassDependencies();
318:                result.enable(Capability.NUMERIC_CLASS);
319:                result.enable(Capability.DATE_CLASS);
320:
321:                return result;
322:            }
323:
324:            /**
325:             * Build the classifier on the supplied data
326:             *
327:             * @param data the training data
328:             * @throws Exception if the classifier could not be built successfully
329:             */
330:            public void buildClassifier(Instances data) throws Exception {
331:
332:                super .buildClassifier(data);
333:
334:                // can classifier handle the data?
335:                getCapabilities().testWithFail(data);
336:
337:                // remove instances with missing class
338:                Instances newData = new Instances(data);
339:                newData.deleteWithMissingClass();
340:
341:                double sum = 0;
342:                double temp_sum = 0;
343:                // Add the model for the mean first
344:                m_zeroR = new ZeroR();
345:                m_zeroR.buildClassifier(newData);
346:
347:                // only class? -> use only ZeroR model
348:                if (newData.numAttributes() == 1) {
349:                    System.err
350:                            .println("Cannot build model (only class attribute present in data!), "
351:                                    + "using ZeroR model instead!");
352:                    m_SuitableData = false;
353:                    return;
354:                } else {
355:                    m_SuitableData = true;
356:                }
357:
358:                newData = residualReplace(newData, m_zeroR, false);
359:                for (int i = 0; i < newData.numInstances(); i++) {
360:                    sum += newData.instance(i).weight()
361:                            * newData.instance(i).classValue()
362:                            * newData.instance(i).classValue();
363:                }
364:                if (m_Debug) {
365:                    System.err.println("Sum of squared residuals "
366:                            + "(predicting the mean) : " + sum);
367:                }
368:
369:                m_NumIterationsPerformed = 0;
370:                do {
371:                    temp_sum = sum;
372:
373:                    // Build the classifier
374:                    m_Classifiers[m_NumIterationsPerformed]
375:                            .buildClassifier(newData);
376:
377:                    newData = residualReplace(newData,
378:                            m_Classifiers[m_NumIterationsPerformed], true);
379:                    sum = 0;
380:                    for (int i = 0; i < newData.numInstances(); i++) {
381:                        sum += newData.instance(i).weight()
382:                                * newData.instance(i).classValue()
383:                                * newData.instance(i).classValue();
384:                    }
385:                    if (m_Debug) {
386:                        System.err.println("Sum of squared residuals : " + sum);
387:                    }
388:                    m_NumIterationsPerformed++;
389:                } while (((temp_sum - sum) > Utils.SMALL)
390:                        && (m_NumIterationsPerformed < m_Classifiers.length));
391:            }
392:
393:            /**
394:             * Classify an instance.
395:             *
396:             * @param inst the instance to predict
397:             * @return a prediction for the instance
398:             * @throws Exception if an error occurs
399:             */
400:            public double classifyInstance(Instance inst) throws Exception {
401:
402:                double prediction = m_zeroR.classifyInstance(inst);
403:
404:                // default model?
405:                if (!m_SuitableData) {
406:                    return prediction;
407:                }
408:
409:                for (int i = 0; i < m_NumIterationsPerformed; i++) {
410:                    double toAdd = m_Classifiers[i].classifyInstance(inst);
411:                    toAdd *= getShrinkage();
412:                    prediction += toAdd;
413:                }
414:
415:                return prediction;
416:            }
417:
418:            /**
419:             * Replace the class values of the instances from the current iteration
420:             * with residuals ater predicting with the supplied classifier.
421:             *
422:             * @param data the instances to predict
423:             * @param c the classifier to use
424:             * @param useShrinkage whether shrinkage is to be applied to the model's output
425:             * @return a new set of instances with class values replaced by residuals
426:             * @throws Exception if something goes wrong
427:             */
428:            private Instances residualReplace(Instances data, Classifier c,
429:                    boolean useShrinkage) throws Exception {
430:                double pred, residual;
431:                Instances newInst = new Instances(data);
432:
433:                for (int i = 0; i < newInst.numInstances(); i++) {
434:                    pred = c.classifyInstance(newInst.instance(i));
435:                    if (useShrinkage) {
436:                        pred *= getShrinkage();
437:                    }
438:                    residual = newInst.instance(i).classValue() - pred;
439:                    newInst.instance(i).setClassValue(residual);
440:                }
441:                //    System.err.print(newInst);
442:                return newInst;
443:            }
444:
445:            /**
446:             * Returns an enumeration of the additional measure names
447:             * @return an enumeration of the measure names
448:             */
449:            public Enumeration enumerateMeasures() {
450:                Vector newVector = new Vector(1);
451:                newVector.addElement("measureNumIterations");
452:                return newVector.elements();
453:            }
454:
455:            /**
456:             * Returns the value of the named measure
457:             * @param additionalMeasureName the name of the measure to query for its value
458:             * @return the value of the named measure
459:             * @throws IllegalArgumentException if the named measure is not supported
460:             */
461:            public double getMeasure(String additionalMeasureName) {
462:                if (additionalMeasureName
463:                        .compareToIgnoreCase("measureNumIterations") == 0) {
464:                    return measureNumIterations();
465:                } else {
466:                    throw new IllegalArgumentException(additionalMeasureName
467:                            + " not supported (AdditiveRegression)");
468:                }
469:            }
470:
471:            /**
472:             * return the number of iterations (base classifiers) completed
473:             * @return the number of iterations (same as number of base classifier
474:             * models)
475:             */
476:            public double measureNumIterations() {
477:                return m_NumIterationsPerformed;
478:            }
479:
480:            /**
481:             * Returns textual description of the classifier.
482:             *
483:             * @return a description of the classifier as a string
484:             */
485:            public String toString() {
486:                StringBuffer text = new StringBuffer();
487:
488:                // only ZeroR model?
489:                if (!m_SuitableData) {
490:                    StringBuffer buf = new StringBuffer();
491:                    buf.append(this .getClass().getName()
492:                            .replaceAll(".*\\.", "")
493:                            + "\n");
494:                    buf.append(this .getClass().getName()
495:                            .replaceAll(".*\\.", "").replaceAll(".", "=")
496:                            + "\n\n");
497:                    buf
498:                            .append("Warning: No model could be built, hence ZeroR model is used:\n\n");
499:                    buf.append(m_zeroR.toString());
500:                    return buf.toString();
501:                }
502:
503:                if (m_NumIterations == 0) {
504:                    return "Classifier hasn't been built yet!";
505:                }
506:
507:                text.append("Additive Regression\n\n");
508:
509:                text.append("ZeroR model\n\n" + m_zeroR + "\n\n");
510:
511:                text.append("Base classifier "
512:                        + getClassifier().getClass().getName() + "\n\n");
513:                text.append("" + m_NumIterationsPerformed
514:                        + " models generated.\n");
515:
516:                for (int i = 0; i < m_NumIterationsPerformed; i++) {
517:                    text.append("\nModel number " + i + "\n\n"
518:                            + m_Classifiers[i] + "\n");
519:                }
520:
521:                return text.toString();
522:            }
523:
524:            /**
525:             * Main method for testing this class.
526:             *
527:             * @param argv should contain the following arguments:
528:             * -t training file [-T test file] [-c class index]
529:             */
530:            public static void main(String[] argv) {
531:                runClassifier(new AdditiveRegression(), argv);
532:            }
533:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.