public final class JapaneseCarrot2TokenizerFactory
extends java.lang.Object
implements org.carrot2.text.linguistic.ITokenizerFactory
JapaneseTokenizerと
RemoveDuplicatesTokenFilterを使ったトークナイザ。JaBuzzPhraseCarrot2TokenizerFactoryを使った方がよい。JaBuzzPhraseCarrot2TokenizerFactory
<searchComponent name="clustering"
enable="${solr.clustering.enabled:true}"
class="solr.clustering.ClusteringComponent" >
<lst name="engine">
<str name="name">default</str>
<str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
<str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
<str name="carrot.lexicalResourcesDir">clustering/carrot2</str>
<str name="PreprocessingPipeline.tokenizerFactory">com.rondhuit.solr.search.JapaneseCarrot2TokenizerFactory</str>
</lst>
</searchComponent>
<requestHandler name="/clustering"
startup="lazy"
enable="${solr.clustering.enabled:true}"
class="solr.SearchHandler">
<lst name="defaults">
<bool name="clustering">true</bool>
<str name="clustering.engine">default</str>
<bool name="clustering.results">true</bool>
<str name="carrot.title">title</str>
<str name="carrot.url">id</str>
<str name="carrot.snippet">statement</str>
<bool name="carrot.produceSummary">true</bool>
<bool name="carrot.outputSubClusters">true</bool>
<str name="df">statement</str>
<str name="rows">10</str>
<str name="fl">title</str>
</lst>
<arr name="last-components">
<str>clustering</str>
</arr>
</requestHandler>| コンストラクタと説明 |
|---|
JapaneseCarrot2TokenizerFactory() |
| 修飾子とタイプ | メソッドと説明 |
|---|---|
org.carrot2.text.analysis.ITokenizer |
getTokenizer(org.carrot2.core.LanguageCode arg0) |
Copyright © 2009-2018 RONDHUIT Co.,Ltd. All Rights Reserved.