JapaneseCarrot2TokenizerFactory (RONDHUIT-solr-plugin 4.0.1 API)

java.lang.Object
- com.rondhuit.solr.search.JapaneseCarrot2TokenizerFactory

すべての実装されたインタフェース:: org.carrot2.text.linguistic.ITokenizerFactory

public final class JapaneseCarrot2TokenizerFactory
extends java.lang.Object
implements org.carrot2.text.linguistic.ITokenizerFactory

Carrot2を用いた検索結果クラスタリングを行うための、JapaneseTokenizerと RemoveDuplicatesTokenFilterを使ったトークナイザ。

ほとんどの場合は、JaBuzzPhraseCarrot2TokenizerFactoryを使った方がよい。

導入されたバージョン:

0.10

solrconfig.xml sample

 <searchComponent name="clustering"
                  enable="${solr.clustering.enabled:true}"
                  class="solr.clustering.ClusteringComponent" >
   <lst name="engine">
     <str name="name">default</str>
     <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
     <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
     <str name="carrot.lexicalResourcesDir">clustering/carrot2</str>
     <str name="PreprocessingPipeline.tokenizerFactory">com.rondhuit.solr.search.JapaneseCarrot2TokenizerFactory</str>
   </lst>
 </searchComponent>

 <requestHandler name="/clustering"
                 startup="lazy"
                 enable="${solr.clustering.enabled:true}"
                 class="solr.SearchHandler">
   <lst name="defaults">
     <bool name="clustering">true</bool>
     <str name="clustering.engine">default</str>
     <bool name="clustering.results">true</bool>
     <str name="carrot.title">title</str>
     <str name="carrot.url">id</str>
     <str name="carrot.snippet">statement</str>
     <bool name="carrot.produceSummary">true</bool>
     <bool name="carrot.outputSubClusters">true</bool>
     <str name="df">statement</str>
     <str name="rows">10</str>
     <str name="fl">title</str>
   </lst>
   <arr name="last-components">
     <str>clustering</str>
   </arr>
 </requestHandler>

コンストラクタのサマリー

コンストラクタ
コンストラクタと説明

JapaneseCarrot2TokenizerFactory()

コンストラクタ
コンストラクタと説明
`JapaneseCarrot2TokenizerFactory()`

メソッドのサマリー

すべてのメソッドインスタンス・メソッド concreteメソッド
修飾子とタイプ	メソッドと説明
`org.carrot2.text.analysis.ITokenizer`	`getTokenizer(org.carrot2.core.LanguageCode arg0)`

クラスから継承されたメソッド java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- コンストラクタの詳細
  - JapaneseCarrot2TokenizerFactory
```
public JapaneseCarrot2TokenizerFactory()
```
- メソッドの詳細
  - getTokenizer
```
public org.carrot2.text.analysis.ITokenizer getTokenizer(org.carrot2.core.LanguageCode arg0)
```
    定義:
    
    getTokenizer インタフェース内 org.carrot2.text.linguistic.ITokenizerFactory

クラス JapaneseCarrot2TokenizerFactory

コンストラクタのサマリー

メソッドのサマリー

クラスから継承されたメソッド java.lang.Object

コンストラクタの詳細

JapaneseCarrot2TokenizerFactory

メソッドの詳細

getTokenizer