mirror of
https://github.com/djohnlewis/stackdump
synced 2024-12-04 23:17:37 +00:00
Added PowerShell equivalents to launch and manage Stackdump on Windows.
This commit is contained in:
commit
2fea457b06
27
.hgignore
Normal file
27
.hgignore
Normal file
@ -0,0 +1,27 @@
|
||||
^JAVA_CMD$
|
||||
^PYTHON_CMD$
|
||||
|
||||
# ignore any data
|
||||
^data/.*$
|
||||
|
||||
# ignore working bytecode
|
||||
\.class$
|
||||
\.pyc$
|
||||
|
||||
^datadump/.*
|
||||
|
||||
# ignore test and tutorial directories
|
||||
test/.*$
|
||||
tests/.*$
|
||||
testsuite/.*$
|
||||
tutorial/.*$
|
||||
|
||||
# Solr/Jetty
|
||||
^java/solr/server/solr-webapp/.*
|
||||
^java/solr/server/logs/.*
|
||||
|
||||
# ignore the downloaded logos
|
||||
^python/media/images/logos/.*
|
||||
|
||||
# PyCharm project files
|
||||
^.idea/
|
BIN
List-StackdumpCommands.ps1
Normal file
BIN
List-StackdumpCommands.ps1
Normal file
Binary file not shown.
158
README.textile
Normal file
158
README.textile
Normal file
@ -0,0 +1,158 @@
|
||||
h1. Stackdump - an offline browser for StackExchange sites.
|
||||
|
||||
Stackdump was conceived for those who work in environments that do not have easy access to the StackExchange family of websites. It allows you to host a read-only instance of the StackExchange sites locally, accessible via a web browser.
|
||||
|
||||
Stackdump comprises of two components - the search indexer ("Apache Solr":http://lucene.apache.org/solr/) and the web application. It uses the "StackExchange Data Dumps":http://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump/, published quarterly by StackExchange, as its source of data.
|
||||
|
||||
h2. Screenshots
|
||||
|
||||
"Stackdump home":http://edgylogic.com/dynmedia/301/
|
||||
"Stackdump search results":http://edgylogic.com/dynmedia/303/
|
||||
"Stackdump question view":http://edgylogic.com/dynmedia/302/
|
||||
|
||||
h2. System Requirements
|
||||
|
||||
Stackdump was written in Python and requires Python 2.5 or later (but not Python 3). It leverages Apache Solr, which requires the Java runtime (JRE), version 6 or later.
|
||||
|
||||
Besides that, there are no OS-dependent dependencies and should work on any platform that Python and Java run on (although it only comes bundled with Linux scripts at the moment). It was, however, developed and tested on CentOS 5 running Python 2.7 and JRE 6 update 27.
|
||||
|
||||
You will also need "7-zip":http://www.7-zip.org/ to extract the data dump files, but Stackdump does not use it directly so you can perform the extraction on another machine first.
|
||||
|
||||
It is recommended that Stackdump be run on a system with at least 3GB of RAM, particularly if you intend to import StackOverflow into Stackdump. Apache Solr requires a fair bit of memory during the import process. It should also have a fair bit of space available; having at least roughly the space used by the raw, extracted, data dump XML files is a good rule of thumb (note that once imported, the raw data dump XML files are not needed by Stackdump any more).
|
||||
|
||||
Finally, Stackdump has been tested and works in the latest browsers (IE9, FF10+, Chrome, Safari). It degrades fairly gracefully in older browsers, although some will have rendering issues, e.g. IE8.
|
||||
|
||||
h2. Changes and upgrading to v1.1
|
||||
|
||||
Version 1.1 fixes a few bugs, the major one being the inability to import the 2013 data dumps due to changes in the case of the filenames. It also adds a couple of minor features, including support for resolving and rewriting short question and answer permalinks.
|
||||
|
||||
Because changes have been made to the search schema and the search indexer has been upgraded (to Solr 4.5), all data will need to be re-indexed. Therefore there is no upgrade path; follow the instructions below to set up Stackdump again. It is recommended to install this new version in a new directory, instead of overwriting the existing one.
|
||||
|
||||
h2. Setting up
|
||||
|
||||
Stackdump was designed for offline environments or environments with poor internet access, therefore it is bundled with all the dependencies it requires (with the exception of Python, Java and 7-zip).
|
||||
|
||||
As long as you have:
|
||||
* "Python":http://python.org/download/,
|
||||
* "Java":http://java.com/en/download/manual.jsp,
|
||||
* "Stackdump":https://bitbucket.org/samuel.lai/stackdump/downloads,
|
||||
* the "StackExchange Data Dump":http://www.clearbits.net/creators/146-stack-exchange-data-dump (Note: this is only available as a torrent), and
|
||||
* "7-zip":http://www.7-zip.org/ (needed to extract the data dump files)
|
||||
|
||||
...you should be able to get an instance up and running.
|
||||
|
||||
To provide a better experience, Stackdump can use the RSS feed content to pre-fill some of the required details during the import process, as well as to display the site logos in the app. Stackdump comes bundled with a script that downloads and places these bits in the right places. If you're in a completely offline environment however, it may be worth running this script on a connected box first.
|
||||
|
||||
h3. Extract Stackdump
|
||||
|
||||
Stackdump was to be self-contained, so to get it up and running, simply extract the Stackdump download to an appropriate location.
|
||||
|
||||
h3. Verify dependencies
|
||||
|
||||
Next, you should verify that the required Java and Python versions are accessible in the PATH. (If you haven't installed them yet, now is a good time to do so.)
|
||||
|
||||
Type @java -version@ and check that it is at least version 1.6.
|
||||
|
||||
bq. If you're using Java 7 on Linux and you see an error similar to the following -
|
||||
@ Error: failed /opt/jre1.7.0_40/lib/i386/server/libjvm.so, because /opt/jre1.7.0_40/lib/i386/server/libjvm.so: cannot restore segment prot after reloc: Permission denied @
|
||||
this is because you have SELinux enabled. You will need to tell SELinux to allow Java to run by using the following command as root (amending the path as necessary) -
|
||||
@chcon -t textrel_shlib_t /opt/jre1.7.0_40/lib/i386/server/libjvm.so@
|
||||
|
||||
Then type @python -V@ and check that it is version 2.5 or later (and not Python 3).
|
||||
|
||||
If you would rather not put these versions in the PATH (e.g. you don't want to override the default version of Python in your Linux distribution), you can tell Stackdump which Java and/or Python to use explicitly by creating a file named @JAVA_CMD@ or @PYTHON_CMD@ respectively in the Stackdump root directory, and placing the path to the executable in there.
|
||||
|
||||
h3. Download additional site information
|
||||
|
||||
As mentioned earlier, Stackdump can use additional information available in the StackExchange RSS feed to pre-fill required details during the site import process and to show the logos for each site.
|
||||
|
||||
To start the download, execute the following command in the Stackdump root directory -
|
||||
|
||||
@./manage.sh download_site_info@
|
||||
|
||||
If Stackdump will be running in a completely offline environment, it is recommended that you extract and run this command in a connected environment first. If that is not possible, you can manually download the required pieces -
|
||||
|
||||
* download the "RSS feed":http://stackexchange.com/feeds/sites to a file
|
||||
* for each site you will be importing, work out the __site key__ and download the logo by substituting the site key into this URL: @http://sstatic.net/site_key/img/icon-48.png@ where *site_key* is the site key. The site key is generally the bit in the URL before .stackexchange.com, or just the domain without the TLD, e.g. for the Salesforce StackExchange at http://salesforce.stackexchange.com, it is just __salesforce__, while for Server Fault at http://serverfault.com, it is __serverfault__.
|
||||
|
||||
The RSS feed file should be copied to the file @stackdump_dir/data/sites@ (create the @data@ directory if it doesn't exist), and the logos should be copied to the @stackdump_dir/python/media/images/logos@ directory and named with the site key and file type extension, e.g. @serverfault.png@.
|
||||
|
||||
h3. Import sites
|
||||
|
||||
Each data dump for a StackExchange site is a "7-zip":http://www.7-zip.org/ file. Extract the file corresponding to the site you wish to import into a temporary directory. It should have a bunch of XML files in it when complete.
|
||||
|
||||
Now make sure you have the search indexer up and running. This can be done by simply executing the @stackdump_dir/start_solr.sh@ command.
|
||||
|
||||
To start the import process, execute the following command -
|
||||
|
||||
@stackdump_dir/manage.sh import_site --base-url site_url --dump-date dump_date path_to_xml_files@
|
||||
|
||||
... where site_url is the URL of the site you're importing, e.g. __android.stackexchange.com__; dump_date is the date of the data dump you're importing, e.g. __August 2012__, and finally path_to_xml_files is the path to the XML files you just extracted. The dump_date is a text string that is shown in the app only, so it can be in any format you want.
|
||||
|
||||
For example, to import the August 2012 data dump of the Android StackExchange site, you would execute -
|
||||
|
||||
@stackdump_dir/manage.sh import_site --base-url android.stackexchange.com --dump-date "August 2012" /tmp/android@
|
||||
|
||||
It is normal to get messages about unknown PostTypeIds and missing comments and answers. These errors are likely due to those posts being hidden via moderation.
|
||||
|
||||
This can take anywhere between a minute to 10 hours or more depending on the site you're importing. As a rough guide, __android.stackexchange.com__ took a minute on my VM, while __stackoverflow.com__ took just over 10 hours.
|
||||
|
||||
Repeat these steps for each site you wish to import. Do not attempt to import multiple sites at the same time; it will not work and you may end up with half-imported sites.
|
||||
|
||||
The import process can be cancelled at any time without any adverse effect, however on the next run it will have to start from scratch again.
|
||||
|
||||
h3. Start the app
|
||||
|
||||
To start Stackdump, execute the following command -
|
||||
|
||||
@stackdump_dir/start_web.sh@
|
||||
|
||||
... and visit port 8080 on that machine. That's it - your own offline, read-only instance of StackExchange.
|
||||
|
||||
If you need to change the port that it runs on, modify @stackdump_dir/python/src/stackdump/settings.py@ and restart the app.
|
||||
|
||||
The aforementioned @settings.py@ file also contains some other settings that control how Stackdump works.
|
||||
|
||||
Stackdump comes bundled with some init.d scripts as well which were tested on CentOS 5. These are located in the @init.d@ directory. To use these, you will need to modify them to specify the path to the Stackdump root directory and the user to run under.
|
||||
|
||||
Both the search indexer and the app need to be running for Stackdump to work.
|
||||
|
||||
h2. Maintenance
|
||||
|
||||
Stackdump stores all its data in the @data@ directory under its root directory. If you want to start fresh, just stop the app and the search indexer, delete that directory and restart the app and search indexer.
|
||||
|
||||
To delete certain sites from Stackdump, use the manage_sites management command -
|
||||
|
||||
@stackdump_dir/manage.sh manage_sites -l@ to list the sites (and their site keys) currently in the system;
|
||||
@stackdump_dir/manage.sh manage_sites -d site_key@ to delete a particular site.
|
||||
|
||||
It is not necessary to delete a site before importing a new data dump of it though; the import process will automatically purge the old copy during the import process.
|
||||
|
||||
h2. Credits
|
||||
|
||||
Stackdump leverages several open-source projects to do various things, including -
|
||||
|
||||
* "twitter-bootstrap":http://github.com/twitter/bootstrap for the UI
|
||||
* "jQuery":http://jquery.com for the UI
|
||||
* "bottle.py":http://bottlepy.org for the web framework
|
||||
* "cherrypy":http://cherrypy.org for the built-in web server
|
||||
* "pysolr":https://github.com/toastdriven/pysolr/ to connect from Python to the search indexer, Apache Solr
|
||||
* "html5lib":http://code.google.com/p/html5lib/ for parsing HTML
|
||||
* "Jinja2":http://jinja.pocoo.org/ for templating
|
||||
* "SQLObject":http://www.sqlobject.org/ for writing and reading from the database
|
||||
* "iso8601":http://pypi.python.org/pypi/iso8601/ for date parsing
|
||||
* "markdown":http://pypi.python.org/pypi/Markdown for rendering comments
|
||||
* "mathjax":http://www.mathjax.org/ for displaying mathematical expressions properly
|
||||
* "httplib2":http://code.google.com/p/httplib2/ as a dependency of pysolr
|
||||
* "Apache Solr":http://lucene.apache.org/solr/ for search functionality
|
||||
|
||||
h2. Things not supported... yet
|
||||
|
||||
* searching or browsing by tags
|
||||
* tag wiki pages
|
||||
* badges
|
||||
* post history, e.g. reasons why are a post was closed are not listed
|
||||
|
||||
h2. License
|
||||
|
||||
Stackdump is licensed under the "MIT License":http://en.wikipedia.org/wiki/MIT_License.
|
BIN
Run-StackdumpCommand.ps1
Normal file
BIN
Run-StackdumpCommand.ps1
Normal file
Binary file not shown.
BIN
Start-Python.ps1
Normal file
BIN
Start-Python.ps1
Normal file
Binary file not shown.
BIN
Start-Solr.ps1
Normal file
BIN
Start-Solr.ps1
Normal file
Binary file not shown.
BIN
Start-StackdumpWeb.ps1
Normal file
BIN
Start-StackdumpWeb.ps1
Normal file
Binary file not shown.
142
init.d/stackdump_solr
Executable file
142
init.d/stackdump_solr
Executable file
@ -0,0 +1,142 @@
|
||||
#! /bin/bash
|
||||
#
|
||||
# stackdump_solr: Starts the Solr instance for Stackdump
|
||||
#
|
||||
# chkconfig: 345 99 01
|
||||
# description: This daemon provides the search engine capability for Stackdump.\
|
||||
# It is a required part of Stackdump; Stackdump will not work \
|
||||
# without it.
|
||||
|
||||
# Source function library.
|
||||
. /etc/init.d/functions
|
||||
|
||||
# this needs to be the path of the Stackdump root directory.
|
||||
STACKDUMP_HOME=/opt/stackdump/
|
||||
|
||||
# this is the user that Stackdump runs under
|
||||
STACKDUMP_USER=stackdump
|
||||
|
||||
SOLR_PID_FILE=/var/run/stackdump_solr.pid
|
||||
|
||||
if [ ! -d "$STACKDUMP_HOME" ]
|
||||
then
|
||||
echo "The STACKDUMP_HOME variable does not point to a valid directory."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
base=${0##*/}
|
||||
|
||||
start() {
|
||||
echo -n $"Starting Stackdump - Solr... "
|
||||
|
||||
# create the logs directory if it doesn't already exist
|
||||
if [ ! -d "$STACKDUMP_HOME/logs" ]
|
||||
then
|
||||
runuser -s /bin/bash $STACKDUMP_USER -c "mkdir $STACKDUMP_HOME/logs"
|
||||
fi
|
||||
|
||||
# check if it is already running
|
||||
SOLR_PID=`cat $SOLR_PID_FILE 2>/dev/null`
|
||||
if [ ! -z "$SOLR_PID" ]
|
||||
then
|
||||
if [ ! -z "$(pgrep -P $SOLR_PID)" ]
|
||||
then
|
||||
echo
|
||||
echo "Stackdump - Solr is already running."
|
||||
exit 2
|
||||
else
|
||||
# the PID is stale.
|
||||
rm $SOLR_PID_FILE
|
||||
fi
|
||||
fi
|
||||
|
||||
# run it!
|
||||
runuser -s /bin/bash $STACKDUMP_USER -c "$STACKDUMP_HOME/start_solr.sh >> $STACKDUMP_HOME/logs/solr.log 2>&1" &
|
||||
SOLR_PID=$!
|
||||
RETVAL=$?
|
||||
|
||||
if [ $RETVAL = 0 ]
|
||||
then
|
||||
echo $SOLR_PID > $SOLR_PID_FILE
|
||||
success $"$base startup"
|
||||
else
|
||||
failure $"$base startup"
|
||||
fi
|
||||
echo
|
||||
return $RETVAL
|
||||
}
|
||||
|
||||
stop() {
|
||||
# check if it is running
|
||||
SOLR_PID=`cat $SOLR_PID_FILE 2>/dev/null`
|
||||
if [ -z "$SOLR_PID" ] || [ -z "$(pgrep -P $SOLR_PID)" ]
|
||||
then
|
||||
echo "Stackdump - Solr is not running."
|
||||
exit 2
|
||||
fi
|
||||
|
||||
echo -n $"Shutting down Stackdump - Solr... "
|
||||
|
||||
# it is running, so shut it down.
|
||||
# there are many levels of processes here and the kill signal needs to
|
||||
# be sent to the actual Java process for the process to stop, so let's
|
||||
# just kill the whole process group.
|
||||
RUNUSER_CMD_PID=`pgrep -P $SOLR_PID`
|
||||
RUNUSER_CMD_PGRP=`ps -o pgrp --no-headers -p $RUNUSER_CMD_PID`
|
||||
|
||||
pkill -g $RUNUSER_CMD_PGRP
|
||||
RETVAL=$?
|
||||
[ $RETVAL = 0 ] && success $"$base shutdown" || failure $"$base shutdown"
|
||||
rm -f $SOLR_PID_FILE
|
||||
echo
|
||||
return $RETVAL
|
||||
}
|
||||
|
||||
status() {
|
||||
# check if it is running
|
||||
SOLR_PID=`cat $SOLR_PID_FILE 2>/dev/null`
|
||||
if [ -z "$SOLR_PID" ]
|
||||
then
|
||||
echo "Stackdump - Solr is not running."
|
||||
exit 0
|
||||
else
|
||||
if [ -z "$(pgrep -P $SOLR_PID)" ]
|
||||
then
|
||||
rm -f $SOLR_PID_FILE
|
||||
echo "Stackdump - Solr is not running."
|
||||
exit 0
|
||||
else
|
||||
echo "Stackdump - Solr is running."
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
restart() {
|
||||
stop
|
||||
start
|
||||
}
|
||||
|
||||
RETVAL=0
|
||||
|
||||
# See how we were called.
|
||||
case "$1" in
|
||||
start)
|
||||
start
|
||||
;;
|
||||
stop)
|
||||
stop
|
||||
;;
|
||||
status)
|
||||
status
|
||||
;;
|
||||
restart)
|
||||
restart
|
||||
;;
|
||||
*)
|
||||
echo $"Usage: $0 {start|stop|status|restart}"
|
||||
exit 1
|
||||
esac
|
||||
|
||||
exit $RETVAL
|
||||
|
141
init.d/stackdump_web
Normal file
141
init.d/stackdump_web
Normal file
@ -0,0 +1,141 @@
|
||||
#! /bin/bash
|
||||
#
|
||||
# stackdump_web: Starts the Stackdump web app
|
||||
#
|
||||
# chkconfig: 345 99 01
|
||||
# description: This daemon is the web server for Stackdump.\
|
||||
# It requires the Solr instance to be running to function.
|
||||
|
||||
# Source function library.
|
||||
. /etc/init.d/functions
|
||||
|
||||
# this needs to be the path of the Stackdump root directory.
|
||||
STACKDUMP_HOME=/opt/stackdump/
|
||||
|
||||
# this is the user that Stackdump runs under
|
||||
STACKDUMP_USER=stackdump
|
||||
|
||||
WEB_PID_FILE=/var/run/stackdump_web.pid
|
||||
|
||||
if [ ! -d "$STACKDUMP_HOME" ]
|
||||
then
|
||||
echo "The STACKDUMP_HOME variable does not point to a valid directory."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
base=${0##*/}
|
||||
|
||||
start() {
|
||||
echo -n $"Starting Stackdump - Web... "
|
||||
|
||||
# create the logs directory if it doesn't already exist
|
||||
if [ ! -d "$STACKDUMP_HOME/logs" ]
|
||||
then
|
||||
runuser -s /bin/bash $STACKDUMP_USER -c "mkdir $STACKDUMP_HOME/logs"
|
||||
fi
|
||||
|
||||
# check if it is already running
|
||||
WEB_PID=`cat $WEB_PID_FILE 2>/dev/null`
|
||||
if [ ! -z "$WEB_PID" ]
|
||||
then
|
||||
if [ ! -z "$(pgrep -P $WEB_PID)" ]
|
||||
then
|
||||
echo
|
||||
echo "Stackdump - Web is already running."
|
||||
exit 2
|
||||
else
|
||||
# the PID is stale.
|
||||
rm $WEB_PID_FILE
|
||||
fi
|
||||
fi
|
||||
|
||||
# run it!
|
||||
runuser -s /bin/bash $STACKDUMP_USER -c "$STACKDUMP_HOME/start_web.sh >> $STACKDUMP_HOME/logs/web.log 2>&1" &
|
||||
WEB_PID=$!
|
||||
RETVAL=$?
|
||||
|
||||
if [ $RETVAL = 0 ]
|
||||
then
|
||||
echo $WEB_PID > $WEB_PID_FILE
|
||||
success $"$base startup"
|
||||
else
|
||||
failure $"$base startup"
|
||||
fi
|
||||
echo
|
||||
return $RETVAL
|
||||
}
|
||||
|
||||
stop() {
|
||||
# check if it is running
|
||||
WEB_PID=`cat $WEB_PID_FILE 2>/dev/null`
|
||||
if [ -z "$WEB_PID" ] || [ -z "$(pgrep -P $WEB_PID)" ]
|
||||
then
|
||||
echo "Stackdump - Web is not running."
|
||||
exit 2
|
||||
fi
|
||||
|
||||
echo -n $"Shutting down Stackdump - Web... "
|
||||
|
||||
# it is running, so shut it down.
|
||||
# there are many levels of processes here and the kill signal needs to
|
||||
# be sent to the actual Python process for the process to stop, so let's
|
||||
# just kill the whole process group.
|
||||
RUNUSER_CMD_PID=`pgrep -P $WEB_PID`
|
||||
RUNUSER_CMD_PGRP=`ps -o pgrp --no-headers -p $RUNUSER_CMD_PID`
|
||||
|
||||
pkill -g $RUNUSER_CMD_PGRP
|
||||
RETVAL=$?
|
||||
[ $RETVAL = 0 ] && success $"$base shutdown" || failure $"$base shutdown"
|
||||
rm -f $WEB_PID_FILE
|
||||
echo
|
||||
return $RETVAL
|
||||
}
|
||||
|
||||
status() {
|
||||
# check if it is running
|
||||
WEB_PID=`cat $WEB_PID_FILE 2>/dev/null`
|
||||
if [ -z "$WEB_PID" ]
|
||||
then
|
||||
echo "Stackdump - Web is not running."
|
||||
exit 0
|
||||
else
|
||||
if [ -z "$(pgrep -P $WEB_PID)" ]
|
||||
then
|
||||
rm -f $WEB_PID_FILE
|
||||
echo "Stackdump - Web is not running."
|
||||
exit 0
|
||||
else
|
||||
echo "Stackdump - Web is running."
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
restart() {
|
||||
stop
|
||||
start
|
||||
}
|
||||
|
||||
RETVAL=0
|
||||
|
||||
# See how we were called.
|
||||
case "$1" in
|
||||
start)
|
||||
start
|
||||
;;
|
||||
stop)
|
||||
stop
|
||||
;;
|
||||
status)
|
||||
status
|
||||
;;
|
||||
restart)
|
||||
restart
|
||||
;;
|
||||
*)
|
||||
echo $"Usage: $0 {start|stop|status|restart}"
|
||||
exit 1
|
||||
esac
|
||||
|
||||
exit $RETVAL
|
||||
|
7412
java/solr/CHANGES.txt
Normal file
7412
java/solr/CHANGES.txt
Normal file
File diff suppressed because it is too large
Load Diff
226
java/solr/LICENSE.txt
Normal file
226
java/solr/LICENSE.txt
Normal file
@ -0,0 +1,226 @@
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
||||
==========================================================================
|
||||
The following license applies to the JQuery JavaScript library
|
||||
--------------------------------------------------------------------------
|
||||
Copyright (c) 2010 John Resig, http://jquery.com/
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining
|
||||
a copy of this software and associated documentation files (the
|
||||
"Software"), to deal in the Software without restriction, including
|
||||
without limitation the rights to use, copy, modify, merge, publish,
|
||||
distribute, sublicense, and/or sell copies of the Software, and to
|
||||
permit persons to whom the Software is furnished to do so, subject to
|
||||
the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be
|
||||
included in all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
||||
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
||||
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
||||
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
564
java/solr/NOTICE.txt
Normal file
564
java/solr/NOTICE.txt
Normal file
File diff suppressed because it is too large
Load Diff
120
java/solr/README.txt
Normal file
120
java/solr/README.txt
Normal file
@ -0,0 +1,120 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
|
||||
Welcome to the Apache Solr project!
|
||||
-----------------------------------
|
||||
|
||||
Solr is the popular, blazing fast open source enterprise search platform
|
||||
from the Apache Lucene project.
|
||||
|
||||
For a complete description of the Solr project, team composition, source
|
||||
code repositories, and other details, please see the Solr web site at
|
||||
http://lucene.apache.org/solr
|
||||
|
||||
|
||||
Getting Started
|
||||
---------------
|
||||
|
||||
See the "example" directory for an example Solr setup. A tutorial
|
||||
using the example setup can be found at
|
||||
http://lucene.apache.org/solr/tutorial.html
|
||||
or linked from "docs/index.html" in a binary distribution.
|
||||
Also, there are Solr clients for many programming languages, see
|
||||
http://wiki.apache.org/solr/IntegratingSolr
|
||||
|
||||
|
||||
Files included in an Apache Solr binary distribution
|
||||
----------------------------------------------------
|
||||
|
||||
example/
|
||||
A self-contained example Solr instance, complete with a sample
|
||||
configuration, documents to index, and the Jetty Servlet container.
|
||||
Please see example/README.txt for information about running this
|
||||
example.
|
||||
|
||||
dist/solr-XX.war
|
||||
The Apache Solr Application. Deploy this WAR file to any servlet
|
||||
container to run Apache Solr.
|
||||
|
||||
dist/solr-<component>-XX.jar
|
||||
The Apache Solr libraries. To compile Apache Solr Plugins,
|
||||
one or more of these will be required. The core library is
|
||||
required at a minimum. (see http://wiki.apache.org/solr/SolrPlugins
|
||||
for more information).
|
||||
|
||||
docs/index.html
|
||||
The Apache Solr Javadoc API documentation and Tutorial
|
||||
|
||||
|
||||
Instructions for Building Apache Solr from Source
|
||||
-------------------------------------------------
|
||||
|
||||
1. Download the Java SE 6 JDK (Java Development Kit) or later from http://java.sun.com/
|
||||
You will need the JDK installed, and the $JAVA_HOME/bin (Windows: %JAVA_HOME%\bin)
|
||||
folder included on your command path. To test this, issue a "java -version" command
|
||||
from your shell (command prompt) and verify that the Java version is 1.6 or later.
|
||||
|
||||
2. Download the Apache Ant binary distribution (1.8.2+) from
|
||||
http://ant.apache.org/ You will need Ant installed and the $ANT_HOME/bin (Windows:
|
||||
%ANT_HOME%\bin) folder included on your command path. To test this, issue a
|
||||
"ant -version" command from your shell (command prompt) and verify that Ant is
|
||||
available.
|
||||
|
||||
You will also need to install Apache Ivy binary distribution (2.2.0) from
|
||||
http://ant.apache.org/ivy/ and place ivy-2.2.0.jar file in ~/.ant/lib -- if you skip
|
||||
this step, the Solr build system will offer to do it for you.
|
||||
|
||||
3. Download the Apache Solr distribution, linked from the above web site.
|
||||
Unzip the distribution to a folder of your choice, e.g. C:\solr or ~/solr
|
||||
Alternately, you can obtain a copy of the latest Apache Solr source code
|
||||
directly from the Subversion repository:
|
||||
|
||||
http://lucene.apache.org/solr/versioncontrol.html
|
||||
|
||||
4. Navigate to the "solr" folder and issue an "ant" command to see the available options
|
||||
for building, testing, and packaging Solr.
|
||||
|
||||
NOTE:
|
||||
To see Solr in action, you may want to use the "ant example" command to build
|
||||
and package Solr into the example/webapps directory. See also example/README.txt.
|
||||
|
||||
|
||||
Export control
|
||||
-------------------------------------------------
|
||||
This distribution includes cryptographic software. The country in
|
||||
which you currently reside may have restrictions on the import,
|
||||
possession, use, and/or re-export to another country, of
|
||||
encryption software. BEFORE using any encryption software, please
|
||||
check your country's laws, regulations and policies concerning the
|
||||
import, possession, or use, and re-export of encryption software, to
|
||||
see if this is permitted. See <http://www.wassenaar.org/> for more
|
||||
information.
|
||||
|
||||
The U.S. Government Department of Commerce, Bureau of Industry and
|
||||
Security (BIS), has classified this software as Export Commodity
|
||||
Control Number (ECCN) 5D002.C.1, which includes information security
|
||||
software using or performing cryptographic functions with asymmetric
|
||||
algorithms. The form and manner of this Apache Software Foundation
|
||||
distribution makes it eligible for export under the License Exception
|
||||
ENC Technology Software Unrestricted (TSU) exception (see the BIS
|
||||
Export Administration Regulations, Section 740.13) for both object
|
||||
code and source code.
|
||||
|
||||
The following provides more details on the included cryptographic
|
||||
software:
|
||||
Apache Solr uses the Apache Tika which uses the Bouncy Castle generic encryption libraries for
|
||||
extracting text content and metadata from encrypted PDF files.
|
||||
See http://www.bouncycastle.org/ for more details on Bouncy Castle.
|
13
java/solr/SYSTEM_REQUIREMENTS.txt
Normal file
13
java/solr/SYSTEM_REQUIREMENTS.txt
Normal file
@ -0,0 +1,13 @@
|
||||
# System Requirements
|
||||
|
||||
Apache Solr runs of Java 6 or greater. When using Java 7, be sure to
|
||||
install at least Update 1! With all Java versions it is strongly
|
||||
recommended to not use experimental `-XX` JVM options. It is also
|
||||
recommended to always use the latest update version of your Java VM,
|
||||
because bugs may affect Solr. An overview of known JVM bugs can be
|
||||
found on http://wiki.apache.org/lucene-java/JavaBugs.
|
||||
|
||||
CPU, disk and memory requirements are based on the many choices made in
|
||||
implementing Solr (document size, number of documents, and number of
|
||||
hits retrieved to name a few). The benchmarks page has some information
|
||||
related to performance on particular platforms.
|
BIN
java/solr/dist/solr-analysis-extras-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-analysis-extras-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-cell-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-cell-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-clustering-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-clustering-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-core-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-core-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-dataimporthandler-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-dataimporthandler-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-dataimporthandler-extras-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-dataimporthandler-extras-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-langid-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-langid-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-solrj-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-solrj-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-test-framework-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-test-framework-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-uima-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-uima-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solr-velocity-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/solr-velocity-4.5.0.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/commons-io-2.1.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/commons-io-2.1.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/httpclient-4.2.3.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/httpclient-4.2.3.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/httpcore-4.2.2.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/httpcore-4.2.2.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/httpmime-4.2.3.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/httpmime-4.2.3.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/jcl-over-slf4j-1.6.6.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/jcl-over-slf4j-1.6.6.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/jul-to-slf4j-1.6.6.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/jul-to-slf4j-1.6.6.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/log4j-1.2.16.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/log4j-1.2.16.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/noggit-0.5.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/noggit-0.5.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/slf4j-api-1.6.6.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/slf4j-api-1.6.6.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/slf4j-log4j12-1.6.6.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/slf4j-log4j12-1.6.6.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/wstx-asl-3.2.7.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/wstx-asl-3.2.7.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/solrj-lib/zookeeper-3.4.5.jar
vendored
Normal file
BIN
java/solr/dist/solrj-lib/zookeeper-3.4.5.jar
vendored
Normal file
Binary file not shown.
6
java/solr/dist/test-framework/README.txt
vendored
Normal file
6
java/solr/dist/test-framework/README.txt
vendored
Normal file
@ -0,0 +1,6 @@
|
||||
The Solr test-framework products base classes and utility classes for
|
||||
writting JUnit tests excercising Solr functionality.
|
||||
|
||||
This test framework relies on the lucene components found in in the
|
||||
./lucene-libs/ directory, as well as the third-party libraries found
|
||||
in the ./lib directory.
|
BIN
java/solr/dist/test-framework/lib/ant-1.8.2.jar
vendored
Normal file
BIN
java/solr/dist/test-framework/lib/ant-1.8.2.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/test-framework/lib/junit-4.10.jar
vendored
Normal file
BIN
java/solr/dist/test-framework/lib/junit-4.10.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/test-framework/lib/junit4-ant-2.0.10.jar
vendored
Normal file
BIN
java/solr/dist/test-framework/lib/junit4-ant-2.0.10.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/test-framework/lib/randomizedtesting-runner-2.0.10.jar
vendored
Normal file
BIN
java/solr/dist/test-framework/lib/randomizedtesting-runner-2.0.10.jar
vendored
Normal file
Binary file not shown.
BIN
java/solr/dist/test-framework/lucene-libs/lucene-test-framework-4.5.0.jar
vendored
Normal file
BIN
java/solr/dist/test-framework/lucene-libs/lucene-test-framework-4.5.0.jar
vendored
Normal file
Binary file not shown.
8
java/solr/server/contexts/solr-jetty-context.xml
Normal file
8
java/solr/server/contexts/solr-jetty-context.xml
Normal file
@ -0,0 +1,8 @@
|
||||
<?xml version="1.0"?>
|
||||
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
|
||||
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
|
||||
<Set name="contextPath"><SystemProperty name="hostContext" default="/solr"/></Set>
|
||||
<Set name="war"><SystemProperty name="jetty.home"/>/webapps/solr.war</Set>
|
||||
<Set name="defaultsDescriptor"><SystemProperty name="jetty.home"/>/etc/webdefault.xml</Set>
|
||||
<Set name="tempDirectory"><Property name="jetty.home" default="."/>/solr-webapp</Set>
|
||||
</Configure>
|
37
java/solr/server/etc/create-solrtest.keystore.sh
Executable file
37
java/solr/server/etc/create-solrtest.keystore.sh
Executable file
@ -0,0 +1,37 @@
|
||||
#!/bin/bash -ex
|
||||
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
############
|
||||
|
||||
# This script shows how the solrtest.keystore file used for solr tests
|
||||
# and these example configs was generated.
|
||||
#
|
||||
# Running this script should only be necessary if the keystore file
|
||||
# needs to be replaced, which shouldn't be required until sometime around
|
||||
# the year 4751.
|
||||
#
|
||||
# NOTE: the "-ext" option used in the "keytool" command requires that you have
|
||||
# the java7 version of keytool, but the generated key will work with any
|
||||
# version of java
|
||||
|
||||
echo "### remove old keystore"
|
||||
rm -f solrtest.keystore
|
||||
|
||||
echo "### create keystore and keys"
|
||||
keytool -keystore solrtest.keystore -storepass "secret" -alias solrtest -keypass "secret" -genkey -keyalg RSA -dname "cn=localhost, ou=SolrTest, o=lucene.apache.org, c=US" -ext "san=ip:127.0.0.1" -validity 999999
|
||||
|
||||
|
205
java/solr/server/etc/jetty.xml
Normal file
205
java/solr/server/etc/jetty.xml
Normal file
@ -0,0 +1,205 @@
|
||||
<?xml version="1.0"?>
|
||||
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
|
||||
|
||||
<!-- =============================================================== -->
|
||||
<!-- Configure the Jetty Server -->
|
||||
<!-- -->
|
||||
<!-- Documentation of this file format can be found at: -->
|
||||
<!-- http://wiki.eclipse.org/Jetty/Reference/jetty.xml_syntax -->
|
||||
<!-- -->
|
||||
<!-- =============================================================== -->
|
||||
|
||||
|
||||
<Configure id="Server" class="org.eclipse.jetty.server.Server">
|
||||
|
||||
<!-- =========================================================== -->
|
||||
<!-- Server Thread Pool -->
|
||||
<!-- =========================================================== -->
|
||||
<Set name="ThreadPool">
|
||||
<!-- Default queued blocking threadpool -->
|
||||
<New class="org.eclipse.jetty.util.thread.QueuedThreadPool">
|
||||
<Set name="minThreads">10</Set>
|
||||
<Set name="maxThreads">10000</Set>
|
||||
<Set name="detailedDump">false</Set>
|
||||
</New>
|
||||
</Set>
|
||||
|
||||
<!-- =========================================================== -->
|
||||
<!-- Set connectors -->
|
||||
<!-- =========================================================== -->
|
||||
|
||||
<!--
|
||||
<Call name="addConnector">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.server.nio.SelectChannelConnector">
|
||||
<Set name="host"><SystemProperty name="jetty.host" /></Set>
|
||||
<Set name="port"><SystemProperty name="jetty.port" default="8983"/></Set>
|
||||
<Set name="maxIdleTime">50000</Set>
|
||||
<Set name="Acceptors">2</Set>
|
||||
<Set name="statsOn">false</Set>
|
||||
<Set name="confidentialPort">8443</Set>
|
||||
<Set name="lowResourcesConnections">5000</Set>
|
||||
<Set name="lowResourcesMaxIdleTime">5000</Set>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
-->
|
||||
|
||||
<!-- This connector is currently being used for Solr because it
|
||||
showed better performance than nio.SelectChannelConnector
|
||||
for typical Solr requests. -->
|
||||
<Call name="addConnector">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.server.bio.SocketConnector">
|
||||
<Call class="java.lang.System" name="setProperty"> <Arg>log4j.configuration</Arg> <Arg>etc/log4j.properties</Arg> </Call>
|
||||
<Set name="host"><SystemProperty name="jetty.host" /></Set>
|
||||
<Set name="port"><SystemProperty name="jetty.port" default="8983"/></Set>
|
||||
<Set name="maxIdleTime">50000</Set>
|
||||
<Set name="lowResourceMaxIdleTime">1500</Set>
|
||||
<Set name="statsOn">false</Set>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
|
||||
<!-- if the connector below is uncommented, then jetty will also accept SSL
|
||||
connections on port 8984, using a self signed certificate and can
|
||||
optionally require the client to authenticate with a certificate.
|
||||
(which can be the same as the server certificate_
|
||||
|
||||
# Run solr example with SSL on port 8984
|
||||
java -jar start.jar
|
||||
#
|
||||
# Run post.jar so that it trusts the server cert...
|
||||
java -Djavax.net.ssl.trustStore=../etc/solrtest.keystore -Durl=https://localhost:8984/solr/update -jar post.jar *.xml
|
||||
|
||||
# Run solr example with SSL requiring client certs on port 8984
|
||||
java -Djetty.ssl.clientAuth=true -jar start.jar
|
||||
#
|
||||
# Run post.jar so that it trusts the server cert,
|
||||
# and authenticates with a client cert
|
||||
java -Djavax.net.ssl.keyStorePassword=secret -Djavax.net.ssl.keyStore=../etc/solrtest.keystore -Djavax.net.ssl.trustStore=../etc/solrtest.keystore -Durl=https://localhost:8984/solr/update -jar post.jar *.xml
|
||||
|
||||
-->
|
||||
<!--
|
||||
<Call name="addConnector">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.server.ssl.SslSelectChannelConnector">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.http.ssl.SslContextFactory">
|
||||
<Set name="keyStore"><SystemProperty name="jetty.home" default="."/>/etc/solrtest.keystore</Set>
|
||||
<Set name="keyStorePassword">secret</Set>
|
||||
<Set name="needClientAuth"><SystemProperty name="jetty.ssl.clientAuth" default="false"/></Set>
|
||||
</New>
|
||||
</Arg>
|
||||
<Set name="port"><SystemProperty name="jetty.ssl.port" default="8984"/></Set>
|
||||
<Set name="maxIdleTime">30000</Set>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
-->
|
||||
|
||||
<!-- =========================================================== -->
|
||||
<!-- Set handler Collection Structure -->
|
||||
<!-- =========================================================== -->
|
||||
<Set name="handler">
|
||||
<New id="Handlers" class="org.eclipse.jetty.server.handler.HandlerCollection">
|
||||
<Set name="handlers">
|
||||
<Array type="org.eclipse.jetty.server.Handler">
|
||||
<Item>
|
||||
<New id="Contexts" class="org.eclipse.jetty.server.handler.ContextHandlerCollection"/>
|
||||
</Item>
|
||||
<Item>
|
||||
<New id="DefaultHandler" class="org.eclipse.jetty.server.handler.DefaultHandler"/>
|
||||
</Item>
|
||||
<Item>
|
||||
<New id="RequestLog" class="org.eclipse.jetty.server.handler.RequestLogHandler"/>
|
||||
</Item>
|
||||
</Array>
|
||||
</Set>
|
||||
</New>
|
||||
</Set>
|
||||
|
||||
<!-- =========================================================== -->
|
||||
<!-- Configure Request Log -->
|
||||
<!-- =========================================================== -->
|
||||
<!--
|
||||
<Ref id="Handlers">
|
||||
<Call name="addHandler">
|
||||
<Arg>
|
||||
<New id="RequestLog" class="org.eclipse.jetty.server.handler.RequestLogHandler">
|
||||
<Set name="requestLog">
|
||||
<New id="RequestLogImpl" class="org.eclipse.jetty.server.NCSARequestLog">
|
||||
<Set name="filename">
|
||||
logs/request.yyyy_mm_dd.log
|
||||
</Set>
|
||||
<Set name="filenameDateFormat">yyyy_MM_dd</Set>
|
||||
<Set name="retainDays">90</Set>
|
||||
<Set name="append">true</Set>
|
||||
<Set name="extended">false</Set>
|
||||
<Set name="logCookies">false</Set>
|
||||
<Set name="LogTimeZone">UTC</Set>
|
||||
</New>
|
||||
</Set>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
</Ref>
|
||||
-->
|
||||
|
||||
<!-- =========================================================== -->
|
||||
<!-- extra options -->
|
||||
<!-- =========================================================== -->
|
||||
<Set name="stopAtShutdown">true</Set>
|
||||
<Set name="sendServerVersion">false</Set>
|
||||
<Set name="sendDateHeader">false</Set>
|
||||
<Set name="gracefulShutdown">1000</Set>
|
||||
<Set name="dumpAfterStart">false</Set>
|
||||
<Set name="dumpBeforeStop">false</Set>
|
||||
|
||||
|
||||
|
||||
|
||||
<Call name="addBean">
|
||||
<Arg>
|
||||
<New id="DeploymentManager" class="org.eclipse.jetty.deploy.DeploymentManager">
|
||||
<Set name="contexts">
|
||||
<Ref id="Contexts" />
|
||||
</Set>
|
||||
<Call name="setContextAttribute">
|
||||
<Arg>org.eclipse.jetty.server.webapp.ContainerIncludeJarPattern</Arg>
|
||||
<Arg>.*/servlet-api-[^/]*\.jar$</Arg>
|
||||
</Call>
|
||||
|
||||
|
||||
<!-- Add a customize step to the deployment lifecycle -->
|
||||
<!-- uncomment and replace DebugBinding with your extended AppLifeCycle.Binding class
|
||||
<Call name="insertLifeCycleNode">
|
||||
<Arg>deployed</Arg>
|
||||
<Arg>starting</Arg>
|
||||
<Arg>customise</Arg>
|
||||
</Call>
|
||||
<Call name="addLifeCycleBinding">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.deploy.bindings.DebugBinding">
|
||||
<Arg>customise</Arg>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
-->
|
||||
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
|
||||
<Ref id="DeploymentManager">
|
||||
<Call name="addAppProvider">
|
||||
<Arg>
|
||||
<New class="org.eclipse.jetty.deploy.providers.ContextProvider">
|
||||
<Set name="monitoredDirName"><SystemProperty name="jetty.home" default="."/>/contexts</Set>
|
||||
<Set name="scanInterval">0</Set>
|
||||
</New>
|
||||
</Arg>
|
||||
</Call>
|
||||
</Ref>
|
||||
|
||||
</Configure>
|
38
java/solr/server/etc/logging.properties
Normal file
38
java/solr/server/etc/logging.properties
Normal file
@ -0,0 +1,38 @@
|
||||
#
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# To use this log config, start solr with the following system property:
|
||||
# -Djava.util.logging.config.file=etc/logging.properties
|
||||
|
||||
## Default global logging level:
|
||||
.level = INFO
|
||||
|
||||
## Log every update command (add, delete, commit, ...)
|
||||
#org.apache.solr.update.processor.LogUpdateProcessor.level = FINE
|
||||
|
||||
## Where to log (space separated list).
|
||||
handlers = java.util.logging.FileHandler
|
||||
|
||||
java.util.logging.FileHandler.level = FINE
|
||||
|
||||
java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter
|
||||
|
||||
# 1 GB limit per file
|
||||
java.util.logging.FileHandler.limit = 1073741824
|
||||
|
||||
# Log to the logs directory, with log files named solrxxx.log
|
||||
java.util.logging.FileHandler.pattern = ./logs/solr%u.log
|
BIN
java/solr/server/etc/solrtest.keystore
Normal file
BIN
java/solr/server/etc/solrtest.keystore
Normal file
Binary file not shown.
527
java/solr/server/etc/webdefault.xml
Normal file
527
java/solr/server/etc/webdefault.xml
Normal file
File diff suppressed because it is too large
Load Diff
BIN
java/solr/server/lib/ext/jcl-over-slf4j-1.6.6.jar
Normal file
BIN
java/solr/server/lib/ext/jcl-over-slf4j-1.6.6.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/ext/jul-to-slf4j-1.6.6.jar
Normal file
BIN
java/solr/server/lib/ext/jul-to-slf4j-1.6.6.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/ext/log4j-1.2.16.jar
Normal file
BIN
java/solr/server/lib/ext/log4j-1.2.16.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/ext/slf4j-api-1.6.6.jar
Normal file
BIN
java/solr/server/lib/ext/slf4j-api-1.6.6.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/ext/slf4j-log4j12-1.6.6.jar
Normal file
BIN
java/solr/server/lib/ext/slf4j-log4j12-1.6.6.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-continuation-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-continuation-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-deploy-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-deploy-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-http-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-http-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-io-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-io-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-jmx-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-jmx-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-security-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-security-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-server-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-server-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-servlet-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-servlet-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-util-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-util-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-webapp-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-webapp-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/jetty-xml-8.1.10.v20130312.jar
Normal file
BIN
java/solr/server/lib/jetty-xml-8.1.10.v20130312.jar
Normal file
Binary file not shown.
BIN
java/solr/server/lib/servlet-api-3.0.jar
Normal file
BIN
java/solr/server/lib/servlet-api-3.0.jar
Normal file
Binary file not shown.
24
java/solr/server/resources/log4j.properties
Normal file
24
java/solr/server/resources/log4j.properties
Normal file
@ -0,0 +1,24 @@
|
||||
# Logging level
|
||||
solr.log=logs/
|
||||
log4j.rootLogger=INFO, file, CONSOLE
|
||||
|
||||
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
|
||||
|
||||
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
|
||||
log4j.appender.CONSOLE.layout.ConversionPattern=%-4r [%t] %-5p %c %x \u2013 %m%n
|
||||
|
||||
#- size rotation with log cleanup.
|
||||
log4j.appender.file=org.apache.log4j.RollingFileAppender
|
||||
log4j.appender.file.MaxFileSize=4MB
|
||||
log4j.appender.file.MaxBackupIndex=9
|
||||
|
||||
#- File to log to and log format
|
||||
log4j.appender.file.File=${solr.log}/solr.log
|
||||
log4j.appender.file.layout=org.apache.log4j.PatternLayout
|
||||
log4j.appender.file.layout.ConversionPattern=%-5p - %d{yyyy-MM-dd HH:mm:ss.SSS}; %C; %m\n
|
||||
|
||||
log4j.logger.org.apache.zookeeper=WARN
|
||||
log4j.logger.org.apache.hadoop=WARN
|
||||
|
||||
# set to INFO to enable infostream log messages
|
||||
log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF
|
63
java/solr/server/solr/README.txt
Normal file
63
java/solr/server/solr/README.txt
Normal file
@ -0,0 +1,63 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
|
||||
Example Solr Home Directory
|
||||
=============================
|
||||
|
||||
This directory is provided as an example of what a "Solr Home" directory
|
||||
should look like.
|
||||
|
||||
It's not strictly necessary that you copy all of the files in this
|
||||
directory when setting up a new instance of Solr, but it is recommended.
|
||||
|
||||
|
||||
Basic Directory Structure
|
||||
-------------------------
|
||||
|
||||
The Solr Home directory typically contains the following...
|
||||
|
||||
* solr.xml *
|
||||
|
||||
This is the primary configuration file Solr looks for when starting.
|
||||
This file specifies the list of "SolrCores" it should load, and high
|
||||
level configuration options that should be used for all SolrCores.
|
||||
|
||||
Please see the comments in ./solr.xml for more details.
|
||||
|
||||
If no solr.xml file is found, then Solr assumes that there should be
|
||||
a single SolrCore named "collection1" and that the "Instance Directory"
|
||||
for collection1 should be the same as the Solr Home Directory.
|
||||
|
||||
* Individual SolrCore Instance Directories *
|
||||
|
||||
Although solr.xml can be configured to look for SolrCore Instance Directories
|
||||
in any path, simple sub-directories of the Solr Home Dir using relative paths
|
||||
are common for many installations. In this directory you can see the
|
||||
"./collection1" Instance Directory.
|
||||
|
||||
* A Shared 'lib' Directory *
|
||||
|
||||
Although solr.xml can be configured with an optional "sharedLib" attribute
|
||||
that can point to any path, it is common to use a "./lib" sub-directory of the
|
||||
Solr Home Directory.
|
||||
|
||||
* ZooKeeper Files *
|
||||
|
||||
When using SolrCloud using the embedded ZooKeeper option for Solr, it is
|
||||
common to have a "zoo.cfg" file and "zoo_data" directories in the Solr Home
|
||||
Directory. Please see the SolrCloud wiki page for more details...
|
||||
|
||||
https://wiki.apache.org/solr/SolrCloud
|
45
java/solr/server/solr/solr.xml
Normal file
45
java/solr/server/solr/solr.xml
Normal file
@ -0,0 +1,45 @@
|
||||
<?xml version="1.0" encoding="UTF-8" ?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!--
|
||||
This is an example of a simple "solr.xml" file for configuring one or
|
||||
more Solr Cores, as well as allowing Cores to be added, removed, and
|
||||
reloaded via HTTP requests.
|
||||
|
||||
More information about options available in this configuration file,
|
||||
and Solr Core administration can be found online:
|
||||
http://wiki.apache.org/solr/CoreAdmin
|
||||
-->
|
||||
|
||||
<solr>
|
||||
|
||||
<solrcloud>
|
||||
<str name="host">${host:}</str>
|
||||
<int name="hostPort">${jetty.port:8983}</int>
|
||||
<str name="hostContext">${hostContext:solr}</str>
|
||||
<int name="zkClientTimeout">${zkClientTimeout:15000}</int>
|
||||
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
|
||||
</solrcloud>
|
||||
|
||||
<shardHandlerFactory name="shardHandlerFactory"
|
||||
class="HttpShardHandlerFactory">
|
||||
<int name="socketTimeout">${socketTimeout:0}</int>
|
||||
<int name="connTimeout">${connTimeout:0}</int>
|
||||
</shardHandlerFactory>
|
||||
|
||||
</solr>
|
50
java/solr/server/solr/stackdump/README.txt
Normal file
50
java/solr/server/solr/stackdump/README.txt
Normal file
@ -0,0 +1,50 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
|
||||
Example SolrCore Instance Directory
|
||||
=============================
|
||||
|
||||
This directory is provided as an example of what an "Instance Directory"
|
||||
should look like for a SolrCore
|
||||
|
||||
It's not strictly necessary that you copy all of the files in this
|
||||
directory when setting up a new SolrCores, but it is recommended.
|
||||
|
||||
|
||||
Basic Directory Structure
|
||||
-------------------------
|
||||
|
||||
The Solr Home directory typically contains the following sub-directories...
|
||||
|
||||
conf/
|
||||
This directory is mandatory and must contain your solrconfig.xml
|
||||
and schema.xml. Any other optional configuration files would also
|
||||
be kept here.
|
||||
|
||||
data/
|
||||
This directory is the default location where Solr will keep your
|
||||
index, and is used by the replication scripts for dealing with
|
||||
snapshots. You can override this location in the
|
||||
conf/solrconfig.xml. Solr will create this directory if it does not
|
||||
already exist.
|
||||
|
||||
lib/
|
||||
This directory is optional. If it exists, Solr will load any Jars
|
||||
found in this directory and use them to resolve any "plugins"
|
||||
specified in your solrconfig.xml or schema.xml (ie: Analyzers,
|
||||
Request Handlers, etc...). Alternatively you can use the <lib>
|
||||
syntax in conf/solrconfig.xml to direct Solr to your plugins. See
|
||||
the example conf/solrconfig.xml file for details.
|
24
java/solr/server/solr/stackdump/conf/admin-extra.html
Normal file
24
java/solr/server/solr/stackdump/conf/admin-extra.html
Normal file
@ -0,0 +1,24 @@
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!-- The content of this page will be statically included into the top-
|
||||
right box of the cores overview page. Uncomment this as an example to
|
||||
see there the content will show up.
|
||||
|
||||
<img src="img/ico/construction.png"> This line will appear at the top-
|
||||
right box on collection1's Overview
|
||||
-->
|
@ -0,0 +1,25 @@
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!-- admin-extra.menu-bottom.html -->
|
||||
<!--
|
||||
<li>
|
||||
<a href="#" style="background-image: url(img/ico/construction.png);">
|
||||
LAST ITEM
|
||||
</a>
|
||||
</li>
|
||||
-->
|
@ -0,0 +1,25 @@
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!-- admin-extra.menu-top.html -->
|
||||
<!--
|
||||
<li>
|
||||
<a href="#" style="background-image: url(img/ico/construction.png);">
|
||||
FIRST ITEM
|
||||
</a>
|
||||
</li>
|
||||
-->
|
67
java/solr/server/solr/stackdump/conf/currency.xml
Normal file
67
java/solr/server/solr/stackdump/conf/currency.xml
Normal file
@ -0,0 +1,67 @@
|
||||
<?xml version="1.0" ?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!-- Example exchange rates file for CurrencyField type named "currency" in example schema -->
|
||||
|
||||
<currencyConfig version="1.0">
|
||||
<rates>
|
||||
<!-- Updated from http://www.exchangerate.com/ at 2011-09-27 -->
|
||||
<rate from="USD" to="ARS" rate="4.333871" comment="ARGENTINA Peso" />
|
||||
<rate from="USD" to="AUD" rate="1.025768" comment="AUSTRALIA Dollar" />
|
||||
<rate from="USD" to="EUR" rate="0.743676" comment="European Euro" />
|
||||
<rate from="USD" to="BRL" rate="1.881093" comment="BRAZIL Real" />
|
||||
<rate from="USD" to="CAD" rate="1.030815" comment="CANADA Dollar" />
|
||||
<rate from="USD" to="CLP" rate="519.0996" comment="CHILE Peso" />
|
||||
<rate from="USD" to="CNY" rate="6.387310" comment="CHINA Yuan" />
|
||||
<rate from="USD" to="CZK" rate="18.47134" comment="CZECH REP. Koruna" />
|
||||
<rate from="USD" to="DKK" rate="5.515436" comment="DENMARK Krone" />
|
||||
<rate from="USD" to="HKD" rate="7.801922" comment="HONG KONG Dollar" />
|
||||
<rate from="USD" to="HUF" rate="215.6169" comment="HUNGARY Forint" />
|
||||
<rate from="USD" to="ISK" rate="118.1280" comment="ICELAND Krona" />
|
||||
<rate from="USD" to="INR" rate="49.49088" comment="INDIA Rupee" />
|
||||
<rate from="USD" to="XDR" rate="0.641358" comment="INTNL MON. FUND SDR" />
|
||||
<rate from="USD" to="ILS" rate="3.709739" comment="ISRAEL Sheqel" />
|
||||
<rate from="USD" to="JPY" rate="76.32419" comment="JAPAN Yen" />
|
||||
<rate from="USD" to="KRW" rate="1169.173" comment="KOREA (SOUTH) Won" />
|
||||
<rate from="USD" to="KWD" rate="0.275142" comment="KUWAIT Dinar" />
|
||||
<rate from="USD" to="MXN" rate="13.85895" comment="MEXICO Peso" />
|
||||
<rate from="USD" to="NZD" rate="1.285159" comment="NEW ZEALAND Dollar" />
|
||||
<rate from="USD" to="NOK" rate="5.859035" comment="NORWAY Krone" />
|
||||
<rate from="USD" to="PKR" rate="87.57007" comment="PAKISTAN Rupee" />
|
||||
<rate from="USD" to="PEN" rate="2.730683" comment="PERU Sol" />
|
||||
<rate from="USD" to="PHP" rate="43.62039" comment="PHILIPPINES Peso" />
|
||||
<rate from="USD" to="PLN" rate="3.310139" comment="POLAND Zloty" />
|
||||
<rate from="USD" to="RON" rate="3.100932" comment="ROMANIA Leu" />
|
||||
<rate from="USD" to="RUB" rate="32.14663" comment="RUSSIA Ruble" />
|
||||
<rate from="USD" to="SAR" rate="3.750465" comment="SAUDI ARABIA Riyal" />
|
||||
<rate from="USD" to="SGD" rate="1.299352" comment="SINGAPORE Dollar" />
|
||||
<rate from="USD" to="ZAR" rate="8.329761" comment="SOUTH AFRICA Rand" />
|
||||
<rate from="USD" to="SEK" rate="6.883442" comment="SWEDEN Krona" />
|
||||
<rate from="USD" to="CHF" rate="0.906035" comment="SWITZERLAND Franc" />
|
||||
<rate from="USD" to="TWD" rate="30.40283" comment="TAIWAN Dollar" />
|
||||
<rate from="USD" to="THB" rate="30.89487" comment="THAILAND Baht" />
|
||||
<rate from="USD" to="AED" rate="3.672955" comment="U.A.E. Dirham" />
|
||||
<rate from="USD" to="UAH" rate="7.988582" comment="UKRAINE Hryvnia" />
|
||||
<rate from="USD" to="GBP" rate="0.647910" comment="UNITED KINGDOM Pound" />
|
||||
|
||||
<!-- Cross-rates for some common currencies -->
|
||||
<rate from="EUR" to="GBP" rate="0.869914" />
|
||||
<rate from="EUR" to="NOK" rate="7.800095" />
|
||||
<rate from="GBP" to="NOK" rate="8.966508" />
|
||||
</rates>
|
||||
</currencyConfig>
|
38
java/solr/server/solr/stackdump/conf/elevate.xml
Normal file
38
java/solr/server/solr/stackdump/conf/elevate.xml
Normal file
@ -0,0 +1,38 @@
|
||||
<?xml version="1.0" encoding="UTF-8" ?>
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
<!-- If this file is found in the config directory, it will only be
|
||||
loaded once at startup. If it is found in Solr's data
|
||||
directory, it will be re-loaded every commit.
|
||||
|
||||
See http://wiki.apache.org/solr/QueryElevationComponent for more info
|
||||
|
||||
-->
|
||||
<elevate>
|
||||
<query text="foo bar">
|
||||
<doc id="1" />
|
||||
<doc id="2" />
|
||||
<doc id="3" />
|
||||
</query>
|
||||
|
||||
<query text="ipod">
|
||||
<doc id="MA147LL/A" /> <!-- put the actual ipod at the top -->
|
||||
<doc id="IW-02" exclude="true" /> <!-- exclude this cable -->
|
||||
</query>
|
||||
|
||||
</elevate>
|
@ -0,0 +1,8 @@
|
||||
# Set of Catalan contractions for ElisionFilter
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
d
|
||||
l
|
||||
m
|
||||
n
|
||||
s
|
||||
t
|
@ -0,0 +1,15 @@
|
||||
# Set of French contractions for ElisionFilter
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
l
|
||||
m
|
||||
t
|
||||
qu
|
||||
n
|
||||
s
|
||||
j
|
||||
d
|
||||
c
|
||||
jusqu
|
||||
quoiqu
|
||||
lorsqu
|
||||
puisqu
|
@ -0,0 +1,5 @@
|
||||
# Set of Irish contractions for ElisionFilter
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
d
|
||||
m
|
||||
b
|
@ -0,0 +1,23 @@
|
||||
# Set of Italian contractions for ElisionFilter
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
c
|
||||
l
|
||||
all
|
||||
dall
|
||||
dell
|
||||
nell
|
||||
sull
|
||||
coll
|
||||
pell
|
||||
gl
|
||||
agl
|
||||
dagl
|
||||
degl
|
||||
negl
|
||||
sugl
|
||||
un
|
||||
m
|
||||
t
|
||||
s
|
||||
v
|
||||
d
|
@ -0,0 +1,5 @@
|
||||
# Set of Irish hyphenations for StopFilter
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
h
|
||||
n
|
||||
t
|
@ -0,0 +1,6 @@
|
||||
# Set of overrides for the dutch stemmer
|
||||
# TODO: load this as a resource from the analyzer and sync it in build.xml
|
||||
fiets fiets
|
||||
bromfiets bromfiets
|
||||
ei eier
|
||||
kind kinder
|
420
java/solr/server/solr/stackdump/conf/lang/stoptags_ja.txt
Normal file
420
java/solr/server/solr/stackdump/conf/lang/stoptags_ja.txt
Normal file
@ -0,0 +1,420 @@
|
||||
#
|
||||
# This file defines a Japanese stoptag set for JapanesePartOfSpeechStopFilter.
|
||||
#
|
||||
# Any token with a part-of-speech tag that exactly matches those defined in this
|
||||
# file are removed from the token stream.
|
||||
#
|
||||
# Set your own stoptags by uncommenting the lines below. Note that comments are
|
||||
# not allowed on the same line as a stoptag. See LUCENE-3745 for frequency lists,
|
||||
# etc. that can be useful for building you own stoptag set.
|
||||
#
|
||||
# The entire possible tagset is provided below for convenience.
|
||||
#
|
||||
#####
|
||||
# noun: unclassified nouns
|
||||
#名詞
|
||||
#
|
||||
# noun-common: Common nouns or nouns where the sub-classification is undefined
|
||||
#名詞-一般
|
||||
#
|
||||
# noun-proper: Proper nouns where the sub-classification is undefined
|
||||
#名詞-固有名詞
|
||||
#
|
||||
# noun-proper-misc: miscellaneous proper nouns
|
||||
#名詞-固有名詞-一般
|
||||
#
|
||||
# noun-proper-person: Personal names where the sub-classification is undefined
|
||||
#名詞-固有名詞-人名
|
||||
#
|
||||
# noun-proper-person-misc: names that cannot be divided into surname and
|
||||
# given name; foreign names; names where the surname or given name is unknown.
|
||||
# e.g. お市の方
|
||||
#名詞-固有名詞-人名-一般
|
||||
#
|
||||
# noun-proper-person-surname: Mainly Japanese surnames.
|
||||
# e.g. 山田
|
||||
#名詞-固有名詞-人名-姓
|
||||
#
|
||||
# noun-proper-person-given_name: Mainly Japanese given names.
|
||||
# e.g. 太郎
|
||||
#名詞-固有名詞-人名-名
|
||||
#
|
||||
# noun-proper-organization: Names representing organizations.
|
||||
# e.g. 通産省, NHK
|
||||
#名詞-固有名詞-組織
|
||||
#
|
||||
# noun-proper-place: Place names where the sub-classification is undefined
|
||||
#名詞-固有名詞-地域
|
||||
#
|
||||
# noun-proper-place-misc: Place names excluding countries.
|
||||
# e.g. アジア, バルセロナ, 京都
|
||||
#名詞-固有名詞-地域-一般
|
||||
#
|
||||
# noun-proper-place-country: Country names.
|
||||
# e.g. 日本, オーストラリア
|
||||
#名詞-固有名詞-地域-国
|
||||
#
|
||||
# noun-pronoun: Pronouns where the sub-classification is undefined
|
||||
#名詞-代名詞
|
||||
#
|
||||
# noun-pronoun-misc: miscellaneous pronouns:
|
||||
# e.g. それ, ここ, あいつ, あなた, あちこち, いくつ, どこか, なに, みなさん, みんな, わたくし, われわれ
|
||||
#名詞-代名詞-一般
|
||||
#
|
||||
# noun-pronoun-contraction: Spoken language contraction made by combining a
|
||||
# pronoun and the particle 'wa'.
|
||||
# e.g. ありゃ, こりゃ, こりゃあ, そりゃ, そりゃあ
|
||||
#名詞-代名詞-縮約
|
||||
#
|
||||
# noun-adverbial: Temporal nouns such as names of days or months that behave
|
||||
# like adverbs. Nouns that represent amount or ratios and can be used adverbially,
|
||||
# e.g. 金曜, 一月, 午後, 少量
|
||||
#名詞-副詞可能
|
||||
#
|
||||
# noun-verbal: Nouns that take arguments with case and can appear followed by
|
||||
# 'suru' and related verbs (する, できる, なさる, くださる)
|
||||
# e.g. インプット, 愛着, 悪化, 悪戦苦闘, 一安心, 下取り
|
||||
#名詞-サ変接続
|
||||
#
|
||||
# noun-adjective-base: The base form of adjectives, words that appear before な ("na")
|
||||
# e.g. 健康, 安易, 駄目, だめ
|
||||
#名詞-形容動詞語幹
|
||||
#
|
||||
# noun-numeric: Arabic numbers, Chinese numerals, and counters like 何 (回), 数.
|
||||
# e.g. 0, 1, 2, 何, 数, 幾
|
||||
#名詞-数
|
||||
#
|
||||
# noun-affix: noun affixes where the sub-classification is undefined
|
||||
#名詞-非自立
|
||||
#
|
||||
# noun-affix-misc: Of adnominalizers, the case-marker の ("no"), and words that
|
||||
# attach to the base form of inflectional words, words that cannot be classified
|
||||
# into any of the other categories below. This category includes indefinite nouns.
|
||||
# e.g. あかつき, 暁, かい, 甲斐, 気, きらい, 嫌い, くせ, 癖, こと, 事, ごと, 毎, しだい, 次第,
|
||||
# 順, せい, 所為, ついで, 序で, つもり, 積もり, 点, どころ, の, はず, 筈, はずみ, 弾み,
|
||||
# 拍子, ふう, ふり, 振り, ほう, 方, 旨, もの, 物, 者, ゆえ, 故, ゆえん, 所以, わけ, 訳,
|
||||
# わり, 割り, 割, ん-口語/, もん-口語/
|
||||
#名詞-非自立-一般
|
||||
#
|
||||
# noun-affix-adverbial: noun affixes that that can behave as adverbs.
|
||||
# e.g. あいだ, 間, あげく, 挙げ句, あと, 後, 余り, 以外, 以降, 以後, 以上, 以前, 一方, うえ,
|
||||
# 上, うち, 内, おり, 折り, かぎり, 限り, きり, っきり, 結果, ころ, 頃, さい, 際, 最中, さなか,
|
||||
# 最中, じたい, 自体, たび, 度, ため, 為, つど, 都度, とおり, 通り, とき, 時, ところ, 所,
|
||||
# とたん, 途端, なか, 中, のち, 後, ばあい, 場合, 日, ぶん, 分, ほか, 他, まえ, 前, まま,
|
||||
# 儘, 侭, みぎり, 矢先
|
||||
#名詞-非自立-副詞可能
|
||||
#
|
||||
# noun-affix-aux: noun affixes treated as 助動詞 ("auxiliary verb") in school grammars
|
||||
# with the stem よう(だ) ("you(da)").
|
||||
# e.g. よう, やう, 様 (よう)
|
||||
#名詞-非自立-助動詞語幹
|
||||
#
|
||||
# noun-affix-adjective-base: noun affixes that can connect to the indeclinable
|
||||
# connection form な (aux "da").
|
||||
# e.g. みたい, ふう
|
||||
#名詞-非自立-形容動詞語幹
|
||||
#
|
||||
# noun-special: special nouns where the sub-classification is undefined.
|
||||
#名詞-特殊
|
||||
#
|
||||
# noun-special-aux: The そうだ ("souda") stem form that is used for reporting news, is
|
||||
# treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the base
|
||||
# form of inflectional words.
|
||||
# e.g. そう
|
||||
#名詞-特殊-助動詞語幹
|
||||
#
|
||||
# noun-suffix: noun suffixes where the sub-classification is undefined.
|
||||
#名詞-接尾
|
||||
#
|
||||
# noun-suffix-misc: Of the nouns or stem forms of other parts of speech that connect
|
||||
# to ガル or タイ and can combine into compound nouns, words that cannot be classified into
|
||||
# any of the other categories below. In general, this category is more inclusive than
|
||||
# 接尾語 ("suffix") and is usually the last element in a compound noun.
|
||||
# e.g. おき, かた, 方, 甲斐 (がい), がかり, ぎみ, 気味, ぐるみ, (~した) さ, 次第, 済 (ず) み,
|
||||
# よう, (でき)っこ, 感, 観, 性, 学, 類, 面, 用
|
||||
#名詞-接尾-一般
|
||||
#
|
||||
# noun-suffix-person: Suffixes that form nouns and attach to person names more often
|
||||
# than other nouns.
|
||||
# e.g. 君, 様, 著
|
||||
#名詞-接尾-人名
|
||||
#
|
||||
# noun-suffix-place: Suffixes that form nouns and attach to place names more often
|
||||
# than other nouns.
|
||||
# e.g. 町, 市, 県
|
||||
#名詞-接尾-地域
|
||||
#
|
||||
# noun-suffix-verbal: Of the suffixes that attach to nouns and form nouns, those that
|
||||
# can appear before スル ("suru").
|
||||
# e.g. 化, 視, 分け, 入り, 落ち, 買い
|
||||
#名詞-接尾-サ変接続
|
||||
#
|
||||
# noun-suffix-aux: The stem form of そうだ (様態) that is used to indicate conditions,
|
||||
# is treated as 助動詞 ("auxiliary verb") in school grammars, and attach to the
|
||||
# conjunctive form of inflectional words.
|
||||
# e.g. そう
|
||||
#名詞-接尾-助動詞語幹
|
||||
#
|
||||
# noun-suffix-adjective-base: Suffixes that attach to other nouns or the conjunctive
|
||||
# form of inflectional words and appear before the copula だ ("da").
|
||||
# e.g. 的, げ, がち
|
||||
#名詞-接尾-形容動詞語幹
|
||||
#
|
||||
# noun-suffix-adverbial: Suffixes that attach to other nouns and can behave as adverbs.
|
||||
# e.g. 後 (ご), 以後, 以降, 以前, 前後, 中, 末, 上, 時 (じ)
|
||||
#名詞-接尾-副詞可能
|
||||
#
|
||||
# noun-suffix-classifier: Suffixes that attach to numbers and form nouns. This category
|
||||
# is more inclusive than 助数詞 ("classifier") and includes common nouns that attach
|
||||
# to numbers.
|
||||
# e.g. 個, つ, 本, 冊, パーセント, cm, kg, カ月, か国, 区画, 時間, 時半
|
||||
#名詞-接尾-助数詞
|
||||
#
|
||||
# noun-suffix-special: Special suffixes that mainly attach to inflecting words.
|
||||
# e.g. (楽し) さ, (考え) 方
|
||||
#名詞-接尾-特殊
|
||||
#
|
||||
# noun-suffix-conjunctive: Nouns that behave like conjunctions and join two words
|
||||
# together.
|
||||
# e.g. (日本) 対 (アメリカ), 対 (アメリカ), (3) 対 (5), (女優) 兼 (主婦)
|
||||
#名詞-接続詞的
|
||||
#
|
||||
# noun-verbal_aux: Nouns that attach to the conjunctive particle て ("te") and are
|
||||
# semantically verb-like.
|
||||
# e.g. ごらん, ご覧, 御覧, 頂戴
|
||||
#名詞-動詞非自立的
|
||||
#
|
||||
# noun-quotation: text that cannot be segmented into words, proverbs, Chinese poetry,
|
||||
# dialects, English, etc. Currently, the only entry for 名詞 引用文字列 ("noun quotation")
|
||||
# is いわく ("iwaku").
|
||||
#名詞-引用文字列
|
||||
#
|
||||
# noun-nai_adjective: Words that appear before the auxiliary verb ない ("nai") and
|
||||
# behave like an adjective.
|
||||
# e.g. 申し訳, 仕方, とんでも, 違い
|
||||
#名詞-ナイ形容詞語幹
|
||||
#
|
||||
#####
|
||||
# prefix: unclassified prefixes
|
||||
#接頭詞
|
||||
#
|
||||
# prefix-nominal: Prefixes that attach to nouns (including adjective stem forms)
|
||||
# excluding numerical expressions.
|
||||
# e.g. お (水), 某 (氏), 同 (社), 故 (~氏), 高 (品質), お (見事), ご (立派)
|
||||
#接頭詞-名詞接続
|
||||
#
|
||||
# prefix-verbal: Prefixes that attach to the imperative form of a verb or a verb
|
||||
# in conjunctive form followed by なる/なさる/くださる.
|
||||
# e.g. お (読みなさい), お (座り)
|
||||
#接頭詞-動詞接続
|
||||
#
|
||||
# prefix-adjectival: Prefixes that attach to adjectives.
|
||||
# e.g. お (寒いですねえ), バカ (でかい)
|
||||
#接頭詞-形容詞接続
|
||||
#
|
||||
# prefix-numerical: Prefixes that attach to numerical expressions.
|
||||
# e.g. 約, およそ, 毎時
|
||||
#接頭詞-数接続
|
||||
#
|
||||
#####
|
||||
# verb: unclassified verbs
|
||||
#動詞
|
||||
#
|
||||
# verb-main:
|
||||
#動詞-自立
|
||||
#
|
||||
# verb-auxiliary:
|
||||
#動詞-非自立
|
||||
#
|
||||
# verb-suffix:
|
||||
#動詞-接尾
|
||||
#
|
||||
#####
|
||||
# adjective: unclassified adjectives
|
||||
#形容詞
|
||||
#
|
||||
# adjective-main:
|
||||
#形容詞-自立
|
||||
#
|
||||
# adjective-auxiliary:
|
||||
#形容詞-非自立
|
||||
#
|
||||
# adjective-suffix:
|
||||
#形容詞-接尾
|
||||
#
|
||||
#####
|
||||
# adverb: unclassified adverbs
|
||||
#副詞
|
||||
#
|
||||
# adverb-misc: Words that can be segmented into one unit and where adnominal
|
||||
# modification is not possible.
|
||||
# e.g. あいかわらず, 多分
|
||||
#副詞-一般
|
||||
#
|
||||
# adverb-particle_conjunction: Adverbs that can be followed by の, は, に,
|
||||
# な, する, だ, etc.
|
||||
# e.g. こんなに, そんなに, あんなに, なにか, なんでも
|
||||
#副詞-助詞類接続
|
||||
#
|
||||
#####
|
||||
# adnominal: Words that only have noun-modifying forms.
|
||||
# e.g. この, その, あの, どの, いわゆる, なんらかの, 何らかの, いろんな, こういう, そういう, ああいう,
|
||||
# どういう, こんな, そんな, あんな, どんな, 大きな, 小さな, おかしな, ほんの, たいした,
|
||||
# 「(, も) さる (ことながら)」, 微々たる, 堂々たる, 単なる, いかなる, 我が」「同じ, 亡き
|
||||
#連体詞
|
||||
#
|
||||
#####
|
||||
# conjunction: Conjunctions that can occur independently.
|
||||
# e.g. が, けれども, そして, じゃあ, それどころか
|
||||
接続詞
|
||||
#
|
||||
#####
|
||||
# particle: unclassified particles.
|
||||
助詞
|
||||
#
|
||||
# particle-case: case particles where the subclassification is undefined.
|
||||
助詞-格助詞
|
||||
#
|
||||
# particle-case-misc: Case particles.
|
||||
# e.g. から, が, で, と, に, へ, より, を, の, にて
|
||||
助詞-格助詞-一般
|
||||
#
|
||||
# particle-case-quote: the "to" that appears after nouns, a person’s speech,
|
||||
# quotation marks, expressions of decisions from a meeting, reasons, judgements,
|
||||
# conjectures, etc.
|
||||
# e.g. ( だ) と (述べた.), ( である) と (して執行猶予...)
|
||||
助詞-格助詞-引用
|
||||
#
|
||||
# particle-case-compound: Compounds of particles and verbs that mainly behave
|
||||
# like case particles.
|
||||
# e.g. という, といった, とかいう, として, とともに, と共に, でもって, にあたって, に当たって, に当って,
|
||||
# にあたり, に当たり, に当り, に当たる, にあたる, において, に於いて,に於て, における, に於ける,
|
||||
# にかけ, にかけて, にかんし, に関し, にかんして, に関して, にかんする, に関する, に際し,
|
||||
# に際して, にしたがい, に従い, に従う, にしたがって, に従って, にたいし, に対し, にたいして,
|
||||
# に対して, にたいする, に対する, について, につき, につけ, につけて, につれ, につれて, にとって,
|
||||
# にとり, にまつわる, によって, に依って, に因って, により, に依り, に因り, による, に依る, に因る,
|
||||
# にわたって, にわたる, をもって, を以って, を通じ, を通じて, を通して, をめぐって, をめぐり, をめぐる,
|
||||
# って-口語/, ちゅう-関西弁「という」/, (何) ていう (人)-口語/, っていう-口語/, といふ, とかいふ
|
||||
助詞-格助詞-連語
|
||||
#
|
||||
# particle-conjunctive:
|
||||
# e.g. から, からには, が, けれど, けれども, けど, し, つつ, て, で, と, ところが, どころか, とも, ども,
|
||||
# ながら, なり, ので, のに, ば, ものの, や ( した), やいなや, (ころん) じゃ(いけない)-口語/,
|
||||
# (行っ) ちゃ(いけない)-口語/, (言っ) たって (しかたがない)-口語/, (それがなく)ったって (平気)-口語/
|
||||
助詞-接続助詞
|
||||
#
|
||||
# particle-dependency:
|
||||
# e.g. こそ, さえ, しか, すら, は, も, ぞ
|
||||
助詞-係助詞
|
||||
#
|
||||
# particle-adverbial:
|
||||
# e.g. がてら, かも, くらい, 位, ぐらい, しも, (学校) じゃ(これが流行っている)-口語/,
|
||||
# (それ)じゃあ (よくない)-口語/, ずつ, (私) なぞ, など, (私) なり (に), (先生) なんか (大嫌い)-口語/,
|
||||
# (私) なんぞ, (先生) なんて (大嫌い)-口語/, のみ, だけ, (私) だって-口語/, だに,
|
||||
# (彼)ったら-口語/, (お茶) でも (いかが), 等 (とう), (今後) とも, ばかり, ばっか-口語/, ばっかり-口語/,
|
||||
# ほど, 程, まで, 迄, (誰) も (が)([助詞-格助詞] および [助詞-係助詞] の前に位置する「も」)
|
||||
助詞-副助詞
|
||||
#
|
||||
# particle-interjective: particles with interjective grammatical roles.
|
||||
# e.g. (松島) や
|
||||
助詞-間投助詞
|
||||
#
|
||||
# particle-coordinate:
|
||||
# e.g. と, たり, だの, だり, とか, なり, や, やら
|
||||
助詞-並立助詞
|
||||
#
|
||||
# particle-final:
|
||||
# e.g. かい, かしら, さ, ぜ, (だ)っけ-口語/, (とまってる) で-方言/, な, ナ, なあ-口語/, ぞ, ね, ネ,
|
||||
# ねぇ-口語/, ねえ-口語/, ねん-方言/, の, のう-口語/, や, よ, ヨ, よぉ-口語/, わ, わい-口語/
|
||||
助詞-終助詞
|
||||
#
|
||||
# particle-adverbial/conjunctive/final: The particle "ka" when unknown whether it is
|
||||
# adverbial, conjunctive, or sentence final. For example:
|
||||
# (a) 「A か B か」. Ex:「(国内で運用する) か,(海外で運用する) か (.)」
|
||||
# (b) Inside an adverb phrase. Ex:「(幸いという) か (, 死者はいなかった.)」
|
||||
# 「(祈りが届いたせい) か (, 試験に合格した.)」
|
||||
# (c) 「かのように」. Ex:「(何もなかった) か (のように振る舞った.)」
|
||||
# e.g. か
|
||||
助詞-副助詞/並立助詞/終助詞
|
||||
#
|
||||
# particle-adnominalizer: The "no" that attaches to nouns and modifies
|
||||
# non-inflectional words.
|
||||
助詞-連体化
|
||||
#
|
||||
# particle-adnominalizer: The "ni" and "to" that appear following nouns and adverbs
|
||||
# that are giongo, giseigo, or gitaigo.
|
||||
# e.g. に, と
|
||||
助詞-副詞化
|
||||
#
|
||||
# particle-special: A particle that does not fit into one of the above classifications.
|
||||
# This includes particles that are used in Tanka, Haiku, and other poetry.
|
||||
# e.g. かな, けむ, ( しただろう) に, (あんた) にゃ(わからん), (俺) ん (家)
|
||||
助詞-特殊
|
||||
#
|
||||
#####
|
||||
# auxiliary-verb:
|
||||
助動詞
|
||||
#
|
||||
#####
|
||||
# interjection: Greetings and other exclamations.
|
||||
# e.g. おはよう, おはようございます, こんにちは, こんばんは, ありがとう, どうもありがとう, ありがとうございます,
|
||||
# いただきます, ごちそうさま, さよなら, さようなら, はい, いいえ, ごめん, ごめんなさい
|
||||
#感動詞
|
||||
#
|
||||
#####
|
||||
# symbol: unclassified Symbols.
|
||||
記号
|
||||
#
|
||||
# symbol-misc: A general symbol not in one of the categories below.
|
||||
# e.g. [○◎@$〒→+]
|
||||
記号-一般
|
||||
#
|
||||
# symbol-comma: Commas
|
||||
# e.g. [,、]
|
||||
記号-読点
|
||||
#
|
||||
# symbol-period: Periods and full stops.
|
||||
# e.g. [..。]
|
||||
記号-句点
|
||||
#
|
||||
# symbol-space: Full-width whitespace.
|
||||
記号-空白
|
||||
#
|
||||
# symbol-open_bracket:
|
||||
# e.g. [({‘“『【]
|
||||
記号-括弧開
|
||||
#
|
||||
# symbol-close_bracket:
|
||||
# e.g. [)}’”』」】]
|
||||
記号-括弧閉
|
||||
#
|
||||
# symbol-alphabetic:
|
||||
#記号-アルファベット
|
||||
#
|
||||
#####
|
||||
# other: unclassified other
|
||||
#その他
|
||||
#
|
||||
# other-interjection: Words that are hard to classify as noun-suffixes or
|
||||
# sentence-final particles.
|
||||
# e.g. (だ)ァ
|
||||
その他-間投
|
||||
#
|
||||
#####
|
||||
# filler: Aizuchi that occurs during a conversation or sounds inserted as filler.
|
||||
# e.g. あの, うんと, えと
|
||||
フィラー
|
||||
#
|
||||
#####
|
||||
# non-verbal: non-verbal sound.
|
||||
非言語音
|
||||
#
|
||||
#####
|
||||
# fragment:
|
||||
#語断片
|
||||
#
|
||||
#####
|
||||
# unknown: unknown part of speech.
|
||||
#未知語
|
||||
#
|
||||
##### End of file
|
125
java/solr/server/solr/stackdump/conf/lang/stopwords_ar.txt
Normal file
125
java/solr/server/solr/stackdump/conf/lang/stopwords_ar.txt
Normal file
@ -0,0 +1,125 @@
|
||||
# This file was created by Jacques Savoy and is distributed under the BSD license.
|
||||
# See http://members.unine.ch/jacques.savoy/clef/index.html.
|
||||
# Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
# Cleaned on October 11, 2009 (not normalized, so use before normalization)
|
||||
# This means that when modifying this list, you might need to add some
|
||||
# redundant entries, for example containing forms with both أ and ا
|
||||
من
|
||||
ومن
|
||||
منها
|
||||
منه
|
||||
في
|
||||
وفي
|
||||
فيها
|
||||
فيه
|
||||
و
|
||||
ف
|
||||
ثم
|
||||
او
|
||||
أو
|
||||
ب
|
||||
بها
|
||||
به
|
||||
ا
|
||||
أ
|
||||
اى
|
||||
اي
|
||||
أي
|
||||
أى
|
||||
لا
|
||||
ولا
|
||||
الا
|
||||
ألا
|
||||
إلا
|
||||
لكن
|
||||
ما
|
||||
وما
|
||||
كما
|
||||
فما
|
||||
عن
|
||||
مع
|
||||
اذا
|
||||
إذا
|
||||
ان
|
||||
أن
|
||||
إن
|
||||
انها
|
||||
أنها
|
||||
إنها
|
||||
انه
|
||||
أنه
|
||||
إنه
|
||||
بان
|
||||
بأن
|
||||
فان
|
||||
فأن
|
||||
وان
|
||||
وأن
|
||||
وإن
|
||||
التى
|
||||
التي
|
||||
الذى
|
||||
الذي
|
||||
الذين
|
||||
الى
|
||||
الي
|
||||
إلى
|
||||
إلي
|
||||
على
|
||||
عليها
|
||||
عليه
|
||||
اما
|
||||
أما
|
||||
إما
|
||||
ايضا
|
||||
أيضا
|
||||
كل
|
||||
وكل
|
||||
لم
|
||||
ولم
|
||||
لن
|
||||
ولن
|
||||
هى
|
||||
هي
|
||||
هو
|
||||
وهى
|
||||
وهي
|
||||
وهو
|
||||
فهى
|
||||
فهي
|
||||
فهو
|
||||
انت
|
||||
أنت
|
||||
لك
|
||||
لها
|
||||
له
|
||||
هذه
|
||||
هذا
|
||||
تلك
|
||||
ذلك
|
||||
هناك
|
||||
كانت
|
||||
كان
|
||||
يكون
|
||||
تكون
|
||||
وكانت
|
||||
وكان
|
||||
غير
|
||||
بعض
|
||||
قد
|
||||
نحو
|
||||
بين
|
||||
بينما
|
||||
منذ
|
||||
ضمن
|
||||
حيث
|
||||
الان
|
||||
الآن
|
||||
خلال
|
||||
بعد
|
||||
قبل
|
||||
حتى
|
||||
عند
|
||||
عندما
|
||||
لدى
|
||||
جميع
|
193
java/solr/server/solr/stackdump/conf/lang/stopwords_bg.txt
Normal file
193
java/solr/server/solr/stackdump/conf/lang/stopwords_bg.txt
Normal file
@ -0,0 +1,193 @@
|
||||
# This file was created by Jacques Savoy and is distributed under the BSD license.
|
||||
# See http://members.unine.ch/jacques.savoy/clef/index.html.
|
||||
# Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
а
|
||||
аз
|
||||
ако
|
||||
ала
|
||||
бе
|
||||
без
|
||||
беше
|
||||
би
|
||||
бил
|
||||
била
|
||||
били
|
||||
било
|
||||
близо
|
||||
бъдат
|
||||
бъде
|
||||
бяха
|
||||
в
|
||||
вас
|
||||
ваш
|
||||
ваша
|
||||
вероятно
|
||||
вече
|
||||
взема
|
||||
ви
|
||||
вие
|
||||
винаги
|
||||
все
|
||||
всеки
|
||||
всички
|
||||
всичко
|
||||
всяка
|
||||
във
|
||||
въпреки
|
||||
върху
|
||||
г
|
||||
ги
|
||||
главно
|
||||
го
|
||||
д
|
||||
да
|
||||
дали
|
||||
до
|
||||
докато
|
||||
докога
|
||||
дори
|
||||
досега
|
||||
доста
|
||||
е
|
||||
едва
|
||||
един
|
||||
ето
|
||||
за
|
||||
зад
|
||||
заедно
|
||||
заради
|
||||
засега
|
||||
затова
|
||||
защо
|
||||
защото
|
||||
и
|
||||
из
|
||||
или
|
||||
им
|
||||
има
|
||||
имат
|
||||
иска
|
||||
й
|
||||
каза
|
||||
как
|
||||
каква
|
||||
какво
|
||||
както
|
||||
какъв
|
||||
като
|
||||
кога
|
||||
когато
|
||||
което
|
||||
които
|
||||
кой
|
||||
който
|
||||
колко
|
||||
която
|
||||
къде
|
||||
където
|
||||
към
|
||||
ли
|
||||
м
|
||||
ме
|
||||
между
|
||||
мен
|
||||
ми
|
||||
мнозина
|
||||
мога
|
||||
могат
|
||||
може
|
||||
моля
|
||||
момента
|
||||
му
|
||||
н
|
||||
на
|
||||
над
|
||||
назад
|
||||
най
|
||||
направи
|
||||
напред
|
||||
например
|
||||
нас
|
||||
не
|
||||
него
|
||||
нея
|
||||
ни
|
||||
ние
|
||||
никой
|
||||
нито
|
||||
но
|
||||
някои
|
||||
някой
|
||||
няма
|
||||
обаче
|
||||
около
|
||||
освен
|
||||
особено
|
||||
от
|
||||
отгоре
|
||||
отново
|
||||
още
|
||||
пак
|
||||
по
|
||||
повече
|
||||
повечето
|
||||
под
|
||||
поне
|
||||
поради
|
||||
после
|
||||
почти
|
||||
прави
|
||||
пред
|
||||
преди
|
||||
през
|
||||
при
|
||||
пък
|
||||
първо
|
||||
с
|
||||
са
|
||||
само
|
||||
се
|
||||
сега
|
||||
си
|
||||
скоро
|
||||
след
|
||||
сме
|
||||
според
|
||||
сред
|
||||
срещу
|
||||
сте
|
||||
съм
|
||||
със
|
||||
също
|
||||
т
|
||||
тази
|
||||
така
|
||||
такива
|
||||
такъв
|
||||
там
|
||||
твой
|
||||
те
|
||||
тези
|
||||
ти
|
||||
тн
|
||||
то
|
||||
това
|
||||
тогава
|
||||
този
|
||||
той
|
||||
толкова
|
||||
точно
|
||||
трябва
|
||||
тук
|
||||
тъй
|
||||
тя
|
||||
тях
|
||||
у
|
||||
харесва
|
||||
ч
|
||||
че
|
||||
често
|
||||
чрез
|
||||
ще
|
||||
щом
|
||||
я
|
220
java/solr/server/solr/stackdump/conf/lang/stopwords_ca.txt
Normal file
220
java/solr/server/solr/stackdump/conf/lang/stopwords_ca.txt
Normal file
@ -0,0 +1,220 @@
|
||||
# Catalan stopwords from http://github.com/vcl/cue.language (Apache 2 Licensed)
|
||||
a
|
||||
abans
|
||||
ací
|
||||
ah
|
||||
així
|
||||
això
|
||||
al
|
||||
als
|
||||
aleshores
|
||||
algun
|
||||
alguna
|
||||
algunes
|
||||
alguns
|
||||
alhora
|
||||
allà
|
||||
allí
|
||||
allò
|
||||
altra
|
||||
altre
|
||||
altres
|
||||
amb
|
||||
ambdós
|
||||
ambdues
|
||||
apa
|
||||
aquell
|
||||
aquella
|
||||
aquelles
|
||||
aquells
|
||||
aquest
|
||||
aquesta
|
||||
aquestes
|
||||
aquests
|
||||
aquí
|
||||
baix
|
||||
cada
|
||||
cadascú
|
||||
cadascuna
|
||||
cadascunes
|
||||
cadascuns
|
||||
com
|
||||
contra
|
||||
d'un
|
||||
d'una
|
||||
d'unes
|
||||
d'uns
|
||||
dalt
|
||||
de
|
||||
del
|
||||
dels
|
||||
des
|
||||
després
|
||||
dins
|
||||
dintre
|
||||
donat
|
||||
doncs
|
||||
durant
|
||||
e
|
||||
eh
|
||||
el
|
||||
els
|
||||
em
|
||||
en
|
||||
encara
|
||||
ens
|
||||
entre
|
||||
érem
|
||||
eren
|
||||
éreu
|
||||
es
|
||||
és
|
||||
esta
|
||||
està
|
||||
estàvem
|
||||
estaven
|
||||
estàveu
|
||||
esteu
|
||||
et
|
||||
etc
|
||||
ets
|
||||
fins
|
||||
fora
|
||||
gairebé
|
||||
ha
|
||||
han
|
||||
has
|
||||
havia
|
||||
he
|
||||
hem
|
||||
heu
|
||||
hi
|
||||
ho
|
||||
i
|
||||
igual
|
||||
iguals
|
||||
ja
|
||||
l'hi
|
||||
la
|
||||
les
|
||||
li
|
||||
li'n
|
||||
llavors
|
||||
m'he
|
||||
ma
|
||||
mal
|
||||
malgrat
|
||||
mateix
|
||||
mateixa
|
||||
mateixes
|
||||
mateixos
|
||||
me
|
||||
mentre
|
||||
més
|
||||
meu
|
||||
meus
|
||||
meva
|
||||
meves
|
||||
molt
|
||||
molta
|
||||
moltes
|
||||
molts
|
||||
mon
|
||||
mons
|
||||
n'he
|
||||
n'hi
|
||||
ne
|
||||
ni
|
||||
no
|
||||
nogensmenys
|
||||
només
|
||||
nosaltres
|
||||
nostra
|
||||
nostre
|
||||
nostres
|
||||
o
|
||||
oh
|
||||
oi
|
||||
on
|
||||
pas
|
||||
pel
|
||||
pels
|
||||
per
|
||||
però
|
||||
perquè
|
||||
poc
|
||||
poca
|
||||
pocs
|
||||
poques
|
||||
potser
|
||||
propi
|
||||
qual
|
||||
quals
|
||||
quan
|
||||
quant
|
||||
que
|
||||
què
|
||||
quelcom
|
||||
qui
|
||||
quin
|
||||
quina
|
||||
quines
|
||||
quins
|
||||
s'ha
|
||||
s'han
|
||||
sa
|
||||
semblant
|
||||
semblants
|
||||
ses
|
||||
seu
|
||||
seus
|
||||
seva
|
||||
seva
|
||||
seves
|
||||
si
|
||||
sobre
|
||||
sobretot
|
||||
sóc
|
||||
solament
|
||||
sols
|
||||
son
|
||||
són
|
||||
sons
|
||||
sota
|
||||
sou
|
||||
t'ha
|
||||
t'han
|
||||
t'he
|
||||
ta
|
||||
tal
|
||||
també
|
||||
tampoc
|
||||
tan
|
||||
tant
|
||||
tanta
|
||||
tantes
|
||||
teu
|
||||
teus
|
||||
teva
|
||||
teves
|
||||
ton
|
||||
tons
|
||||
tot
|
||||
tota
|
||||
totes
|
||||
tots
|
||||
un
|
||||
una
|
||||
unes
|
||||
uns
|
||||
us
|
||||
va
|
||||
vaig
|
||||
vam
|
||||
van
|
||||
vas
|
||||
veu
|
||||
vosaltres
|
||||
vostra
|
||||
vostre
|
||||
vostres
|
172
java/solr/server/solr/stackdump/conf/lang/stopwords_cz.txt
Normal file
172
java/solr/server/solr/stackdump/conf/lang/stopwords_cz.txt
Normal file
@ -0,0 +1,172 @@
|
||||
a
|
||||
s
|
||||
k
|
||||
o
|
||||
i
|
||||
u
|
||||
v
|
||||
z
|
||||
dnes
|
||||
cz
|
||||
tímto
|
||||
budeš
|
||||
budem
|
||||
byli
|
||||
jseš
|
||||
můj
|
||||
svým
|
||||
ta
|
||||
tomto
|
||||
tohle
|
||||
tuto
|
||||
tyto
|
||||
jej
|
||||
zda
|
||||
proč
|
||||
máte
|
||||
tato
|
||||
kam
|
||||
tohoto
|
||||
kdo
|
||||
kteří
|
||||
mi
|
||||
nám
|
||||
tom
|
||||
tomuto
|
||||
mít
|
||||
nic
|
||||
proto
|
||||
kterou
|
||||
byla
|
||||
toho
|
||||
protože
|
||||
asi
|
||||
ho
|
||||
naši
|
||||
napište
|
||||
re
|
||||
což
|
||||
tím
|
||||
takže
|
||||
svých
|
||||
její
|
||||
svými
|
||||
jste
|
||||
aj
|
||||
tu
|
||||
tedy
|
||||
teto
|
||||
bylo
|
||||
kde
|
||||
ke
|
||||
pravé
|
||||
ji
|
||||
nad
|
||||
nejsou
|
||||
či
|
||||
pod
|
||||
téma
|
||||
mezi
|
||||
přes
|
||||
ty
|
||||
pak
|
||||
vám
|
||||
ani
|
||||
když
|
||||
však
|
||||
neg
|
||||
jsem
|
||||
tento
|
||||
článku
|
||||
články
|
||||
aby
|
||||
jsme
|
||||
před
|
||||
pta
|
||||
jejich
|
||||
byl
|
||||
ještě
|
||||
až
|
||||
bez
|
||||
také
|
||||
pouze
|
||||
první
|
||||
vaše
|
||||
která
|
||||
nás
|
||||
nový
|
||||
tipy
|
||||
pokud
|
||||
může
|
||||
strana
|
||||
jeho
|
||||
své
|
||||
jiné
|
||||
zprávy
|
||||
nové
|
||||
není
|
||||
vás
|
||||
jen
|
||||
podle
|
||||
zde
|
||||
už
|
||||
být
|
||||
více
|
||||
bude
|
||||
již
|
||||
než
|
||||
který
|
||||
by
|
||||
které
|
||||
co
|
||||
nebo
|
||||
ten
|
||||
tak
|
||||
má
|
||||
při
|
||||
od
|
||||
po
|
||||
jsou
|
||||
jak
|
||||
další
|
||||
ale
|
||||
si
|
||||
se
|
||||
ve
|
||||
to
|
||||
jako
|
||||
za
|
||||
zpět
|
||||
ze
|
||||
do
|
||||
pro
|
||||
je
|
||||
na
|
||||
atd
|
||||
atp
|
||||
jakmile
|
||||
přičemž
|
||||
já
|
||||
on
|
||||
ona
|
||||
ono
|
||||
oni
|
||||
ony
|
||||
my
|
||||
vy
|
||||
jí
|
||||
ji
|
||||
mě
|
||||
mne
|
||||
jemu
|
||||
tomu
|
||||
těm
|
||||
těmu
|
||||
němu
|
||||
němuž
|
||||
jehož
|
||||
jíž
|
||||
jelikož
|
||||
jež
|
||||
jakož
|
||||
načež
|
108
java/solr/server/solr/stackdump/conf/lang/stopwords_da.txt
Normal file
108
java/solr/server/solr/stackdump/conf/lang/stopwords_da.txt
Normal file
@ -0,0 +1,108 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/danish/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| A Danish stop word list. Comments begin with vertical bar. Each stop
|
||||
| word is at the start of a line.
|
||||
|
||||
| This is a ranked list (commonest to rarest) of stopwords derived from
|
||||
| a large text sample.
|
||||
|
||||
|
||||
og | and
|
||||
i | in
|
||||
jeg | I
|
||||
det | that (dem. pronoun)/it (pers. pronoun)
|
||||
at | that (in front of a sentence)/to (with infinitive)
|
||||
en | a/an
|
||||
den | it (pers. pronoun)/that (dem. pronoun)
|
||||
til | to/at/for/until/against/by/of/into, more
|
||||
er | present tense of "to be"
|
||||
som | who, as
|
||||
på | on/upon/in/on/at/to/after/of/with/for, on
|
||||
de | they
|
||||
med | with/by/in, along
|
||||
han | he
|
||||
af | of/by/from/off/for/in/with/on, off
|
||||
for | at/for/to/from/by/of/ago, in front/before, because
|
||||
ikke | not
|
||||
der | who/which, there/those
|
||||
var | past tense of "to be"
|
||||
mig | me/myself
|
||||
sig | oneself/himself/herself/itself/themselves
|
||||
men | but
|
||||
et | a/an/one, one (number), someone/somebody/one
|
||||
har | present tense of "to have"
|
||||
om | round/about/for/in/a, about/around/down, if
|
||||
vi | we
|
||||
min | my
|
||||
havde | past tense of "to have"
|
||||
ham | him
|
||||
hun | she
|
||||
nu | now
|
||||
over | over/above/across/by/beyond/past/on/about, over/past
|
||||
da | then, when/as/since
|
||||
fra | from/off/since, off, since
|
||||
du | you
|
||||
ud | out
|
||||
sin | his/her/its/one's
|
||||
dem | them
|
||||
os | us/ourselves
|
||||
op | up
|
||||
man | you/one
|
||||
hans | his
|
||||
hvor | where
|
||||
eller | or
|
||||
hvad | what
|
||||
skal | must/shall etc.
|
||||
selv | myself/youself/herself/ourselves etc., even
|
||||
her | here
|
||||
alle | all/everyone/everybody etc.
|
||||
vil | will (verb)
|
||||
blev | past tense of "to stay/to remain/to get/to become"
|
||||
kunne | could
|
||||
ind | in
|
||||
når | when
|
||||
være | present tense of "to be"
|
||||
dog | however/yet/after all
|
||||
noget | something
|
||||
ville | would
|
||||
jo | you know/you see (adv), yes
|
||||
deres | their/theirs
|
||||
efter | after/behind/according to/for/by/from, later/afterwards
|
||||
ned | down
|
||||
skulle | should
|
||||
denne | this
|
||||
end | than
|
||||
dette | this
|
||||
mit | my/mine
|
||||
også | also
|
||||
under | under/beneath/below/during, below/underneath
|
||||
have | have
|
||||
dig | you
|
||||
anden | other
|
||||
hende | her
|
||||
mine | my
|
||||
alt | everything
|
||||
meget | much/very, plenty of
|
||||
sit | his, her, its, one's
|
||||
sine | his, her, its, one's
|
||||
vor | our
|
||||
mod | against
|
||||
disse | these
|
||||
hvis | if
|
||||
din | your/yours
|
||||
nogle | some
|
||||
hos | by/at
|
||||
blive | be/become
|
||||
mange | many
|
||||
ad | by/through
|
||||
bliver | present tense of "to be/to become"
|
||||
hendes | her/hers
|
||||
været | be
|
||||
thi | for (conj)
|
||||
jer | you
|
||||
sådan | such, like this/like that
|
292
java/solr/server/solr/stackdump/conf/lang/stopwords_de.txt
Normal file
292
java/solr/server/solr/stackdump/conf/lang/stopwords_de.txt
Normal file
@ -0,0 +1,292 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/german/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| A German stop word list. Comments begin with vertical bar. Each stop
|
||||
| word is at the start of a line.
|
||||
|
||||
| The number of forms in this list is reduced significantly by passing it
|
||||
| through the German stemmer.
|
||||
|
||||
|
||||
aber | but
|
||||
|
||||
alle | all
|
||||
allem
|
||||
allen
|
||||
aller
|
||||
alles
|
||||
|
||||
als | than, as
|
||||
also | so
|
||||
am | an + dem
|
||||
an | at
|
||||
|
||||
ander | other
|
||||
andere
|
||||
anderem
|
||||
anderen
|
||||
anderer
|
||||
anderes
|
||||
anderm
|
||||
andern
|
||||
anderr
|
||||
anders
|
||||
|
||||
auch | also
|
||||
auf | on
|
||||
aus | out of
|
||||
bei | by
|
||||
bin | am
|
||||
bis | until
|
||||
bist | art
|
||||
da | there
|
||||
damit | with it
|
||||
dann | then
|
||||
|
||||
der | the
|
||||
den
|
||||
des
|
||||
dem
|
||||
die
|
||||
das
|
||||
|
||||
daß | that
|
||||
|
||||
derselbe | the same
|
||||
derselben
|
||||
denselben
|
||||
desselben
|
||||
demselben
|
||||
dieselbe
|
||||
dieselben
|
||||
dasselbe
|
||||
|
||||
dazu | to that
|
||||
|
||||
dein | thy
|
||||
deine
|
||||
deinem
|
||||
deinen
|
||||
deiner
|
||||
deines
|
||||
|
||||
denn | because
|
||||
|
||||
derer | of those
|
||||
dessen | of him
|
||||
|
||||
dich | thee
|
||||
dir | to thee
|
||||
du | thou
|
||||
|
||||
dies | this
|
||||
diese
|
||||
diesem
|
||||
diesen
|
||||
dieser
|
||||
dieses
|
||||
|
||||
|
||||
doch | (several meanings)
|
||||
dort | (over) there
|
||||
|
||||
|
||||
durch | through
|
||||
|
||||
ein | a
|
||||
eine
|
||||
einem
|
||||
einen
|
||||
einer
|
||||
eines
|
||||
|
||||
einig | some
|
||||
einige
|
||||
einigem
|
||||
einigen
|
||||
einiger
|
||||
einiges
|
||||
|
||||
einmal | once
|
||||
|
||||
er | he
|
||||
ihn | him
|
||||
ihm | to him
|
||||
|
||||
es | it
|
||||
etwas | something
|
||||
|
||||
euer | your
|
||||
eure
|
||||
eurem
|
||||
euren
|
||||
eurer
|
||||
eures
|
||||
|
||||
für | for
|
||||
gegen | towards
|
||||
gewesen | p.p. of sein
|
||||
hab | have
|
||||
habe | have
|
||||
haben | have
|
||||
hat | has
|
||||
hatte | had
|
||||
hatten | had
|
||||
hier | here
|
||||
hin | there
|
||||
hinter | behind
|
||||
|
||||
ich | I
|
||||
mich | me
|
||||
mir | to me
|
||||
|
||||
|
||||
ihr | you, to her
|
||||
ihre
|
||||
ihrem
|
||||
ihren
|
||||
ihrer
|
||||
ihres
|
||||
euch | to you
|
||||
|
||||
im | in + dem
|
||||
in | in
|
||||
indem | while
|
||||
ins | in + das
|
||||
ist | is
|
||||
|
||||
jede | each, every
|
||||
jedem
|
||||
jeden
|
||||
jeder
|
||||
jedes
|
||||
|
||||
jene | that
|
||||
jenem
|
||||
jenen
|
||||
jener
|
||||
jenes
|
||||
|
||||
jetzt | now
|
||||
kann | can
|
||||
|
||||
kein | no
|
||||
keine
|
||||
keinem
|
||||
keinen
|
||||
keiner
|
||||
keines
|
||||
|
||||
können | can
|
||||
könnte | could
|
||||
machen | do
|
||||
man | one
|
||||
|
||||
manche | some, many a
|
||||
manchem
|
||||
manchen
|
||||
mancher
|
||||
manches
|
||||
|
||||
mein | my
|
||||
meine
|
||||
meinem
|
||||
meinen
|
||||
meiner
|
||||
meines
|
||||
|
||||
mit | with
|
||||
muss | must
|
||||
musste | had to
|
||||
nach | to(wards)
|
||||
nicht | not
|
||||
nichts | nothing
|
||||
noch | still, yet
|
||||
nun | now
|
||||
nur | only
|
||||
ob | whether
|
||||
oder | or
|
||||
ohne | without
|
||||
sehr | very
|
||||
|
||||
sein | his
|
||||
seine
|
||||
seinem
|
||||
seinen
|
||||
seiner
|
||||
seines
|
||||
|
||||
selbst | self
|
||||
sich | herself
|
||||
|
||||
sie | they, she
|
||||
ihnen | to them
|
||||
|
||||
sind | are
|
||||
so | so
|
||||
|
||||
solche | such
|
||||
solchem
|
||||
solchen
|
||||
solcher
|
||||
solches
|
||||
|
||||
soll | shall
|
||||
sollte | should
|
||||
sondern | but
|
||||
sonst | else
|
||||
über | over
|
||||
um | about, around
|
||||
und | and
|
||||
|
||||
uns | us
|
||||
unse
|
||||
unsem
|
||||
unsen
|
||||
unser
|
||||
unses
|
||||
|
||||
unter | under
|
||||
viel | much
|
||||
vom | von + dem
|
||||
von | from
|
||||
vor | before
|
||||
während | while
|
||||
war | was
|
||||
waren | were
|
||||
warst | wast
|
||||
was | what
|
||||
weg | away, off
|
||||
weil | because
|
||||
weiter | further
|
||||
|
||||
welche | which
|
||||
welchem
|
||||
welchen
|
||||
welcher
|
||||
welches
|
||||
|
||||
wenn | when
|
||||
werde | will
|
||||
werden | will
|
||||
wie | how
|
||||
wieder | again
|
||||
will | want
|
||||
wir | we
|
||||
wird | will
|
||||
wirst | willst
|
||||
wo | where
|
||||
wollen | want
|
||||
wollte | wanted
|
||||
würde | would
|
||||
würden | would
|
||||
zu | to
|
||||
zum | zu + dem
|
||||
zur | zu + der
|
||||
zwar | indeed
|
||||
zwischen | between
|
||||
|
78
java/solr/server/solr/stackdump/conf/lang/stopwords_el.txt
Normal file
78
java/solr/server/solr/stackdump/conf/lang/stopwords_el.txt
Normal file
@ -0,0 +1,78 @@
|
||||
# Lucene Greek Stopwords list
|
||||
# Note: by default this file is used after GreekLowerCaseFilter,
|
||||
# so when modifying this file use 'σ' instead of 'ς'
|
||||
ο
|
||||
η
|
||||
το
|
||||
οι
|
||||
τα
|
||||
του
|
||||
τησ
|
||||
των
|
||||
τον
|
||||
την
|
||||
και
|
||||
κι
|
||||
κ
|
||||
ειμαι
|
||||
εισαι
|
||||
ειναι
|
||||
ειμαστε
|
||||
ειστε
|
||||
στο
|
||||
στον
|
||||
στη
|
||||
στην
|
||||
μα
|
||||
αλλα
|
||||
απο
|
||||
για
|
||||
προσ
|
||||
με
|
||||
σε
|
||||
ωσ
|
||||
παρα
|
||||
αντι
|
||||
κατα
|
||||
μετα
|
||||
θα
|
||||
να
|
||||
δε
|
||||
δεν
|
||||
μη
|
||||
μην
|
||||
επι
|
||||
ενω
|
||||
εαν
|
||||
αν
|
||||
τοτε
|
||||
που
|
||||
πωσ
|
||||
ποιοσ
|
||||
ποια
|
||||
ποιο
|
||||
ποιοι
|
||||
ποιεσ
|
||||
ποιων
|
||||
ποιουσ
|
||||
αυτοσ
|
||||
αυτη
|
||||
αυτο
|
||||
αυτοι
|
||||
αυτων
|
||||
αυτουσ
|
||||
αυτεσ
|
||||
αυτα
|
||||
εκεινοσ
|
||||
εκεινη
|
||||
εκεινο
|
||||
εκεινοι
|
||||
εκεινεσ
|
||||
εκεινα
|
||||
εκεινων
|
||||
εκεινουσ
|
||||
οπωσ
|
||||
ομωσ
|
||||
ισωσ
|
||||
οσο
|
||||
οτι
|
54
java/solr/server/solr/stackdump/conf/lang/stopwords_en.txt
Normal file
54
java/solr/server/solr/stackdump/conf/lang/stopwords_en.txt
Normal file
@ -0,0 +1,54 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
# a couple of test stopwords to test that the words are really being
|
||||
# configured from this file:
|
||||
stopworda
|
||||
stopwordb
|
||||
|
||||
# Standard english stop words taken from Lucene's StopAnalyzer
|
||||
a
|
||||
an
|
||||
and
|
||||
are
|
||||
as
|
||||
at
|
||||
be
|
||||
but
|
||||
by
|
||||
for
|
||||
if
|
||||
in
|
||||
into
|
||||
is
|
||||
it
|
||||
no
|
||||
not
|
||||
of
|
||||
on
|
||||
or
|
||||
such
|
||||
that
|
||||
the
|
||||
their
|
||||
then
|
||||
there
|
||||
these
|
||||
they
|
||||
this
|
||||
to
|
||||
was
|
||||
will
|
||||
with
|
354
java/solr/server/solr/stackdump/conf/lang/stopwords_es.txt
Normal file
354
java/solr/server/solr/stackdump/conf/lang/stopwords_es.txt
Normal file
@ -0,0 +1,354 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/spanish/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| A Spanish stop word list. Comments begin with vertical bar. Each stop
|
||||
| word is at the start of a line.
|
||||
|
||||
|
||||
| The following is a ranked list (commonest to rarest) of stopwords
|
||||
| deriving from a large sample of text.
|
||||
|
||||
| Extra words have been added at the end.
|
||||
|
||||
de | from, of
|
||||
la | the, her
|
||||
que | who, that
|
||||
el | the
|
||||
en | in
|
||||
y | and
|
||||
a | to
|
||||
los | the, them
|
||||
del | de + el
|
||||
se | himself, from him etc
|
||||
las | the, them
|
||||
por | for, by, etc
|
||||
un | a
|
||||
para | for
|
||||
con | with
|
||||
no | no
|
||||
una | a
|
||||
su | his, her
|
||||
al | a + el
|
||||
| es from SER
|
||||
lo | him
|
||||
como | how
|
||||
más | more
|
||||
pero | pero
|
||||
sus | su plural
|
||||
le | to him, her
|
||||
ya | already
|
||||
o | or
|
||||
| fue from SER
|
||||
este | this
|
||||
| ha from HABER
|
||||
sí | himself etc
|
||||
porque | because
|
||||
esta | this
|
||||
| son from SER
|
||||
entre | between
|
||||
| está from ESTAR
|
||||
cuando | when
|
||||
muy | very
|
||||
sin | without
|
||||
sobre | on
|
||||
| ser from SER
|
||||
| tiene from TENER
|
||||
también | also
|
||||
me | me
|
||||
hasta | until
|
||||
hay | there is/are
|
||||
donde | where
|
||||
| han from HABER
|
||||
quien | whom, that
|
||||
| están from ESTAR
|
||||
| estado from ESTAR
|
||||
desde | from
|
||||
todo | all
|
||||
nos | us
|
||||
durante | during
|
||||
| estados from ESTAR
|
||||
todos | all
|
||||
uno | a
|
||||
les | to them
|
||||
ni | nor
|
||||
contra | against
|
||||
otros | other
|
||||
| fueron from SER
|
||||
ese | that
|
||||
eso | that
|
||||
| había from HABER
|
||||
ante | before
|
||||
ellos | they
|
||||
e | and (variant of y)
|
||||
esto | this
|
||||
mí | me
|
||||
antes | before
|
||||
algunos | some
|
||||
qué | what?
|
||||
unos | a
|
||||
yo | I
|
||||
otro | other
|
||||
otras | other
|
||||
otra | other
|
||||
él | he
|
||||
tanto | so much, many
|
||||
esa | that
|
||||
estos | these
|
||||
mucho | much, many
|
||||
quienes | who
|
||||
nada | nothing
|
||||
muchos | many
|
||||
cual | who
|
||||
| sea from SER
|
||||
poco | few
|
||||
ella | she
|
||||
estar | to be
|
||||
| haber from HABER
|
||||
estas | these
|
||||
| estaba from ESTAR
|
||||
| estamos from ESTAR
|
||||
algunas | some
|
||||
algo | something
|
||||
nosotros | we
|
||||
|
||||
| other forms
|
||||
|
||||
mi | me
|
||||
mis | mi plural
|
||||
tú | thou
|
||||
te | thee
|
||||
ti | thee
|
||||
tu | thy
|
||||
tus | tu plural
|
||||
ellas | they
|
||||
nosotras | we
|
||||
vosotros | you
|
||||
vosotras | you
|
||||
os | you
|
||||
mío | mine
|
||||
mía |
|
||||
míos |
|
||||
mías |
|
||||
tuyo | thine
|
||||
tuya |
|
||||
tuyos |
|
||||
tuyas |
|
||||
suyo | his, hers, theirs
|
||||
suya |
|
||||
suyos |
|
||||
suyas |
|
||||
nuestro | ours
|
||||
nuestra |
|
||||
nuestros |
|
||||
nuestras |
|
||||
vuestro | yours
|
||||
vuestra |
|
||||
vuestros |
|
||||
vuestras |
|
||||
esos | those
|
||||
esas | those
|
||||
|
||||
| forms of estar, to be (not including the infinitive):
|
||||
estoy
|
||||
estás
|
||||
está
|
||||
estamos
|
||||
estáis
|
||||
están
|
||||
esté
|
||||
estés
|
||||
estemos
|
||||
estéis
|
||||
estén
|
||||
estaré
|
||||
estarás
|
||||
estará
|
||||
estaremos
|
||||
estaréis
|
||||
estarán
|
||||
estaría
|
||||
estarías
|
||||
estaríamos
|
||||
estaríais
|
||||
estarían
|
||||
estaba
|
||||
estabas
|
||||
estábamos
|
||||
estabais
|
||||
estaban
|
||||
estuve
|
||||
estuviste
|
||||
estuvo
|
||||
estuvimos
|
||||
estuvisteis
|
||||
estuvieron
|
||||
estuviera
|
||||
estuvieras
|
||||
estuviéramos
|
||||
estuvierais
|
||||
estuvieran
|
||||
estuviese
|
||||
estuvieses
|
||||
estuviésemos
|
||||
estuvieseis
|
||||
estuviesen
|
||||
estando
|
||||
estado
|
||||
estada
|
||||
estados
|
||||
estadas
|
||||
estad
|
||||
|
||||
| forms of haber, to have (not including the infinitive):
|
||||
he
|
||||
has
|
||||
ha
|
||||
hemos
|
||||
habéis
|
||||
han
|
||||
haya
|
||||
hayas
|
||||
hayamos
|
||||
hayáis
|
||||
hayan
|
||||
habré
|
||||
habrás
|
||||
habrá
|
||||
habremos
|
||||
habréis
|
||||
habrán
|
||||
habría
|
||||
habrías
|
||||
habríamos
|
||||
habríais
|
||||
habrían
|
||||
había
|
||||
habías
|
||||
habíamos
|
||||
habíais
|
||||
habían
|
||||
hube
|
||||
hubiste
|
||||
hubo
|
||||
hubimos
|
||||
hubisteis
|
||||
hubieron
|
||||
hubiera
|
||||
hubieras
|
||||
hubiéramos
|
||||
hubierais
|
||||
hubieran
|
||||
hubiese
|
||||
hubieses
|
||||
hubiésemos
|
||||
hubieseis
|
||||
hubiesen
|
||||
habiendo
|
||||
habido
|
||||
habida
|
||||
habidos
|
||||
habidas
|
||||
|
||||
| forms of ser, to be (not including the infinitive):
|
||||
soy
|
||||
eres
|
||||
es
|
||||
somos
|
||||
sois
|
||||
son
|
||||
sea
|
||||
seas
|
||||
seamos
|
||||
seáis
|
||||
sean
|
||||
seré
|
||||
serás
|
||||
será
|
||||
seremos
|
||||
seréis
|
||||
serán
|
||||
sería
|
||||
serías
|
||||
seríamos
|
||||
seríais
|
||||
serían
|
||||
era
|
||||
eras
|
||||
éramos
|
||||
erais
|
||||
eran
|
||||
fui
|
||||
fuiste
|
||||
fue
|
||||
fuimos
|
||||
fuisteis
|
||||
fueron
|
||||
fuera
|
||||
fueras
|
||||
fuéramos
|
||||
fuerais
|
||||
fueran
|
||||
fuese
|
||||
fueses
|
||||
fuésemos
|
||||
fueseis
|
||||
fuesen
|
||||
siendo
|
||||
sido
|
||||
| sed also means 'thirst'
|
||||
|
||||
| forms of tener, to have (not including the infinitive):
|
||||
tengo
|
||||
tienes
|
||||
tiene
|
||||
tenemos
|
||||
tenéis
|
||||
tienen
|
||||
tenga
|
||||
tengas
|
||||
tengamos
|
||||
tengáis
|
||||
tengan
|
||||
tendré
|
||||
tendrás
|
||||
tendrá
|
||||
tendremos
|
||||
tendréis
|
||||
tendrán
|
||||
tendría
|
||||
tendrías
|
||||
tendríamos
|
||||
tendríais
|
||||
tendrían
|
||||
tenía
|
||||
tenías
|
||||
teníamos
|
||||
teníais
|
||||
tenían
|
||||
tuve
|
||||
tuviste
|
||||
tuvo
|
||||
tuvimos
|
||||
tuvisteis
|
||||
tuvieron
|
||||
tuviera
|
||||
tuvieras
|
||||
tuviéramos
|
||||
tuvierais
|
||||
tuvieran
|
||||
tuviese
|
||||
tuvieses
|
||||
tuviésemos
|
||||
tuvieseis
|
||||
tuviesen
|
||||
teniendo
|
||||
tenido
|
||||
tenida
|
||||
tenidos
|
||||
tenidas
|
||||
tened
|
||||
|
99
java/solr/server/solr/stackdump/conf/lang/stopwords_eu.txt
Normal file
99
java/solr/server/solr/stackdump/conf/lang/stopwords_eu.txt
Normal file
@ -0,0 +1,99 @@
|
||||
# example set of basque stopwords
|
||||
al
|
||||
anitz
|
||||
arabera
|
||||
asko
|
||||
baina
|
||||
bat
|
||||
batean
|
||||
batek
|
||||
bati
|
||||
batzuei
|
||||
batzuek
|
||||
batzuetan
|
||||
batzuk
|
||||
bera
|
||||
beraiek
|
||||
berau
|
||||
berauek
|
||||
bere
|
||||
berori
|
||||
beroriek
|
||||
beste
|
||||
bezala
|
||||
da
|
||||
dago
|
||||
dira
|
||||
ditu
|
||||
du
|
||||
dute
|
||||
edo
|
||||
egin
|
||||
ere
|
||||
eta
|
||||
eurak
|
||||
ez
|
||||
gainera
|
||||
gu
|
||||
gutxi
|
||||
guzti
|
||||
haiei
|
||||
haiek
|
||||
haietan
|
||||
hainbeste
|
||||
hala
|
||||
han
|
||||
handik
|
||||
hango
|
||||
hara
|
||||
hari
|
||||
hark
|
||||
hartan
|
||||
hau
|
||||
hauei
|
||||
hauek
|
||||
hauetan
|
||||
hemen
|
||||
hemendik
|
||||
hemengo
|
||||
hi
|
||||
hona
|
||||
honek
|
||||
honela
|
||||
honetan
|
||||
honi
|
||||
hor
|
||||
hori
|
||||
horiei
|
||||
horiek
|
||||
horietan
|
||||
horko
|
||||
horra
|
||||
horrek
|
||||
horrela
|
||||
horretan
|
||||
horri
|
||||
hortik
|
||||
hura
|
||||
izan
|
||||
ni
|
||||
noiz
|
||||
nola
|
||||
non
|
||||
nondik
|
||||
nongo
|
||||
nor
|
||||
nora
|
||||
ze
|
||||
zein
|
||||
zen
|
||||
zenbait
|
||||
zenbat
|
||||
zer
|
||||
zergatik
|
||||
ziren
|
||||
zituen
|
||||
zu
|
||||
zuek
|
||||
zuen
|
||||
zuten
|
313
java/solr/server/solr/stackdump/conf/lang/stopwords_fa.txt
Normal file
313
java/solr/server/solr/stackdump/conf/lang/stopwords_fa.txt
Normal file
@ -0,0 +1,313 @@
|
||||
# This file was created by Jacques Savoy and is distributed under the BSD license.
|
||||
# See http://members.unine.ch/jacques.savoy/clef/index.html.
|
||||
# Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
# Note: by default this file is used after normalization, so when adding entries
|
||||
# to this file, use the arabic 'ي' instead of 'ی'
|
||||
انان
|
||||
نداشته
|
||||
سراسر
|
||||
خياه
|
||||
ايشان
|
||||
وي
|
||||
تاكنون
|
||||
بيشتري
|
||||
دوم
|
||||
پس
|
||||
ناشي
|
||||
وگو
|
||||
يا
|
||||
داشتند
|
||||
سپس
|
||||
هنگام
|
||||
هرگز
|
||||
پنج
|
||||
نشان
|
||||
امسال
|
||||
ديگر
|
||||
گروهي
|
||||
شدند
|
||||
چطور
|
||||
ده
|
||||
و
|
||||
دو
|
||||
نخستين
|
||||
ولي
|
||||
چرا
|
||||
چه
|
||||
وسط
|
||||
ه
|
||||
كدام
|
||||
قابل
|
||||
يك
|
||||
رفت
|
||||
هفت
|
||||
همچنين
|
||||
در
|
||||
هزار
|
||||
بله
|
||||
بلي
|
||||
شايد
|
||||
اما
|
||||
شناسي
|
||||
گرفته
|
||||
دهد
|
||||
داشته
|
||||
دانست
|
||||
داشتن
|
||||
خواهيم
|
||||
ميليارد
|
||||
وقتيكه
|
||||
امد
|
||||
خواهد
|
||||
جز
|
||||
اورده
|
||||
شده
|
||||
بلكه
|
||||
خدمات
|
||||
شدن
|
||||
برخي
|
||||
نبود
|
||||
بسياري
|
||||
جلوگيري
|
||||
حق
|
||||
كردند
|
||||
نوعي
|
||||
بعري
|
||||
نكرده
|
||||
نظير
|
||||
نبايد
|
||||
بوده
|
||||
بودن
|
||||
داد
|
||||
اورد
|
||||
هست
|
||||
جايي
|
||||
شود
|
||||
دنبال
|
||||
داده
|
||||
بايد
|
||||
سابق
|
||||
هيچ
|
||||
همان
|
||||
انجا
|
||||
كمتر
|
||||
كجاست
|
||||
گردد
|
||||
كسي
|
||||
تر
|
||||
مردم
|
||||
تان
|
||||
دادن
|
||||
بودند
|
||||
سري
|
||||
جدا
|
||||
ندارند
|
||||
مگر
|
||||
يكديگر
|
||||
دارد
|
||||
دهند
|
||||
بنابراين
|
||||
هنگامي
|
||||
سمت
|
||||
جا
|
||||
انچه
|
||||
خود
|
||||
دادند
|
||||
زياد
|
||||
دارند
|
||||
اثر
|
||||
بدون
|
||||
بهترين
|
||||
بيشتر
|
||||
البته
|
||||
به
|
||||
براساس
|
||||
بيرون
|
||||
كرد
|
||||
بعضي
|
||||
گرفت
|
||||
توي
|
||||
اي
|
||||
ميليون
|
||||
او
|
||||
جريان
|
||||
تول
|
||||
بر
|
||||
مانند
|
||||
برابر
|
||||
باشيم
|
||||
مدتي
|
||||
گويند
|
||||
اكنون
|
||||
تا
|
||||
تنها
|
||||
جديد
|
||||
چند
|
||||
بي
|
||||
نشده
|
||||
كردن
|
||||
كردم
|
||||
گويد
|
||||
كرده
|
||||
كنيم
|
||||
نمي
|
||||
نزد
|
||||
روي
|
||||
قصد
|
||||
فقط
|
||||
بالاي
|
||||
ديگران
|
||||
اين
|
||||
ديروز
|
||||
توسط
|
||||
سوم
|
||||
ايم
|
||||
دانند
|
||||
سوي
|
||||
استفاده
|
||||
شما
|
||||
كنار
|
||||
داريم
|
||||
ساخته
|
||||
طور
|
||||
امده
|
||||
رفته
|
||||
نخست
|
||||
بيست
|
||||
نزديك
|
||||
طي
|
||||
كنيد
|
||||
از
|
||||
انها
|
||||
تمامي
|
||||
داشت
|
||||
يكي
|
||||
طريق
|
||||
اش
|
||||
چيست
|
||||
روب
|
||||
نمايد
|
||||
گفت
|
||||
چندين
|
||||
چيزي
|
||||
تواند
|
||||
ام
|
||||
ايا
|
||||
با
|
||||
ان
|
||||
ايد
|
||||
ترين
|
||||
اينكه
|
||||
ديگري
|
||||
راه
|
||||
هايي
|
||||
بروز
|
||||
همچنان
|
||||
پاعين
|
||||
كس
|
||||
حدود
|
||||
مختلف
|
||||
مقابل
|
||||
چيز
|
||||
گيرد
|
||||
ندارد
|
||||
ضد
|
||||
همچون
|
||||
سازي
|
||||
شان
|
||||
مورد
|
||||
باره
|
||||
مرسي
|
||||
خويش
|
||||
برخوردار
|
||||
چون
|
||||
خارج
|
||||
شش
|
||||
هنوز
|
||||
تحت
|
||||
ضمن
|
||||
هستيم
|
||||
گفته
|
||||
فكر
|
||||
بسيار
|
||||
پيش
|
||||
براي
|
||||
روزهاي
|
||||
انكه
|
||||
نخواهد
|
||||
بالا
|
||||
كل
|
||||
وقتي
|
||||
كي
|
||||
چنين
|
||||
كه
|
||||
گيري
|
||||
نيست
|
||||
است
|
||||
كجا
|
||||
كند
|
||||
نيز
|
||||
يابد
|
||||
بندي
|
||||
حتي
|
||||
توانند
|
||||
عقب
|
||||
خواست
|
||||
كنند
|
||||
بين
|
||||
تمام
|
||||
همه
|
||||
ما
|
||||
باشند
|
||||
مثل
|
||||
شد
|
||||
اري
|
||||
باشد
|
||||
اره
|
||||
طبق
|
||||
بعد
|
||||
اگر
|
||||
صورت
|
||||
غير
|
||||
جاي
|
||||
بيش
|
||||
ريزي
|
||||
اند
|
||||
زيرا
|
||||
چگونه
|
||||
بار
|
||||
لطفا
|
||||
مي
|
||||
درباره
|
||||
من
|
||||
ديده
|
||||
همين
|
||||
گذاري
|
||||
برداري
|
||||
علت
|
||||
گذاشته
|
||||
هم
|
||||
فوق
|
||||
نه
|
||||
ها
|
||||
شوند
|
||||
اباد
|
||||
همواره
|
||||
هر
|
||||
اول
|
||||
خواهند
|
||||
چهار
|
||||
نام
|
||||
امروز
|
||||
مان
|
||||
هاي
|
||||
قبل
|
||||
كنم
|
||||
سعي
|
||||
تازه
|
||||
را
|
||||
هستند
|
||||
زير
|
||||
جلوي
|
||||
عنوان
|
||||
بود
|
95
java/solr/server/solr/stackdump/conf/lang/stopwords_fi.txt
Normal file
95
java/solr/server/solr/stackdump/conf/lang/stopwords_fi.txt
Normal file
@ -0,0 +1,95 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/finnish/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| forms of BE
|
||||
|
||||
olla
|
||||
olen
|
||||
olet
|
||||
on
|
||||
olemme
|
||||
olette
|
||||
ovat
|
||||
ole | negative form
|
||||
|
||||
oli
|
||||
olisi
|
||||
olisit
|
||||
olisin
|
||||
olisimme
|
||||
olisitte
|
||||
olisivat
|
||||
olit
|
||||
olin
|
||||
olimme
|
||||
olitte
|
||||
olivat
|
||||
ollut
|
||||
olleet
|
||||
|
||||
en | negation
|
||||
et
|
||||
ei
|
||||
emme
|
||||
ette
|
||||
eivät
|
||||
|
||||
|Nom Gen Acc Part Iness Elat Illat Adess Ablat Allat Ess Trans
|
||||
minä minun minut minua minussa minusta minuun minulla minulta minulle | I
|
||||
sinä sinun sinut sinua sinussa sinusta sinuun sinulla sinulta sinulle | you
|
||||
hän hänen hänet häntä hänessä hänestä häneen hänellä häneltä hänelle | he she
|
||||
me meidän meidät meitä meissä meistä meihin meillä meiltä meille | we
|
||||
te teidän teidät teitä teissä teistä teihin teillä teiltä teille | you
|
||||
he heidän heidät heitä heissä heistä heihin heillä heiltä heille | they
|
||||
|
||||
tämä tämän tätä tässä tästä tähän tallä tältä tälle tänä täksi | this
|
||||
tuo tuon tuotä tuossa tuosta tuohon tuolla tuolta tuolle tuona tuoksi | that
|
||||
se sen sitä siinä siitä siihen sillä siltä sille sinä siksi | it
|
||||
nämä näiden näitä näissä näistä näihin näillä näiltä näille näinä näiksi | these
|
||||
nuo noiden noita noissa noista noihin noilla noilta noille noina noiksi | those
|
||||
ne niiden niitä niissä niistä niihin niillä niiltä niille niinä niiksi | they
|
||||
|
||||
kuka kenen kenet ketä kenessä kenestä keneen kenellä keneltä kenelle kenenä keneksi| who
|
||||
ketkä keiden ketkä keitä keissä keistä keihin keillä keiltä keille keinä keiksi | (pl)
|
||||
mikä minkä minkä mitä missä mistä mihin millä miltä mille minä miksi | which what
|
||||
mitkä | (pl)
|
||||
|
||||
joka jonka jota jossa josta johon jolla jolta jolle jona joksi | who which
|
||||
jotka joiden joita joissa joista joihin joilla joilta joille joina joiksi | (pl)
|
||||
|
||||
| conjunctions
|
||||
|
||||
että | that
|
||||
ja | and
|
||||
jos | if
|
||||
koska | because
|
||||
kuin | than
|
||||
mutta | but
|
||||
niin | so
|
||||
sekä | and
|
||||
sillä | for
|
||||
tai | or
|
||||
vaan | but
|
||||
vai | or
|
||||
vaikka | although
|
||||
|
||||
|
||||
| prepositions
|
||||
|
||||
kanssa | with
|
||||
mukaan | according to
|
||||
noin | about
|
||||
poikki | across
|
||||
yli | over, across
|
||||
|
||||
| other
|
||||
|
||||
kun | when
|
||||
niin | so
|
||||
nyt | now
|
||||
itse | self
|
||||
|
184
java/solr/server/solr/stackdump/conf/lang/stopwords_fr.txt
Normal file
184
java/solr/server/solr/stackdump/conf/lang/stopwords_fr.txt
Normal file
@ -0,0 +1,184 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/french/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| A French stop word list. Comments begin with vertical bar. Each stop
|
||||
| word is at the start of a line.
|
||||
|
||||
au | a + le
|
||||
aux | a + les
|
||||
avec | with
|
||||
ce | this
|
||||
ces | these
|
||||
dans | with
|
||||
de | of
|
||||
des | de + les
|
||||
du | de + le
|
||||
elle | she
|
||||
en | `of them' etc
|
||||
et | and
|
||||
eux | them
|
||||
il | he
|
||||
je | I
|
||||
la | the
|
||||
le | the
|
||||
leur | their
|
||||
lui | him
|
||||
ma | my (fem)
|
||||
mais | but
|
||||
me | me
|
||||
même | same; as in moi-même (myself) etc
|
||||
mes | me (pl)
|
||||
moi | me
|
||||
mon | my (masc)
|
||||
ne | not
|
||||
nos | our (pl)
|
||||
notre | our
|
||||
nous | we
|
||||
on | one
|
||||
ou | where
|
||||
par | by
|
||||
pas | not
|
||||
pour | for
|
||||
qu | que before vowel
|
||||
que | that
|
||||
qui | who
|
||||
sa | his, her (fem)
|
||||
se | oneself
|
||||
ses | his (pl)
|
||||
son | his, her (masc)
|
||||
sur | on
|
||||
ta | thy (fem)
|
||||
te | thee
|
||||
tes | thy (pl)
|
||||
toi | thee
|
||||
ton | thy (masc)
|
||||
tu | thou
|
||||
un | a
|
||||
une | a
|
||||
vos | your (pl)
|
||||
votre | your
|
||||
vous | you
|
||||
|
||||
| single letter forms
|
||||
|
||||
c | c'
|
||||
d | d'
|
||||
j | j'
|
||||
l | l'
|
||||
à | to, at
|
||||
m | m'
|
||||
n | n'
|
||||
s | s'
|
||||
t | t'
|
||||
y | there
|
||||
|
||||
| forms of être (not including the infinitive):
|
||||
été
|
||||
étée
|
||||
étées
|
||||
étés
|
||||
étant
|
||||
suis
|
||||
es
|
||||
est
|
||||
sommes
|
||||
êtes
|
||||
sont
|
||||
serai
|
||||
seras
|
||||
sera
|
||||
serons
|
||||
serez
|
||||
seront
|
||||
serais
|
||||
serait
|
||||
serions
|
||||
seriez
|
||||
seraient
|
||||
étais
|
||||
était
|
||||
étions
|
||||
étiez
|
||||
étaient
|
||||
fus
|
||||
fut
|
||||
fûmes
|
||||
fûtes
|
||||
furent
|
||||
sois
|
||||
soit
|
||||
soyons
|
||||
soyez
|
||||
soient
|
||||
fusse
|
||||
fusses
|
||||
fût
|
||||
fussions
|
||||
fussiez
|
||||
fussent
|
||||
|
||||
| forms of avoir (not including the infinitive):
|
||||
ayant
|
||||
eu
|
||||
eue
|
||||
eues
|
||||
eus
|
||||
ai
|
||||
as
|
||||
avons
|
||||
avez
|
||||
ont
|
||||
aurai
|
||||
auras
|
||||
aura
|
||||
aurons
|
||||
aurez
|
||||
auront
|
||||
aurais
|
||||
aurait
|
||||
aurions
|
||||
auriez
|
||||
auraient
|
||||
avais
|
||||
avait
|
||||
avions
|
||||
aviez
|
||||
avaient
|
||||
eut
|
||||
eûmes
|
||||
eûtes
|
||||
eurent
|
||||
aie
|
||||
aies
|
||||
ait
|
||||
ayons
|
||||
ayez
|
||||
aient
|
||||
eusse
|
||||
eusses
|
||||
eût
|
||||
eussions
|
||||
eussiez
|
||||
eussent
|
||||
|
||||
| Later additions (from Jean-Christophe Deschamps)
|
||||
ceci | this
|
||||
cela | that
|
||||
celà | that
|
||||
cet | this
|
||||
cette | this
|
||||
ici | here
|
||||
ils | they
|
||||
les | the (pl)
|
||||
leurs | their (pl)
|
||||
quel | which
|
||||
quels | which
|
||||
quelle | which
|
||||
quelles | which
|
||||
sans | without
|
||||
soi | oneself
|
||||
|
110
java/solr/server/solr/stackdump/conf/lang/stopwords_ga.txt
Normal file
110
java/solr/server/solr/stackdump/conf/lang/stopwords_ga.txt
Normal file
@ -0,0 +1,110 @@
|
||||
|
||||
a
|
||||
ach
|
||||
ag
|
||||
agus
|
||||
an
|
||||
aon
|
||||
ar
|
||||
arna
|
||||
as
|
||||
b'
|
||||
ba
|
||||
beirt
|
||||
bhúr
|
||||
caoga
|
||||
ceathair
|
||||
ceathrar
|
||||
chomh
|
||||
chtó
|
||||
chuig
|
||||
chun
|
||||
cois
|
||||
céad
|
||||
cúig
|
||||
cúigear
|
||||
d'
|
||||
daichead
|
||||
dar
|
||||
de
|
||||
deich
|
||||
deichniúr
|
||||
den
|
||||
dhá
|
||||
do
|
||||
don
|
||||
dtí
|
||||
dá
|
||||
dár
|
||||
dó
|
||||
faoi
|
||||
faoin
|
||||
faoina
|
||||
faoinár
|
||||
fara
|
||||
fiche
|
||||
gach
|
||||
gan
|
||||
go
|
||||
gur
|
||||
haon
|
||||
hocht
|
||||
i
|
||||
iad
|
||||
idir
|
||||
in
|
||||
ina
|
||||
ins
|
||||
inár
|
||||
is
|
||||
le
|
||||
leis
|
||||
lena
|
||||
lenár
|
||||
m'
|
||||
mar
|
||||
mo
|
||||
mé
|
||||
na
|
||||
nach
|
||||
naoi
|
||||
naonúr
|
||||
ná
|
||||
ní
|
||||
níor
|
||||
nó
|
||||
nócha
|
||||
ocht
|
||||
ochtar
|
||||
os
|
||||
roimh
|
||||
sa
|
||||
seacht
|
||||
seachtar
|
||||
seachtó
|
||||
seasca
|
||||
seisear
|
||||
siad
|
||||
sibh
|
||||
sinn
|
||||
sna
|
||||
sé
|
||||
sí
|
||||
tar
|
||||
thar
|
||||
thú
|
||||
triúr
|
||||
trí
|
||||
trína
|
||||
trínár
|
||||
tríocha
|
||||
tú
|
||||
um
|
||||
ár
|
||||
é
|
||||
éis
|
||||
í
|
||||
ó
|
||||
ón
|
||||
óna
|
||||
ónár
|
161
java/solr/server/solr/stackdump/conf/lang/stopwords_gl.txt
Normal file
161
java/solr/server/solr/stackdump/conf/lang/stopwords_gl.txt
Normal file
@ -0,0 +1,161 @@
|
||||
# galican stopwords
|
||||
a
|
||||
aínda
|
||||
alí
|
||||
aquel
|
||||
aquela
|
||||
aquelas
|
||||
aqueles
|
||||
aquilo
|
||||
aquí
|
||||
ao
|
||||
aos
|
||||
as
|
||||
así
|
||||
á
|
||||
ben
|
||||
cando
|
||||
che
|
||||
co
|
||||
coa
|
||||
comigo
|
||||
con
|
||||
connosco
|
||||
contigo
|
||||
convosco
|
||||
coas
|
||||
cos
|
||||
cun
|
||||
cuns
|
||||
cunha
|
||||
cunhas
|
||||
da
|
||||
dalgunha
|
||||
dalgunhas
|
||||
dalgún
|
||||
dalgúns
|
||||
das
|
||||
de
|
||||
del
|
||||
dela
|
||||
delas
|
||||
deles
|
||||
desde
|
||||
deste
|
||||
do
|
||||
dos
|
||||
dun
|
||||
duns
|
||||
dunha
|
||||
dunhas
|
||||
e
|
||||
el
|
||||
ela
|
||||
elas
|
||||
eles
|
||||
en
|
||||
era
|
||||
eran
|
||||
esa
|
||||
esas
|
||||
ese
|
||||
eses
|
||||
esta
|
||||
estar
|
||||
estaba
|
||||
está
|
||||
están
|
||||
este
|
||||
estes
|
||||
estiven
|
||||
estou
|
||||
eu
|
||||
é
|
||||
facer
|
||||
foi
|
||||
foron
|
||||
fun
|
||||
había
|
||||
hai
|
||||
iso
|
||||
isto
|
||||
la
|
||||
las
|
||||
lle
|
||||
lles
|
||||
lo
|
||||
los
|
||||
mais
|
||||
me
|
||||
meu
|
||||
meus
|
||||
min
|
||||
miña
|
||||
miñas
|
||||
moi
|
||||
na
|
||||
nas
|
||||
neste
|
||||
nin
|
||||
no
|
||||
non
|
||||
nos
|
||||
nosa
|
||||
nosas
|
||||
noso
|
||||
nosos
|
||||
nós
|
||||
nun
|
||||
nunha
|
||||
nuns
|
||||
nunhas
|
||||
o
|
||||
os
|
||||
ou
|
||||
ó
|
||||
ós
|
||||
para
|
||||
pero
|
||||
pode
|
||||
pois
|
||||
pola
|
||||
polas
|
||||
polo
|
||||
polos
|
||||
por
|
||||
que
|
||||
se
|
||||
senón
|
||||
ser
|
||||
seu
|
||||
seus
|
||||
sexa
|
||||
sido
|
||||
sobre
|
||||
súa
|
||||
súas
|
||||
tamén
|
||||
tan
|
||||
te
|
||||
ten
|
||||
teñen
|
||||
teño
|
||||
ter
|
||||
teu
|
||||
teus
|
||||
ti
|
||||
tido
|
||||
tiña
|
||||
tiven
|
||||
túa
|
||||
túas
|
||||
un
|
||||
unha
|
||||
unhas
|
||||
uns
|
||||
vos
|
||||
vosa
|
||||
vosas
|
||||
voso
|
||||
vosos
|
||||
vós
|
235
java/solr/server/solr/stackdump/conf/lang/stopwords_hi.txt
Normal file
235
java/solr/server/solr/stackdump/conf/lang/stopwords_hi.txt
Normal file
@ -0,0 +1,235 @@
|
||||
# Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
# See http://members.unine.ch/jacques.savoy/clef/index.html.
|
||||
# This file was created by Jacques Savoy and is distributed under the BSD license.
|
||||
# Note: by default this file also contains forms normalized by HindiNormalizer
|
||||
# for spelling variation (see section below), such that it can be used whether or
|
||||
# not you enable that feature. When adding additional entries to this list,
|
||||
# please add the normalized form as well.
|
||||
अंदर
|
||||
अत
|
||||
अपना
|
||||
अपनी
|
||||
अपने
|
||||
अभी
|
||||
आदि
|
||||
आप
|
||||
इत्यादि
|
||||
इन
|
||||
इनका
|
||||
इन्हीं
|
||||
इन्हें
|
||||
इन्हों
|
||||
इस
|
||||
इसका
|
||||
इसकी
|
||||
इसके
|
||||
इसमें
|
||||
इसी
|
||||
इसे
|
||||
उन
|
||||
उनका
|
||||
उनकी
|
||||
उनके
|
||||
उनको
|
||||
उन्हीं
|
||||
उन्हें
|
||||
उन्हों
|
||||
उस
|
||||
उसके
|
||||
उसी
|
||||
उसे
|
||||
एक
|
||||
एवं
|
||||
एस
|
||||
ऐसे
|
||||
और
|
||||
कई
|
||||
कर
|
||||
करता
|
||||
करते
|
||||
करना
|
||||
करने
|
||||
करें
|
||||
कहते
|
||||
कहा
|
||||
का
|
||||
काफ़ी
|
||||
कि
|
||||
कितना
|
||||
किन्हें
|
||||
किन्हों
|
||||
किया
|
||||
किर
|
||||
किस
|
||||
किसी
|
||||
किसे
|
||||
की
|
||||
कुछ
|
||||
कुल
|
||||
के
|
||||
को
|
||||
कोई
|
||||
कौन
|
||||
कौनसा
|
||||
गया
|
||||
घर
|
||||
जब
|
||||
जहाँ
|
||||
जा
|
||||
जितना
|
||||
जिन
|
||||
जिन्हें
|
||||
जिन्हों
|
||||
जिस
|
||||
जिसे
|
||||
जीधर
|
||||
जैसा
|
||||
जैसे
|
||||
जो
|
||||
तक
|
||||
तब
|
||||
तरह
|
||||
तिन
|
||||
तिन्हें
|
||||
तिन्हों
|
||||
तिस
|
||||
तिसे
|
||||
तो
|
||||
था
|
||||
थी
|
||||
थे
|
||||
दबारा
|
||||
दिया
|
||||
दुसरा
|
||||
दूसरे
|
||||
दो
|
||||
द्वारा
|
||||
न
|
||||
नहीं
|
||||
ना
|
||||
निहायत
|
||||
नीचे
|
||||
ने
|
||||
पर
|
||||
पर
|
||||
पहले
|
||||
पूरा
|
||||
पे
|
||||
फिर
|
||||
बनी
|
||||
बही
|
||||
बहुत
|
||||
बाद
|
||||
बाला
|
||||
बिलकुल
|
||||
भी
|
||||
भीतर
|
||||
मगर
|
||||
मानो
|
||||
मे
|
||||
में
|
||||
यदि
|
||||
यह
|
||||
यहाँ
|
||||
यही
|
||||
या
|
||||
यिह
|
||||
ये
|
||||
रखें
|
||||
रहा
|
||||
रहे
|
||||
ऱ्वासा
|
||||
लिए
|
||||
लिये
|
||||
लेकिन
|
||||
व
|
||||
वर्ग
|
||||
वह
|
||||
वह
|
||||
वहाँ
|
||||
वहीं
|
||||
वाले
|
||||
वुह
|
||||
वे
|
||||
वग़ैरह
|
||||
संग
|
||||
सकता
|
||||
सकते
|
||||
सबसे
|
||||
सभी
|
||||
साथ
|
||||
साबुत
|
||||
साभ
|
||||
सारा
|
||||
से
|
||||
सो
|
||||
ही
|
||||
हुआ
|
||||
हुई
|
||||
हुए
|
||||
है
|
||||
हैं
|
||||
हो
|
||||
होता
|
||||
होती
|
||||
होते
|
||||
होना
|
||||
होने
|
||||
# additional normalized forms of the above
|
||||
अपनि
|
||||
जेसे
|
||||
होति
|
||||
सभि
|
||||
तिंहों
|
||||
इंहों
|
||||
दवारा
|
||||
इसि
|
||||
किंहें
|
||||
थि
|
||||
उंहों
|
||||
ओर
|
||||
जिंहें
|
||||
वहिं
|
||||
अभि
|
||||
बनि
|
||||
हि
|
||||
उंहिं
|
||||
उंहें
|
||||
हें
|
||||
वगेरह
|
||||
एसे
|
||||
रवासा
|
||||
कोन
|
||||
निचे
|
||||
काफि
|
||||
उसि
|
||||
पुरा
|
||||
भितर
|
||||
हे
|
||||
बहि
|
||||
वहां
|
||||
कोइ
|
||||
यहां
|
||||
जिंहों
|
||||
तिंहें
|
||||
किसि
|
||||
कइ
|
||||
यहि
|
||||
इंहिं
|
||||
जिधर
|
||||
इंहें
|
||||
अदि
|
||||
इतयादि
|
||||
हुइ
|
||||
कोनसा
|
||||
इसकि
|
||||
दुसरे
|
||||
जहां
|
||||
अप
|
||||
किंहों
|
||||
उनकि
|
||||
भि
|
||||
वरग
|
||||
हुअ
|
||||
जेसा
|
||||
नहिं
|
209
java/solr/server/solr/stackdump/conf/lang/stopwords_hu.txt
Normal file
209
java/solr/server/solr/stackdump/conf/lang/stopwords_hu.txt
Normal file
@ -0,0 +1,209 @@
|
||||
| From svn.tartarus.org/snowball/trunk/website/algorithms/hungarian/stop.txt
|
||||
| This file is distributed under the BSD License.
|
||||
| See http://snowball.tartarus.org/license.php
|
||||
| Also see http://www.opensource.org/licenses/bsd-license.html
|
||||
| - Encoding was converted to UTF-8.
|
||||
| - This notice was added.
|
||||
|
||||
| Hungarian stop word list
|
||||
| prepared by Anna Tordai
|
||||
|
||||
a
|
||||
ahogy
|
||||
ahol
|
||||
aki
|
||||
akik
|
||||
akkor
|
||||
alatt
|
||||
által
|
||||
általában
|
||||
amely
|
||||
amelyek
|
||||
amelyekben
|
||||
amelyeket
|
||||
amelyet
|
||||
amelynek
|
||||
ami
|
||||
amit
|
||||
amolyan
|
||||
amíg
|
||||
amikor
|
||||
át
|
||||
abban
|
||||
ahhoz
|
||||
annak
|
||||
arra
|
||||
arról
|
||||
az
|
||||
azok
|
||||
azon
|
||||
azt
|
||||
azzal
|
||||
azért
|
||||
aztán
|
||||
azután
|
||||
azonban
|
||||
bár
|
||||
be
|
||||
belül
|
||||
benne
|
||||
cikk
|
||||
cikkek
|
||||
cikkeket
|
||||
csak
|
||||
de
|
||||
e
|
||||
eddig
|
||||
egész
|
||||
egy
|
||||
egyes
|
||||
egyetlen
|
||||
egyéb
|
||||
egyik
|
||||
egyre
|
||||
ekkor
|
||||
el
|
||||
elég
|
||||
ellen
|
||||
elő
|
||||
először
|
||||
előtt
|
||||
első
|
||||
én
|
||||
éppen
|
||||
ebben
|
||||
ehhez
|
||||
emilyen
|
||||
ennek
|
||||
erre
|
||||
ez
|
||||
ezt
|
||||
ezek
|
||||
ezen
|
||||
ezzel
|
||||
ezért
|
||||
és
|
||||
fel
|
||||
felé
|
||||
hanem
|
||||
hiszen
|
||||
hogy
|
||||
hogyan
|
||||
igen
|
||||
így
|
||||
illetve
|
||||
ill.
|
||||
ill
|
||||
ilyen
|
||||
ilyenkor
|
||||
ison
|
||||
ismét
|
||||
itt
|
||||
jó
|
||||
jól
|
||||
jobban
|
||||
kell
|
||||
kellett
|
||||
keresztül
|
||||
keressünk
|
||||
ki
|
||||
kívül
|
||||
között
|
||||
közül
|
||||
legalább
|
||||
lehet
|
||||
lehetett
|
||||
legyen
|
||||
lenne
|
||||
lenni
|
||||
lesz
|
||||
lett
|
||||
maga
|
||||
magát
|
||||
majd
|
||||
majd
|
||||
már
|
||||
más
|
||||
másik
|
||||
meg
|
||||
még
|
||||
mellett
|
||||
mert
|
||||
mely
|
||||
melyek
|
||||
mi
|
||||
mit
|
||||
míg
|
||||
miért
|
||||
milyen
|
||||
mikor
|
||||
minden
|
||||
mindent
|
||||
mindenki
|
||||
mindig
|
||||
mint
|
||||
mintha
|
||||
mivel
|
||||
most
|
||||
nagy
|
||||
nagyobb
|
||||
nagyon
|
||||
ne
|
||||
néha
|
||||
nekem
|
||||
neki
|
||||
nem
|
||||
néhány
|
||||
nélkül
|
||||
nincs
|
||||
olyan
|
||||
ott
|
||||
össze
|
||||
ő
|
||||
ők
|
||||
őket
|
||||
pedig
|
||||
persze
|
||||
rá
|
||||
s
|
||||
saját
|
||||
sem
|
||||
semmi
|
||||
sok
|
||||
sokat
|
||||
sokkal
|
||||
számára
|
||||
szemben
|
||||
szerint
|
||||
szinte
|
||||
talán
|
||||
tehát
|
||||
teljes
|
||||
tovább
|
||||
továbbá
|
||||
több
|
||||
úgy
|
||||
ugyanis
|
||||
új
|
||||
újabb
|
||||
újra
|
||||
után
|
||||
utána
|
||||
utolsó
|
||||
vagy
|
||||
vagyis
|
||||
valaki
|
||||
valami
|
||||
valamint
|
||||
való
|
||||
vagyok
|
||||
van
|
||||
vannak
|
||||
volt
|
||||
voltam
|
||||
voltak
|
||||
voltunk
|
||||
vissza
|
||||
vele
|
||||
viszont
|
||||
volna
|
46
java/solr/server/solr/stackdump/conf/lang/stopwords_hy.txt
Normal file
46
java/solr/server/solr/stackdump/conf/lang/stopwords_hy.txt
Normal file
@ -0,0 +1,46 @@
|
||||
# example set of Armenian stopwords.
|
||||
այդ
|
||||
այլ
|
||||
այն
|
||||
այս
|
||||
դու
|
||||
դուք
|
||||
եմ
|
||||
են
|
||||
ենք
|
||||
ես
|
||||
եք
|
||||
է
|
||||
էի
|
||||
էին
|
||||
էինք
|
||||
էիր
|
||||
էիք
|
||||
էր
|
||||
ըստ
|
||||
թ
|
||||
ի
|
||||
ին
|
||||
իսկ
|
||||
իր
|
||||
կամ
|
||||
համար
|
||||
հետ
|
||||
հետո
|
||||
մենք
|
||||
մեջ
|
||||
մի
|
||||
ն
|
||||
նա
|
||||
նաև
|
||||
նրա
|
||||
նրանք
|
||||
որ
|
||||
որը
|
||||
որոնք
|
||||
որպես
|
||||
ու
|
||||
ում
|
||||
պիտի
|
||||
վրա
|
||||
և
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user