mirror of
https://github.com/djohnlewis/stackdump
synced 2024-12-04 23:17:37 +00:00
Made some minor amendments to the instructions in the README.
This commit is contained in:
parent
049e857159
commit
1f9546e4b9
@ -41,7 +41,7 @@ Stackdump was to be self-contained, so to get it up and running, simply extract
|
||||
|
||||
h3. Verify dependencies
|
||||
|
||||
Next, you should verify that the required Java and Python versions are accessible in the PATH.
|
||||
Next, you should verify that the required Java and Python versions are accessible in the PATH. (If you haven't installed them yet, now is a good time to do so.)
|
||||
|
||||
Type @java -version@ and check that it is at least version 1.6.
|
||||
|
||||
@ -76,9 +76,17 @@ To start the import process, execute the following command -
|
||||
|
||||
... where site_url is the URL of the site you're importing, e.g. __android.stackexchange.com__; dump_date is the date of the data dump you're importing, e.g. __August 2012__, and finally path_to_xml_files is the path to the XML files you just extracted. The dump_date is a text string that is shown in the app only, so it can be in any format you want.
|
||||
|
||||
For example, to import the August 2012 data dump of the Android StackExchange site, you would execute -
|
||||
|
||||
@stackdump_dir/manage.sh import_site --base-url android.stackexchange.com --dump-date "August 2012" /tmp/android@
|
||||
|
||||
It is normal to get messages about unknown PostTypeIds and missing comments and answers. These errors are likely due to those posts being hidden via moderation.
|
||||
|
||||
This can take anywhere between a minute to 10 hours or more depending on the site you're importing. As a rough guide, __android.stackexchange.com__ took a minute on my VM, while __stackoverflow.com__ took just over 10 hours.
|
||||
|
||||
Repeat these steps for each site you wish to import.
|
||||
Repeat these steps for each site you wish to import. Do not attempt to import multiple sites at the same time; it will not work and you may end up with half-imported sites.
|
||||
|
||||
The import process can be cancelled at any time without any adverse effect, however on the next run it will have to start from scratch again.
|
||||
|
||||
h3. Start the app
|
||||
|
||||
@ -86,7 +94,7 @@ To start Stackdump, execute the following command -
|
||||
|
||||
@stackdump_dir/start_web.sh@
|
||||
|
||||
... and visit port 8080 on that machine.
|
||||
... and visit port 8080 on that machine. That's it - your own offline, read-only instance of StackExchange.
|
||||
|
||||
If you need to change the port that it runs on, modify @stackdump_dir/python/src/stackdump/settings.py@ and restart the app.
|
||||
|
||||
@ -94,6 +102,17 @@ Stackdump comes bundled with some init.d scripts as well which were tested on Ce
|
||||
|
||||
Both the search indexer and the app need to be running for Stackdump to work.
|
||||
|
||||
h2. Maintenance
|
||||
|
||||
Stackdump stores all its data in the @data@ directory under its root directory. If you want to start fresh, just stop the app and the search indexer, delete that directory and restart the app and search indexer.
|
||||
|
||||
To delete certain sites from Stackdump, use the manage_sites management command -
|
||||
|
||||
@stackdump_dir/manage.sh manage_sites -l@ to list the sites (and their site keys) currently in the system;
|
||||
@stackdump_dir/manage.sh manage_sites -d site_key@ to delete a particular site.
|
||||
|
||||
It is not necessary to delete a site before importing a new data dump of it though; the import process will automatically purge the old copy during the import process.
|
||||
|
||||
h2. Credits
|
||||
|
||||
Stackdump leverages several open-source projects to do various things, including -
|
||||
|
Loading…
Reference in New Issue
Block a user