1
0
mirror of https://github.com/djohnlewis/stackdump synced 2024-12-04 06:57:36 +00:00
Commit Graph

195 Commits

Author SHA1 Message Date
djohnlewis
ea8aefcaf7 xml output 2021-06-13 15:23:02 +01:00
djohnlewis
a10a6d1e4d Python update 2021-06-13 15:19:51 +01:00
djohnlewis
4616d04c56 Python update 2021-06-10 09:46:47 +01:00
djohnlewis
f53efd1422 image download 2021-05-24 20:14:14 +01:00
djohnlewis
20693a8764 added gitigore 2021-05-24 09:15:01 +01:00
Sornram Kmut'nb
f20e281d3d Update README.textile 2016-07-16 23:07:49 +07:00
Sornram Kmut'nb
0f16cd4bce Update README.textile 2016-07-16 23:07:14 +07:00
Sornram Kmut'nb
2d27d3efe4 Update README.textile 2016-07-16 23:05:35 +07:00
Sornram Kmut'nb
3f29d45964 Update README.textile 2016-07-16 23:03:24 +07:00
Skylar Ittner
dcc7203c97 Updated icon retrieval URL to fix 404 errors 2016-04-21 01:23:32 +00:00
Sam Lai
6a7b8ea432 Updated the README to reflect the new resource requirements needed for the latest StackOverflow data set. 2015-01-06 22:15:30 +00:00
Samuel Lai
c020660479 Bad URLs are left as bad external links, rather than re-written as bad
internal links.

This doesn't make much difference, but I think it is nicer to assume bad
URLs are external, rather than internal.
2014-05-17 19:34:16 +10:00
Alexei Baboulevitch
7f6ed7b438 Fixed an uncaught exception caused by broken URLs.
Examples of fixed pages: photo/11689, stackoverflow/315911
2014-05-12 14:35:28 +02:00
Samuel Lai
a06d2a4c55 Added minimum version information in more places in README. 2014-04-26 23:18:01 +10:00
Samuel Lai
55bec19665 Updated README with ideal Python version information. 2014-04-26 23:09:45 +10:00
Samuel Lai
a59e3b59d0 Updated readme with a few minor changes. 2014-04-25 23:51:42 +10:00
Samuel Lai
40121f2600 Added tag v1.3.1 for changeset 321f5e2fa176 2014-03-04 16:28:44 +11:00
Samuel Lai
db026d2ccc Fixed Supervisor config file example so it actually stops the components properly. 2014-03-04 16:19:30 +11:00
Samuel Lai
ae6e10e6c4 Fixed a bug where import_site will loop forever if a non-SolrError exception is encountered. 2014-03-04 15:47:11 +11:00
Samuel Lai
f79df598d3 Added tag v1.3 for changeset 5c1ae2e2f71a 2014-03-04 15:06:01 +11:00
Samuel Lai
4d6343584a Minor README tweaks. 2014-03-03 17:07:26 +11:00
Samuel Lai
9d1d6b135a Grrr. More textile issues. 2014-02-27 22:02:04 +11:00
Samuel Lai
96b06f7b35 Oops, textile syntax mistake. 2014-02-27 22:00:48 +11:00
Samuel Lai
28d79ea089 Added notes on using supervisor with stackdump. 2014-02-27 21:58:22 +11:00
Samuel Lai
ce7edf1ca0 Minor README tweaks. 2014-02-27 20:44:55 +11:00
Samuel Lai
4254f31859 Updated the README for the next release.
Fixes #8 by updating the URL to the data dumps.
2014-02-27 20:39:32 +11:00
Samuel Lai
c11fcfacf6 Fixes #9. Added ability for import_site command to resume importing if the connection to Solr is lost and restored. 2014-02-27 20:12:53 +11:00
Samuel Lai
7764f088c2 Added a setting to disable the rewriting of links and image URLs. 2014-02-27 18:52:25 +11:00
Samuel Lai
a4c6c2c7ba Certain ignored post type IDs are now recognised by the error handler and messages printed as such. 2014-02-27 18:13:04 +11:00
Samuel Lai
01f9b10c27 Fixed #7. Turns out post IDs are not unique across sites.
This change will require re-indexing of all sites unfortunately. On the upside, more questions to browse!
2014-02-27 17:57:34 +11:00
Sam
cdb93e6f68 Merged changes. 2014-02-16 01:04:19 +11:00
Sam
0990e00852 Added an original copy of pysolr.py so the custom changes can be worked out. 2014-02-16 01:03:05 +11:00
Samuel Lai
92e359174a Added some notes on importing StackOverflow on Windows. 2013-12-12 17:29:55 +11:00
Samuel Lai
c521fc1627 Added tag v1.2 for changeset 240affa260a1 2013-11-30 18:06:37 +11:00
Sam
722d4125e7 Added section in README re new PowerShell scripts.
Also fixed formatting and wording.
2013-12-01 03:43:58 +11:00
Sam
ce3eb04270 Updated README with v1.2 changes and SO import stats. 2013-12-01 03:33:40 +11:00
Samuel Lai
9613caa8d1 Changed settings so Solr now only listens on localhost, not all interfaces. 2013-11-29 15:18:55 +11:00
Samuel Lai
2583afeb90 Removed more redundant date/time parsing. 2013-11-29 15:11:32 +11:00
Samuel Lai
522e1ff4f2 Fixed bug in script where the directory change was not reverted when script exited. 2013-11-29 15:06:10 +11:00
Samuel Lai
36eb8d3980 Changed the name of the stackdump schema to something better than 'Example'. 2013-11-29 15:05:31 +11:00
Samuel Lai
a597b2e588 Merge import-perf-improvements branch to default. 2013-11-29 13:01:41 +11:00
Samuel Lai
4a9c4504b3 Updated bad docs. 2013-11-29 12:57:06 +11:00
Samuel Lai
77dd2def42 Oops, forgot to re-instate the comment index during the backout. 2013-11-29 01:42:17 +11:00
Samuel Lai
75a216f5a4 Backed out the comments-batching change.
It was causing weird perf issues and errors. Didn't really seem like it made things faster; if anything, things became slower.
2013-11-29 01:12:09 +11:00
Samuel Lai
bf09e36928 Changed other models to avoid unnecessary date/time parsing.
Added PRAGMA statements for comments table and changed flow so the siteId_postId index is now created after data has been inserted.
2013-11-29 00:18:54 +11:00
Samuel Lai
cdb8d96508 Comments are now committed in batches and using a 'prepared' statement via executemany.
Also fixed a Windows compatibility bug with the new temp comments db and a bug with the webapp now that the Comment model has moved. Dates are also no longer parsed from their ISO form for comments; instead left as strings and parsed by SQLObject internally as needed.
2013-11-28 23:51:53 +11:00
Samuel Lai
5868c8e328 Fixed settings for Windows compatibility. 2013-11-28 22:06:33 +11:00
Samuel Lai
8e3d21f817 Fixed settings for Windows compatibility. 2013-11-28 22:06:33 +11:00
Samuel Lai
2fea457b06 Added PowerShell equivalents to launch and manage Stackdump on Windows. 2013-11-28 21:53:45 +11:00
Samuel Lai
6469691e4b Added PowerShell equivalents to launch and manage Stackdump on Windows. 2013-11-28 21:53:45 +11:00