makehistory - Initialize or rebuild INN history database
makehistory [-abFIOSx] [-f filename]
[-l count] [-L load-average] [-s
size] [-T tmpdir]
makehistory rebuilds the history(5) text file, which
contains a list of message-IDs of articles already seen by the server. It
can also be used to rebuild the overview database. Note that even though the
dbz indices for the history file are also rebuilt by
makehistory, it is useful to run makedbz(8) after makehistory(8) in
order to improve the efficiency of the indices (makehistory does not
know how large to make the hash table at first run, unless the size is given
by the -s flag).
The default location of the history text file is
pathdb/history; to specify an alternate location, use the -f
flag.
By default, makehistory will scan the entire spool, using
the storage manager, and write a history line for every article. To also
generate overview information, use the -O flag.
If a malformed article is found in the news spool, in a way which
prevents its integration into the history or overview data, a log line will
be output and the malformed article will just be skipped.
WARNING: If you're trying to rebuild the overview database,
be sure to stop innd(8) and delete or zero out the existing database before
you start for the best results. An overview rebuild should not be done while
the server is running. Unless the existing overview is deleted, you may end
up with problems like out-of-order overview entries, excessively large
overview buffers, and the like.
If ovmethod in inn.conf is
"ovdb", you must have the ovdb processes
running while rebuilding overview. ovdb needs them available while writing
overview entries. You can start them by hand separate from the rest of the
server by running ovdb_init; see ovdb_init(8) for more details.
Similarly, if ovmethod in inn.conf is
"ovsqlite", you must have the
ovsqlite-server process running while rebuilding overview. See
ovsqlite-server(8) for more details and how to start it by hand.
Rebuilding overview data is as straight-forward as:
- 1.
- Checking that the configuration file of the new overview storage method is
present in pathetc and fits your needs (buffindexed.conf,
ovdb.conf or ovsqlite.conf). Note that the tradindexed
overview storage method does not have a dedicated configuration file.
- 2.
- Making sure that INN is stopped ("rc.news
stop" as the news user, or whichever command you're
using).
- 3.
- Setting the new overview storage method in the ovmethod parameter
in inn.conf.
- 4.
- Making sure that the directory specified by the pathoverview
parameter in inn.conf exists and is empty (or contains freshly
created buffindexed buffers, if using that overview storage method).
Otherwise, rename the current directory (to backup existing overview data)
and re-create pathoverview as the news user.
- 5.
- Starting ovdb_init or ovsqlite-server as the news user if
the new overview storage method is respectively ovdb or ovsqlite.
- 6.
- Running "makehistory -O -x -F" and
waiting for the command to finish. (You may notice a few logs about
articles for which overview data cannot be inserted into the new overview
storage method. As long as there aren't tons of them, it is normal,
notably because there is an internal limit in the length of overview data
generated by makehistory, contrary to innd. Unfortunately,
these rare articles won't be present in the new overview.)
- 7.
- Stopping ovdb or ovsqlite helper programs if you started them during the
previous steps (running "rc.news stop"
as the news user will stop them; do not mind the messages related to the
fact that the news server was not running).
- 8.
- Starting INN and checking the logs to make sure everything is fine. You
will normally notice that the active file is renumbered
(rc.news takes care of that when run after an overview rebuild;
otherwise, manually run "ctlinnd renumber
''").
- -a
- Append to the history file rather than generating a new one. If you
append to the main history file, make sure innd(8) is throttled or
not running, or you can corrupt the history.
- -b
- Delete any messages found in the spool that do not have valid Message-ID
header fields in them.
- -F
- Fork a separate process to flush overview data to disk rather than doing
it directly. The advantage of this is that it allows makehistory to
continue to collect more data from the spool while the first batch of data
is being written to the overview database. The disadvantage is that up to
twice as much temporary disk space will be used for the generated overview
data. This option only makes sense in combination with -O. With
buffindexed, the overchan program is invoked to write
overview.
- -f filename
- Rather than writing directly to pathdb/history, instead write to
filename, also in pathdb.
- -I
- Don't store overview data for articles numbered lower than the lowest
article number in active. This is useful if there are for whatever
reason old articles on disk that shouldn't be available to readers or put
into the overview database.
- -l count
- This option specifies how many articles to process before writing the
accumulated overview information out to the overview database. The default
is "10000". Since overview write
performance is faster with sorted data, each "batch" gets
sorted. Increasing the batch size with this option may further improve
write performance, at the cost of longer sort times. Also, temporary space
will be needed to store the overview batches. At a rough estimate, about
300 * count bytes of temporary space will be required (not counting
temp files created by sort(1)). See the description of the -T
option for how to specify the temporary storage location. This option has
no effect with buffindexed, because buffindexed does not need sorted
overview and no batching is done.
- -L
load-average
- Temporarily pause activities if the system load average exceeds the
specified level load-average. This allows makehistory to run
on a system being used for other purposes without monopolizing system
resources and thus making the response time for other applications
unacceptably slow. Using nice(1) does not help much for that because the
problem comes from disk I/O usage, and ionice(1) is not always available
or efficient.
- -O
- Create the overview database as well as the history file. Overview
information is only required if the server supports readers; it is not
needed for a transit-only server (see enableoverview in
inn.conf(5)). If you are using the buffindexed overview storage method,
erase all of your overview buffers before running makehistory with
-O.
- -S
- Rather than storing the overview data into the overview database, just
write it to standard output in a form suitable for feeding to
overchan later if wished. When this option is used, -F,
-I, -l, and -T are ignored. This option only makes
sense in combination with -O.
- -s size
- Size the history database for approximately size key-value pairs
(i.e. lines in history). Accurately specifying the size is an
optimization that will create a more efficient database. (The size should
be the estimated eventual number of articles, typically the size of the
old history file, in lines.)
By default, makehistory will create a database
optimized for handling about 6,000,000 articles (or 500,000 if the
slower tagged hash format is used). This size does not limit the number
of articles the news server can store in its history file. It
will just get slower when that optimal size is exceeded, until the next
run of news.daily which will appropriately resize it.
- -T tmpdir
- If -O is given, makehistory needs a location to write
temporary overview data. By default, it uses pathtmp, set in
inn.conf, but if this option is given, the provided tmpdir
is used instead. This is also used for temporary files created by sort(1)
(which is invoked in the process of writing overview information since
sorted overview information writes faster). By default, sort
usually uses your system temporary directory; see the sort(1) man page on
your system to be sure.
- -x
- If this option is given, makehistory won't write out history
file entries. This is useful mostly for building overview without
generating a new history file.
Here's a typical example of rebuilding the entire history and
overview database, removing broken articles in the news spool. This uses the
default temporary file locations and should be done while innd isn't
running (or is throttled).
makehistory -b -f history.n -O -l 30000 -I
This will rebuild the overview (if using buffindexed, erase the
existing overview buffers before running this command) and leave a new
history file as "history.n" in
pathdb. To preserve all of the history entries from the old
history file that correspond to rejected articles or expired
articles, follow the above command with:
cd <pathdb>
awk 'NF == 2 { print }' < history >> history.n
(replacing the path with your pathdb, if it isn't the
default). Then look over the new history file for problems and
run:
makedbz -s `wc -l < history.n` -f history.n
Then rename all of the files matching
"history.n.*" to
"history.*", replacing the current history
database and indices. After that, it's safe to unthrottle innd.
For a simpler example:
makehistory -b -f history.n -I -O
will scan the spool, removing broken articles and generating
history and overview entries for articles missing from history.
To pre-size the history file for 100,000,000 articles, and
generate overview data at the same time, you may directly use the following
command:
makehistory -O -s 100000000
You then do not need running makedbz as the history
file has already been generated and optimized for the expected number of
articles.
To just rebuild overview:
makehistory -O -x -F
- pathdb/history
- This is the default output file for makehistory.
- pathtmp
- Where temporary files are written unless -T is given.
Originally written by Rich $alz <rsalz@uunet.uu.net> for
InterNetNews and updated by various other people since.
active(5), ctlinnd(8), history(5), inn.conf(5), innd(8),
libinn_dbz(3), makedbz(8), ovdb_init(8), overchan(8),
ovsqlite-server(8).