yaz-marcdump - MARC record dump utility
yaz-marcdump [-i format]
[-o format]
[-f from] [-t to]
[-l spec] [-c cfile]
[-s prefix]
[-C size]
[-O offset]
[-L limit] [-n] [-p] [-r]
[-v] [-V] [file...]
yaz-marcdump reads MARC records from one or more files. It
parses each record and supports output in line-format, ISO2709,
MARCXML[1], MARC-in-JSON[2], MarcXchange[3] as well as
Hex output.
This utility parses records ISO2709(raw MARC), line format,
MARC-in-JSON format as well as XML if that is structured as
MARCXML/MarcXchange.
MARC-in-JSON encoding/decoding is supported in YAZ 5.0.5 and
later.
Note
As of YAZ 2.1.18, OAI-MARC is no longer supported. OAI-MARC is
deprecated. Use MARCXML instead.
By default, each record is written to standard output in a line
format with newline for each field, $x for each sub-field x. The output
format may be changed with option -o,
yaz-marcdump can also be requested to perform character set
conversion of each record.
-i format
Specifies input format. Must be one of marcxml, marc
(ISO2709), marcxchange (ISO25577), line (line mode MARC), turbomarc (Turbo
MARC), or json (MARC-in-JSON).
-o format
Specifies output format. Must be one of marcxml, marc
(ISO2709), marcxchange (ISO25577), line (line mode MARC), turbomarc (Turbo
MARC), or json (MARC-in-JSON).
-f from
Specify the character set of the input MARC record.
Should be used in conjunction with option -t. Refer to the yaz-iconv man page
for supported character sets.
-t to
Specify the character set of the output. Should be used
in conjunction with option -f. Refer to the yaz-iconv man page for supported
character sets.
-l leaderspec
Specify a simple modification string for MARC leader. The
leaderspec is a list of pos=value pairs, where pos is an integer offset
(0 - 23) for leader. Value is either a quoted string or an integer (character
value in decimal). Pairs are comma separated. For example, to set leader at
offset 9 to a, use 9='a'.
-s prefix
Writes a chunk of records to a separate file with prefix
given, i.e. splits a record batch into files with only at most
"chunk" ISO2709 records per file. By default chunk is 1 (one record
per file). See option -C.
-C chunksize
Specifies chunk size; to be used conjunction with option
-s.
-O offset
Integer offset for at what position records whould be
written. 0=first record, 1=second, .. With -L option, this allows a specific
range of records to be processed.
-L limit
Integer limit for how many records should at most be
written. With -O option, this allows a specific range of records to be
processed.
-p
Makes yaz-marcdump print record number and input file
offset of each record read.
-n
MARC output is omitted so that MARC input is only
checked.
-r
Writes to stderr a summary about number of records read
by yaz-marcdump.
-v
Writes more information about the parsing process. Useful
if you have ill-formatted ISO2709 records as input.
-V
Prints YAZ version.
The following command converts MARC21/USMARC in MARC-8 encoding to
MARC21/USMARC in UTF-8 encoding. Leader offset 9 is set to 'a'. Both input
and output records are ISO2709 encoded.
yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 marc21.raw >marc21.utf8.raw
The same records may be converted to MARCXML instead in UTF-8:
yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml marc21.raw >marcxml.xml
Turbo MARC is a compact XML notation with same semantics as
MARCXML, but which allows for faster processing via XSLT. In order to
generate Turbo MARC records encoded in UTF-8 from MARC21 (ISO), one could
use:
yaz-marcdump -f MARC8 -t UTF8 -o turbomarc -i marc marc21.raw >out.xml
prefix/bin/yaz-marcdump
prefix/include/yaz/marcdisp.h
- 1.
- MARCXML
https://www.loc.gov/standards/marcxml/
- 2.
- MARC-in-JSON
https://rossfsinger.com/blog/2010/09/a-proposal-to-serialize-marc-in-json/
- 3.
- MarcXchange
https://www.loc.gov/standards/iso25577/