Vufind: delete imported records

Vufind admin panel (http://www.yourhost.com/vufind/Admin/Home) allows to delete records by id (it calls Records.php, and more precisely the deleteRecord method).

But if you want, for instance, to delete all records from a bad import you can do it directly from your system prompt using util/delete.php script:

The first parameter is the import file name and the second, its format. If no format supplied, ‘marc’ is assumed.

cd $VUFIND_HOME;
php util/deletes.php import/400.mrc marc

In my case an error was showing up:

PHP Warning:  parse_ini_file(../web/conf/config.ini): failed to open stream: No such file or directory in /usr/local/vufind/util/deletes.php on line 48
 
Warning: parse_ini_file(../web/conf/config.ini): failed to open stream: No such file or directory in /usr/local/vufind/util/deletes.php on line 48
Solr index is offline.

Mmmhh. Relative path issues.

I opened deletes.php and edited line 48 so that parse_ini_file is done to /usr/local/vufind/web/conf/config.ini (full path). But then another error was showing up:

PHP Fatal error:  Call to a member function getData() on a non-object in /usr/local/vufind/util/deletes.php on line 85
 
Fatal error: Call to a member function getData() on a non-object in /usr/local/vufind/util/deletes.php on line 85

What’s the problem now? Well, I am NOT USING MARC’s 001 tag as identificator as stated in my import/marc_local.properties (which overrides import/marc.properties). My id is set to record’s 907a tag value… UG.

We will cope with this issue later (here!)

New delete tool script

I first decided to make a little php program which allows me to delete a list of identifiers. I called it util/BorraRegistros.php.

This php script is called WITHOUT parameters, so you’ll have to edit it in order to include the identifiers in $lista_ids_registros array.

<?php
set_include_path('/usr/local/vufind/web/:/usr/local/vufind/web/sys/:/usr/local/lib/php/');
require_once 'Solr.php';
 
$configArray = parse_ini_file('/usr/local/vufind/web/conf/config.ini', true);
 
// Setup Solr Connection
$url = $configArray['Index']['url'];
$solr = new Solr($url);
if ($configArray['System']['debug']) {
    $solr->debug = true;
}
 
// ----------------------------------------------------------------
// This is the list of SOLR IDENTIFIERS to be deleted!!!
$lista_ids_registros = array('.b1000001x');
// ----------------------------------------------------------------
 
print "Interfaz de borrado de registros\nSe borraran los registros cuyos identificadores son:\n";
print_r($lista_ids_registros);
 
// Confirm deletion...
echo "¿Seguro de que deseas continuar? Escribe 'si' para continuar: ";
$handle = fopen ("php://stdin","r");
$line = fgets($handle);
if(trim($line) != 'si'){
    echo "Cancelado\n";
    exit;
}
echo "\n";
echo "Gracias, se va a proceder...\n";
 
// Delete each record identified by its value $lista_ids_registros
foreach ($lista_ids_registros as $id_registro){
   print "\nPreparando para borrar el registro '$id_registro'.............................";
   $solr->deleteRecord($id_registro);
   print "[ OK ]";
}
 
print "\nTerminando el borrado...";
// Now commit and optimize
$solr->commit();
$solr->optimize();
print "\n";
 
?>

More references to SOLR delete by id in http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_by_ID_and_by_Query

This script could also be done using cURL, but I kinda prefer it this way.

Fixing util/delete.php to fit records identified by marctags distinct from 001

Above we noticed problems when deleting recently imported records with util/delete.php. The problem is that this script does not read import/marc_local.properties and therefore does not notice that our solr records might not be identified by its marc’s 001 tag.

For instance, in my import/marc_local.properties we find:

id = 907a, first

So I modified util/delete.php so that it takes into account that my solr records are identified by tag 907 (subfield ‘a’) AND not tag 001! Notice changes in lines 86-90.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
<?php
/**
 *
 * Copyright (C) Villanova University 2007.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2,
 * as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 *
 */
 
// Parse the command line parameters -- see if we are in "flat file" mode and
// find out what file we are reading in!
$filename = $argv[1];
$mode = isset($argv[2]) ? $argv[2] : 'marc';
 
// No filename specified?  Give usage guidelines:
if (empty($filename)) {
    echo "Delete records from VuFind's index.\n\n";
    echo "Usage: deletes.php [filename] [format]\n\n";
    echo "[filename] is the file containing records to delete.\n";
    echo "[format] is the format of the file -- it may be one of the following:\n";
    echo "\tflat - flat text format (deletes all IDs in newline-delimited file)\n";
    echo "\tmarc - binary MARC format (delete all record IDs from 001 fields)\n";
    echo "\tmarcxml - MARC-XML format (delete all record IDs from 001 fields)\n";
    echo '"marc" is used by default if no format is specified.' . "\n";
    die();
}
 
// File doesn't exist?
if (!file_exists($filename)) {
    die("Cannot find file: {$filename}\n");
}
 
require_once 'util.inc.php';        // set up util environment
require_once 'sys/Solr.php';
 
// Read Config file
//$configArray = parse_ini_file('../web/conf/config.ini', true);
$configArray = parse_ini_file('/usr/local/vufind/web/conf/config.ini', true);
// Setup Solr Connection
$url = $configArray['Index']['url'];
$solr = new Solr($url);
if ($configArray['System']['debug']) {
    $solr->debug = true;
}
 
// Count deleted records:
$i = 0;
 
// Flat file mode:
if ($mode == 'flat') {
    $ids = explode("\n", file_get_contents($filename));
    foreach($ids as $id) {
        $id = trim($id);
        if (!empty($id)) {
            $solr->deleteRecord($id);
            $i++;
        }
    }
// MARC file mode:
} else {
    // We need to load the MARC record differently if it's XML or binary:
    if ($mode == 'marcxml') {
        require_once 'File/MARCXML.php';
        $collection = new File_MARCXML($filename);
    } else {
        // este require hace referencia a /usr/local/lib/php/File/MARC.php
        require_once 'File/MARC.php';
        $collection = new File_MARC($filename);
    }
 
    // Once the record is loaded, the rest of the logic is always the same:
    while ($record = $collection->next()) {
        // getField is defined in /usr/local/lib/php/File/MARC/Record.php
        // Comment this line
        // $idField = $record->getField('001');
        // Add the following two lines...
        $idField = $record->getField('907');
        $idField = $idField->getSubfield('a');
        $id = (string)$idField->getData();
        $solr->deleteRecord($id);
        $i++;
    }
}
 
// Commit and Optimize if necessary:
if ($i) {
    $solr->commit();
    $solr->optimize();
}
?>

Now we can run the new script and it will work 🙂

clear; php $VUFIND_HOME/util/deletes_by_907a.php $VUFIND_HOME/import/400.mrc marc

Thanks for reading, have fun!

vufind: mostrar información de reservas en los registros

Estos días he estado trasteando con vufind ([1] y [2]).

Mostrar el enlace hacia Holds

En algunas bibliotecas que también tienen vufind y su catálogo es Innovative Millenium había visto que mostraban la posibilidad de colocar un enlace para reservar ejemplares directamente. Sin embargo en mi instalación de vufind no parecía funcionar.

Investigando un poco en la wiki de vufind descubrí que el fichero que me estaba “fastidiando” era el view-holdings.tpl y más en concreto estas líneas:

 {foreach from=$holding item=row}
    {if $row.barcode != ""}

Mis registros no tenían un valor de barcode asignado, por lo cual esa parte del código no se ejecutaba.

¿Cómo lo averigué? Colocando un print_r de la variable $row en smarty (cómo hacer print_r de arrays en smarty templates). Mi código de view-holdings.tpl ha quedado ahora asi:

 {foreach from=$holding item=row}
    <em> Edit view-holdings.tpl by MiguelMartin</em><br />
    {$row|@print_r}
 
    <!-- comento esta linea y añado la siguiente -->
    <!--if $row.barcode != ""-->
    {if $row.id != ""}

Como todos mis registros sí tienen ID, el código se ejecuta y se muestra el link.

Que viva el open source! 🙂

(Para saber cómo configurar la conexión entre vufind y tu software de catálogo puedes consultar éste link)

Hacer que funcione el enlace a Holds usando Innovative

El link que se genera con lo anterior es del tipo
http://yourvufindserver.com/vufind/Record/.b10002030/Hold

Si usamos el driver de Innovative que viene por defecto (la pareja de archivos Innovative.ini y Innovative.php) obtendremos un error de PEAR que dice algo asi:

Cannot Process Place Hold - ILS Not Supported

Este error viene de $VUFIND_HOME/web/services/Record/Hold.php.

Para solucionarlo cambiad vuestro Innovative.ini por algo del tipo (ojo, cambiad la ‘url’ de la sección [Catalog] y de [PATRONAPI] para que apunte a la url de vuestro catálogo):

[Catalog]
url = http://iii.server.org
 
; The following is a set of fields to look up for
; Change them to match your HTML
[OPAC]
location_column    = "Location"
call_no_column     = "Call No"
status_column      = "Status"
reserves_column    = "Location"
reserves_key_name  = "res"
status_avail       = "AVAILABLE"
status_due         = "DUE "
 
[CONFIG]
; "bib" or "item" level holds
holdtype = "bib"
 
[PATRONAPI]
; Enable III Patron API usage for patron authentication
; and profile information.
enabled = "true"
url = http://iii.server.org:4500/PATRONAPI/

Y vuestro Innovative.php por:

<?php
/**
 *
 * Copyright (C) Villanova University 2007.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2,
 * as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 *
 */
require_once 'sys/Proxy_Request.php';
require_once 'Interface.php';
 
/**
 * VuFind Connector for Innovative
 *
 * This class uses screen scraping techniques to gather record holdings written
 * by Adam Bryn of the Tri-College consortium.
 *
 * @author Adam Brin <abrin@brynmawr.com>
 */
class Innovative implements DriverInterface
{
    public $config;
 
    public function __construct()
    {
        // Load Configuration for this Module
        $this->config = parse_ini_file('conf/Innovative.ini', true);
    }
 
    public function getStatus($id)
    {
        // Strip ID
        $id_ = substr(str_replace('.b', '', $id), 0, -1);
 
        // Load Record Page
        if (substr($this->config['Catalog']['url'], -1) == '/') {
            $host = substr($this->config['Catalog']['url'], 0, -1);
        } else {
            $host = $this->config['Catalog']['url'];
        }
        //$req = new Proxy_Request($host . '/record=b' . $id_);
 
        //Grab the full item list view
        $req = new Proxy_Request($host . '/search/.b' . $id_ . '/.b' . $id_ .'/1%2C1%2C1%2CB/holdings~' . $id_ . '&FF=&1%2C0%2C');
        if (PEAR::isError($req->sendRequest())) {
            return null;
        }
        $result = $req->getResponseBody();
 
        //strip out html before the first occurance of 'bibItems', should be '<table class="bibItems" '
        $r = substr($result, stripos($result, 'bibItems'));
        //strip out the rest of the first table tag.
        $r = substr($r,strpos($r,">")+1);
        //strip out the next table closing tag and everything after it.
        $r = substr($r,0,stripos($r,"</table"));
 
        //$r should only include the holdings table at this point
 
        //split up into strings that contain each table row, excluding the beginning tr tag.
        $rows = preg_split("/<tr([^>]*)>/",$r);
        $count = 0;
        $keys = array_pad(array(),10,"");
 
        $loc_col_name      = $this->config['OPAC']['location_column'];
        $call_col_name     = $this->config['OPAC']['call_no_column'];
        $status_col_name   = $this->config['OPAC']['status_column'];
        $reserves_col_name = $this->config['OPAC']['location_column'];
        $reserves_key_name = $this->config['OPAC']['reserves_key_name'];
        $stat_avail        = $this->config['OPAC']['status_avail'];
        $stat_due              = $this->config['OPAC']['status_due'];
 
        $ret = array();
        foreach ($rows as $row) {
                // Split up the contents of the row based on the th or td tag, excluding the tags themselves.
                $cols = preg_split("/<t(h|d)([^>]*)>/",$row);
 
                //for each th or td section, do the following.
                for ($i=0; $i < sizeof($cols); $i++) {
 
                        //replace non blocking space encodings with a space.
                        $cols[$i] = str_replace("&nbsp;"," ",$cols[$i]);
                        //remove html comment tags
                        $cols[$i] = ereg_replace("<!--([^(-->)]*)-->","",$cols[$i]);
                        //Remove closing th or td tag, trim whitespace and decode html entities
                        $cols[$i] = html_entity_decode(trim(substr($cols[$i],0,stripos($cols[$i],"</t"))));
 
                        //If this is the first row, it is the header row and has the column names
                        if ($count == 1) {
                                $keys[$i] = $cols[$i];
                        } else if ($count > 1) { //not the first row, has holding info
                                //look for location column
                                if (stripos($keys[$i],$loc_col_name) > -1) {
                                        $ret[$count-2]['location'] = strip_tags($cols[$i]);
                                }
                                // Does column hold reserves information?
                                if (stripos($keys[$i],$reserves_col_name) > -1) {
                                        if (stripos($cols[$i],$reserves_key_name) > -1) {
                                        $ret[$count-2]['reserve'] = 'Y';
                                } else {
                                        $ret[$count-2]['reserve'] = 'N';
                                }
                        }
                    // Does column hold call numbers?
                    if (stripos($keys[$i],$call_col_name) > -1) {
                        $ret[$count-2]['callnumber'] = strip_tags($cols[$i]);
                    }
                    // Look for status information.
                    if (stripos($keys[$i],$status_col_name) > -1) {
                        if (stripos($cols[$i],$stat_avail) > -1) {
                            $ret[$count-2]['status'] = "Available On Shelf";
                            $ret[$count-2]['availability'] = 1;
                        } else {
                            $ret[$count-2]['status'] = "Available to request";
                            $ret[$count-2]['availability'] = 0;
                        }
                        if (stripos($cols[$i],$stat_due) > -1) {
                            $t = trim(substr($cols[$i],stripos($cols[$i],$stat_due)+strlen($stat_due)));
                            $t = substr($t,0,stripos($t," "));
                            $ret[$count-2]['duedate'] = $t;
                        }
                    }
                    //$ret[$count-2][$keys[$i]] = $cols[$i];
                    //$ret[$count-2]['id'] = $bibid;
                    $ret[$count-2]['id'] = $id;
                    $ret[$count-2]['number'] = ($count -1);
                    //Return a fake barcode so hold link is enabled
                    //  Should be dependent on settings variable,  If bib level holds.
                    $ret[$count-2]['barcode'] = '1234567890123';
                }
            }
            $count++;
        }
        return $ret;
    }
 
    public function getStatuses($ids)
    {
        $items = array();
        $count = 0;
        foreach ($ids as $id) {
               $items[$count] = $this->getStatus($id);
               $count++;
        }
        return $items;
    }
 
    public function getHolding($id)
    {
        return $this->getStatus($id);
    }
 
    public function getPurchaseHistory($id)
    {
        return array();
    }
 
    public function getHoldLink($id)
    {
        // Strip ID
        $id_ = substr(str_replace('.b', '', $id), 0, -1);
 
        //Build request link
        $link = $this->config['Catalog']['url'] . '/search?/.b' . $id_ . '/.b' . $id_ . '/1%2C1%2C1%2CB/request~b'. $id_;
        //$link = $this->config['Catalog']['url'] . '/record=b' . $id_;
        return $link;
    }
 
    public function getMyProfile($userinfo)
    {
        return $userinfo;
 
    }
    public function patronLogin($username,$password)
    {
        //Todo, if username is a barcode, test to make sure it fits proper format
 
        if($this->config['PATRONAPI']['enabled'] == 'true'){
                //use patronAPI to authenticate customer
                $url = $this->config['PATRONAPI']['url'];
 
                //build patronapi pin test request
 
                $req = new Proxy_Request($url . urlencode($username) . '/' . urlencode($password) . '/pintest');
                if (PEAR::isError($req->sendRequest())) {
                    return null;
                }
                $result = $req->getResponseBody();
 
                //search for sucessfull response of "RETCOD=0"
                if(stripos($result,"RETCOD=0") == -1){
                        //pin did not match, can look up specific error to return more usefull info.
                        return null;
 
                }
 
                //Pin did match, get patron information
                $req = new Proxy_Request($url . urlencode($username) . '/dump');
                if (PEAR::isError($req->sendRequest())) {
                    return null;
                }
                $result = $req->getResponseBody();
 
                //The following is taken and modified from patronapi.php by John Blyberg released under the GPL
                $api_contents = trim(strip_tags($result));
                $api_array_lines = explode("\n", $api_contents);
                while (strlen($api_data[PBARCODE]) != 14 && !$api_data[ERRNUM]) {
                        foreach ($api_array_lines as $api_line) {
                                $api_line = str_replace("p=", "peq", $api_line);
                                $api_line_arr = explode("=", $api_line);
                                $regex_match = array("/\[(.*?)\]/","/\s/","/#/");
                                $regex_replace = array('','','NUM');
                                $key = trim(preg_replace($regex_match, $regex_replace, $api_line_arr[0]));
                                $api_data[$key] = trim($api_line_arr[1]);
                        }
                }
 
                if(!$api_data[PBARCODE]){
                        //no barcode found, can look up specific error to return more useful info.
                        //this check needs to be modified to handle using III patron ids also.
                      return null;
 
                }
                //return patron info
                $ret = array();
                $ret['id'] = $api_data[PBARCODE]; //or should I return patron id num?
                $names = explode(',', $api_data[PATRNNAME]);
                $ret['firstname'] = $names[1];
                $ret['lastname'] = $names[0];
                $ret['cat_username'] = urlencode($username);
                $ret['cat_password'] = urlencode($password);
                $ret['email'] = $api_data[EMAILADDR];
                $ret['major'] = null;
                $ret['college'] = $api_data[HOMELIBR];
                $ret['homelib'] = $api_data[HOMELIBR];
                //replace $ seperator in III addresses with newline
                $ret['address1'] = str_replace("$",", ",$api_data[ADDRESS]);
                $ret['address2'] = str_replace("$",", ",$api_data[ADDRESS2]);
                preg_match("/([0-9]{5}|[0-9]{5}-[0-9]{4})[ ]*$/",$api_data[ADDRESS],$zipmatch);
                $ret['zip'] = $zipmatch[1];  //retreive from address
                $ret['phone'] = $api_data[TELEPHONE];
                $ret['phone2'] = $api_data[TELEPHONE2];
                //Should probably have a translation table for patron type
                $ret['group'] = $api_data[PTYPE];
                $ret['expiration'] = $api_data[EXPDATE];
                //Only if agency module is enabled.
                $ret['region'] = $api_data[AGENCY];
                return $ret;
 
        }
        else {
                //use screen scrape
 
        }
 
    }
}
 
?>

De esta forma al hacer click en “Reservar” os llevará a algo del tipo:

https://yourcatalog.com/patroninfo~S1/0/redirect=/search?/.b1000203/.b1000203/1%2C1%2C1%2CB/request~b1000203

Ver Más ayuda.

Vufind con Innovative Millenium y CDS-Invenio: importar registros y configurar facetas

Unos días atrás comenté cómo instalar vufind.

Una vez instalado el software y comprobado que funcionan algunos aspectos fundamentales (como la validación por LDAP, etc), el siguiente paso es proceder a la importación de registros (bien sean desde el Catálogo de la Biblioteca o desde Repositorios OAI).

Paso a comentar algunas conclusiones obtenidas mediante las primeras experiencias de carga.

Carga de registros de prueba procedentes de Innovative Millenium (formato MARC)

Imaginemos que hemos exportado desde Millenium un archivo con registros MARC (.mrc). En caso de que no tengáis una exportación de registros a mano podéis usar (esta página o ésta otra para obtener datos de ejemplo).

Mi fichero se llama $VUFIND_HOME/import/400.mrc

Procedo a importar los registros con la siguiente orden:

/usr/local/vufind/import-marc.sh import/400.mrc

Más información en la wiki de vufind.

Carga de registros de prueba procedentes de CDS-Invenio (formato MARCXML)

Imaginemos que exportamos de un repositorio un conjunto de registros en formato marcxml, por ejemplo éste.

Creo una carpeta para almacenar estos ficheros .xml exportados:

mkdir $VUFIND_HOME/harvest/desdezaguan
mkdir $VUFIND_HOME/harvest/desdezaguan/marcxml
cd $VUFIND_HOME/harvest/desdezaguan/marcxml

Obtengo los registros…

wget http://zaguan.unizar.es/search?as=1&cc=Tesis&m1=a&p1=&f1=&op1=a&m2=a&p2=&f2=&op2=a&m3=a&p3=&f3=&action_search=Buscar&c=Tesis&c=&sf=&so=a&rm=&rg=160&sc=1&of=xm
 
Connecting to zaguan.unizar.es|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `search?as=1'
 
    [  <=>                                                                                                ] 38,128       131K/s   in 0.3s
 
2010-09-30 14:11:26 (131 KB/s) - `search?as=1' saved [38128]
 
 
[1]+  Done                    wget http://zaguan.unizar.es/search?as=1

Cambio el nombre del fichero xml…

 mv search\?as\=1 tesis0.xml

Y procedemos a realizar la importación. Se invoca poniendo como parametro el directorio donde están los XML’s:

[root@ harvest]# cd /usr/local/vufind/harvest; \
                     ./batch-import-marc.sh desdezaguan-tesis/marcxml/
 
Now Importing /usr/local/vufind/harvest/desdezaguan-tesis/marcxml//tesis0.xml ...
/usr/java/jre1.6.0_17/bin/java -Xms512m -Xmx512m
 
-Dsolrmarc.solr.war.path=/usr/local/vufind/solr/jetty/webapps/solr.war
 
-Dsolr.core.name=biblio -Dsolrmarc.pa
th=/usr/local/vufind/import -Dsolr.path=/usr/local/vufind/solr
 
-Dsolr.solr.home=/usr/local/vufind/solr -jar /usr/local/vufind/import/SolrMarc.jar
 
/usr/local/
vufind/import/import.properties
 
/usr/local/vufind/harvest/desdezaguan-tesis/marcxml/tesis0.xml
 INFO [main] (MarcImporter.java:769) - Starting SolrMarc indexing.
 INFO [main] (Utils.java:189) - Opening file: /usr/local/vufind/import/import.properties
 INFO [main] (MarcHandler.java:325) - Attempting to open data file:
 
/usr/local/vufind/harvest/desdezaguan-tesis/marcxml/tesis0.xml
 INFO [main] (MarcImporter.java:618) -  Updating to Solr index at /usr/local/vufind/solr
 INFO [main] (MarcImporter.java:634) -      Using Solr core biblio
 INFO [main] (SolrCoreLoader.java:102) - Using the data directory of:
 
/usr/local/vufind/solr/biblio
 INFO [main] (SolrCoreLoader.java:104) - Using the multicore schema file at :
 
/usr/local/vufind/solr/solr.xml
 INFO [main] (SolrCoreLoader.java:105) - Using the biblio core
 INFO [main] (MarcImporter.java:266) - Added record 1 read from file: 4841
 INFO [main] (MarcImporter.java:266) - Added record 2 read from file: 4840
 INFO [main] (MarcImporter.java:266) - Added record 3 read from file: 4823
 
....
 
 INFO [main] (MarcImporter.java:516) -  Adding 160 of 160 documents to index
 INFO [main] (MarcImporter.java:517) -  Deleting 0 documents from index
 INFO [main] (MarcImporter.java:391) - Calling commit
 INFO [main] (MarcImporter.java:402) - Done with the commit, closing Solr
 INFO [main] (MarcImporter.java:405) - Setting Solr closed flag
 INFO [main] (MarcImporter.java:431) - Connecting to solr server at URL:
 
http://localhost:8080/solr/biblio/update
 INFO [main] (SolrUpdate.java:135) - <?xml version="1.0" encoding="UTF-8"?>
 INFO [main] (SolrUpdate.java:135) - <response>
 INFO [main] (SolrUpdate.java:135) - <lst name="responseHeader"><int
 
name="status">0</int><int name="QTime">136</int></lst>
 INFO [main] (SolrUpdate.java:135) - </response>
 INFO [main] (MarcImporter.java:526) - Finished indexing in 0:01.00
 INFO [main] (MarcImporter.java:535) - Indexed 10 at a rate of about 8.0 per sec
 INFO [main] (MarcImporter.java:536) - Deleted 0 records
 INFO [Thread-2] (MarcImporter.java:465) - Starting Shutdown hook
 INFO [Thread-2] (MarcImporter.java:484) - Finished Shutdown hook

La invocacion MUEVE el fichero tesis0.xml y crea:

harvest/desdezaguan/marcxml/log y
harvest/desdezaguan/marcxml/processed

Veamos qué tiene cada carpeta:

[root@ marcxml]# ls -l $VUFIND_HOME/harvest/desdezaguan/marcxml/log/
total 8
-rw-r--r-- 1 root root 3095 Sep 29 12:54 tesis0.xml.log
 
[root@olmo marcxml]# ls -l processed/
total 68
-rw-r--r-- 1 root root 59068 Sep 29 12:49 tesis0.xml (el original)
 
[root@olmo marcxml]# cd log/
[root@olmo log]# more tesis0.xml.log

*** NOTA: Si el identificador del registro YA EXISTE en vufind no duplica, actualiza el registro

Más información en la wiki de vufind.

Configurando el display name de las facetas solr en vufind

Las facetas se describen en el archivo facets.ini. Os muestro cómo queda nuestro archivo tras la modificación y customización de los nombres que se mostrarán en las facetas. La parte de la izquierda muestra el ‘nombre lógico’ del índice de SOLR y la parte derecha el display name (aka ‘lo que sale en la web como facetas’).

* Nota: funcionan las tildes perfectamente (thanks vufind guys!)

more $VUFIND_HOME/web/conf/facets.ini
 
; The order of display is as shown below
; The name of the index field is on the left
; The display name of the field is on the right
[Results]
institution        = Origen
building           = Localización
format             = Formato
 
; Use callnumber-first for LC call numbers, dewey-hundreds for Dewey Decimal:
callnumber-first   = "Call Number"
;dewey-hundreds     = "Call Number"
 
authorStr          = Autor
language           = Idioma
genre_facet        = Genero
era                = Era
geographic_facet   = Región
 
; Facets that will appear at the top of search results when the TopFacets
; recommendations module is used.  See the [TopRecommendations] section of
; searches.ini for more details.
[ResultsTop]
topic_facet        = "Suggested Topics"
 
; This section is reserved for special boolean facets.  These are displayed
; as checkboxes.  If the box is checked, the filter on the left side of the
; equal sign is applied.  If the box is not checked, the filter is not applied.
; The value on the right side of the equal sign is the text to display to the
; user.  It will be run through the translation code, so be sure to update the
; language files appropriately.
;
; Leave the section empty if you do not need checkbox facets.
;
; NOTE: Do not create CheckboxFacets using values that also exist in the
;       other facet sections above -- this will not work correctly.

Los nombres de las facetas quedarán tal que asi:
vufind facets example configuration

Leer más sobre configuración de facetas en vufind y solr

Asignación de valores a las facetas solr en vufind

Es el próximo paso que queremos dar. Pero antes observemos cómo podemos hacer consultas al motor SOLR de vufind.
En http://yoursite.com:8080/solr/biblio/admin/form.jsp de tu servidor web podemos ver una amigable interfaz que nos permite consultar cómo son las respuestas XML a peticiones de consulta del motor y, de este modo, hacernos una idea de cómo queremos asignar valores a cada una de las partes.

Interfaz de consultas a SOLR:
vufind solr web interface

Haced una query. Ver cómo es el XML que devuelve. Fijaos en los valores que tienen los distintos registros devueltos en cada campo. En concreto, y para enseñarlo siempre con un ejemplo, vamos a fijarnos en el campo institution, que por defecto tiene asignado siempre el valor estático ‘MyInstitution’ para cualquier registro importado:

<arr name="institution">
    <str>MyInstitution</str>
</arr>

Esto es debido a la siguiente línea del fichero $VUFIND_HOME/import/marc.properties donde a esa faceta se le asigna el valor estático ‘MyInstitution’:

institution = "MyInstitution"

Imaginemos que queremos que la faceta ‘institution’ haga referencia al origen de los datos. Tendremos varios orígenes distintos: catálogo y repositorio.

Queremos que, si el registro viene del repositorio (Esto es, tiene 980a==’TESIS’) en esta faceta se guarde la cadena “Repositorio”.
Para ello debemos editar el fichero marc_local.properties (este fichero sobreescribe los settings por defecto marcados en marc.properties).

vi $VUFIND_HOME/import/marc_local.properties

Y añadimos la siguiente línea:

#asignar a la faceta 'institution' el valor de la etiqueta 980a según el <em>mappeo</em> establecido en el fichero <em>zaguan_map.properties</em>
institution = 980a,zaguan_map.properties

Y el contenido del fichero $VUFIND_HOME/import/zaguan_map.properties es:

[root@ import]# more /usr/local/vufind/import/zaguan_map.properties
 
# Si el valor de la etiqueta marcxml es 'TESIS', asigna a la faceta la cadena 'Repositorio'
TESIS = Repositorio

Del mismo modo imaginemos que, si el registro viene del catálogo (Esto es, tiene 907a==’.b1XXX’) en esta faceta se guarde la cadena “Catálogo”. En el fichero $VUFIND_HOME/import/marc_local.properties pondremos la línea:

# Tomamos los caracteres 1 y 2 de la etiqueta 907a y los <em>mappeamos</em> según el fichero <em>roble_marc.properties</em>.
institution = 907a[1-2],roble_map.properties

Y en roble_marc.properties:

[root@olmo import]# more /usr/local/vufind/import/roble_map.properties
b1 = Catalogo

¡Mucho ojo con los caracteres especiales como el punto (.) pues son interpretados como expresión regular y habría que escaparlos!

También es útil asignar el campo que actuará como identificador de los registros en vufind. En nuestro caso deseamos utilizar el valor de la etiqueta 907a como identificador del registro. Como la etiqueta es repetible deberemos añadir también el modificador first.

Añadimos pues la siguiente línea a marc_local.properties:

id = 907a, first

Acordaos de reiniciar vufind tras estas modificaciones:

$VUFIND_HOME/vufind.sh restart

De momento os dejo con el manual de facetas de la wiki de vufind para que sigáis leyendo 😉

Más experiencias en breves!