Vufind admin panel (http://www.yourhost.com/vufind/Admin/Home) allows to delete records by id (it calls Records.php, and more precisely the deleteRecord method).
But if you want, for instance, to delete all records from a bad import you can do it directly from your system prompt using util/delete.php script:
The first parameter is the import file name and the second, its format. If no format supplied, ‘marc’ is assumed.
cd $VUFIND_HOME; php util/deletes.php import/400.mrc marc |
In my case an error was showing up:
PHP Warning: parse_ini_file(../web/conf/config.ini): failed to open stream: No such file or directory in /usr/local/vufind/util/deletes.php on line 48 Warning: parse_ini_file(../web/conf/config.ini): failed to open stream: No such file or directory in /usr/local/vufind/util/deletes.php on line 48 Solr index is offline. |
Mmmhh. Relative path issues.
I opened deletes.php and edited line 48 so that parse_ini_file is done to /usr/local/vufind/web/conf/config.ini (full path). But then another error was showing up:
PHP Fatal error: Call to a member function getData() on a non-object in /usr/local/vufind/util/deletes.php on line 85 Fatal error: Call to a member function getData() on a non-object in /usr/local/vufind/util/deletes.php on line 85 |
What’s the problem now? Well, I am NOT USING MARC’s 001 tag as identificator as stated in my import/marc_local.properties (which overrides import/marc.properties). My id is set to record’s 907a tag value… UG.
We will cope with this issue later (here!)
New delete tool script
I first decided to make a little php program which allows me to delete a list of identifiers. I called it util/BorraRegistros.php.
This php script is called WITHOUT parameters, so you’ll have to edit it in order to include the identifiers in $lista_ids_registros array.
<?php set_include_path('/usr/local/vufind/web/:/usr/local/vufind/web/sys/:/usr/local/lib/php/'); require_once 'Solr.php'; $configArray = parse_ini_file('/usr/local/vufind/web/conf/config.ini', true); // Setup Solr Connection $url = $configArray['Index']['url']; $solr = new Solr($url); if ($configArray['System']['debug']) { $solr->debug = true; } // ---------------------------------------------------------------- // This is the list of SOLR IDENTIFIERS to be deleted!!! $lista_ids_registros = array('.b1000001x'); // ---------------------------------------------------------------- print "Interfaz de borrado de registros\nSe borraran los registros cuyos identificadores son:\n"; print_r($lista_ids_registros); // Confirm deletion... echo "¿Seguro de que deseas continuar? Escribe 'si' para continuar: "; $handle = fopen ("php://stdin","r"); $line = fgets($handle); if(trim($line) != 'si'){ echo "Cancelado\n"; exit; } echo "\n"; echo "Gracias, se va a proceder...\n"; // Delete each record identified by its value $lista_ids_registros foreach ($lista_ids_registros as $id_registro){ print "\nPreparando para borrar el registro '$id_registro'............................."; $solr->deleteRecord($id_registro); print "[ OK ]"; } print "\nTerminando el borrado..."; // Now commit and optimize $solr->commit(); $solr->optimize(); print "\n"; ?> |
More references to SOLR delete by id in http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_by_ID_and_by_Query
This script could also be done using cURL, but I kinda prefer it this way.
Fixing util/delete.php to fit records identified by marctags distinct from 001
Above we noticed problems when deleting recently imported records with util/delete.php. The problem is that this script does not read import/marc_local.properties and therefore does not notice that our solr records might not be identified by its marc’s 001 tag.
For instance, in my import/marc_local.properties we find:
id = 907a, first |
So I modified util/delete.php so that it takes into account that my solr records are identified by tag 907 (subfield ‘a’) AND not tag 001! Notice changes in lines 86-90.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | <?php /** * * Copyright (C) Villanova University 2007. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2, * as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * */ // Parse the command line parameters -- see if we are in "flat file" mode and // find out what file we are reading in! $filename = $argv[1]; $mode = isset($argv[2]) ? $argv[2] : 'marc'; // No filename specified? Give usage guidelines: if (empty($filename)) { echo "Delete records from VuFind's index.\n\n"; echo "Usage: deletes.php [filename] [format]\n\n"; echo "[filename] is the file containing records to delete.\n"; echo "[format] is the format of the file -- it may be one of the following:\n"; echo "\tflat - flat text format (deletes all IDs in newline-delimited file)\n"; echo "\tmarc - binary MARC format (delete all record IDs from 001 fields)\n"; echo "\tmarcxml - MARC-XML format (delete all record IDs from 001 fields)\n"; echo '"marc" is used by default if no format is specified.' . "\n"; die(); } // File doesn't exist? if (!file_exists($filename)) { die("Cannot find file: {$filename}\n"); } require_once 'util.inc.php'; // set up util environment require_once 'sys/Solr.php'; // Read Config file //$configArray = parse_ini_file('../web/conf/config.ini', true); $configArray = parse_ini_file('/usr/local/vufind/web/conf/config.ini', true); // Setup Solr Connection $url = $configArray['Index']['url']; $solr = new Solr($url); if ($configArray['System']['debug']) { $solr->debug = true; } // Count deleted records: $i = 0; // Flat file mode: if ($mode == 'flat') { $ids = explode("\n", file_get_contents($filename)); foreach($ids as $id) { $id = trim($id); if (!empty($id)) { $solr->deleteRecord($id); $i++; } } // MARC file mode: } else { // We need to load the MARC record differently if it's XML or binary: if ($mode == 'marcxml') { require_once 'File/MARCXML.php'; $collection = new File_MARCXML($filename); } else { // este require hace referencia a /usr/local/lib/php/File/MARC.php require_once 'File/MARC.php'; $collection = new File_MARC($filename); } // Once the record is loaded, the rest of the logic is always the same: while ($record = $collection->next()) { // getField is defined in /usr/local/lib/php/File/MARC/Record.php // Comment this line // $idField = $record->getField('001'); // Add the following two lines... $idField = $record->getField('907'); $idField = $idField->getSubfield('a'); $id = (string)$idField->getData(); $solr->deleteRecord($id); $i++; } } // Commit and Optimize if necessary: if ($i) { $solr->commit(); $solr->optimize(); } ?> |
Now we can run the new script and it will work
clear; php $VUFIND_HOME/util/deletes_by_907a.php $VUFIND_HOME/import/400.mrc marc |
Thanks for reading, have fun!


