Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renamed properties used for unittest to solrwayback_unittest.properties #411

Merged
merged 1 commit into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/bundle/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ Resources used when building the SolrWayback bundle.
- `install SolrWayback bundle`: See install guide [SolrWayback README](https://github.com/netarchivesuite/solrwayback/blob/master/README.md/)
- `indexing`: Scripts for indexing WARC files using [webarchive-discovery](https://github.com/ukwa/webarchive-discovery/)
- `Changes.md`: See version history [SolrWayback](https://github.com/netarchivesuite/solrwayback/blob/master/CHANGES.md/)
- `properties`: Default properties for the SolrWayback Bundle
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ private void scanRoot(Path path, Map<String, String> warcs) {
}
String filename = pathEntry.getFileName().toString();
if (!filePattern.matcher(filename).matches()) {
log.debug("Scanner encountered non-matching file '{}'", filename);
log.trace("Scanner encountered non-matching file '{}'", filename); //spamming too much during build
return;
}
if (warcs.containsKey(filename)) {
Expand Down
12 changes: 12 additions & 0 deletions src/test/java/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Information about unittests.

Property loading.
For unittest that require the properties to be initialised use this way to load the properties
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());

This will use the property files under test/resources/properties

If you need a unittest with quite different properties, you can create a new property file and load that. Just be sure
to include unittest in the name of the property.

TODO: more documentation
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import java.io.PrintWriter;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
import dk.kb.netarchivesuite.solrwayback.solr.SolrStreamingExportClient;
import org.apache.solr.client.solrj.SolrClient;
Expand All @@ -13,7 +14,7 @@ public class TestGenerateCSV {

public static void main(String[] args) throws Exception{

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());

String query = "thomas egense";
String filter = null;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import java.io.PrintWriter;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
import dk.kb.netarchivesuite.solrwayback.solr.SolrStreamingLinkGraphCSVExportClient;
import org.apache.solr.client.solrj.SolrClient;
Expand All @@ -13,7 +14,7 @@ public class TestGenerateLinkGraphCSV {

public static void main(String[] args) throws Exception{

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());

String query = "katte";

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ public class HtmlParserUrlRewriterTest {
public void invalidateProperties() throws Exception{

// Need this to ensure that the normaliser has a known setting
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback.properties").getPath());
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
Normalisation.setTypeFromConfig();

// We need this so that we know what the Solr server is set to
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ public class ScriptRewriterTest {
public void invalidateProperties() throws IOException {

// Need this to ensure that the normaliser has a known setting
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback.properties").getPath());
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
Normalisation.setTypeFromConfig();
// PropertiesLoader.initProperties();
// Also need this so that we know what the Solr server is set to
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.interfaces.ArcSource;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
import dk.kb.netarchivesuite.solrwayback.service.dto.ArcEntry;
Expand All @@ -13,7 +14,7 @@ public class TestExportArc {

public static void main (String[] args) throws Exception{

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());


String arcFile="/media/teg/1200GB_SSD/netarkiv/0205/filedir/27119-33-20080401194737-00004-kb-prod-har-001.kb.dk.arc.gz";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import java.nio.file.StandardOpenOption;
import java.util.List;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.interfaces.ArcSource;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
import dk.kb.netarchivesuite.solrwayback.service.dto.ArcEntry;
Expand All @@ -19,7 +20,7 @@ public class TestExportWarc {

public static void main (String[] args) throws Exception{

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
SearchResult search = NetarchiveSolrClient.getInstance().search("hash:\"sha1:PROTE66RZ6GDXPZI3ZAHG6YPCXRKZMEN\"", 100000);
// /netarkiv/0105/filedir/272829-30-20170318193124175-00168-sb-prod-har-001.statsbiblioteket.dk.warc.gz

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ public class TestExportWarcStreaming extends UnitTestUtils {

@Before
public void setUpProperties() throws Exception{
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback.properties").getPath());
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
}

@Test
Expand Down Expand Up @@ -268,7 +268,7 @@ private void assertBinaryEnding(byte[] expected, byte[] exported) {
}

public static void main(String[] args) throws Exception{
PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
String source_file_path="/home/teg/workspace/solrwayback/storedanske_export-00000.warc";
int offset = 515818793;
ArcEntry warcEntry = WarcParser.getWarcEntry(ArcSource.fromFile(source_file_path),offset);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
*/
package dk.kb.netarchivesuite.solrwayback.solr;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.facade.Facade;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
import dk.kb.netarchivesuite.solrwayback.service.exception.InvalidArgumentServiceException;
Expand Down Expand Up @@ -57,7 +58,7 @@ public class SolrGenericStreamingTest {
public static void setUp() throws Exception {
log.info("Setting up embedded server");

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());

coreContainer = new CoreContainer(SOLR_HOME);
coreContainer.load();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
*/
package dk.kb.netarchivesuite.solrwayback.solr;

import dk.kb.netarchivesuite.solrwayback.UnitTestUtils;
import dk.kb.netarchivesuite.solrwayback.parsers.HtmlParserUrlRewriter;
import dk.kb.netarchivesuite.solrwayback.parsers.ParseResult;
import dk.kb.netarchivesuite.solrwayback.properties.PropertiesLoader;
Expand Down Expand Up @@ -61,7 +62,7 @@ public class UrlResolveTest {
public static void setUp() throws Exception {
log.info("Setting up embedded server");

PropertiesLoader.initProperties();
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());

coreContainer = new CoreContainer(SOLR_HOME);
coreContainer.load();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ public class URLAbsoluterTest {

@Before
public void setUpProperties() throws Exception{
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback.properties").getPath());
PropertiesLoader.initProperties(UnitTestUtils.getFile("properties/solrwayback_unittest.properties").getPath());
// We need this so that we know what the Solr server is set to
PropertiesLoader.WAYBACK_BASEURL = "http://localhost:0000/solrwayback/";
}
Expand Down
114 changes: 114 additions & 0 deletions src/test/resources/properties/solrwayback_unittest.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
##solrwayback.properties (UTF-8)

##Url to the UWKA warc-indexer solr-server. Last part is the collectionname
solr.server=http://localhost:8983/solr/netarchivebuilder/

#Solr caching. Will be default false if not defined
solr.server.caching=true
solr.server.caching.max.entries=10000
# Age based cache invalidation is not enabled per default as index watching works better for most cases
# See the descrition of solr.server.check.interval.seconds below for more details
#solr.server.caching.age.seconds=86400

# Solr availability and index change check interval: Every x seconds a query for new documents is issued.
# If an index change is detected, caches will be cleared
#
# The check is light (cached by Solr) if the index has not changed and moderate if the index has been
# changed. If the backing index has billions of records and is continuously updated, active checking
# will strain the system. In that case it is recommended to disable active checking and use fixed time
# cache clearing with solr.server.caching.age.seconds instead.
#
# Default is 60 seconds
# Disable by setting to -1
# If the checking is disabled, consider setting solr.server.caching.age.seconds instead
solr.server.check.interval.seconds=60

## Link to this webapp itself. BaseURL for link rewrites must be full url.
wayback.baseurl=http://localhost:8080/solrwayback/

#Disable playback if true. Will just show a simple page with error message if playback is clicked.
#Will also prevent showing full size images and download of binaries.
#Tumbnail images in search results will still be shown.
playback.disabled=false


#Set to true to prevent SolrWayback url-hacking from accessing Warc-files+offset that is not in the Solr collection.
#This can be done if location+WARC filename+offset is known for a record.
#This will have performance impact. Only set to true if there are other Warc-files mounted on the OS that must not be accessed.
warc.files.verify.collection=false

# WARC files must be resolvable for playback to work.
# Plain files as well as HTTP URLs are supported.
# For the base case when WARCS have not been moved since index time, the
# RewriteLocationResolver is used with default setup.
# If WARC files are moved to another location after index, different
# implementations of ArcFileLocationResolverInterface are available.
#
# Default resolver: Optionally rewrites the input
warc.file.resolver.class=dk.kb.netarchivesuite.solrwayback.interfaces.RewriteLocationResolver
# Default parameters for RewriteLocationResolver: Return the input path unchanged:
# warc.file.resolver.parameters.path.regexp=.*
# warc.file.resolver.parameters.path.replacement=$0
# Sample parameters for RewriteLocationResolver that handles changed root location for WARC files,
# where the subfolder structure for the WARCs is preserved:
# warc.file.resolver.parameters.path.regexp=/home/harvester/warcs/(.*)
# warc.file.resolver.parameters.path.replacement=/warcs/$1
# Sample parameters for RewriteLocationResolver that rewrites to a HTTP server where all WARCs are accessible
# directly under the "warcstore/" folder:
# warc.file.resolver.parameters.path.regexp=.*([^/]*)
# warc.file.resolver.parameters.path.replacement=http://example.com/warcstore/$1
#
# Mapping resolver: Uses a map of known WARCs
# warc.file.resolver.class=dk.kb.netarchivesuite.solrwayback.interfaces.FileMovedMappingResolver
# The FileMovedMappingResolver MUST have a file containing a list of
# full file paths for known WARCs, where a sample entry in the list could be
# /storage/warcs/col1/mywarc_123.warc.gz
# warc.file.resolver.parameters=/home/user/netarkivet.files
#
# Auto discovery: Scans folders for WARCs.
# IMPORTANT: On a networked drive with millions of WARCs, the scan might take significant time
# and IO resources. Use RewriteLocationResolver or FileMovedMappingResolver where possible.
# warc.file.resolver.class=dk.kb.netarchivesuite.solrwayback.interfaces.AutoFileResolver
# The AutoFileResolver MUST have at least one root to scan from
# warc.file.resolver.parameters.autoresolver.roots=/home/sw/warcs1,/netmounts/colfoo
# Per default, the roots are only scanned on SolrWayback start.
# Sample config for AutoFileResolver for scanning every hour:
# warc.file.resolver.parameters.autoresolver.rescan.enabled=true
# warc.file.resolver.parameters.autoresolver.rescan.seconds=3600


#Collection name. This is the name shown when exporting a page to PID-XML.
pid.collection.name=netarkivet.dk


#The possible values for url.normaliser are: normal, legacy and minimal.
# Only change the normaliser type if you know what you are doing.
# Only use minimal if the solr index was build in warc-indexer earlier that 3.0. All SolrWayback bundles have warc-indexer later than this. (Playback quality is drastically reduced)
# Use Legacy for 3.0-3.1 versions of the warc-indexer.
# Use normal for all warc-indexers version 3.2.0+
url.normaliser=normal

# Optional list of Solr-params. Format is key1=value1;key2=value2,...
#solr.search.params=f.url_norm.qf=url

#------- Generate preview screenshots ------------------
#Used for preview screenshots shown on the page resources overview. Is not required.
#Chrome must be installed on the OS and headless chrome is used to generate the screenshots.
#The setup depend on the OS.

#Linux: chrome
#Ubunutu: chrome.command=chromium-browser
#Windows: chrome.command=C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe
#MAC1: chrome.command=/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
#MAC2: chrome.command="open -b com.google.Chrome"
#example command: chromium-browser --headless --disable-gpu --ipc-connection-timeout=3000 --screenshot=test.png --window-size=1280,1024 https://www.google.com/
chrome.command=chromium-browser

# This will work on linux. Create the folder yourself
screenshot.temp.imagedir=/home/xxx/solrwayback_screenshots/
#For windows (create the folder yourself)
#screenshot.temp.imagedir=C:\\solrwayback_screenshots\\

#Timeout in seconds. Optional, 10 seconds is default.
screenshot.preview.timeout=20
#-------------------------------------------------------
Loading
Loading